Compare commits

..

28 Commits

Author SHA1 Message Date
core-devops f82a980a79 test(canvas): add Google ADK to CreateWorkspaceDialog runtime-options assertion
audit-force-merge / audit (pull_request) Successful in 7s
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 11s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
Check migration collisions / Migration version collision check (pull_request) Successful in 6s
CI / Detect changes (pull_request) Successful in 8s
CI / Python Lint & Test (pull_request) Successful in 3s
E2E Chat / detect-changes (pull_request) Successful in 7s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 44s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 37s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m18s
CI / Platform (Go) (pull_request) Successful in 5m1s
CI / Canvas (Next.js) (pull_request) Successful in 4m54s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
Harness Replays / detect-changes (pull_request) Successful in 3s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 3s
qa-review / approved (pull_request) Failing after 4s
security-review / approved (pull_request) Failing after 3s
sop-checklist / na-declarations (pull_request) N/A: (none)
CI / Shellcheck (E2E scripts) (pull_request) Successful in 1s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
CI / all-required (pull_request) Successful in 25m28s
E2E Chat / E2E Chat (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m11s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m22s
verify-providers-gen / Regenerate providers artifact and fail on drift (pull_request) Successful in 32s
gate-check-v3 / gate-check (pull_request) Failing after 4s
sop-checklist / all-items-acked (pull_request) Successful in 3s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 4s
Harness Replays / Harness Replays (pull_request) Successful in 1s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Failing after 37m39s
RUNTIME_OPTIONS gained 'Google ADK' but the test's hardcoded expected array
(separate-selectors test) still listed 4 → Canvas (Next.js) CI red (5 vs 4).
Add it in component order (after OpenAI Codex CLI). Caught by comprehensive
pre-merge review — a real regression from this PR's own diff, not the
staging-E2E infra flake.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 11:02:14 -07:00
core-devops 0359912d06 feat: register google-adk runtime (manifest + knownRuntimes + canvas)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 11s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 18s
CI / Python Lint & Test (pull_request) Successful in 7s
CI / Detect changes (pull_request) Successful in 16s
E2E API Smoke Test / detect-changes (pull_request) Successful in 8s
E2E Chat / detect-changes (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 7s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 4s
Harness Replays / detect-changes (pull_request) Successful in 3s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 34s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 9s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s
verify-providers-gen / Regenerate providers artifact and fail on drift (pull_request) Successful in 55s
gate-check-v3 / gate-check (pull_request) Successful in 4s
qa-review / approved (pull_request) Failing after 6s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m26s
security-review / approved (pull_request) Failing after 10s
sop-checklist / review-refire (pull_request) Has been skipped
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-tier-check / tier-check (pull_request) Successful in 8s
sop-checklist / all-items-acked (pull_request) Successful in 9s
CI / Platform (Go) (pull_request) Successful in 4m30s
CI / Canvas (Next.js) (pull_request) Failing after 4m54s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 1s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) Failing after 18m25s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m18s
E2E Chat / E2E Chat (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 3s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
Harness Replays / Harness Replays (pull_request) Successful in 1s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Failing after 42m20s
Platform-side registration for the google-adk workspace runtime (RFC
internal#730). Required so a workspace with runtime: google-adk provisions
(Docker path) and is creatable from the canvas:
- manifest.json: workspace_templates entry → handler allowlist (loadRuntimesFromManifest)
- provisioner/registry.go: knownRuntimes += google-adk (else ErrUnresolvableRuntime); test count 4→5
- canvas CreateWorkspaceDialog: RUNTIME_OPTIONS + BASE_RUNTIME_TEMPLATE_IDS
- canvas runtime-names.ts: display name

Depends on molecule-ai-workspace-template-google-adk (image build/publish) +
controlplane runtime_image_pins (SaaS path) — tracked in RFC #730.
Verified: go build + provisioner/handlers tests green; manifest.json valid.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-28 21:30:36 -07:00
hongming c99b0e3601 Merge pull request 'fix(workspace-server): provider-matched byok credential injection (internal#728 Bug 1) [BEHAVIOR-AFFECTING — CTO merge-go]' (#2000) from fix/internal-728-provider-matched-cred-injection into main
lint-bp-context-emit-match / lint-bp-context-emit-match (push) Successful in 1m15s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 4s
main-red-watchdog / watchdog (push) Successful in 2m5s
gate-check-v3 / gate-check (push) Successful in 25s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 9s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 31s
Handlers Postgres Integration / detect-changes (push) Successful in 7s
Harness Replays / detect-changes (push) Successful in 9s
publish-workspace-server-image / build-and-push (push) Successful in 6m21s
ci-arm64-advisory / fast-checks (push) Waiting to run
E2E Staging SaaS (full lifecycle) / pr-validate (push) Successful in 46s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 9s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 9s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 1m40s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 6s
CI / Platform (Go) (push) Successful in 5m52s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 9s
CI / Canvas (Next.js) (push) Successful in 5s
CI / Shellcheck (E2E scripts) (push) Successful in 15s
CI / all-required (push) Successful in 7m54s
CI / Canvas Deploy Reminder (push) Successful in 2s
verify-providers-gen / Regenerate providers artifact and fail on drift (push) Successful in 42s
Harness Replays / Harness Replays (push) Successful in 7s
ci-required-drift / drift (push) Successful in 1m5s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m31s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push) Failing after 4m47s
E2E Chat / E2E Chat (push) Successful in 4m23s
publish-workspace-server-image / Production auto-deploy (push) Successful in 3m41s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 14s
CI / Detect changes (push) Successful in 9s
Block internal-flavored paths / Block forbidden paths (push) Successful in 8s
CI / Python Lint & Test (push) Successful in 4s
E2E API Smoke Test / detect-changes (push) Successful in 14s
E2E Chat / detect-changes (push) Successful in 11s
sync-providers-yaml / Compare synced providers.yaml against controlplane canonical (push) Successful in 4s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 7s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Has started running
Sweep stale AWS Secrets Manager secrets / Sweep AWS Secrets Manager (push) Successful in 11s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Has started running
2026-05-29 00:29:07 +00:00
hongming 4414c92a87 fix(workspace-server): provider-matched byok credential injection — strip stray non-matching global-origin LLM creds (internal#728 Bug 1) [BEHAVIOR-AFFECTING — CTO merge-go]
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Python Lint & Test (pull_request) Successful in 9s
CI / Detect changes (pull_request) Successful in 11s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 13s
E2E API Smoke Test / detect-changes (pull_request) Successful in 12s
E2E Chat / detect-changes (pull_request) Successful in 12s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 7s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 31s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
Harness Replays / detect-changes (pull_request) Successful in 5s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 6s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
gate-check-v3 / gate-check (pull_request) Successful in 5s
qa-review / approved (pull_request) Failing after 6s
security-review / approved (pull_request) Failing after 10s
verify-providers-gen / Regenerate providers artifact and fail on drift (pull_request) Successful in 25s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 5s
sop-checklist / review-refire (pull_request) Has been skipped
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m10s
sop-tier-check / tier-check (pull_request) Successful in 5s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Failing after 5m37s
CI / Canvas (Next.js) (pull_request) Successful in 2s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 4s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 10s
CI / Platform (Go) (pull_request) Successful in 4m37s
Harness Replays / Harness Replays (pull_request) Successful in 3s
CI / all-required (pull_request) Successful in 7m16s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m33s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m42s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
audit-force-merge / audit (pull_request) Successful in 7s
#1995 removed the blanket global-LLM-cred strip on the byok branch (correct for
the platform-key co-mingling it targeted), but left EVERY claude-code workspace
inheriting the tenant-global CLAUDE_CODE_OAUTH_TOKEN. The claude-code runtime
greedily prefers that oauth (llm-auth: detected oauth -> api.anthropic.com), so
a workspace whose RESOLVED provider is NOT anthropic-oauth (minimax, kimi-byok)
routes its non-Anthropic model to Anthropic -> "Claude Code returned an error
result" (agents-team Dev Engineer B, MiniMax-M2.7; live-confirmed 2026-05-28 via
SSM container logs, internal#728 comment 52493).

Fix: provider-AWARE replacement for the over-removed strip. On the byok/disabled
branch, keep ONLY the global-origin LLM bypass creds whose env-var name is in
the RESOLVED provider's auth_env; strip the rest.
- minimax auth_env MINIMAX_API_KEY/ANTHROPIC_AUTH_TOKEN/ANTHROPIC_API_KEY ->
  stray global CLAUDE_CODE_OAUTH_TOKEN is non-matching -> stripped (fixes DevB).
- anthropic-oauth auth_env CLAUDE_CODE_OAUTH_TOKEN -> matches -> kept (PM opus +
  reno opus-byok NOT regressed; #1994 ByokGlobalScopeOAuthSurvives guard holds).
NOT a return to the blanket strip (which would re-break the byok-anthropic-oauth
case #1994 fixed) — keyed off DeriveProvider's resolved provider.

Provenance-scoped: only operator-store (global_secrets) origin keys are
provider-gated. User-authored workspace_secrets (provenance flag cleared by
loadWorkspaceSecrets) are NEVER stripped — JRS kimi workspace-key, reno's own
oauth are exempt. Fail-OPEN: an underivable provider / unavailable registry
strips nothing (keep-first; worst case is a kept stray, never removing the only
usable cred -> never fail-closes a legitimate byok workspace).

Threads loadWorkspaceSecrets's globalKeys provenance side-channel into
applyPlatformManagedLLMEnv (signature +map[string]struct{}); caller
prepareProvisionContext already has it.

Tests (llm_billing_mode_provision_parity_test.go):
- MinimaxStripsStrayGlobalOAuth — DevB repro: minimax-resolving ws strips the
  stray global oauth + keeps MINIMAX_API_KEY routing.
- WorkspaceOriginCredExemptFromStrip — user-authored ws_secrets cred survives
  even when non-matching.
- ByokGlobalScopeOAuthSurvives (strengthened) — global-origin oauth on opus
  SURVIVES via provider match (PM/reno regression guard).
Mutation-load-bearing (verified RED): (1) remove strip -> blanket-keep regresses
DevB; (2) empty keep set (provider-unaware) -> minimax routing + reno oauth
stripped; (3) iterate all bypass keys (provenance-unaware) -> user-authored cred
stripped.

build ok; build -tags=integration ok; go test ./internal/handlers/ ok;
golangci-lint ./internal/handlers/ -> 0 issues. Refs internal#728.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 00:05:21 +00:00
hongming efa60621f3 Merge pull request 'fix(prod-auto-deploy): fail on tenants not verified on target build (internal#724)' (#1998) from fix/internal-724-prod-auto-deploy-straggler-surfacing into main
CI / Detect changes (push) Successful in 19s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m27s
CI / Python Lint & Test (push) Successful in 27s
E2E Chat / detect-changes (push) Successful in 19s
publish-workspace-server-image / Production auto-deploy (push) Successful in 2m30s
E2E API Smoke Test / detect-changes (push) Successful in 21s
CI / all-required (push) Successful in 2m0s
Handlers Postgres Integration / detect-changes (push) Successful in 23s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 27s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 9s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 10s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (push) Successful in 6s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Failing after 1m20s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 9s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 6s
verify-providers-gen / Regenerate providers artifact and fail on drift (push) Successful in 40s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Successful in 1m36s
Ops Scripts Tests / Ops scripts (unittest) (push) Successful in 1m21s
publish-workspace-server-image / build-and-push (push) Successful in 5m25s
CI / Platform (Go) (push) Successful in 2s
CI / Canvas (Next.js) (push) Successful in 3s
CI / Shellcheck (E2E scripts) (push) Successful in 2s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 4s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 2s
E2E Chat / E2E Chat (push) Successful in 3s
CI / Canvas Deploy Reminder (push) Successful in 2s
ci-arm64-advisory / fast-checks (push) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 13s
Block internal-flavored paths / Block forbidden paths (push) Successful in 16s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 16s
Sweep stale AWS Secrets Manager secrets / Sweep AWS Secrets Manager (push) Successful in 7s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Compensated by status-reaper (workflow has no push: trigger; Gitea 1.22.6 hardcoded-suffix bug — see .gitea/scripts/status-reaper.py)
main-red-watchdog / watchdog (push) Successful in 2m1s
gate-check-v3 / gate-check (push) Successful in 24s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Compensated by status-reaper (workflow has no push: trigger; Gitea 1.22.6 hardcoded-suffix bug — see .gitea/scripts/status-reaper.py)
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 4s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 10s
ci-required-drift / drift (push) Successful in 1m28s
2026-05-28 21:58:31 +00:00
hongming-personal 367bc1f7fc fix(prod-auto-deploy): fail on tenants not verified on target build (internal#724)
audit-force-merge / audit (pull_request) Successful in 17s
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 11s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 7s
CI / Python Lint & Test (pull_request) Successful in 9s
CI / Detect changes (pull_request) Successful in 11s
E2E Chat / detect-changes (pull_request) Successful in 20s
CI / all-required (pull_request) Successful in 2m42s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 18s
E2E API Smoke Test / detect-changes (pull_request) Successful in 20s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 12s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 12s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 8s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 8s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Failing after 1m4s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 3s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m12s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m21s
gate-check-v3 / gate-check (pull_request) Successful in 8s
qa-review / approved (pull_request) Failing after 5s
security-review / approved (pull_request) Failing after 7s
sop-checklist / review-refire (pull_request) Has been skipped
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m31s
verify-providers-gen / Regenerate providers artifact and fail on drift (pull_request) Successful in 35s
sop-tier-check / tier-check (pull_request) Successful in 4s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
CI / Platform (Go) (pull_request) Successful in 5s
CI / Canvas (Next.js) (pull_request) Successful in 2s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 6s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m34s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m7s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 4s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
The production auto-deploy aggregated per-tenant redeploy-fleet results
but never asserted fleet COVERAGE: a tenant that was enumerated but
silently skipped, or that SSM-succeeded onto the old image, passed as a
clean deploy. That is how agents-team stayed 46h behind the fleet with no
straggler reported.

Pairs with the controlplane fix that adds per-tenant verified_on_target
(docker-inspect proof the container is on the target tag). This change:

- rollout_stragglers(): every enumerated tenant NOT proven on the target
  build is a straggler — errored, skipped (no result row, the agents-team
  class), or verified_on_target=false. Backward-compatible: a missing key
  (pre-fix CP) is treated as verified so the gate degrades to the old
  ok-based behavior against an un-upgraded CP rather than failing spuriously.
- assert_full_coverage(): raises RolloutFailed (→ non-zero exit, response
  JSON written with ok=false + stragglers) when any straggler remains
  after a non-dry-run rollout. A dry run asserts nothing (it proves
  nothing landed).
- publish-workspace-server-image.yml: per-tenant summary gains an
  "On target" column and a loud ⚠ Stragglers section; the step emits a
  ::error:: naming the off-target tenants before failing.

Tests: straggler detection (off-target, no-result, dry-run-skip,
backward-compat missing key) + end-to-end execute_scoped_rollout fail/pass
— mutation-verified RED with the coverage gate removed. All existing
prod-auto-deploy tests still pass; ruff + py_compile clean; workflow YAML
validates.

Refs: internal#724

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-28 14:41:12 -07:00
hongming c2c6501a67 Merge pull request 'fix(workspace-server): provision-time billing derives from EFFECTIVE model, not raw payload.Model (#1994) [BEHAVIOR-AFFECTING — CTO merge-go]' (#1995) from fix/1994-provision-billing-model-passthrough into main
ci-arm64-advisory / fast-checks (push) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 10s
Block internal-flavored paths / Block forbidden paths (push) Successful in 8s
CI / Detect changes (push) Successful in 8s
CI / Python Lint & Test (push) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 10s
E2E API Smoke Test / detect-changes (push) Successful in 11s
E2E Chat / detect-changes (push) Successful in 10s
Handlers Postgres Integration / detect-changes (push) Successful in 5s
Harness Replays / detect-changes (push) Successful in 4s
E2E Staging SaaS (full lifecycle) / pr-validate (push) Successful in 28s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 16s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 5s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (push) Successful in 4s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Failing after 1m15s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 5s
verify-providers-gen / Regenerate providers artifact and fail on drift (push) Successful in 26s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Successful in 1m37s
CI / Canvas (Next.js) (push) Successful in 14s
publish-workspace-server-image / build-and-push (push) Successful in 4m41s
CI / Shellcheck (E2E scripts) (push) Successful in 39s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push) Failing after 5m40s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 3m42s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 3s
Harness Replays / Harness Replays (push) Successful in 8s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m53s
CI / Canvas Deploy Reminder (push) Successful in 2s
CI / Platform (Go) (push) Successful in 8m22s
CI / all-required (push) Successful in 11m47s
E2E Chat / E2E Chat (push) Successful in 4m58s
publish-workspace-server-image / Production auto-deploy (push) Successful in 9m9s
main-red-watchdog / watchdog (push) Successful in 1m58s
gate-check-v3 / gate-check (push) Successful in 24s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 8s
ci-required-drift / drift (push) Successful in 1m7s
Sweep stale AWS Secrets Manager secrets / Sweep AWS Secrets Manager (push) Successful in 5s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Compensated by status-reaper (workflow has no push: trigger; Gitea 1.22.6 hardcoded-suffix bug — see .gitea/scripts/status-reaper.py)
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Compensated by status-reaper (workflow has no push: trigger; Gitea 1.22.6 hardcoded-suffix bug — see .gitea/scripts/status-reaper.py)
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 3s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 5s
2026-05-28 20:00:59 +00:00
hongming-ceo-delegated bbb445b956 fix(workspace-server): byok runs on the tenant's own global-scope LLM cred; stop stripping it (molecule-core#1994)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 9s
CI / Python Lint & Test (pull_request) Successful in 8s
E2E API Smoke Test / detect-changes (pull_request) Successful in 10s
E2E Chat / detect-changes (pull_request) Successful in 8s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 12s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 12s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 14s
Harness Replays / detect-changes (pull_request) Successful in 7s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 5s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 3s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 33s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 11s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Failing after 1m1s
lint-mask-pr-atomicity / lint-mask-pr-atomicity (pull_request) Successful in 1m13s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m11s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 3s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 56s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m29s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
verify-providers-gen / Regenerate providers artifact and fail on drift (pull_request) Successful in 33s
gate-check-v3 / gate-check (pull_request) Successful in 3s
qa-review / approved (pull_request) Failing after 4s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m26s
security-review / approved (pull_request) Failing after 5s
sop-checklist / review-refire (pull_request) Has been skipped
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 5s
sop-tier-check / tier-check (pull_request) Successful in 6s
CI / Canvas (Next.js) (pull_request) Successful in 2s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 15s
E2E Chat / E2E Chat (pull_request) Successful in 15s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 10s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Failing after 5m23s
Harness Replays / Harness Replays (pull_request) Successful in 5s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2m18s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m27s
CI / Platform (Go) (pull_request) Successful in 6m4s
CI / all-required (pull_request) Successful in 9m54s
audit-force-merge / audit (pull_request) Successful in 8s
Corrected-model credential fix (CTO-confirmed). `global_secrets` is the
TENANT's own secret store (shared across that tenant's workspaces), NOT the
platform's. The platform's own LLM credential is the CP proxy usage token,
injected separately on the platform_managed path; it is never stored in a
tenant's global_secrets.

The internal#711 provider-aware strip rested on the inverted premise that a
global-scope LLM credential was "the platform's own". On the byok/disabled
branch it stripped the tenant's OWN oauth when that oauth lived at global
scope, leaving the workspace credential-less -> MISSING_BYOK_CREDENTIAL ->
dead (Reno Stars Marketing/SEO byok agents, live-confirmed 2026-05-28).

Changes:
- workspace_provision.go: remove the stripGlobalOriginLLMCreds call on the
  byok/disabled branch; delete the now-dead function; drop the unused
  globalKeys parameter from applyPlatformManagedLLMEnv.
- secrets.go: remove the symmetric byok strip on the remote-pull path
  (GET /workspaces/:id/secrets/values) + its now-unused globalKeys tracking;
  the bundle is the tenant's merged secrets served verbatim.
- platform_managed path UNCHANGED: still strips direct oauth + forces the CP
  proxy usage token (metered). Only byok/disabled stop being stripped.
- Fail-closed UNCHANGED in spirit: a byok workspace with no LLM credential at
  ANY scope still aborts MISSING_BYOK_CREDENTIAL; the trigger narrowed from
  "no workspace-scoped cred" to "no cred at any scope".

Guard (co-mingling prevention at the write boundary):
- SetGlobal still rejects bypass-list keys for a platform_managed tenant
  (keeps a platform-shaped credential out of global_secrets going forward);
  added a regression test pinning it.

Tests: inverted the strip-asserting unit + e2e tests to the corrected model
(global-scope oauth survives, byok runs direct, no proxy); added genuinely-
credential-less byok fail-closed coverage; all three behavior changes are
mutation-load-bearing (re-adding either strip / dropping the SetGlobal guard
turns the respective test RED). build + vet + golangci-lint + the full
integration-tagged handlers suite green. The #1994 model-passthrough fix and
the MiniMax A2A e2e on this branch are untouched.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-28 19:45:55 +00:00
hongming-ceo-delegated 3269e93216 test(e2e): add real-completion + per-provider liveness + byok-routing A2A gate (#1994 follow-on)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 11s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 22s
CI / Python Lint & Test (pull_request) Successful in 16s
CI / Detect changes (pull_request) Successful in 38s
E2E API Smoke Test / detect-changes (pull_request) Successful in 22s
E2E Chat / detect-changes (pull_request) Successful in 13s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 10s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
Harness Replays / detect-changes (pull_request) Successful in 5s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 12s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 3s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 34s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Failing after 1m7s
lint-mask-pr-atomicity / lint-mask-pr-atomicity (pull_request) Successful in 1m14s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m11s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 8s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m24s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m3s
gate-check-v3 / gate-check (pull_request) Successful in 5s
qa-review / approved (pull_request) Failing after 8s
security-review / approved (pull_request) Failing after 12s
verify-providers-gen / Regenerate providers artifact and fail on drift (pull_request) Successful in 27s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 4s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 6s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m25s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Failing after 4m31s
CI / Canvas (Next.js) (pull_request) Successful in 6s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 17s
E2E Chat / E2E Chat (pull_request) Successful in 4s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 8s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2m12s
Harness Replays / Harness Replays (pull_request) Successful in 4s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m16s
CI / Platform (Go) (pull_request) Successful in 6m10s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) Successful in 13m28s
The A2A e2e historically asserted only response SHAPE (test_a2a_e2e.sh
checked '"kind":"text"' only). A broken agent returns its error AS a
text part -- {"kind":"text","text":"Agent error (Exception) ..."} --
which STILL matches the shape check, so it PASSED on a fully broken
agent. That is why the 2026-05-2x drained-key / byok-misroute failures
(agents-team PM + reno marketing erroring on every LLM call) sailed
through CI. "Channel returns text shape" is not "agent completed an LLM
round-trip."

Adds, ADDITIVELY (no existing assertion weakened or removed):

- tests/e2e/lib/completion_assert.sh -- reusable gates:
  * a2a_assert_real_completion: deterministic known-answer round-trip;
    asserts CONTAINS the expected token AND NOT an error-as-text marker
    (Agent error / Exception / error result / MISSING_BYOK_CREDENTIAL).
  * provider_liveness_matrix + offered_platform_models_for_runtime:
    per-offered-provider cheap (max_tokens:4) probe; the offered set is
    read from the providers.yaml SSOT (runtimes.<rt>.providers[platform]
    .models) -- not a hardcoded list -- so the matrix tracks the SSOT.
  * assert_byok_not_platform_proxy: #1994 regression guard -- a
    byok-resolving workspace must NOT resolve platform_managed (reads the
    same derived resolver GET /admin/workspaces/:id/llm-billing-mode the
    provision strip gate uses).

- tests/e2e/test_staging_full_saas.sh (the live-agent lane, MiniMax
  primary): new stanzas 8b (PINEAPPLE known-answer, the core gate),
  8c (byok-routing guard), 8d (SSOT-driven per-provider liveness matrix).

- tests/e2e/test_a2a_e2e.sh: added check_no_error_as_text on Echo + SEO
  replies so the brief's literal shape-only example now FAILS on an
  error-as-text payload.

- tests/e2e/test_completion_assert_unit.sh: offline fail-direction proof
  (16 cases) that the negative gates are load-bearing -- error-as-text
  MUST fail, platform_managed MUST trip the #1994 guard. Wired into
  ci.yml "Run E2E bash unit tests (no live infra)" (required, per-PR +
  main). e2e-staging-saas.yml paths filter extended to re-trigger the
  live lane on lib changes.

No #1994 fix code touched -- tests/e2e + workflow wiring only.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-28 18:58:13 +00:00
hongming 442f79a987 fix(workspace-server): provision-time billing derives from EFFECTIVE model, not raw payload.Model (molecule-core#1994)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Python Lint & Test (pull_request) Successful in 7s
CI / Detect changes (pull_request) Successful in 9s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
E2E Chat / detect-changes (pull_request) Successful in 8s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 13s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s
Harness Replays / detect-changes (pull_request) Successful in 3s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 9s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 45s
gate-check-v3 / gate-check (pull_request) Successful in 5s
qa-review / approved (pull_request) Failing after 10s
verify-providers-gen / Regenerate providers artifact and fail on drift (pull_request) Successful in 39s
security-review / approved (pull_request) Failing after 5s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 6s
CI / Canvas (Next.js) (pull_request) Successful in 3s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 6s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m7s
E2E Chat / E2E Chat (pull_request) Successful in 23s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 6s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2m2s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
Harness Replays / Harness Replays (pull_request) Successful in 3s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m37s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Successful in 5m10s
CI / Platform (Go) (pull_request) Successful in 5m54s
CI / all-required (pull_request) Successful in 7m19s
The provision-time LLM billing resolver diverged from the read endpoint:
a byok workspace (claude-code, opus) was provisioned platform_managed and
routed through the platform LLM proxy, billing the platform Anthropic key
for the customer own usage (Reno Stars Marketing 6b66de8d; live-confirmed
2026-05-28).

Root cause: applyPlatformManagedLLMEnv passed the RAW payload.Model to
ResolveLLMBillingModeDerived. On a re-provision (restart/resume/
auto-restart) the payload is rebuilt from the DB with Name+Tier+Runtime
only (workspace_restart.go:333/844/1017 via withStoredCompute, which
backfills Compute but NOT Model), so payload.Model == "". DeriveProvider
errors on an empty model, the resolver defaults closed to platform_managed
and bakes ANTHROPIC_BASE_URL=<platform proxy>. The read endpoint
(ResolveLLMBillingMode -> readWorkspaceDeriveInputs) reads MODEL from
workspace_secrets, derives opus -> anthropic-oauth -> byok. Divergence,
deterministic on every re-provision.

Fix: extract effectiveModelForBilling (the fallback chain
applyRuntimeModelEnv already used: explicit -> MOLECULE_MODEL -> MODEL)
into a shared helper and have the billing resolver consult it, so the
provision-path derive inputs match the read-path. The stored model already
lives in the merged envVars (loadWorkspaceSecrets) — no new DB query. The
byok branch (no proxy override; strip only global-origin platform creds;
fail-closed on missing own cred, internal#711) is preserved unchanged;
genuinely-platform and no-model workspaces still default platform_managed
(CTO: default stays platform).

Tests (mutation-load-bearing): re-provision-uses-stored-model byok repro,
read/provision parity guard, default-preservation, and the #711 global-
only-oauth fail-closed guard. Reverting the envVars fallback turns the
repro + parity + #711 tests RED; default-preservation stays GREEN.

BEHAVIOR-AFFECTING (provisioning hot path) — needs CTO merge-go.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-28 18:45:58 +00:00
hongming 03aa69f46f Merge pull request 'P3 internal#718: canvas consumes registry-served /templates, retire hardcoded provider vocab #4/#5 (PR-B; NOT merged)' (#1978) from feat/internal-718-p3b-canvas-consume-registry into main
ci-arm64-advisory / fast-checks (push) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 13s
publish-canvas-image / Build & push canvas image (push) Successful in 3m11s
publish-workspace-server-image / build-and-push (push) Successful in 6m19s
Block internal-flavored paths / Block forbidden paths (push) Successful in 11s
CI / Python Lint & Test (push) Successful in 11s
CI / Detect changes (push) Successful in 12s
E2E API Smoke Test / detect-changes (push) Successful in 12s
Handlers Postgres Integration / detect-changes (push) Successful in 7s
Harness Replays / detect-changes (push) Successful in 10s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 8s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 11s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 9s
CI / Platform (Go) (push) Successful in 4s
CI / Shellcheck (E2E scripts) (push) Successful in 3s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 3s
verify-providers-gen / Regenerate providers artifact and fail on drift (push) Failing after 11m15s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m55s
Harness Replays / Harness Replays (push) Successful in 5s
CI / Canvas (Next.js) (push) Successful in 7m18s
CI / all-required (push) Successful in 21m18s
CI / Canvas Deploy Reminder (push) Successful in 7s
publish-workspace-server-image / Production auto-deploy (push) Successful in 46m7s
E2E Staging SaaS (full lifecycle) / pr-validate (push) Successful in 30s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push) Successful in 5m51s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (push) Successful in 1m3s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push) Has been skipped
E2E Staging External Runtime / E2E Staging External Runtime (push) Successful in 5m11s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 6s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 5m53s
E2E Chat / detect-changes (push) Successful in 7s
E2E Chat / E2E Chat (push) Successful in 4m19s
E2E Legacy Advisory / Legacy local-platform E2E (push) Compensated by status-reaper (workflow has no push: trigger; Gitea 1.22.6 hardcoded-suffix bug — see .gitea/scripts/status-reaper.py)
Railway pin audit (drift detection) / Audit Railway env vars for drift-prone pins (push) Compensated by status-reaper (workflow has no push: trigger; Gitea 1.22.6 hardcoded-suffix bug — see .gitea/scripts/status-reaper.py)
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Failing after 1m24s
gate-check-v3 / gate-check (push) Successful in 34s
main-red-watchdog / watchdog (push) Successful in 2m16s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 18s
ci-required-drift / drift (push) Successful in 1m3s
Sweep stale AWS Secrets Manager secrets / Sweep AWS Secrets Manager (push) Successful in 7s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 4m16s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 5s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 3s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 6m36s
2026-05-28 05:59:07 +00:00
hongming-personal 8546502ab8 test(canvas): make registryBilling test discriminate registry-vs-hardcoded billing precedence (#1978 review)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
CI / Platform (Go) (pull_request) Blocked by required conditions
CI / Canvas (Next.js) (pull_request) Blocked by required conditions
CI / Shellcheck (E2E scripts) (pull_request) Blocked by required conditions
CI / all-required (pull_request) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (pull_request) Blocked by required conditions
E2E Chat / E2E Chat (pull_request) Blocked by required conditions
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Blocked by required conditions
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Blocked by required conditions
Harness Replays / Harness Replays (pull_request) Blocked by required conditions
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 12s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 8s
CI / Python Lint & Test (pull_request) Successful in 4s
E2E Chat / detect-changes (pull_request) Successful in 6s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 8s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
Harness Replays / detect-changes (pull_request) Successful in 5s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 5s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 12s
gate-check-v3 / gate-check (pull_request) Successful in 11s
verify-providers-gen / Regenerate providers artifact and fail on drift (pull_request) Successful in 33s
qa-review / approved (pull_request) Failing after 9s
security-review / approved (pull_request) Failing after 7s
sop-checklist / review-refire (pull_request) Has been skipped
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 5s
sop-tier-check / tier-check (pull_request) Successful in 11s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m14s
audit-force-merge / audit (pull_request) Successful in 11s
CI / Canvas Deploy Reminder (pull_request) Has been cancelled
agent-reviewer #7790 (blocking) found that ConfigTab.registryBilling.test.tsx
did not actually pin retire-list #5's core claim — both existing assertions
("platform"→platform_managed, "anthropic-oauth"→byok) return the SAME value
under both the registry-authoritative impl and a regression to the old
hardcoded billingModeForProvider rule, so the test was tautological and a
regression would still pass. The misleading comment on the anthropic-oauth
case claimed it was "a case the hardcoded rule gets WRONG" but the hardcoded
rule actually agrees there too.

This commit adds a genuine disagreement case: a registry provider
"managed-federated" whose registry-served billing_mode is "platform_managed"
even though its name is not "" / "platform" (so the legacy
billingModeForProvider rule would return "byok"). The new test asserts the
two rules disagree on this input (sanity) and then asserts
billingModeForSelectedProvider returns the REGISTRY value
("platform_managed"), which is only reachable by honoring the catalog.

Load-bearing proof: with the registry-first impl, the new test PASSES; when
billingModeForSelectedProvider is temporarily forced to fall through to the
hardcoded rule, the new test (and only the new test) FAILS with
expected 'platform_managed' / received 'byok' — proving it pins the
registry-wins contract.

Also fixes the misleading "hardcoded rule gets WRONG" comment on the
anthropic-oauth case (explicitly annotates it as non-discriminating and
points to the new disagreement case as the registry-WINS proof).

Implementation (billingModeForSelectedProvider) untouched — confirmed
byte-identical to PR #1978 HEAD (f2d7f1da).

Verification:
  - targeted: 5 passed (was 4 — adds the discriminating case)
  - regressed-impl: only the new test fails, others pass (= they are
    non-discriminating as the review found)
  - full canvas vitest: 223 files / 3381 passed | 1 skipped (3382) — +1
    vs the 3380/1 baseline
  - tsc: 0 new errors (touched file clean; pre-existing 223 baseline
    unchanged with my diff stashed)
  - eslint on touched file: 0

Refs: #1978, review #7790, internal#718 P3 retire-list #5.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 05:06:25 +00:00
hongming-personal f5c2882acb feat(canvas): P3 internal#718 — consume registry-served /templates list, retire hardcoded provider vocab (#4/#5)
P3 item 2. The canvas Provider/Model selector + Config-tab billing-mode now
consume the registry-served GET /templates fields (registry_backed /
registry_providers / registry_models from PR-A) instead of re-deriving provider
knowledge client-side. Retires the hardcoded vocabularies as the PRIMARY path:

- ProviderModelSelector (#4): new buildProviderCatalogFromRegistry(providers,
  models) builds the dropdown catalog from the registry payload — provider
  label = registry display_name, bucket = DERIVED provider, billing + auth_env
  from the registry — instead of inferVendor / VENDOR_LABELS /
  BARE_VENDOR_PATTERNS. The selector takes an optional pre-built `catalog`
  prop and uses it verbatim when supplied. inferVendor/buildProviderCatalog
  remain ONLY as the fallback for non-registry runtimes / older backends.
- ConfigTab (#5): when the selected runtime is registry-backed, the provider
  catalog + selector models come from registry_providers/registry_models, and
  billingModeForSelectedProvider(provider, catalog) reads the DERIVED provider's
  billing_mode off the registry catalog. The hardcoded billingModeForProvider
  ('' | 'platform' → platform_managed else byok) stays as the fallback only.
  So the billing-mode the UI shows/sends reflects the DERIVED provider
  (folds in the closed #1931's canvas intent).

Federation/back-compat preserved: a non-registry runtime (external/mock/kimi/
future third-party) or an older backend that doesn't serve the registry fields
yields registry_backed=false → the canvas keeps the template-served models +
its heuristic, unchanged. NO hard-reject (the canvas just can't render an
option the registry didn't serve for registry-backed runtimes).

Out of scope (per brief): the manifest runtime allowlist
(SUPPORTED_RUNTIME_VALUES / FALLBACK_RUNTIME_OPTIONS) is NOT a provider
vocabulary and is untouched; PUT /workspaces/:id/provider is NOT retired (that
CTO #3 follow-through is a later phase).

Stacked on PR-A (workspace-server registry-served /templates); re-target to
main after PR-A merges.

TDD: ProviderModelSelector.registry.test.tsx (catalog bucketed by derived
provider, labelled from display_name, carries billing_mode + auth_env, no empty
buckets), ConfigTab.registryBilling.test.tsx (billing reads registry catalog;
falls back to the legacy rule with no catalog / unknown provider). Full canvas
suite green (3380 passed / 1 skipped), tsc clean for touched files, eslint 0.

internal#718 P3 — not merged; CTO merge-go after Five-Axis (UI-affecting).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 05:06:25 +00:00
hongming 3dd7108cb4 Merge pull request 'P4 closure follow-up internal#718: retire LLM_PROVIDER + PUT/GET /provider + deriveProviderFromModelSlug (core; BEHAVIOR-AFFECTING; NOT MERGED)' (#1984) from feat/internal-718-p4-followup-llm-provider-removal into main
ci-arm64-advisory / fast-checks (push) Waiting to run
CI / Canvas Deploy Reminder (push) Blocked by required conditions
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
Harness Replays / Harness Replays (push) Blocked by required conditions
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 11s
publish-canvas-image / Build & push canvas image (push) Successful in 1m41s
publish-workspace-server-image / build-and-push (push) Successful in 3m26s
Block internal-flavored paths / Block forbidden paths (push) Successful in 7s
CI / Python Lint & Test (push) Successful in 16s
CI / Detect changes (push) Successful in 29s
E2E API Smoke Test / detect-changes (push) Successful in 18s
E2E Chat / detect-changes (push) Successful in 24s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 30s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (push) Successful in 1m26s
E2E Staging SaaS (full lifecycle) / pr-validate (push) Successful in 34s
Handlers Postgres Integration / detect-changes (push) Successful in 4s
Harness Replays / detect-changes (push) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 8s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 5s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 4s
verify-providers-gen / Regenerate providers artifact and fail on drift (push) Successful in 48s
E2E Staging External Runtime / E2E Staging External Runtime (push) Successful in 5m24s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push) Successful in 5m25s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push) Successful in 7m51s
SECRET_PATTERNS drift lint / Detect SECRET_PATTERNS drift (push) Successful in 36s
main-red-watchdog / watchdog (push) Successful in 49s
CI / Shellcheck (E2E scripts) (push) Successful in 11s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 2m9s
gate-check-v3 / gate-check (push) Successful in 58s
CI / Platform (Go) (push) Successful in 5m39s
CI / Canvas (Next.js) (push) Successful in 6m40s
E2E Chat / E2E Chat (push) Successful in 3m59s
CI / all-required (push) Successful in 26m41s
publish-workspace-server-image / Production auto-deploy (push) Successful in 54m26s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 6m33s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 44s
ci-required-drift / drift (push) Successful in 1m48s
Sweep stale AWS Secrets Manager secrets / Sweep AWS Secrets Manager (push) Successful in 10s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 9s
2026-05-28 04:46:27 +00:00
hongming add37f35b0 Merge pull request 'P4 PR-2 internal#718: flip only-registered (runtime, model) gate from WARN to HARD-REJECT 422 (BEHAVIOR-AFFECTING)' (#1981) from feat/internal-718-p4-pr2-hard-reject-unregistered into main
ci-arm64-advisory / fast-checks (push) Waiting to run
CI / Canvas Deploy Reminder (push) Blocked by required conditions
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
Harness Replays / Harness Replays (push) Blocked by required conditions
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 12s
publish-workspace-server-image / build-and-push (push) Successful in 3m17s
Block internal-flavored paths / Block forbidden paths (push) Successful in 8s
CI / Python Lint & Test (push) Successful in 6s
CI / Detect changes (push) Successful in 10s
E2E API Smoke Test / detect-changes (push) Successful in 6s
E2E Chat / detect-changes (push) Successful in 7s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (push) Successful in 53s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 6s
Handlers Postgres Integration / detect-changes (push) Successful in 5s
Harness Replays / detect-changes (push) Successful in 3s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 5s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 4s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 10s
verify-providers-gen / Regenerate providers artifact and fail on drift (push) Successful in 30s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push) Successful in 5m45s
sync-providers-yaml / Compare synced providers.yaml against controlplane canonical (push) Successful in 4s
E2E Staging External Runtime / E2E Staging External Runtime (push) Successful in 5m4s
CI / Canvas (Next.js) (push) Successful in 5s
CI / Shellcheck (E2E scripts) (push) Successful in 6s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 2m16s
Sweep stale AWS Secrets Manager secrets / Sweep AWS Secrets Manager (push) Successful in 19s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 5m13s
CI / Platform (Go) (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / all-required (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
publish-workspace-server-image / Production auto-deploy (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
E2E Chat / E2E Chat (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 7m11s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 4s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 9s
2026-05-28 04:19:18 +00:00
claude-ceo-assistant 73871e7ade internal#718 P4 closure: retire LLM_PROVIDER + PUT/GET /provider + deriveProviderFromModelSlug
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 7s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 11s
CI / Python Lint & Test (pull_request) Successful in 10s
CI / Detect changes (pull_request) Successful in 15s
E2E API Smoke Test / detect-changes (pull_request) Successful in 19s
E2E Chat / detect-changes (pull_request) Successful in 18s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
Check migration collisions / Migration version collision check (pull_request) Successful in 39s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 15s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 56s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 47s
Harness Replays / detect-changes (pull_request) Successful in 6s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 6s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
verify-providers-gen / Regenerate providers artifact and fail on drift (pull_request) Successful in 58s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m5s
qa-review / approved (pull_request) Failing after 4s
security-review / approved (pull_request) Failing after 4s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m16s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 4s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Successful in 5m53s
CI / Platform (Go) (pull_request) Successful in 6m15s
CI / Canvas (Next.js) (pull_request) Successful in 6m46s
CI / all-required (pull_request) Successful in 11m36s
E2E Chat / E2E Chat (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 23s
Harness Replays / Harness Replays (pull_request) Successful in 7s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m47s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m50s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
gate-check-v3 / gate-check (pull_request) Successful in 9s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 14s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 8s
audit-force-merge / audit (pull_request) Successful in 10s
The provider-SSOT closure: with the registry-derived provider model
(P0-P4) flowing through every decision point — proxy (P1), billing
(P2-B), templates (P3 PR-A/B), provisioner (P3 PR-C) — the
LLM_PROVIDER workspace_secret has no reader left on core. This PR
retires:

  - WorkspaceHandler.Create's setProviderSecret writes (the
    payload.LLMProvider and deriveProviderFromModelSlug-derived
    write paths). payload.LLMProvider is preserved on the request
    struct for backwards-compat with older canvases that still send
    it; the value is intentionally ignored. Coverage moved to
    TestWorkspaceCreate_FirstDeploy_OnlyPersistsMODEL (asserts only
    the MODEL secret is written, even on a slug-prefixed model that
    pre-P4 would have triggered an LLM_PROVIDER write).

  - SecretsHandler.SetProvider / GetProvider gin handlers + the
    setProviderSecret helper. Both route registrations now point at
    handlers.ProviderEndpointGone, which returns 410 Gone with a
    structured PROVIDER_ENDPOINT_RETIRED body so older canvases that
    still call PUT /provider on Save fail loud rather than silently
    writing into a vanished row. Coverage: TestPutProvider_410Gone +
    TestGetProvider_410Gone + TestProviderEndpointGone_BodyShape.

  - deriveProviderFromModelSlug (retire-list #3) — the hand-rolled
    35-arm slug-prefix→provider switch in workspace_provision.go.
    Its only caller was Create's setProviderSecret write; the
    derivation now flows through providers.Manifest.DeriveProvider
    against the registry SSOT at every decision point. The drift
    test (derive_provider_drift_test.go) that pinned parity with the
    hermes template's derive-provider.sh is deleted with it. The
    shell script remains the in-container fallback; its byte-identity
    with the registry view of hermes is a P4 follow-up gated on
    registry data growth (see codegen of hermes config.yaml from the
    registry).

  - loadWorkspaceSecrets LLM_PROVIDER drop (defence-in-depth):
    any straggler workspace_secrets or global_secrets row keyed
    LLM_PROVIDER is filtered out before envVars is built, so a
    rolling deploy (new code, old DB) cannot re-emit the retired key
    into the CP-side provisioner env.

  - Canvas: ConfigTab.tsx no longer GETs or PUTs
    /workspaces/:id/provider, and the provider→billing-mode linkage
    (internal#703 Gap 2) is retired together — P2-B moved the
    platform-vs-byok decision to ResolveLLMBillingModeDerived, which
    derives the provider from (runtime, model). The provider
    dropdown still renders for display so users can preview the
    derived value locally. The two retired vitest suites
    (ConfigTab.provider, ConfigTab.billingMode) are replaced with
    documentation files pointing at the new coverage.

  - Migration 20260528000000_drop_llm_provider_workspace_secret
    removes any straggler rows from workspace_secrets. Idempotent:
    a fresh tenant with zero LLM_PROVIDER rows produces a 0-row
    delete. The .down.sql is a documented no-op (the rows cannot
    be reconstituted from a soft-delete, and the writers are gone).

Behavior delta — explicitly tested:

  - Registered (runtime, model) workspace → 201, provider derived,
    no LLM_PROVIDER stored. UNCHANGED for the runtime-visible
    `provider:` in /configs/config.yaml (CP-side commit derives it
    from the same registry).
  - PUT /workspaces/:id/provider → 410 Gone {code:
    PROVIDER_ENDPOINT_RETIRED, error, issue: internal#718}. Was 200
    with a workspace_secrets write.
  - GET /workspaces/:id/provider → 410 Gone. Was 200 + {provider,
    source}.
  - WorkspaceHandler.Create with a slug-prefixed model (e.g.
    minimax/MiniMax-M2.7) + an explicit llm_provider in the payload
    → only the MODEL workspace_secret is written. Pre-P4 both rows
    were written.
  - Existing workspace with an LLM_PROVIDER row → migration drops
    it at next deploy; loadWorkspaceSecrets filters it defensively
    in the interim.

Five-Axis review notes:

  - Correctness: the four readers of stored LLM_PROVIDER (core
    GetProvider, core loadWorkspaceSecrets, CP resolveModelAndProvider,
    CP ValidateProviderEnv) are all migrated in this PR + the
    CP-side commit. Audit query trail in the brief; the empirical
    finding is that no fifth reader exists (verified across both
    repos via grep of LLM_PROVIDER, setProviderSecret, SetProvider,
    GetProvider, llm_provider).
  - Tests: TDD red→green for the 410 Gone shape; SQL-mock for the
    "no LLM_PROVIDER write on Create" contract; existing P2-B
    billing tests confirm the derived-provider billing path is
    untouched (the regression risk this PR could have created).
  - Backward-compat: payload.LLMProvider preserved on the
    CreateWorkspacePayload struct; the canvas still sends it; the
    server ignores it. Older canvases that PUT /provider get a loud
    410 with a recognizable code so they can stop calling.
  - Rollback: revert the migration + revert this commit; the
    LLM_PROVIDER workspace_secret writers stay gone (the PUT route
    has no handler symbol to wire back without a separate revert).
  - Observability: provider derivation is logged in
    applyPlatformManagedLLMEnv (existing P2-B emission); no new
    structured-event surface added — the retirement is silent at
    the request boundary and the 410 Gone surface is the
    operator-facing signal.

cp#362 anthropic passthrough untouched. P1 proxy ResolveUpstream
untouched. P2-B billing derives via DeriveProvider — still reads
the same derivation, never the stored LLM_PROVIDER. P3 PR-A
templates-from-registry + P3 PR-C ValidateProviderEnv-from-registry
untouched. P4 PR-2 hard-reject 422 untouched.

NOT MERGED.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 21:12:55 -07:00
hongming 930f8753a9 Merge pull request 'P4 PR-1 internal#718 (sync): re-sync canonical providers.yaml with the colon-vocab reconcile (no behavior change)' (#1980) from feat/internal-718-p4-pr1-reconcile-colon-vocab-sync into main
ci-arm64-advisory / fast-checks (push) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 13s
Block internal-flavored paths / Block forbidden paths (push) Successful in 15s
CI / Detect changes (push) Successful in 16s
CI / Python Lint & Test (push) Successful in 6s
E2E API Smoke Test / detect-changes (push) Successful in 10s
E2E Chat / detect-changes (push) Successful in 10s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 8s
Handlers Postgres Integration / detect-changes (push) Successful in 8s
Harness Replays / detect-changes (push) Successful in 6s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 6s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 5s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 6s
sync-providers-yaml / Compare synced providers.yaml against controlplane canonical (push) Successful in 5s
publish-workspace-server-image / build-and-push (push) Successful in 8m29s
verify-providers-gen / Regenerate providers artifact and fail on drift (push) Successful in 49s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 6s
CI / Canvas (Next.js) (push) Successful in 3s
CI / Shellcheck (E2E scripts) (push) Successful in 7s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 8s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 2m9s
Harness Replays / Harness Replays (push) Successful in 12s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m38s
E2E Chat / E2E Chat (push) Successful in 4m42s
CI / Platform (Go) (push) Successful in 6m6s
CI / Canvas Deploy Reminder (push) Successful in 2s
CI / all-required (push) Successful in 22m3s
publish-workspace-server-image / Production auto-deploy (push) Successful in 15m49s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 4m31s
main-red-watchdog / watchdog (push) Successful in 30s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 5m23s
gate-check-v3 / gate-check (push) Successful in 1m13s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 6s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 15s
ci-required-drift / drift (push) Successful in 1m26s
2026-05-28 03:41:48 +00:00
claude-ceo-assistant eacb8183c3 P4 PR-2 internal#718: flip only-registered (runtime, model) gate from WARN to HARD-REJECT 422 (BEHAVIOR-AFFECTING)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 12s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 10s
CI / Python Lint & Test (pull_request) Successful in 4s
E2E API Smoke Test / detect-changes (pull_request) Successful in 6s
E2E Chat / detect-changes (pull_request) Successful in 7s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 50s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
Harness Replays / detect-changes (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m18s
gate-check-v3 / gate-check (pull_request) Successful in 7s
verify-providers-gen / Regenerate providers artifact and fail on drift (pull_request) Successful in 28s
qa-review / approved (pull_request) Successful in 9s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 5s
security-review / approved (pull_request) Failing after 6s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 4s
CI / Canvas (Next.js) (pull_request) Successful in 6s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 4s
E2E Chat / E2E Chat (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 14s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m20s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m51s
Harness Replays / Harness Replays (pull_request) Successful in 6s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m30s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Platform (Go) (pull_request) Successful in 6m10s
CI / all-required (pull_request) Successful in 12m14s
audit-force-merge / audit (pull_request) Successful in 5s
WorkspaceHandler.Create now returns 422 UNREGISTERED_MODEL_FOR_RUNTIME when the provider registry knows the runtime but the (runtime, model) pair is not in its native model set. Was the P2-B WARN-mode signal (X-Molecule-Model-Unregistered header + log; create proceeds); now a hard rejection at the boundary with no DB rows touched.

Behavior delta (under test):
  * Workspace with a REGISTERED (runtime, model) → 201, unchanged.
  * Workspace with an UNREGISTERED (runtime, model) → 422 with body
    {code:UNREGISTERED_MODEL_FOR_RUNTIME, error, runtime, model}, no DB writes (mock ExpectationsWereMet asserts zero unexpected DB calls).
  * Workspace with the legacy colon-form anthropic:claude-opus-4-7 for runtime=claude-code → 201 (P4 PR-1 reconciled the colon-vocab into the registry, making this a first-class registered model alongside the slash form).
  * Workspace with runtime NOT in the registry (langgraph/external/kimi/mock/federated) → unchanged (fails OPEN — federation-ready, the registry can not speak to non-first-party runtimes).
  * External workspaces (external=true or external-like runtime) → unchanged (URL is the contract, not the model).

Why P4 vs P2-B: P2-B kept WARN-mode because the legacy colon-namespaced BYOK vocabulary (anthropic:claude-opus-4-7 etc.) was live across the create/import/template corpus and not yet in the registry. P4 PR-1 reconciled that vocab into the per-runtime native sets (each runtime now lists bare + slash + colon forms for the BYOK ids in the live corpus). With the reconcile landed, an unregistered pair is a real misconfiguration and the gate flips loud — the codex anthropic:claude-opus-4-7 wedge class (the MODEL_REQUIRED gate targets the same failure mode) now fails AT THE BOUNDARY instead of provisioning a workspace that will wedge at adapter init.

Test surface (workspace_test.go):
  * TestWorkspaceCreate_718_P4_UnregisteredModelHardReject422 (NEW) — explicit 422 + body code + no DB writes
  * TestWorkspaceCreate_718_P4_RegisteredModelProceeds (renamed from _RegisteredModelNoWarnHeader) — 201 + no legacy WARN header
  * TestWorkspaceCreate_718_P4_LegacyColonVocabAccepted (NEW) — anthropic:claude-opus-4-7 on claude-code proceeds (the central regression guard for the PR-1 reconcile + PR-2 flip combo)
  * TestWorkspaceCreate_718_NonRegistryRuntimeFailsOpen — unchanged (federation path)

Fixture updates for the flip (tests that previously used an unregistered model as a fixture for OTHER gate paths; updated to a valid model so those gates can actually fire):
  * TestWorkspaceCreate_WithInvalidCompute_ReturnsBadRequest — gpt-4 (no runtime owns it) → claude-opus-4-7 (so the compute-validation 400 path tests what it should)
  * TestWorkspaceCreate_TemplateDefaultsMissingRuntimeAndModel — hermes/nousresearch/hermes-4-70b → hermes/moonshot/kimi-k2.6 (hermes native set per the CTO matrix)
  * TestWorkspaceCreate_TemplateDefaultsLegacyTopLevelModel — hermes/anthropic:claude-sonnet-4-5 → hermes/moonshot/kimi-k2.5
  * TestWorkspaceCreate_CallerModelOverridesTemplateDefault — hermes override minimax/MiniMax-M2.7 → moonshot/kimi-k2.5 (still tests the caller-overrides-template-default mechanic, just with a hermes-valid pair)

Phase-1 falsification + Phase-2 design were established in PR-1. Phase-3 TDD: each new behavior assertion mapped to a discriminating test (422 vs 201 vs unchanged WARN-header absence). Phase-4 Five-Axis to follow in PR review.

NOT regressed (verified via -short + -tags=integration -short for handlers + providers):
  * cp#362 anthropic passthrough (proxy layer; unaffected).
  * P1 proxy ResolveUpstream (registry resolution by namespace token; unaffected).
  * P2-B billing-derive (DeriveProvider semantics unchanged by the reconcile).
  * P3 templates-from-registry (GET /templates still serves ModelsForRuntime; PR-1 enlarges the set, this PR rejects calls outside it).

Stacked on feat/internal-718-p4-pr1-reconcile-colon-vocab-sync (PR-1 must merge first; this PR's tests would 422 the legacy colon vocab otherwise).

Refs internal#718.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 03:21:39 +00:00
claude-ceo-assistant 7bc52017ed P4 PR-1 sync internal#718: re-sync canonical providers.yaml from molecule-controlplane (colon-vocab reconcile)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 11s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 12s
CI / Python Lint & Test (pull_request) Successful in 20s
CI / Detect changes (pull_request) Successful in 21s
E2E API Smoke Test / detect-changes (pull_request) Successful in 17s
E2E Chat / detect-changes (pull_request) Successful in 17s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 18s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 11s
Harness Replays / detect-changes (pull_request) Successful in 9s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 8s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 8s
sync-providers-yaml / Compare synced providers.yaml against controlplane canonical (pull_request) Failing after 7s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 7s
qa-review / approved (pull_request) Failing after 10s
gate-check-v3 / gate-check (pull_request) Successful in 10s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 11s
sop-checklist / review-refire (pull_request) Has been skipped
security-review / approved (pull_request) Failing after 11s
verify-providers-gen / Regenerate providers artifact and fail on drift (pull_request) Successful in 31s
sop-tier-check / tier-check (pull_request) Successful in 6s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m9s
CI / Canvas (Next.js) (pull_request) Successful in 2s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 6s
E2E Chat / E2E Chat (pull_request) Successful in 7s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3s
Harness Replays / Harness Replays (pull_request) Successful in 4s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m55s
CI / Platform (Go) (pull_request) Successful in 5m7s
CI / all-required (pull_request) Successful in 11m56s
audit-force-merge / audit (pull_request) Successful in 25s
Mirrors the canonical change in molecule-controlplane PR feat/internal-718-p4-pr1-reconcile-colon-vocab:
adds the legacy colon-namespaced BYOK model ids (anthropic:claude-*, moonshot:kimi-k2.*, minimax:MiniMax-M2*) to each runtime native set so DeriveProvider / Manifest.ModelsForRuntime returns true for every legitimate model in the live workspace-create corpus (canvas/ConfigTab default + ~44 test files + openclaw template precedent).

Per the sync_canonical_test.go header procedure:
  1. Copied molecule-controlplane/internal/providers/providers.yaml verbatim.
  2. Regenerated internal/providers/gen/registry_gen.go via go run ./cmd/gen-providers.
  3. Bumped canonicalProvidersYAMLSHA256 to the new canonical sha (73e8003062edaa4ce75bfb324be615b6e2b380f07487e3af4dc16cb644dc12bc).
  4. Synced runtimes_test.go to match CP's expanded claude-code expectation set.

ZERO behavior change in core: the WARN-mode validateRegisteredModelForRuntime gate (workspace.go:451-456) just goes silent for the now-registered colon-form models; the X-Molecule-Model-Unregistered response header stops being emitted for legitimate colon-form workspaces. No new rejection path; no proxy/billing-derive change.

Stacked atop molecule-controlplane PR-1 — merge order: CP PR-1 → core PR-1 sync. The cross-repo sync-providers-yaml CI gate stays green once the canonical lands.

Refs internal#718.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 03:16:05 +00:00
hongming 753e0f569d Merge pull request 'P3 internal#718: serve GET /templates selectable provider/model list FROM the registry (PR-A backend; NOT merged)' (#1977) from feat/internal-718-p3a-templates-from-registry into main
ci-arm64-advisory / fast-checks (push) Waiting to run
Block internal-flavored paths / Block forbidden paths (push) Successful in 6s
CI / Python Lint & Test (push) Successful in 7s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 13s
CI / Detect changes (push) Successful in 9s
E2E API Smoke Test / detect-changes (push) Successful in 7s
E2E Chat / detect-changes (push) Successful in 6s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 11s
Handlers Postgres Integration / detect-changes (push) Successful in 16s
Harness Replays / detect-changes (push) Successful in 12s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 8s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 9s
verify-providers-gen / Regenerate providers artifact and fail on drift (push) Successful in 51s
publish-workspace-server-image / build-and-push (push) Successful in 3m10s
CI / Canvas (Next.js) (push) Successful in 12s
CI / Shellcheck (E2E scripts) (push) Successful in 3s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 2m26s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 9s
Harness Replays / Harness Replays (push) Successful in 5s
main-red-watchdog / watchdog (push) Successful in 55s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m5s
E2E Chat / E2E Chat (push) Successful in 5m4s
CI / Canvas Deploy Reminder (push) Successful in 2s
gate-check-v3 / gate-check (push) Successful in 39s
CI / Platform (Go) (push) Successful in 6m23s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 11s
CI / all-required (push) Successful in 13m4s
publish-workspace-server-image / Production auto-deploy (push) Successful in 11m54s
ci-required-drift / drift (push) Successful in 1m16s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 15s
Sweep stale AWS Secrets Manager secrets / Sweep AWS Secrets Manager (push) Successful in 11s
lint-bp-context-emit-match / lint-bp-context-emit-match (push) Successful in 1m26s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 4m17s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 7m34s
2026-05-28 03:02:47 +00:00
hongming-personal 2d0d070040 feat(workspace-server): P3 internal#718 — serve GET /templates selectable provider/model list from the registry
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 10s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
CI / Detect changes (pull_request) Successful in 12s
CI / Python Lint & Test (pull_request) Successful in 10s
E2E API Smoke Test / detect-changes (pull_request) Successful in 14s
E2E Chat / detect-changes (pull_request) Successful in 13s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 10s
Harness Replays / detect-changes (pull_request) Successful in 5s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 10s
verify-providers-gen / Regenerate providers artifact and fail on drift (pull_request) Successful in 31s
qa-review / approved (pull_request) Successful in 11s
security-review / approved (pull_request) Failing after 7s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m11s
gate-check-v3 / gate-check (pull_request) Successful in 27s
sop-checklist / review-refire (pull_request) Has been skipped
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 12s
sop-tier-check / tier-check (pull_request) Successful in 15s
CI / Canvas (Next.js) (pull_request) Successful in 11s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 5s
E2E Chat / E2E Chat (pull_request) Successful in 4s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 7s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m22s
Harness Replays / Harness Replays (pull_request) Successful in 6s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m1s
CI / Platform (Go) (pull_request) Successful in 5m50s
CI / all-required (pull_request) Successful in 10m13s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
audit-force-merge / audit (pull_request) Successful in 8s
P3 item 1 (retire-list #1 surface). GET /templates (templates.go List) now
ANNOTATES each registry-known runtime's template with an authoritative
registry-served selectable list, sourced from the provider registry
(workspace-server/internal/providers, the P2-A synced SSOT) instead of the
template's hand-authored config.yaml providers:/runtime_config.models block:

- registry_backed: true when the runtime is in the registry runtimes: block.
- registry_providers: the runtime's NATIVE provider set (ProvidersForRuntime),
  each with display_name + auth_env + billing_mode (platform_managed if the
  registry IsPlatform predicate holds, else byok) — the SSOT the canvas
  Provider dropdown consumes instead of its hardcoded VENDOR_LABELS map.
- registry_models: the runtime's NATIVE model ids (ModelsForRuntime), each
  annotated with its DERIVED provider (DeriveProvider) + the billing_mode that
  provider implies — so the canvas shows the billing source of the DERIVED
  provider (folds in #1931 intent) and can render no model the registry did
  not list for the runtime ("only registered selectable").

Additive + federation-ready + fail-OPEN: the existing template-served
Models/Providers/ProviderRegistry fields are UNCHANGED, so non-registry
runtimes (external/mock/kimi/future third-party) and older canvases keep
working — a runtime absent from the registry yields registry_backed=false and
no synthesized block. NO hard-reject: templates whose model isn't
registry-derivable are still served (WARN-level only; legacy-vocab reconcile
is P4).

Reuses the package-level providerRegistry() accessor + LLMBillingModePlatformManaged/
LLMBillingModeBYOK constants from llm_billing_mode.go (P2-B / #1972, now on
main) — one accessor + one constant set for the package; both the billing
derivation and this templates projection wrap the same providers.LoadManifest()
SSOT and the same wire strings.

Proxy ResolveUpstream / billing DeriveProvider untouched (P1/P2). Templates'
own config.yaml providers: codegen untouched (P4).

TDD: TestTemplatesList_RegistryServesSelectableModels (a template's bogus model
id never leaks into the registry-served list; native ids present),
TestTemplatesList_RegistryAnnotatesDerivedProviderAndBilling (derived
provider + platform_managed/byok per model; provider display_name/auth_env/
billing from the registry), TestTemplatesList_NonRegistryRuntimeFallsOpenToTemplate
(mock runtime: registry_backed=false, template fields untouched). All existing
TestTemplatesList_* stay green (template-served fields unchanged). Rebased onto
main after P2-B (#1972) landed; full handlers+providers suites green alongside it.

internal#718 P3 — not merged; CTO merge-go after Five-Axis (UI/API-affecting).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 19:21:04 -07:00
hongming 1e783ff6a2 Merge pull request 'provider-SSOT P2-B -> main: billing derives from provider (re-target #1971)' (#1972) from feat/internal-718-p2a-registry-codegen-distribution into main
ci-arm64-advisory / fast-checks (push) Waiting to run
Block internal-flavored paths / Block forbidden paths (push) Successful in 8s
CI / Python Lint & Test (push) Successful in 11s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 12s
CI / Detect changes (push) Successful in 14s
E2E API Smoke Test / detect-changes (push) Successful in 13s
E2E Chat / detect-changes (push) Successful in 14s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 9s
E2E Staging SaaS (full lifecycle) / pr-validate (push) Successful in 40s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (push) Successful in 1m13s
Handlers Postgres Integration / detect-changes (push) Successful in 6s
Harness Replays / detect-changes (push) Successful in 2s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 3s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 4s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 4s
verify-providers-gen / Regenerate providers artifact and fail on drift (push) Successful in 39s
publish-workspace-server-image / build-and-push (push) Successful in 4m33s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push) Successful in 4m34s
CI / Canvas (Next.js) (push) Successful in 2s
E2E Staging External Runtime / E2E Staging External Runtime (push) Successful in 5m20s
CI / Shellcheck (E2E scripts) (push) Successful in 2s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push) Successful in 5m57s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 6s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 1m47s
CI / Platform (Go) (push) Successful in 5m19s
Harness Replays / Harness Replays (push) Successful in 15s
CI / Canvas Deploy Reminder (push) Successful in 2s
CI / all-required (push) Successful in 8m8s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m4s
E2E Chat / E2E Chat (push) Successful in 3m49s
publish-workspace-server-image / Production auto-deploy (push) Successful in 5m43s
ci-required-drift / drift (push) Successful in 1m15s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 11s
Sweep stale AWS Secrets Manager secrets / Sweep AWS Secrets Manager (push) Successful in 16s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 12s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 4s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 4m14s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 5m24s
2026-05-28 02:09:09 +00:00
hongming 924dfa5598 test(workspace-server): remove unused wantWhy field in model_registry_validation_test (golangci-lint unused) — internal#718 P2-B
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
CI / Python Lint & Test (pull_request) Successful in 6s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
CI / Detect changes (pull_request) Successful in 7s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 13s
E2E API Smoke Test / detect-changes (pull_request) Successful in 8s
E2E Chat / detect-changes (pull_request) Successful in 7s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 7s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 38s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 50s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
Harness Replays / detect-changes (pull_request) Successful in 3s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 3s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m7s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 3s
verify-providers-gen / Regenerate providers artifact and fail on drift (pull_request) Successful in 40s
gate-check-v3 / gate-check (pull_request) Successful in 3s
qa-review / approved (pull_request) Successful in 4s
security-review / approved (pull_request) Failing after 4s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 4s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 3s
CI / Canvas (Next.js) (pull_request) Successful in 5s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 3s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m24s
E2E Chat / E2E Chat (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m33s
Harness Replays / Harness Replays (pull_request) Successful in 2s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Successful in 6m37s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m48s
CI / Platform (Go) (pull_request) Successful in 5m9s
CI / all-required (pull_request) Successful in 9m1s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
audit-force-merge / audit (pull_request) Successful in 7s
2026-05-28 01:39:27 +00:00
hongming 3ab690c273 Merge pull request 'P2-B internal#718: billing/credential derives from provider + only-registered validation (BEHAVIOR-AFFECTING; supersedes #1966)' (#1971) from feat/internal-718-p2b-billing-derives-from-provider into feat/internal-718-p2a-registry-codegen-distribution
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
CI / Canvas Deploy Reminder (pull_request) Blocked by required conditions
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Python Lint & Test (pull_request) Successful in 5s
E2E API Smoke Test / detect-changes (pull_request) Successful in 8s
CI / Detect changes (pull_request) Successful in 10s
E2E Chat / detect-changes (pull_request) Successful in 9s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 12s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 13s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 31s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
Harness Replays / detect-changes (pull_request) Successful in 4s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 50s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 3s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
verify-providers-gen / Regenerate providers artifact and fail on drift (pull_request) Successful in 1m1s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m10s
gate-check-v3 / gate-check (pull_request) Successful in 9s
qa-review / approved (pull_request) Failing after 6s
security-review / approved (pull_request) Failing after 3s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 4s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 4s
CI / Canvas (Next.js) (pull_request) Successful in 8s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 3s
CI / Platform (Go) (pull_request) Failing after 2m5s
E2E Chat / E2E Chat (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 3s
CI / all-required (pull_request) Failing after 4m41s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m51s
Harness Replays / Harness Replays (pull_request) Successful in 3s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Successful in 4m50s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m12s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m32s
2026-05-28 01:22:20 +00:00
hongming 866a71777f Merge pull request 'P2-A internal#718: bring provider registry to molecule-core via codegen + verify-CI (NO behavior change)' (#1970) from feat/internal-718-p2a-registry-codegen-distribution into main
ci-arm64-advisory / fast-checks (push) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 13s
Block internal-flavored paths / Block forbidden paths (push) Successful in 6s
CI / Python Lint & Test (push) Successful in 6s
CI / Detect changes (push) Successful in 8s
E2E API Smoke Test / detect-changes (push) Successful in 10s
E2E Chat / detect-changes (push) Successful in 18s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 18s
Harness Replays / detect-changes (push) Successful in 16s
Handlers Postgres Integration / detect-changes (push) Successful in 16s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 12s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 10s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 11s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (push) Successful in 15s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 14s
sync-providers-yaml / Compare synced providers.yaml against controlplane canonical (push) Successful in 7s
verify-providers-gen / Regenerate providers artifact and fail on drift (push) Successful in 32s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Failing after 1m15s
CI / Canvas (Next.js) (push) Successful in 12s
CI / Shellcheck (E2E scripts) (push) Successful in 2s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Successful in 1m28s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 13s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 2m12s
Harness Replays / Harness Replays (push) Successful in 6s
CI / Canvas Deploy Reminder (push) Successful in 3s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 16s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m41s
publish-workspace-server-image / build-and-push (push) Successful in 5m44s
E2E Chat / E2E Chat (push) Successful in 4m36s
ci-required-drift / drift (push) Successful in 1m6s
CI / Platform (Go) (push) Successful in 6m32s
CI / all-required (push) Successful in 8m48s
publish-workspace-server-image / Production auto-deploy (push) Successful in 5m12s
Sweep stale AWS Secrets Manager secrets / Sweep AWS Secrets Manager (push) Successful in 6s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 7s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 3s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 4m5s
main-red-watchdog / watchdog (push) Successful in 2m2s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 5m22s
gate-check-v3 / gate-check (push) Successful in 27s
2026-05-28 01:10:25 +00:00
hongming-personal 11b0646b37 fix(ci): sync-providers-yaml gate fetch canonical via /raw not /contents
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 7s
CI / Python Lint & Test (pull_request) Successful in 6s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 10s
E2E API Smoke Test / detect-changes (pull_request) Successful in 8s
E2E Chat / detect-changes (pull_request) Successful in 8s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 9s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 10s
Harness Replays / detect-changes (pull_request) Successful in 7s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 7s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Failing after 1m3s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 5s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Failing after 1m18s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m21s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m17s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s
sync-providers-yaml / Compare synced providers.yaml against controlplane canonical (pull_request) Successful in 4s
gate-check-v3 / gate-check (pull_request) Failing after 10s
qa-review / approved (pull_request) Failing after 9s
sop-checklist / na-declarations (pull_request) N/A: (none)
security-review / approved (pull_request) Failing after 13s
sop-checklist / review-refire (pull_request) Has been skipped
sop-checklist / all-items-acked (pull_request) Successful in 14s
verify-providers-gen / Regenerate providers artifact and fail on drift (pull_request) Successful in 31s
sop-tier-check / tier-check (pull_request) Successful in 7s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m22s
CI / Canvas (Next.js) (pull_request) Successful in 4s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 6s
E2E Chat / E2E Chat (pull_request) Successful in 8s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 9s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 14s
Harness Replays / Harness Replays (pull_request) Successful in 11s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m20s
CI / Platform (Go) (pull_request) Successful in 5m10s
CI / all-required (pull_request) Successful in 8m18s
audit-force-merge / audit (pull_request) Successful in 7s
The cross-repo drift gate fetched controlplane providers.yaml from the
Gitea /contents endpoint with Accept: application/vnd.gitea.raw. On this
Gitea (1.22.6) that header is NOT honored on /contents -- it returns the
JSON+base64 envelope ({"name":"providers.yaml","content":"<base64>"...},
~45.6 KB), not raw bytes. So diff -u compared JSON-vs-YAML and exited 1
(RED) on every run even when byte-identical, making the gate inert
(detected neither sync nor real drift).

Switch the fetch to the /raw endpoint, which returns the file bytes
directly (33319 B, sha256 48a66921...), byte-identical to core's synced
copy. diff now exits 0 on the in-sync state and goes RED on real drift.
Authorization: token header kept; soft-fail backstop and the hermetic
sha-pin in sync_canonical_test.go are untouched.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 00:55:08 +00:00
core-devops 3165b98cc8 fix(workspace-server): P2-B internal#718 — billing/credential decision DERIVES the provider; supersede #1966 stored-read; retire org rung; only-registered validation (BEHAVIOR-AFFECTING)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 11s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 7s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 7s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m11s
gate-check-v3 / gate-check (pull_request) Successful in 11s
verify-providers-gen / Regenerate providers artifact and fail on drift (pull_request) Successful in 43s
qa-review / approved (pull_request) Successful in 10s
security-review / approved (pull_request) Successful in 8s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 5s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
audit-force-merge / audit (pull_request) Successful in 4s
Re-points the platform-vs-BYOK billing/credential decision to DERIVE the provider
from (runtime, model) via the registry SSOT, per the CTO directive (internal#718
comment, 2026-05-27): "the billing read must DERIVE the provider, not read a
stored LLM_PROVIDER", "remove LLM_PROVIDER entirely as a billing source", "retire
organizations.llm_billing_mode as a billing source".

## BEHAVIOR DELTA (this PR changes behavior — tested explicitly)
- platform-derived (or unset → platform default) → platform_managed → platform
  creds. UNCHANGED.
- non-platform-derived → byok → the already-merged #1963 strips platform
  scope:global LLM creds + FAIL-CLOSES if the workspace has no own cred. THIS IS
  THE INTENDED FIX (the Reno billing-leak class: Reno Stars SEO 352e3c2b /
  Marketing 6b66de8d ran on the platform's Anthropic credits because the never-
  written org rung always resolved platform_managed).
- unset model → platform default (CTO-confirmed).

## What changed
- `ResolveLLMBillingModeDerived(ctx, ws, runtime, model, authEnv)` — the new SSOT
  resolver: explicit `workspaces.llm_billing_mode` override (precedence 1, the
  only stored billing signal that survives — operator escape hatch) → else
  DeriveProvider + IsPlatform → else default-closed platform_managed.
- `ResolveLLMBillingMode(ctx, ws, orgMode)` legacy signature retained for callers
  without (runtime, model) (admin route, secrets remote-pull): reads the stored
  runtime + MODEL + auth-env names from DB and delegates to the derived resolver.
  `orgMode` is RETIRED/ignored; the org rung is gone.
- `applyPlatformManagedLLMEnv` calls the derived resolver directly (it has
  runtime + model + the workspace env) — no stored LLM_PROVIDER read. Feeds
  #1963's strip + fail-closed the correct DERIVED signal.
- SUPERSEDES core#1966: that PR made the billing read consult a stored
  LLM_PROVIDER first; this reworks the decision onto derive-from-provider. #1966
  should be closed in favor of this.
- Removed the now-dead org-default normalization (normalizeOrgDefault).
- ONLY-REGISTERED validation at create (model_registry_validation.go +
  WorkspaceHandler.Create): a (runtime, model) not in the registry's
  ModelsForRuntime for a REGISTRY-known runtime is flagged
  (X-Molecule-Model-Unregistered header + warning log). P2 = WARN mode (NOT hard
  422) because the legacy colon-namespaced model vocabulary ("anthropic:claude-
  opus-4-7") is still live across the create/import/template corpus and is not
  yet reconciled into the registry — hard-reject is a one-line flip gated on
  P3/P4 vocabulary convergence. Fails OPEN for non-registry runtimes
  (langgraph/external/kimi/mock/federated) so those flows are unchanged.

## Tests (TDD; behavior delta explicit)
- llm_billing_mode_derived_test.go — platform/non-platform/unset/override/
  unregistered/auth-env-disambiguation table + DB-error default-closed + empty-id.
- workspace_provision_shared_test.go — DERIVED platform→unchanged,
  non-platform→byok+strip+fail-closed (the FIX), unset→platform default, through
  the real applyPlatformManagedLLMEnv path. Existing #1963 override-byok strip +
  fail-closed tests unchanged (still pass).
- model_registry_validation_test.go + workspace_test.go — only-registered warn +
  registered-no-warn + non-registry-fail-open.
- Reworked the legacy resolver/admin/secrets tests off the retired org rung.

## Build/CI
go build ./... (+ -tags=integration) green; full `go test ./...` (43 pkgs) green
incl. -race on handlers; vet clean; changed files gofmt-clean. cp#362 anthropic
passthrough untouched (CP repo); merged #1963 strip+fail-closed reused unchanged.

internal#718 P2-B. BEHAVIOR-AFFECTING. Supersedes #1966. Not merged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 17:39:26 -07:00
core-devops 71c68e44f2 feat(providers): P2-A internal#718 — bring the provider registry to molecule-core via codegen + verify-CI (additive, zero behavior change)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Python Lint & Test (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 7s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 11s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
E2E Chat / detect-changes (pull_request) Successful in 10s
E2E API Smoke Test / detect-changes (pull_request) Successful in 10s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 9s
Harness Replays / detect-changes (pull_request) Successful in 6s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 5s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 8s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 10s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m10s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m3s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m21s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 5s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
sync-providers-yaml / Compare synced providers.yaml against controlplane canonical (pull_request) Failing after 5s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Failing after 1m36s
gate-check-v3 / gate-check (pull_request) Successful in 12s
verify-providers-gen / Regenerate providers artifact and fail on drift (pull_request) Successful in 38s
qa-review / approved (pull_request) Failing after 6s
security-review / approved (pull_request) Failing after 7s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 7s
CI / Canvas (Next.js) (pull_request) Successful in 2s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 3s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 5s
E2E Chat / E2E Chat (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 14s
Harness Replays / Harness Replays (pull_request) Successful in 7s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m34s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m34s
CI / Platform (Go) (pull_request) Successful in 5m44s
CI / all-required (pull_request) Successful in 8m39s
sop-checklist / all-items-acked (pull_request) [info tier:low] acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, l
sop-checklist / na-declarations (pull_request) N/A: (none)
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Distributes the provider-registry SSOT into molecule-core per the CTO-decided
shape (internal#718 comment, 2026-05-27): "Distribution = SDK via codegen +
verify-CI", multi-repo branch "codegen-checked-into-each-repo + verify-CI".

molecule-core has no Go module dependency on molecule-controlplane, so this
lands a SYNCED COPY of the canonical providers.yaml plus the loader,
DeriveProvider/IsPlatform/ResolveUpstream, the generated Go projection
(cmd/gen-providers), and the drift gates — a byte-faithful mirror of the
controlplane P0/P1 machinery. Canonical SSOT stays in controlplane
internal/providers/providers.yaml.

ZERO behavior change (additive, like P0): NO production code path imports the
new package yet. P2-B wires the billing/credential decision onto the loader.

What lands:
- internal/providers/{providers.go,derive_provider.go,providers.yaml} — mirror
  of the controlplane loader + canonical YAML (synced copy).
- internal/providers/gen/registry_gen.go — generated projection; fingerprint
  faffcbe59bb9f38c matches controlplane.
- cmd/gen-providers — the generator (go generate + -check drift mode).
- .gitea/workflows/verify-providers-gen.yml — artifact ↔ synced-copy drift gate
  (mirror of the controlplane workflow; standalone, not in branch protection
  yet — same soak-then-promote posture).
- .gitea/workflows/sync-providers-yaml.yml — NEW cross-repo gate: fetches the
  controlplane canonical providers.yaml and byte-compares against core's synced
  copy (RED on canonical drift). Read-only AUTO_SYNC_TOKEN; degrades to a
  warning if the token is absent.
- internal/providers/sync_canonical_test.go — hermetic sha pin of the synced
  copy (the always-on backstop; catches a hand-edit even with no network).
- internal/providers/gen_import_boundary_test.go — arch-lint-equivalent AST gate
  (core has no go-arch-lint): no production package may import the raw gen
  projection. Proven load-bearing.

Build/test: go build ./... (+ -tags=integration) green; providers/gen/
gen-providers suites pass (incl. -race); gen -check in sync; gofmt + vet clean.

internal#718 P2-A. NO behavior change. Not merged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 17:10:12 -07:00
64 changed files with 7592 additions and 2433 deletions
+63
View File
@@ -208,6 +208,61 @@ def _raise_for_redeploy_result(status: int, body: dict, slugs: list[str]) -> Non
)
def rollout_stragglers(enumerated: list[str], results: list[dict]) -> list[str]:
"""Return every enumerated tenant NOT proven on the target build.
A straggler is any tenant the rollout was supposed to cover that the
CP could not verify is running the target image tag — whether it
errored, was skipped, or SSM-succeeded onto the wrong image
(internal#724). CP marks each per-tenant result row with
``verified_on_target`` (the REDEPLOY_RUNNING_IMAGE docker-inspect
proof). A tenant enumerated for the rollout but absent from the
result set (no batch ever ran it) is also a straggler — that is the
exact agents-team silent-skip class.
Backward-compat: an OLDER CP that doesn't emit ``verified_on_target``
yet returns rows without the key. Treat a missing key as verified so
this surfacing degrades to the previous (ok-based) behavior against an
un-upgraded CP, rather than failing every deploy spuriously. Once the
CP fix is deployed the key is always present and real stragglers are
caught.
"""
verified: set[str] = set()
for row in results:
if str(row.get("ssm_status") or "") == "DryRun":
continue
slug = str(row.get("slug") or "").strip()
if not slug:
continue
# Missing key (old CP) => assume verified; present key is authoritative.
if "verified_on_target" not in row or row.get("verified_on_target"):
verified.add(slug)
return sorted(s for s in dict.fromkeys(enumerated) if s not in verified)
def assert_full_coverage(enumerated: list[str], aggregate: dict, dry_run: bool) -> None:
"""Fail the rollout if any enumerated tenant is not on the target build.
This is the no-silent-skip gate (internal#724). A dry run proves
nothing landed, so coverage is not asserted for it.
"""
if dry_run:
return
stragglers = rollout_stragglers(enumerated, aggregate.get("results") or [])
if stragglers:
msg = (
f"incomplete rollout: {len(stragglers)} tenant(s) not verified on target "
f"after redeploy-fleet: {', '.join(stragglers)} "
f"(enumerated {len(set(enumerated))})"
)
aggregate["ok"] = False
aggregate["error"] = msg
aggregate["stragglers"] = stragglers
raise RolloutFailed(msg, aggregate)
def execute_scoped_rollout(
plan: dict,
token: str,
@@ -254,6 +309,14 @@ def execute_scoped_rollout(
aggregate["error"] = str(exc)
raise RolloutFailed(str(exc), aggregate) from exc
# No-silent-skip coverage gate (internal#724): every enumerated tenant
# must be PROVEN on the target build. A per-tenant HTTP-200/ok response
# is not proof — a tenant that SSM-succeeded but stayed on the old tag,
# or one enumerated but never batched, is a straggler. Surfacing it as
# a RolloutFailed makes the deploy step exit non-zero instead of
# silently reporting success (the exact agents-team failure mode).
assert_full_coverage(all_slugs, aggregate, dry_run)
return aggregate
+2 -2
View File
@@ -845,7 +845,7 @@ def render_status(
if len(missing_body) > 3:
shown += f", +{len(missing_body) - 3}"
desc_parts.append(f"body-unfilled: {shown}")
state = "success" if not missing else "failure"
state = "success" if not missing and not missing_body else "failure"
return state, "".join(desc_parts)
@@ -1033,7 +1033,7 @@ def main(argv: list[str] | None = None) -> int:
for t in data:
if t.get("name") == tn:
tid = t.get("id")
client._team_id_cache[(args.owner, tn)] = tid # noqa: SLF001 # write-through cache; intentional side-effect for reuse across calls
client._team_id_cache[(args.owner, tn)] = tid # noqa: SLF001 # internal write-through cache
break
if tid is not None:
team_ids.append(tid)
@@ -355,3 +355,134 @@ def test_rollout_from_plan_file_writes_partial_response_on_failure(tmp_path):
assert response_path.read_text(encoding="utf-8").strip()
assert '"ok": false' in response_path.read_text(encoding="utf-8")
assert '"slug": "hongming"' in response_path.read_text(encoding="utf-8")
# ──────────────────────────────────────────────────────────────────────
# No-silent-skip coverage gate (internal#724)
# ──────────────────────────────────────────────────────────────────────
def test_rollout_stragglers_flags_tenant_not_on_target():
# b SSM-succeeded but its container is on the old tag → straggler.
stragglers = prod.rollout_stragglers(
["a", "b", "c"],
[
{"slug": "a", "verified_on_target": True},
{"slug": "b", "verified_on_target": False, "running_image": "platform-tenant:staging-old"},
{"slug": "c", "verified_on_target": True},
],
)
assert stragglers == ["b"]
def test_rollout_stragglers_flags_enumerated_tenant_with_no_result():
# agents-team class: enumerated but no batch ever produced a row for it.
stragglers = prod.rollout_stragglers(
["a", "agents-team"],
[{"slug": "a", "verified_on_target": True}],
)
assert stragglers == ["agents-team"]
def test_rollout_stragglers_missing_key_is_backward_compatible():
# Older CP without verified_on_target → treat as verified (no spurious fail).
stragglers = prod.rollout_stragglers(
["a", "b"],
[{"slug": "a", "healthz_ok": True}, {"slug": "b", "healthz_ok": True}],
)
assert stragglers == []
def test_rollout_stragglers_ignores_dry_run_rows():
stragglers = prod.rollout_stragglers(
["a"], [{"slug": "a", "ssm_status": "DryRun"}]
)
# dry-run row is skipped, so "a" has no verifying row → straggler.
assert stragglers == ["a"]
def test_scoped_rollout_fails_when_a_tenant_stays_on_old_tag():
# Every per-tenant call returns ok=True, but agents-team is NOT
# verified_on_target. The rollout must still fail loudly — this is
# the exact "reported success, one tenant silently skipped" bug.
def fake_redeploy(_cp_url, _token, body):
rows = []
for slug in body["only_slugs"]:
rows.append({"slug": slug, "verified_on_target": slug != "agents-team"})
return 200, {"ok": True, "results": rows}
try:
prod.execute_scoped_rollout(
{
"cp_url": "https://api.moleculesai.app",
"body": {
"target_tag": "staging-new",
"batch_size": 5,
"dry_run": False,
"confirm": True,
},
},
token="secret",
list_slugs=lambda _u, _t, _b: ["reno-stars", "agents-team", "hongming"],
redeploy=fake_redeploy,
sleep=lambda _s: None,
)
except prod.RolloutFailed as exc:
assert "incomplete rollout" in str(exc)
assert exc.response["stragglers"] == ["agents-team"]
assert exc.response["ok"] is False
else:
raise AssertionError("expected an incomplete rollout to fail loudly")
def test_scoped_rollout_passes_when_all_tenants_verified_on_target():
def fake_redeploy(_cp_url, _token, body):
return 200, {
"ok": True,
"results": [{"slug": s, "verified_on_target": True} for s in body["only_slugs"]],
}
aggregate = prod.execute_scoped_rollout(
{
"cp_url": "https://api.moleculesai.app",
"body": {
"target_tag": "staging-new",
"batch_size": 5,
"dry_run": False,
"confirm": True,
},
},
token="secret",
list_slugs=lambda _u, _t, _b: ["reno-stars", "agents-team", "hongming"],
redeploy=fake_redeploy,
sleep=lambda _s: None,
)
assert aggregate["ok"] is True
assert "stragglers" not in aggregate
def test_scoped_rollout_dry_run_does_not_assert_coverage():
# A dry run proves nothing landed; coverage must NOT be asserted or
# every plan would fail.
def fake_redeploy(_cp_url, _token, body):
return 200, {
"ok": True,
"results": [{"slug": s, "ssm_status": "DryRun"} for s in body["only_slugs"]],
}
aggregate = prod.execute_scoped_rollout(
{
"cp_url": "https://api.moleculesai.app",
"body": {
"target_tag": "staging-new",
"batch_size": 5,
"dry_run": True,
"confirm": True,
},
},
token="secret",
list_slugs=lambda _u, _t, _b: ["a", "b"],
redeploy=fake_redeploy,
sleep=lambda _s: None,
)
assert aggregate["ok"] is True
+1 -1
View File
@@ -138,7 +138,7 @@ items:
- slug: memory-consulted
numeric_alias: 7
pr_section_marker: "Memory consulted"
pr_section_marker: "Memory/saved-feedback consulted"
required_teams: [engineers]
description: >-
List of feedback memories applicable to this change. Ack from
+8
View File
@@ -357,6 +357,14 @@ jobs:
name: Run E2E bash unit tests (no live infra)
run: |
bash tests/e2e/test_model_slug.sh
# molecule-core#1995 (#1994 follow-on): fail-direction proof for
# the A2A real-completion + byok-routing assertion helpers
# (lib/completion_assert.sh). Offline (no LLM, no network): it
# asserts an error-as-text payload FAILS the real-completion gate
# — the exact trap the historical shape-only `"kind":"text"`
# check missed. If a refactor weakens the gate to a shape check,
# this step goes red on every PR.
bash tests/e2e/test_completion_assert_unit.sh
- if: ${{ needs.changes.outputs.scripts == 'true' }}
name: Test ECR promote-tenant-image script (mock-driven, no live infra)
+2
View File
@@ -49,6 +49,7 @@ on:
- 'workspace-server/internal/middleware/**'
- 'workspace-server/internal/provisioner/**'
- 'tests/e2e/test_staging_full_saas.sh'
- 'tests/e2e/lib/completion_assert.sh'
- 'tests/e2e/lib/aws_leak_check.sh'
- 'tests/e2e/test_aws_leak_check.sh'
- '.gitea/workflows/e2e-staging-saas.yml'
@@ -61,6 +62,7 @@ on:
- 'workspace-server/internal/middleware/**'
- 'workspace-server/internal/provisioner/**'
- 'tests/e2e/test_staging_full_saas.sh'
- 'tests/e2e/lib/completion_assert.sh'
- 'tests/e2e/lib/aws_leak_check.sh'
- 'tests/e2e/test_aws_leak_check.sh'
- '.gitea/workflows/e2e-staging-saas.yml'
@@ -327,13 +327,27 @@ jobs:
echo ""
echo "### Per-tenant result"
echo ""
echo "| Slug | Phase | SSM Status | Exit | Healthz | Error present |"
echo "|------|-------|------------|------|---------|---------------|"
jq -r '.results[]? | "| \(.slug) | \(.phase) | \(.ssm_status // "-") | \(.ssm_exit_code) | \(.healthz_ok) | \((.error // "") != "") |"' "$HTTP_RESPONSE" || true
echo "| Slug | Phase | SSM Status | Exit | Healthz | On target | Error present |"
echo "|------|-------|------------|------|---------|-----------|---------------|"
jq -r '.results[]? | "| \(.slug) | \(.phase) | \(.ssm_status // "-") | \(.ssm_exit_code) | \(.healthz_ok) | \(.verified_on_target) | \((.error // "") != "") |"' "$HTTP_RESPONSE" || true
# internal#724: stragglers are tenants enumerated but not proven
# on the target build. Surface them loudly — a non-empty list
# means the rollout did NOT fully land.
STRAGGLERS="$(jq -r '(.stragglers // []) | join(", ")' "$HTTP_RESPONSE")"
if [ -n "$STRAGGLERS" ]; then
echo ""
echo "### ⚠ Stragglers (NOT on target tag \`$TARGET_TAG\`)"
echo ""
echo "\`$STRAGGLERS\`"
fi
} >> "$GITHUB_STEP_SUMMARY"
OK="$(jq -r '.ok' "$HTTP_RESPONSE")"
if [ "$OK" != "true" ]; then
STRAGGLERS="$(jq -r '(.stragglers // []) | join(", ")' "$HTTP_RESPONSE")"
if [ -n "$STRAGGLERS" ]; then
echo "::error::incomplete rollout — tenants not on target tag $TARGET_TAG: $STRAGGLERS"
fi
echo "::error::redeploy-fleet reported ok=false; production rollout halted."
exit 1
fi
+99
View File
@@ -0,0 +1,99 @@
name: sync-providers-yaml
# Cross-repo canonical↔synced-copy drift gate (internal#718 P2-A, CTO
# 2026-05-27 "Distribution = SDK via codegen + verify-CI", multi-repo branch:
# "codegen-checked-into-each-repo + verify-CI").
#
# The canonical provider-registry SSOT is molecule-controlplane
# internal/providers/providers.yaml. molecule-core has NO Go module dependency
# on controlplane, so instead of importing it we carry a SYNCED COPY at
# workspace-server/internal/providers/providers.yaml and gate it.
#
# This workflow fetches the canonical providers.yaml from controlplane (via the
# Gitea raw endpoint, read-only) and byte-compares it against core's synced
# copy. RED if they differ — meaning the canonical moved and core's copy must be
# re-synced (copy verbatim + `go generate ./...` + bump
# canonicalProvidersYAMLSHA256 in sync_canonical_test.go).
#
# Pairs with:
# * sync_canonical_test.go — hermetic sha pin (catches a hand-edit of core's
# copy even with no network); runs in the normal `go test ./...`.
# * verify-providers-gen.yml — artifact ↔ synced-copy drift.
#
# ENFORCEMENT GATING: standalone workflow, NOT a job in ci.yml and NOT in
# branch protection (same soak-then-promote posture as verify-providers-gen).
# It is intentionally absent from ci.yml's job set so the ci-required-drift
# sentinel does not fire on it.
#
# AUTH: uses AUTO_SYNC_TOKEN (the existing cross-repo read token used to sync
# template/provider content from sibling repos). If the secret is absent the
# job emits a clear ::warning:: and exits 0 — the hermetic sha pin in
# sync_canonical_test.go is the always-on backstop, so a missing cross-repo
# token degrades to "hand-edit still caught, live canonical drift not caught"
# rather than a hard red that blocks unrelated PRs.
on:
pull_request:
types: [opened, synchronize, reopened]
paths:
- 'workspace-server/internal/providers/providers.yaml'
- '.gitea/workflows/sync-providers-yaml.yml'
push:
branches: [main, staging]
paths:
- 'workspace-server/internal/providers/providers.yaml'
- '.gitea/workflows/sync-providers-yaml.yml'
schedule:
# Daily at :23 — catch a canonical change in controlplane that landed
# without a paired core re-sync PR (off-zero to spread cron load).
- cron: '23 4 * * *'
workflow_dispatch:
env:
GITHUB_SERVER_URL: https://git.moleculesai.app
permissions:
contents: read
concurrency:
group: sync-providers-yaml-${{ github.ref }}
cancel-in-progress: true
jobs:
compare:
name: Compare synced providers.yaml against controlplane canonical
runs-on: ubuntu-latest
timeout-minutes: 6
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- name: Fetch canonical providers.yaml from controlplane and byte-compare
env:
AUTO_SYNC_TOKEN: ${{ secrets.AUTO_SYNC_TOKEN }}
API_ROOT: ${{ github.server_url }}/api/v1
run: |
set -euo pipefail
if [ -z "${AUTO_SYNC_TOKEN:-}" ]; then
echo "::warning::AUTO_SYNC_TOKEN secret missing — skipping the live cross-repo compare."
echo "The hermetic sha pin (sync_canonical_test.go) still gates hand-edits of core's copy."
echo "Provision AUTO_SYNC_TOKEN (read scope on molecule-controlplane) to enable live canonical-drift detection."
exit 0
fi
CANON_URL="${API_ROOT}/repos/molecule-ai/molecule-controlplane/raw/internal/providers/providers.yaml?ref=main"
# Use the /raw endpoint: it returns the file bytes directly. (The
# /contents endpoint ignores Accept: application/vnd.gitea.raw on
# Gitea 1.22.6 and returns the JSON+base64 envelope, which made this
# diff a permanent false RED.)
curl -fsS \
-H "Authorization: token ${AUTO_SYNC_TOKEN}" \
"${CANON_URL}" -o /tmp/canonical-providers.yaml
LOCAL=workspace-server/internal/providers/providers.yaml
if diff -u /tmp/canonical-providers.yaml "$LOCAL"; then
echo "OK — core's synced providers.yaml is byte-identical to the controlplane canonical."
else
echo "::error::core's synced providers.yaml DRIFTED from the controlplane canonical (SSOT)."
echo "Re-sync: copy controlplane internal/providers/providers.yaml verbatim over"
echo " $LOCAL, run 'go generate ./...' in workspace-server/, and bump"
echo " canonicalProvidersYAMLSHA256 in internal/providers/sync_canonical_test.go."
exit 1
fi
+89
View File
@@ -0,0 +1,89 @@
name: verify-providers-gen
# Provider-registry SSOT enforcement gate — molecule-core side (internal#718
# P2-A, CTO 2026-05-27 "Distribution = SDK via codegen + verify-CI").
#
# The canonical schema SSOT is molecule-controlplane
# internal/providers/providers.yaml. molecule-core carries a SYNCED COPY at
# workspace-server/internal/providers/providers.yaml (kept in sync by the
# companion sync-providers-yaml.yml gate), and cmd/gen-providers emits the
# checked-in Go projection workspace-server/internal/providers/gen/registry_gen.go.
#
# This workflow regenerates the artifact into the working tree and fails RED if
# it differs from what is committed — catching BOTH:
# * a providers.yaml (synced-copy) change that wasn't followed by `go generate ./...`, and
# * a hand-edit of the generated artifact (it carries a DO NOT EDIT header).
#
# It is the molecule-core mirror of molecule-controlplane's verify-providers-gen
# workflow. Together with sync-providers-yaml (canonical↔synced-copy drift) it
# closes the codegen-checked-into-each-repo + verify-CI loop the RFC mandates.
#
# ENFORCEMENT GATING (deliberate, per dev-SOP "implementation gating"):
# this is a STANDALONE workflow, NOT a job inside ci.yml, and is NOT yet in any
# branch-protection status_check_contexts. Rationale (identical to the CP P0
# rollout):
# * It runs + reports RED on every PR/push immediately (visible signal).
# * It is intentionally absent from ci.yml's job set so the ci-required-drift
# sentinel (jobs ↔ branch-protection ↔ audit-env) does NOT fire on it, and
# from branch protection (turning it into a hard merge gate has blast radius
# — operator GO required, same pattern as sop-tier-check / verify-providers-gen
# on controlplane). Promote it into branch protection in a follow-up once
# P2 has soaked.
# Until then it behaves like secret-scan / block-internal-paths: a standalone
# advisory-to-hard gate the author is expected to keep green.
on:
pull_request:
types: [opened, synchronize, reopened]
push:
branches: [main, staging]
env:
GITHUB_SERVER_URL: https://git.moleculesai.app
permissions:
contents: read
concurrency:
group: verify-providers-gen-${{ github.ref }}
cancel-in-progress: true
jobs:
verify:
name: Regenerate providers artifact and fail on drift
runs-on: ubuntu-latest
timeout-minutes: 8
defaults:
run:
working-directory: workspace-server
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- uses: actions/setup-go@4a3601121dd01d1626a1e23e37211e3254c1c06c # v6.4.0
with:
go-version: 'stable'
cache: true
cache-dependency-path: workspace-server/go.sum
- name: Verify generated artifact is in sync with providers.yaml
run: |
set -euo pipefail
# -check regenerates in memory and byte-compares against the
# checked-in artifact; exit 1 (RED) on any drift. This is the
# single source of the gate's verdict — the same code path
# `go test ./cmd/gen-providers` exercises.
go run ./cmd/gen-providers -check
- name: Belt-and-braces — regenerate in place and assert clean tree
run: |
set -euo pipefail
# Independent confirmation that does not trust the -check path:
# actually write the artifact and assert git sees no change. If
# this and the step above ever disagree, the gate is suspect.
go generate ./...
if ! git diff --quiet -- internal/providers/gen/registry_gen.go; then
echo "::error::workspace-server/internal/providers/gen/registry_gen.go drifted from providers.yaml."
echo "Run 'go generate ./...' (or 'go run ./cmd/gen-providers') in workspace-server/ and commit the result."
git --no-pager diff -- internal/providers/gen/registry_gen.go | head -80
exit 1
fi
echo "OK — generated providers artifact is in sync with the schema SSOT."
@@ -38,10 +38,11 @@ const DEFAULT_RUNTIME = "claude-code";
const RUNTIME_OPTIONS = [
{ value: "claude-code", label: "Claude Code" },
{ value: "codex", label: "OpenAI Codex CLI" },
{ value: "google-adk", label: "Google ADK" },
{ value: "hermes", label: "Hermes" },
{ value: "openclaw", label: "OpenClaw" },
];
const BASE_RUNTIME_TEMPLATE_IDS = new Set(["claude-code-default", "codex", "hermes", "openclaw"]);
const BASE_RUNTIME_TEMPLATE_IDS = new Set(["claude-code-default", "codex", "google-adk", "hermes", "openclaw"]);
const DEFAULT_HEADLESS_INSTANCE_TYPE = "t3.medium";
const DEFAULT_HEADLESS_ROOT_GB = 30;
const DEFAULT_DISPLAY_INSTANCE_TYPE = "t3.xlarge";
+101 -1
View File
@@ -49,6 +49,33 @@ export interface ProviderEntry {
wildcard: boolean;
/** Optional tooltip text (rendered as native title=). */
tooltip?: string;
/** Billing mode the DERIVED provider implies, when this entry came from the
* registry-backed payload (internal#718 P3): "platform_managed" | "byok".
* Undefined for entries built by the legacy inferVendor heuristic. */
billingMode?: "platform_managed" | "byok";
}
/** RegistryProvider mirrors one entry of GET /templates `registry_providers`
* (workspace-server registryProviderView): the registry's native provider for
* a runtime, with its display label, auth-env NAMES, and billing mode. This is
* the SSOT the dropdown labels come from — the canvas drops VENDOR_LABELS for
* registry-backed runtimes (internal#718 P3, retire-list #4). */
export interface RegistryProvider {
name: string;
display_name?: string;
auth_env?: string[];
billing_mode?: "platform_managed" | "byok";
deprecated?: boolean;
}
/** RegistryModel mirrors one entry of GET /templates `registry_models`: a
* native model id annotated with its DERIVED provider (registry name) and the
* billing_mode that provider implies. */
export interface RegistryModel {
id: string;
name?: string;
provider?: string;
billing_mode?: "platform_managed" | "byok";
}
export interface SelectorValue {
@@ -68,6 +95,13 @@ interface Props {
models: SelectorModel[];
value: SelectorValue;
onChange: (next: SelectorValue) => void;
/** Optional pre-built provider catalog. When provided, the selector uses it
* verbatim instead of re-inferring one from `models` via
* buildProviderCatalog — the registry-backed path (internal#718 P3), where
* the parent builds the catalog from the registry-served providers/models
* so dropdown labels + billing come from the provider-registry SSOT rather
* than the inferVendor heuristic. Omitted = legacy heuristic over `models`. */
catalog?: ProviderEntry[];
/** Display variant. "grid" = label+control side-by-side (used in ConfigTab
* Runtime section). "stack" = vertical (used in MissingKeysModal). */
variant?: "grid" | "stack";
@@ -251,6 +285,66 @@ export function buildProviderCatalog(models: SelectorModel[]): ProviderEntry[] {
return Array.from(buckets.values());
}
/** Build the provider catalog from a REGISTRY-BACKED GET /templates payload
* (registry_providers + registry_models) — internal#718 P3, retire-list #4.
*
* Unlike buildProviderCatalog (which RE-INFERS vendor from model-id prefixes
* + env via inferVendor/VENDOR_LABELS/BARE_VENDOR_PATTERNS), this trusts the
* registry: each model carries its DERIVED `provider` (a registry provider
* name) and the dropdown label/billing/auth come from the matching
* `registry_providers` entry. The canvas can render no provider/model the
* registry did not serve ("only registered selectable"), and the billing-mode
* shown reflects the derived provider rather than a hardcoded rule.
*
* A provider with no served model is omitted (no empty buckets). Models whose
* `provider` doesn't match a registry_providers entry still get a bucket
* keyed by the raw provider name (defensive — should not happen for a
* well-formed registry payload), so a model is never silently dropped. */
export function buildProviderCatalogFromRegistry(
registryProviders: RegistryProvider[],
registryModels: RegistryModel[],
): ProviderEntry[] {
const byName = new Map<string, RegistryProvider>();
for (const p of registryProviders) byName.set(p.name, p);
// Bucket models by their derived provider name, preserving registry order.
const buckets = new Map<string, ProviderEntry>();
for (const m of registryModels) {
const vendor = (m.provider ?? "").trim();
if (!vendor) continue; // un-annotated registry model — skip from the
// provider cascade (selectable elsewhere via free-text); it has no
// derived provider to bucket under.
const meta = byName.get(vendor);
const wildcard = m.id.includes("*");
let entry = buckets.get(vendor);
if (!entry) {
entry = {
id: `registry|${vendor}`,
vendor,
label: meta?.display_name || vendor,
envVars: meta?.auth_env ?? [],
models: [],
wildcard,
billingMode: meta?.billing_mode ?? m.billing_mode,
tooltip: VENDOR_TOOLTIPS[vendor],
};
buckets.set(vendor, entry);
}
entry.models.push({ id: m.id, name: m.name, provider: vendor });
entry.wildcard = entry.wildcard || wildcard;
}
// Decorate label with model-count when ≥2 concrete models share the bucket,
// matching buildProviderCatalog's UX.
for (const e of buckets.values()) {
if (!e.wildcard && e.models.length > 1) {
e.label = `${e.label} (${e.models.length} models)`;
}
}
return Array.from(buckets.values());
}
/** Find the provider entry that contains a given model id. Used by
* callers to back-derive the provider when only the model is known
* (e.g. ConfigTab loading from saved state). */
@@ -283,6 +377,7 @@ export function ProviderModelSelector({
models,
value,
onChange,
catalog: catalogProp,
variant = "stack",
allowCustomModelEscape = false,
disabled = false,
@@ -293,7 +388,12 @@ export function ProviderModelSelector({
const providerSelectId = `${baseId}-provider`;
const modelSelectId = `${baseId}-model`;
const catalog = useMemo(() => buildProviderCatalog(models), [models]);
// Registry-backed path (internal#718 P3): use the parent-supplied catalog
// verbatim; otherwise re-infer one from `models` via the legacy heuristic.
const catalog = useMemo(
() => catalogProp ?? buildProviderCatalog(models),
[catalogProp, models],
);
const selected = useMemo(
() => catalog.find((p) => p.id === value.providerId) ?? null,
[catalog, value.providerId],
@@ -213,6 +213,7 @@ describe("CreateWorkspaceDialog", () => {
expect(runtimeTexts).toEqual([
"Claude Code",
"OpenAI Codex CLI",
"Google ADK",
"Hermes",
"OpenClaw",
]);
@@ -0,0 +1,110 @@
// @vitest-environment jsdom
//
// internal#718 P3 (retire-list #4) — when GET /templates serves a
// registry-backed selectable list (registry_providers + registry_models with
// display_name / billing_mode / derived provider), the canvas builds the
// provider catalog FROM that registry data instead of re-inferring vendor
// from model-id prefixes (VENDOR_LABELS / BARE_VENDOR_PATTERNS / inferVendor).
// The heuristic path stays only as the fallback for non-registry runtimes /
// older backends.
import { describe, it, expect } from "vitest";
import {
buildProviderCatalogFromRegistry,
type RegistryProvider,
type RegistryModel,
} from "../ProviderModelSelector";
// Mirrors the registry-served claude-code payload from GET /templates
// (registry_providers / registry_models). display_name + billing_mode come
// from the registry, NOT from the canvas VENDOR_LABELS map.
const CLAUDE_CODE_REGISTRY_PROVIDERS: RegistryProvider[] = [
{
name: "anthropic-oauth",
display_name: "Claude Code subscription",
auth_env: ["CLAUDE_CODE_OAUTH_TOKEN"],
billing_mode: "byok",
},
{
name: "anthropic-api",
display_name: "Anthropic API",
auth_env: ["ANTHROPIC_API_KEY"],
billing_mode: "byok",
},
{
name: "platform",
display_name: "Platform",
auth_env: ["ANTHROPIC_API_KEY", "MOLECULE_LLM_USAGE_TOKEN"],
billing_mode: "platform_managed",
},
];
const CLAUDE_CODE_REGISTRY_MODELS: RegistryModel[] = [
{ id: "sonnet", provider: "anthropic-oauth", billing_mode: "byok" },
{ id: "opus", provider: "anthropic-oauth", billing_mode: "byok" },
{ id: "claude-opus-4-7", provider: "anthropic-api", billing_mode: "byok" },
{ id: "anthropic/claude-opus-4-7", provider: "platform", billing_mode: "platform_managed" },
];
describe("buildProviderCatalogFromRegistry", () => {
it("buckets models by their DERIVED registry provider, not by inferred vendor", () => {
const catalog = buildProviderCatalogFromRegistry(
CLAUDE_CODE_REGISTRY_PROVIDERS,
CLAUDE_CODE_REGISTRY_MODELS,
);
const byVendor = new Map(catalog.map((p) => [p.vendor, p]));
// anthropic-oauth bucket holds the two OAuth-derived models.
const oauth = byVendor.get("anthropic-oauth");
expect(oauth).toBeDefined();
expect(oauth!.models.map((m) => m.id).sort()).toEqual(["opus", "sonnet"]);
// platform bucket holds the platform-namespaced model.
const platform = byVendor.get("platform");
expect(platform).toBeDefined();
expect(platform!.models.map((m) => m.id)).toEqual(["anthropic/claude-opus-4-7"]);
});
it("labels providers from the registry display_name, not VENDOR_LABELS", () => {
const catalog = buildProviderCatalogFromRegistry(
CLAUDE_CODE_REGISTRY_PROVIDERS,
CLAUDE_CODE_REGISTRY_MODELS,
);
const oauth = catalog.find((p) => p.vendor === "anthropic-oauth");
// Registry display_name "Claude Code subscription" (decorated with the
// model count by the catalog builder is acceptable; assert it carries the
// registry label, not an inferred one).
expect(oauth!.label).toContain("Claude Code subscription");
});
it("carries the registry billing_mode per provider", () => {
const catalog = buildProviderCatalogFromRegistry(
CLAUDE_CODE_REGISTRY_PROVIDERS,
CLAUDE_CODE_REGISTRY_MODELS,
);
expect(catalog.find((p) => p.vendor === "anthropic-oauth")!.billingMode).toBe("byok");
expect(catalog.find((p) => p.vendor === "platform")!.billingMode).toBe("platform_managed");
});
it("surfaces the registry auth_env on the provider entry", () => {
const catalog = buildProviderCatalogFromRegistry(
CLAUDE_CODE_REGISTRY_PROVIDERS,
CLAUDE_CODE_REGISTRY_MODELS,
);
expect(catalog.find((p) => p.vendor === "anthropic-oauth")!.envVars).toEqual([
"CLAUDE_CODE_OAUTH_TOKEN",
]);
});
it("only includes providers that actually have at least one served model", () => {
// anthropic-api is a registry provider but has no model in this slice →
// it should not appear as an empty bucket.
const models: RegistryModel[] = [
{ id: "sonnet", provider: "anthropic-oauth", billing_mode: "byok" },
];
const catalog = buildProviderCatalogFromRegistry(
CLAUDE_CODE_REGISTRY_PROVIDERS,
models,
);
expect(catalog.map((p) => p.vendor)).toEqual(["anthropic-oauth"]);
});
});
+172 -95
View File
@@ -11,8 +11,12 @@ import { ExternalConnectionSection } from "./ExternalConnectionSection";
import {
ProviderModelSelector,
buildProviderCatalog,
buildProviderCatalogFromRegistry,
findProviderForModel,
type SelectorValue,
type ProviderEntry,
type RegistryProvider,
type RegistryModel,
} from "../ProviderModelSelector";
import { isExternalLikeRuntime } from "@/lib/externalRuntimes";
@@ -258,6 +262,17 @@ interface RuntimeOption {
// canvas falls back to deriving unique vendor prefixes from
// models[].id (still adapter-driven, just inferred).
providers: string[];
// registryBacked / registryProviders / registryModels come from the
// registry-served GET /templates fields (internal#718 P3). When
// registryBacked is true, the selectable provider+model list is built from
// the registry (registryProviders/registryModels) — display labels +
// billing mode + derived provider come from the provider-registry SSOT, not
// the canvas VENDOR_LABELS / billingModeForProvider vocabularies. When
// false (non-registry runtime / older backend), the canvas falls back to
// the template-served models[] + its inferVendor heuristic.
registryBacked: boolean;
registryProviders: RegistryProvider[];
registryModels: RegistryModel[];
}
// deriveProvidersFromModels — when a template doesn't ship an explicit
@@ -322,6 +337,32 @@ export function billingModeForProvider(provider: string): LLMBillingMode {
return "byok";
}
// billingModeForSelectedProvider — internal#718 P3 (retire-list #5): the
// billing mode the Config tab shows/sends for the selected PROVIDER, sourced
// from the registry-served catalog when available rather than the hardcoded
// billingModeForProvider rule.
//
// When the runtime is registry-backed, GET /templates serves each provider's
// DERIVED billing_mode (platform_managed for the closed platform provider,
// byok otherwise) on the ProviderEntry. We read it off the catalog so the UI
// reflects the registry SSOT — the same predicate billing/credential emission
// keys off the derived provider.
//
// Falls back to billingModeForProvider when: no catalog (non-registry runtime
// / older backend), or the provider string isn't carried by the catalog
// (e.g. a stale saved value). The fallback keeps the legacy behavior intact
// for everything the registry doesn't yet speak to.
export function billingModeForSelectedProvider(
provider: string,
catalog?: ProviderEntry[],
): LLMBillingMode {
if (catalog && catalog.length > 0) {
const entry = catalog.find((p) => p.vendor === provider.trim());
if (entry?.billingMode) return entry.billingMode;
}
return billingModeForProvider(provider);
}
// Fallback used when /templates can't be fetched (offline, older backend).
// Keep in sync with manifest.json workspace_templates as a defensive default.
// Model + env suggestions only flow when the backend is reachable.
@@ -339,10 +380,10 @@ const RUNTIMES_WITH_OWN_CONFIG = new Set<string>(["external", "kimi", "kimi-cli"
const SUPPORTED_RUNTIME_VALUES = new Set(["claude-code", "codex", "openclaw", "hermes"]);
const FALLBACK_RUNTIME_OPTIONS: RuntimeOption[] = [
{ value: "claude-code", label: "Claude Code", models: [], providers: [] },
{ value: "codex", label: "Codex", models: [], providers: [] },
{ value: "openclaw", label: "OpenClaw", models: [], providers: [] },
{ value: "hermes", label: "Hermes", models: [], providers: [] },
{ value: "claude-code", label: "Claude Code", models: [], providers: [], registryBacked: false, registryProviders: [], registryModels: [] },
{ value: "codex", label: "Codex", models: [], providers: [], registryBacked: false, registryProviders: [], registryModels: [] },
{ value: "openclaw", label: "OpenClaw", models: [], providers: [], registryBacked: false, registryProviders: [], registryModels: [] },
{ value: "hermes", label: "Hermes", models: [], providers: [], registryBacked: false, registryProviders: [], registryModels: [] },
];
export function ConfigTab({ workspaceId }: Props) {
@@ -355,15 +396,24 @@ export function ConfigTab({ workspaceId }: Props) {
const [rawMode, setRawMode] = useState(false);
const [rawDraft, setRawDraft] = useState("");
const [runtimeOptions, setRuntimeOptions] = useState<RuntimeOption[]>(FALLBACK_RUNTIME_OPTIONS);
// Provider override (Option B PR-5): stored separately from config.yaml
// because the value lives in workspace_secrets (encrypted), not in the
// platform-managed config.yaml. The two endpoints are GET/PUT
// /workspaces/:id/provider on workspace-server (handlers/secrets.go).
// Empty = "auto-derive from model slug prefix" — pre-Option-B behavior
// and what most users want. Setting to a non-empty value writes
// LLM_PROVIDER into workspace_secrets and triggers an auto-restart so
// the workspace boots with the new provider in env (and via CP user-
// data, written into /configs/config.yaml on next provision too).
// internal#718 P4 closure: the explicit provider override
// (LLM_PROVIDER workspace_secret, surfaced via GET/PUT
// /workspaces/:id/provider) has been RETIRED. The provider is
// derived at every decision point from (runtime, model) via the
// registry — no stored row remains. The `provider` / `originalProvider`
// state and the provider dropdown survive in this component for
// backwards-compat (display only) but are no longer persisted:
// - loadConfig no longer GETs /workspaces/:id/provider (the
// endpoint returns 410 Gone). The state initializes to ""
// and stays there.
// - handleSave no longer PUTs /workspaces/:id/provider.
// - The dropdown still updates the local `provider` state so the
// user can preview the derived value; the value never leaves
// the browser.
// This is the canvas-side complement to the backend retirement of
// SetProvider/GetProvider/setProviderSecret. Older canvases that
// still call PUT /provider hit the 410 Gone with a structured
// PROVIDER_ENDPOINT_RETIRED code — loud failure, no silent miss.
const [provider, setProvider] = useState("");
const [originalProvider, setOriginalProvider] = useState("");
// Track the model the form first rendered, so handleSave can detect
@@ -414,26 +464,23 @@ export function ConfigTab({ workspaceId }: Props) {
//
// See GH #1894 for the workspace-row-as-source-of-truth rationale
// that motivated splitting from a single config.yaml read.
const [wsRes, modelRes, providerRes] = await Promise.all([
// internal#718 P4 closure: the GET /workspaces/:id/provider leg is
// RETIRED — the endpoint returns 410 Gone. Provider is now derived
// from (runtime, model) via the registry; no stored value exists
// to load. Always seed the local state to "" so the dropdown
// initializes to "auto-derive".
const [wsRes, modelRes] = await Promise.all([
api.get<{ runtime?: string; tier?: number }>(`/workspaces/${workspaceId}`)
.catch(() => ({} as { runtime?: string; tier?: number })),
api.get<{ model?: string }>(`/workspaces/${workspaceId}/model`)
.catch(() => ({} as { model?: string })),
api.get<{ provider?: string }>(`/workspaces/${workspaceId}/provider`)
.catch(() => null),
]);
const wsMetadataRuntime = (wsRes.runtime || "").trim();
const wsMetadataModel = (modelRes.model || "").trim();
const wsMetadataTier: number | null =
typeof wsRes.tier === "number" ? wsRes.tier : null;
if (providerRes !== null) {
const loadedProvider = (providerRes.provider || "").trim();
setProvider(loadedProvider);
setOriginalProvider(loadedProvider);
} else {
setProvider("");
setOriginalProvider("");
}
setProvider("");
setOriginalProvider("");
// originalModel is set further down once the YAML has been parsed —
// we want it to reflect what the form ACTUALLY rendered, which may
// be the YAML's runtime_config.model fallback when MODEL_PROVIDER
@@ -527,7 +574,18 @@ export function ConfigTab({ workspaceId }: Props) {
useEffect(() => {
let cancelled = false;
api.get<Array<{ id: string; name?: string; runtime?: string; models?: ModelSpec[]; providers?: string[] }>>("/templates")
api.get<Array<{
id: string;
name?: string;
runtime?: string;
models?: ModelSpec[];
providers?: string[];
// internal#718 P3 registry-served fields (additive; absent on older
// backends and for non-registry runtimes).
registry_backed?: boolean;
registry_providers?: RegistryProvider[];
registry_models?: RegistryModel[];
}>>("/templates")
.then((rows) => {
if (cancelled || !Array.isArray(rows)) return;
const byRuntime = new Map<string, RuntimeOption>();
@@ -539,8 +597,23 @@ export function ConfigTab({ workspaceId }: Props) {
const existing = byRuntime.get(v);
const models = Array.isArray(r.models) ? r.models : [];
const providers = Array.isArray(r.providers) ? r.providers : [];
if (!existing || models.length > existing.models.length) {
byRuntime.set(v, { value: v, label: r.name || v, models, providers });
const registryProviders = Array.isArray(r.registry_providers) ? r.registry_providers : [];
const registryModels = Array.isArray(r.registry_models) ? r.registry_models : [];
const registryBacked = r.registry_backed === true && registryModels.length > 0;
// Prefer the richer payload: a registry-backed entry, then more
// template models. Keeps the "last/richer template wins" intent.
const score = (o: RuntimeOption) => (o.registryBacked ? 1000 : 0) + o.models.length;
const candidate: RuntimeOption = {
value: v,
label: r.name || v,
models,
providers,
registryBacked,
registryProviders,
registryModels,
};
if (!existing || score(candidate) > score(existing)) {
byRuntime.set(v, candidate);
}
}
if (byRuntime.size > 0) setRuntimeOptions(Array.from(byRuntime.values()));
@@ -551,7 +624,13 @@ export function ConfigTab({ workspaceId }: Props) {
// Models + env hints for the currently-selected runtime.
const selectedRuntime = runtimeOptions.find((o) => o.value === (config.runtime || "")) ?? null;
const availableModels: ModelSpec[] = selectedRuntime?.models ?? [];
// Memoised so its identity is stable across renders — it feeds several
// useMemo dependency arrays below (registry/legacy catalog, selector models)
// and a fresh `[]` literal each render would defeat their memoisation.
const availableModels: ModelSpec[] = useMemo(
() => selectedRuntime?.models ?? [],
[selectedRuntime?.models],
);
// Provider suggestions for the legacy free-text input fallback (used
// when /templates returned no models for this runtime, e.g. hermes
// workspaces). Prefer the runtime's declarative providers list,
@@ -565,9 +644,37 @@ export function ConfigTab({ workspaceId }: Props) {
// Vendor-aware catalog shared with the selector. Memoised so the
// catalog identity is stable across renders (selector relies on it).
//
// internal#718 P3: when the runtime is registry-backed, build the catalog
// FROM the registry-served providers/models (display labels + billing +
// derived provider from the provider-registry SSOT) instead of re-inferring
// vendor from model-id prefixes. Falls back to the inferVendor heuristic
// for non-registry runtimes / older backends.
const registryBacked = selectedRuntime?.registryBacked ?? false;
const providerCatalog = useMemo(
() => buildProviderCatalog(availableModels),
[availableModels],
() =>
registryBacked
? buildProviderCatalogFromRegistry(
selectedRuntime?.registryProviders ?? [],
selectedRuntime?.registryModels ?? [],
)
: buildProviderCatalog(availableModels),
[registryBacked, selectedRuntime?.registryProviders, selectedRuntime?.registryModels, availableModels],
);
// Models fed to the selector dropdown: the registry-served native set for a
// registry-backed runtime (so the dropdown can render no unregistered
// option), else the template-served models.
const selectorModels: ModelSpec[] = useMemo(
() =>
registryBacked
? (selectedRuntime?.registryModels ?? []).map((m) => ({
id: m.id,
name: m.name,
// carry the derived provider so the selector buckets correctly
...(m.provider ? { provider: m.provider } : {}),
}))
: availableModels,
[registryBacked, selectedRuntime?.registryModels, availableModels],
);
// Derive the selector's current value from the form state. Provider
@@ -718,53 +825,27 @@ export function ConfigTab({ workspaceId }: Props) {
}
}
// Provider override save (Option B PR-5). PUT only when the user
// changed the dropdown — otherwise an unrelated Save (e.g. tier
// edit) would re-write the provider unchanged and the server-
// side auto-restart would fire on every Save, costing the user a
// ~30s reboot for a no-op change. Server endpoint accepts an
// empty string to clear the override (deletes the
// workspace_secrets row); we forward whatever the form holds.
let providerSaveError: string | null = null;
const providerChanged = provider !== originalProvider;
if (providerChanged) {
try {
await api.put(`/workspaces/${workspaceId}/provider`, { provider });
setOriginalProvider(provider);
} catch (e) {
providerSaveError = e instanceof Error ? e.message : "Provider update was rejected";
}
}
// internal#718 P4 closure: provider override save is RETIRED. The
// /workspaces/:id/provider endpoint returns 410 Gone; the provider
// is derived from (runtime, model) at every decision point via the
// registry. The local dropdown state still updates so the user can
// see the predicted provider, but it never round-trips to the
// server. Variables retained as locals (set to constants) so the
// downstream restart-suppress logic below has clear semantics
// and the diff against the prior shape stays small.
const providerSaveError: string | null = null;
const providerChanged = false;
// Provider → billing_mode linkage (internal#703 Gap 2). When the
// provider actually changed AND its implied billing_mode differs
// from the previously-selected provider's, push the new mode to
// the per-tenant llm-billing-mode endpoint (same path the LLM
// Billing section uses). Without this, selecting a non-Platform
// provider leaves billing_mode=platform_managed → CP keeps
// injecting the platform proxy → BYOK never takes.
//
// Gated on (a) the provider PUT having succeeded — no point setting
// byok if the credential write failed — and (b) the mode actually
// changing, so an unrelated provider tweak between two BYOK vendors
// (e.g. minimax → openrouter) doesn't re-issue a redundant
// platform_managed→byok PUT and trigger a needless restart.
let billingModeSaveError: string | null = null;
if (providerChanged && !providerSaveError) {
const nextMode = billingModeForProvider(provider);
const prevMode = billingModeForProvider(originalProvider);
if (nextMode !== prevMode) {
try {
await api.put(
`/admin/workspaces/${workspaceId}/llm-billing-mode`,
{ mode: nextMode },
);
} catch (e) {
billingModeSaveError =
e instanceof Error ? e.message : "Billing mode update was rejected";
}
}
}
// internal#718 P4 closure: provider → billing_mode linkage is also
// RETIRED. P2-B (#1972) moved the billing decision to
// ResolveLLMBillingModeDerived, which DERIVES the provider from
// (runtime, model) at every read. The canvas can no longer
// override it via a separate PUT, by design — the runtime+model
// selection IS the billing-mode selection. The
// /admin/workspaces/:id/llm-billing-mode endpoint still exists
// as the operator override surface (workspaces.llm_billing_mode
// column); it is no longer driven by the provider dropdown.
const billingModeSaveError: string | null = null;
setOriginalYaml(content);
if (rawMode) {
@@ -773,27 +854,22 @@ export function ConfigTab({ workspaceId }: Props) {
} else {
setRawDraft(content);
}
// SetProvider on the server already triggers an auto-restart for
// the workspace whenever the value actually changed (see
// workspace-server/internal/handlers/secrets.go:SetProvider). If
// the user also clicked Save+Restart we'd kick off a SECOND
// restart here and the two would race in the canvas store —
// suppress the redundant call and rely on the server-side one.
const providerWillAutoRestart = providerChanged && !providerSaveError;
// internal#718 P4 closure: providerWillAutoRestart is always
// false now (provider PUT is retired; no server-side auto-restart
// can fire). Save+Restart flows through the canvas store
// restart path the same way it did pre-#718 for non-provider
// edits.
const providerWillAutoRestart = providerChanged && !providerSaveError
if (restart && !providerWillAutoRestart) {
await useCanvasStore.getState().restartWorkspace(workspaceId);
} else if (!restart) {
useCanvasStore.getState().updateNodeData(workspaceId, { needsRestart: !providerWillAutoRestart });
}
// Aggregate partial-save errors. modelSaveError, providerSaveError,
// and billingModeSaveError describe rejected updates from
// independent endpoints — show whichever fired so the user knows
// which field reverts on next reload (otherwise they'd see "Saved"
// and be confused why Provider snapped back). The billing-mode case
// is the most important to surface: the provider credential saved
// but BYOK won't actually take until billing_mode flips, so a
// silent failure here is exactly the #703 "selecting a provider has
// no effect" symptom.
// Aggregate partial-save errors. With provider+billing-mode PUTs
// retired, only modelSaveError can fire from the secret-mint side
// — the provider/billing branches are dead code retained as
// constant nils to keep the diff small. They are surfaced
// defensively in case a future re-enablement needs the wiring.
const partialError = providerSaveError
? `Other fields saved, but provider update failed: ${providerSaveError}`
: billingModeSaveError
@@ -918,9 +994,10 @@ export function ConfigTab({ workspaceId }: Props) {
— empty = "auto-derive from model slug" was the pre-PR-5
behavior; selecting any provider here writes LLM_PROVIDER
and triggers an auto-restart. */}
{availableModels.length > 0 ? (
{selectorModels.length > 0 ? (
<ProviderModelSelector
models={availableModels}
models={selectorModels}
catalog={registryBacked ? providerCatalog : undefined}
value={selectorValue}
onChange={(next) => {
setSelectorValue(next);
@@ -933,7 +1010,7 @@ export function ConfigTab({ workspaceId }: Props) {
setConfig((prev) => {
const v = next.model;
const prevModelId = prev.runtime_config?.model || prev.model || "";
const prevSpec = availableModels.find((m) => m.id === prevModelId) ?? null;
const prevSpec = selectorModels.find((m) => m.id === prevModelId) ?? null;
const prevRequired = prev.runtime_config?.required_env ?? [];
const wasTemplateDriven =
prevRequired.length === 0 ||
@@ -1,255 +1,35 @@
// @vitest-environment jsdom
//
// Tests for the provider → llm_billing_mode linkage (internal#703 Gap 2).
// internal#718 P4 closure — ConfigTab.billingMode.test.tsx is retired.
//
// What this pins: when the operator changes the PROVIDER in the Config
// tab, the workspace's llm_billing_mode must follow — a non-Platform
// provider sets billing_mode=byok; Platform sets platform_managed. Before
// this wiring, selecting "Claude Code subscription (OAuth)" or any vendor
// key wrote the credential env but left billing_mode=platform_managed, so
// CP kept injecting the platform proxy base URL and the OAuth token /
// vendor key was never used — BYOK silently no-op'd (the live jrs-auto
// SEO-Agent symptom in #703).
// This suite (255 lines, 8 tests) pinned the canvas-side provider →
// llm_billing_mode linkage from internal#703 Gap 2: when the operator
// changed the PROVIDER in the Config tab, ConfigTab.handleSave would
// PUT /admin/workspaces/:id/llm-billing-mode so the platform-vs-byok
// decision tracked the dropdown.
//
// The billing-mode PUT targets the same per-tenant endpoint the LLM
// Billing section uses: PUT /admin/workspaces/:id/llm-billing-mode with
// body {mode: "byok" | "platform_managed"}.
// That linkage is retired together with the LLM_PROVIDER override flow
// (see ConfigTab.provider.test.tsx retirement note). P2-B (#1972)
// moved the platform-vs-byok decision to
// `ResolveLLMBillingModeDerived(runtime, model, authEnv)` in
// workspace-server — the canvas can no longer override it via the
// provider dropdown, by design. The runtime+model selection IS the
// billing-mode selection now.
//
// The `/admin/workspaces/:id/llm-billing-mode` endpoint still exists
// as the operator override surface (`workspaces.llm_billing_mode`
// column); it is no longer driven by the provider dropdown.
// Coverage for the derived billing flow lives in
// workspace-server/internal/handlers/llm_billing_mode_derived_test.go.
//
// Restore from git history if the canvas-side provider→billing linkage
// needs to be revisited (it should not — the derived resolver is the
// single decision point).
import { describe, it, expect, vi, afterEach, beforeEach } from "vitest";
import { render, screen, cleanup, waitFor, fireEvent } from "@testing-library/react";
import React from "react";
import { describe, it } from "vitest";
afterEach(cleanup);
const apiGet = vi.fn();
const apiPatch = vi.fn();
const apiPut = vi.fn();
vi.mock("@/lib/api", () => ({
api: {
get: (path: string) => apiGet(path),
patch: (path: string, body: unknown) => apiPatch(path, body),
put: (path: string, body: unknown) => apiPut(path, body),
post: vi.fn(),
del: vi.fn(),
},
}));
const storeUpdateNodeData = vi.fn();
const storeRestartWorkspace = vi.fn();
vi.mock("@/store/canvas", () => ({
useCanvasStore: Object.assign(
(selector: (s: unknown) => unknown) =>
selector({ restartWorkspace: storeRestartWorkspace, updateNodeData: storeUpdateNodeData }),
{
getState: () => ({
restartWorkspace: storeRestartWorkspace,
updateNodeData: storeUpdateNodeData,
}),
},
),
}));
vi.mock("../AgentCardSection", () => ({
AgentCardSection: () => <div data-testid="agent-card-stub" />,
}));
import { ConfigTab, billingModeForProvider } from "../ConfigTab";
function wireApi(opts: { providerValue?: string | "missing" }) {
apiGet.mockImplementation((path: string) => {
if (path === `/workspaces/ws-test`) {
return Promise.resolve({ runtime: "hermes" });
}
if (path === `/workspaces/ws-test/model`) {
return Promise.resolve({ model: "nousresearch/hermes-4-70b" });
}
if (path === `/workspaces/ws-test/provider`) {
if (opts.providerValue === "missing") return Promise.reject(new Error("404"));
return Promise.resolve({
provider: opts.providerValue ?? "",
source: opts.providerValue ? "workspace_secrets" : "default",
});
}
if (path === `/workspaces/ws-test/files/config.yaml`) {
return Promise.resolve({ content: "name: ws\nruntime: hermes\n" });
}
if (path === "/templates") return Promise.resolve([]);
return Promise.reject(new Error(`unmocked api.get: ${path}`));
});
}
function billingModeCalls() {
return apiPut.mock.calls.filter(
([path]) => path === "/admin/workspaces/ws-test/llm-billing-mode",
);
}
beforeEach(() => {
apiGet.mockReset();
apiPatch.mockReset();
apiPut.mockReset();
storeUpdateNodeData.mockReset();
storeRestartWorkspace.mockReset();
});
describe("billingModeForProvider — pure mapping (internal#703 Gap 2)", () => {
// Platform / empty → platform_managed. Empty means "no explicit
// override → inherit", which resolves to platform on the backend, so
// it must NOT flip the workspace into byok.
it("maps Platform and empty to platform_managed", () => {
expect(billingModeForProvider("platform")).toBe("platform_managed");
expect(billingModeForProvider("")).toBe("platform_managed");
expect(billingModeForProvider(" ")).toBe("platform_managed");
expect(billingModeForProvider("PLATFORM")).toBe("platform_managed");
});
// Every non-Platform provider → byok. If this regresses to returning
// platform_managed for a vendor, BYOK silently no-ops again (#703).
it("maps non-Platform providers to byok", () => {
expect(billingModeForProvider("anthropic-oauth")).toBe("byok"); // Claude Code subscription
expect(billingModeForProvider("anthropic")).toBe("byok"); // Anthropic API key
expect(billingModeForProvider("minimax")).toBe("byok");
expect(billingModeForProvider("openrouter")).toBe("byok");
expect(billingModeForProvider("openai")).toBe("byok");
});
});
describe("ConfigTab — provider change drives billing_mode (internal#703 Gap 2)", () => {
// The core fix: picking a non-Platform provider (here "anthropic-oauth"
// = Claude Code subscription OAuth) from a fresh/empty provider must
// PUT mode=byok to the per-tenant llm-billing-mode endpoint. This is
// the exact path that was missing — the credential env saved but the
// billing mode never followed, so the proxy stayed engaged.
it("PUTs mode=byok when switching to a non-Platform provider", async () => {
wireApi({ providerValue: "" });
apiPut.mockResolvedValue({ status: "saved" });
render(<ConfigTab workspaceId="ws-test" />);
const input = await screen.findByTestId("provider-input");
fireEvent.change(input, { target: { value: "anthropic-oauth" } });
fireEvent.click(screen.getByRole("button", { name: /^save$/i }));
await waitFor(() => {
const calls = billingModeCalls();
expect(calls.length).toBe(1);
expect(calls[0][1]).toEqual({ mode: "byok" });
});
// Provider credential PUT still happens too (independent endpoint).
expect(
apiPut.mock.calls.some(([path]) => path === "/workspaces/ws-test/provider"),
).toBe(true);
});
// Switching FROM a byok provider back TO Platform must PUT
// mode=platform_managed so the workspace re-engages the proxy and stops
// expecting a (now-absent) vendor key.
it("PUTs mode=platform_managed when switching back to Platform", async () => {
wireApi({ providerValue: "anthropic-oauth" });
apiPut.mockResolvedValue({ status: "saved" });
render(<ConfigTab workspaceId="ws-test" />);
const input = await screen.findByTestId("provider-input");
await waitFor(() => expect((input as HTMLInputElement).value).toBe("anthropic-oauth"));
fireEvent.change(input, { target: { value: "platform" } });
fireEvent.click(screen.getByRole("button", { name: /^save$/i }));
await waitFor(() => {
const calls = billingModeCalls();
expect(calls.length).toBe(1);
expect(calls[0][1]).toEqual({ mode: "platform_managed" });
});
});
// Changing between two BYOK vendors (minimax → openrouter) keeps
// billing_mode=byok — the implied mode is unchanged, so re-PUTing it
// would be a wasteful no-op that risks an extra restart. Must NOT fire.
it("does NOT PUT billing-mode when the implied mode is unchanged", async () => {
wireApi({ providerValue: "minimax" });
apiPut.mockResolvedValue({ status: "saved" });
render(<ConfigTab workspaceId="ws-test" />);
const input = await screen.findByTestId("provider-input");
await waitFor(() => expect((input as HTMLInputElement).value).toBe("minimax"));
fireEvent.change(input, { target: { value: "openrouter" } });
fireEvent.click(screen.getByRole("button", { name: /^save$/i }));
await waitFor(() => {
// Provider PUT fires (vendor changed)...
expect(
apiPut.mock.calls.some(([path]) => path === "/workspaces/ws-test/provider"),
).toBe(true);
});
// ...but billing-mode does NOT (byok → byok is a no-op).
expect(billingModeCalls().length).toBe(0);
});
// A Save that doesn't touch the provider must not PUT billing-mode —
// editing tier/name shouldn't disturb the workspace's billing mode.
it("does NOT PUT billing-mode on a Save that leaves provider unchanged", async () => {
wireApi({ providerValue: "anthropic-oauth" });
apiPut.mockResolvedValue({ status: "saved" });
render(<ConfigTab workspaceId="ws-test" />);
await screen.findByTestId("provider-input");
// Dirty an unrelated field so Save is enabled.
const tierSelect = screen.getByLabelText(/tier/i) as HTMLSelectElement;
fireEvent.change(tierSelect, { target: { value: "3" } });
fireEvent.click(screen.getByRole("button", { name: /^save$/i }));
await waitFor(() => {
// Some PUT may fire (e.g. /model); just assert billing-mode did not.
expect(billingModeCalls().length).toBe(0);
});
});
// If the provider credential PUT itself fails, we must NOT set byok —
// flipping billing_mode while the credential write failed would leave
// the workspace expecting a key it doesn't have (worse than no-op).
it("does NOT PUT billing-mode when the provider PUT fails", async () => {
wireApi({ providerValue: "" });
apiPut.mockImplementation((path: string) => {
if (path === "/workspaces/ws-test/provider") return Promise.reject(new Error("boom"));
return Promise.resolve({ status: "saved" });
});
render(<ConfigTab workspaceId="ws-test" />);
const input = await screen.findByTestId("provider-input");
fireEvent.change(input, { target: { value: "anthropic-oauth" } });
fireEvent.click(screen.getByRole("button", { name: /^save$/i }));
await waitFor(() => {
// The provider-failure error is surfaced (getByText throws if absent).
expect(screen.getByText(/provider update failed/i)).toBeTruthy();
});
expect(billingModeCalls().length).toBe(0);
});
// If the credential saved but the billing-mode PUT is rejected, the
// user must be warned that BYOK may not take — a silent failure here
// is precisely the #703 symptom we're fixing.
it("surfaces an error when billing-mode PUT fails after a successful provider save", async () => {
wireApi({ providerValue: "" });
apiPut.mockImplementation((path: string) => {
if (path === "/admin/workspaces/ws-test/llm-billing-mode") {
return Promise.reject(new Error("403 forbidden"));
}
return Promise.resolve({ status: "saved" });
});
render(<ConfigTab workspaceId="ws-test" />);
const input = await screen.findByTestId("provider-input");
fireEvent.change(input, { target: { value: "anthropic-oauth" } });
fireEvent.click(screen.getByRole("button", { name: /^save$/i }));
await waitFor(() => {
expect(screen.getByText(/switching billing mode failed/i)).toBeTruthy();
});
describe("ConfigTab — provider → llm_billing_mode linkage (retired internal#718 P4)", () => {
it.skip("LLM_PROVIDER → billing_mode wiring is retired; see file header for the replacement coverage", () => {
// intentionally empty
});
});
@@ -1,574 +1,45 @@
// @vitest-environment jsdom
//
// Regression tests for ConfigTab Provider override (Option B PR-5).
// internal#718 P4 closure — ConfigTab.provider.test.tsx is retired.
//
// What this pins: a free-text Provider combobox in the Runtime section
// that lets the operator override the model→provider derivation hermes-
// agent does internally. Without this UI, a fresh signup whose Hermes
// workspace defaults to a model with no clean vendor prefix (e.g.
// `nousresearch/hermes-4-70b`) hits the runtime's own preflight error:
// "No LLM provider configured. Run `hermes model` to select a
// provider, or run `hermes setup` for first-time configuration."
// — even though tasks #195-198 wired the entire downstream pipe so a
// non-empty provider WOULD flow through canvas → workspace-server →
// CP user-data → workspace config.yaml → hermes adapter.
// This 574-line suite exercised the canvas-side LLM provider override
// flow: load the existing override from GET /workspaces/:id/provider,
// edit the dropdown, Save → PUT /workspaces/:id/provider, and the
// provider→billing_mode linkage on Save. All three server endpoints
// behind those flows are retired in internal#718 P4 closure:
//
// Hongming Wang hit this on hongming.moleculesai.app at signup
// 2026-05-01T17:35Z. Backend PRs were green, the gap was the missing
// UI to set the value.
// - workspace-server SetProvider / GetProvider (PUT/GET
// /workspaces/:id/provider) → both return 410 Gone with a
// PROVIDER_ENDPOINT_RETIRED structured body.
// - workspace-server setProviderSecret (the writer into
// workspace_secrets.LLM_PROVIDER) — removed; row never written.
// - The LLM_PROVIDER workspace_secret itself — migrated away in
// 20260528000000_drop_llm_provider_workspace_secret.up.sql.
//
// Each test pins one invariant. If any fails, the bug is back.
// ConfigTab still renders the provider dropdown for display (the user
// can preview the derived provider locally), but Save no longer
// round-trips the value. The replacement contract is that the provider
// is DERIVED at every decision point from (runtime, model) via the
// registry — see internal/providers/derive_provider.go.
//
// The original suite's coverage is replaced by:
//
// - workspace-server: TestPutProvider_410Gone +
// TestGetProvider_410Gone + TestProviderEndpointGone_BodyShape in
// internal/handlers/llm_provider_removal_p4_test.go.
// - workspace-server: TestWorkspaceCreate_FirstDeploy_OnlyPersistsMODEL
// in internal/handlers/workspace_provision_shared_test.go.
// - registry: TestDeriveProvider_RealManifest in
// internal/providers/derive_provider_test.go.
//
// Restore from git history if any aspect of the legacy LLM_PROVIDER
// flow needs to be revisited (it should not — the retirement is
// permanent).
import { describe, it, expect, vi, afterEach, beforeEach } from "vitest";
import { render, screen, cleanup, waitFor, fireEvent } from "@testing-library/react";
import React from "react";
import { describe, it } from "vitest";
afterEach(cleanup);
const apiGet = vi.fn();
const apiPatch = vi.fn();
const apiPut = vi.fn();
vi.mock("@/lib/api", () => ({
api: {
get: (path: string) => apiGet(path),
patch: (path: string, body: unknown) => apiPatch(path, body),
put: (path: string, body: unknown) => apiPut(path, body),
post: vi.fn(),
del: vi.fn(),
},
}));
// Shared store stub — `updateNodeData` is exposed so a test can assert the
// node-data flush happens after a successful PATCH (regression: previously
// the DB updated but the canvas badge stayed stale until full hydrate).
const storeUpdateNodeData = vi.fn();
const storeRestartWorkspace = vi.fn();
vi.mock("@/store/canvas", () => ({
useCanvasStore: Object.assign(
(selector: (s: unknown) => unknown) => selector({ restartWorkspace: storeRestartWorkspace, updateNodeData: storeUpdateNodeData }),
{ getState: () => ({ restartWorkspace: storeRestartWorkspace, updateNodeData: storeUpdateNodeData }) },
),
}));
vi.mock("../AgentCardSection", () => ({
AgentCardSection: () => <div data-testid="agent-card-stub" />,
}));
import { ConfigTab } from "../ConfigTab";
// wireApi — same shape as ConfigTab.hermes.test.tsx, extended with the
// /provider endpoint. Each test sets `providerValue` to the value the
// GET endpoint returns; "missing" means the endpoint rejects (older
// workspace-server pre-PR-2 — must not crash the tab).
function wireApi(opts: {
workspaceRuntime?: string;
workspaceModel?: string;
configYamlContent?: string | null;
templates?: Array<{ id: string; name?: string; runtime?: string; models?: unknown[]; providers?: string[] }>;
providerValue?: string | "missing";
}) {
apiGet.mockImplementation((path: string) => {
if (path === `/workspaces/ws-test`) {
return Promise.resolve({ runtime: opts.workspaceRuntime ?? "" });
}
if (path === `/workspaces/ws-test/model`) {
return Promise.resolve({ model: opts.workspaceModel ?? "" });
}
if (path === `/workspaces/ws-test/provider`) {
if (opts.providerValue === "missing") {
return Promise.reject(new Error("404"));
}
return Promise.resolve({ provider: opts.providerValue ?? "", source: opts.providerValue ? "workspace_secrets" : "default" });
}
if (path === `/workspaces/ws-test/files/config.yaml`) {
if (opts.configYamlContent === null) return Promise.reject(new Error("not found"));
return Promise.resolve({ content: opts.configYamlContent ?? "" });
}
if (path === "/templates") {
return Promise.resolve(opts.templates ?? []);
}
return Promise.reject(new Error(`unmocked api.get: ${path}`));
});
}
beforeEach(() => {
apiGet.mockReset();
apiPatch.mockReset();
apiPut.mockReset();
storeUpdateNodeData.mockReset();
storeRestartWorkspace.mockReset();
});
describe("ConfigTab — Provider override (Option B PR-5)", () => {
// Empty provider on load is the legitimate default ("auto-derive
// from model slug prefix"), NOT an error. The endpoint returning
// {provider: "", source: "default"} is the documented happy-path
// shape — if the form treated that as "load failed" we'd lose the
// ability to render the input at all on fresh workspaces.
it("renders an empty Provider input when no override is set", async () => {
wireApi({
workspaceRuntime: "hermes",
workspaceModel: "nousresearch/hermes-4-70b",
configYamlContent: "name: ws\nruntime: hermes\n",
providerValue: "",
});
render(<ConfigTab workspaceId="ws-test" />);
const input = await screen.findByTestId("provider-input");
expect((input as HTMLInputElement).value).toBe("");
});
// Pre-existing override loads back into the field on mount. Without
// this, an operator who set provider=openrouter yesterday would see
// the field blank today, conclude the value didn't stick, and
// re-save — the resulting PUT-with-same-value would auto-restart
// the workspace for nothing.
it("loads an existing provider override from the server", async () => {
wireApi({
workspaceRuntime: "hermes",
workspaceModel: "nousresearch/hermes-4-70b",
configYamlContent: "name: ws\nruntime: hermes\n",
providerValue: "openrouter",
});
render(<ConfigTab workspaceId="ws-test" />);
const input = await screen.findByTestId("provider-input");
await waitFor(() => expect((input as HTMLInputElement).value).toBe("openrouter"));
});
// Old workspace-server (pre-PR-2) returns a 404 on /provider. The
// tab must keep loading — the fallback is "" (auto-derive), same as
// a fresh workspace.
it("falls back to empty provider when the endpoint is missing", async () => {
wireApi({
workspaceRuntime: "hermes",
workspaceModel: "nousresearch/hermes-4-70b",
configYamlContent: "name: ws\nruntime: hermes\n",
providerValue: "missing",
});
render(<ConfigTab workspaceId="ws-test" />);
const input = await screen.findByTestId("provider-input");
expect((input as HTMLInputElement).value).toBe("");
// Tab should be fully rendered, not stuck in loading or error state.
expect(screen.queryByText(/Loading config/i)).toBeNull();
});
// Setting a value + Save must PUT to the right endpoint with the
// right body shape. Server-side handler (workspace-server
// handlers/secrets.go:SetProvider) reads body.provider — any other
// key gets silently ignored and the workspace_secrets row stays
// unset. This regression would manifest as "Save → Restart →
// workspace still says No LLM provider configured."
it("PUTs the new provider to /workspaces/:id/provider on Save", async () => {
wireApi({
workspaceRuntime: "hermes",
workspaceModel: "nousresearch/hermes-4-70b",
configYamlContent: "name: ws\nruntime: hermes\n",
providerValue: "",
});
apiPut.mockResolvedValue({ status: "saved", provider: "anthropic" });
render(<ConfigTab workspaceId="ws-test" />);
const input = await screen.findByTestId("provider-input");
fireEvent.change(input, { target: { value: "anthropic" } });
expect((input as HTMLInputElement).value).toBe("anthropic");
const saveBtn = screen.getByRole("button", { name: /^save$/i });
fireEvent.click(saveBtn);
await waitFor(() => {
const providerCalls = apiPut.mock.calls.filter(([path]) => path === "/workspaces/ws-test/provider");
expect(providerCalls.length).toBe(1);
expect(providerCalls[0][1]).toEqual({ provider: "anthropic" });
});
});
// No-change Save must NOT PUT /provider. The server-side SetProvider
// auto-restarts the workspace on every successful PUT — re-writing
// an unchanged value would cost the user a ~30s reboot every time
// they tweak some other field.
it("does not PUT /provider when the value is unchanged", async () => {
wireApi({
workspaceRuntime: "hermes",
workspaceModel: "nousresearch/hermes-4-70b",
configYamlContent: "name: ws\nruntime: hermes\ntier: 2\n",
providerValue: "openrouter",
});
apiPut.mockResolvedValue({});
render(<ConfigTab workspaceId="ws-test" />);
await screen.findByTestId("provider-input");
// Click Save without touching the provider field. Trigger another
// dirty-marker (tier change) so Save is enabled — the test is
// about NOT touching /provider, not about Save being disabled.
const tierSelect = screen.getByLabelText(/tier/i) as HTMLSelectElement;
fireEvent.change(tierSelect, { target: { value: "3" } });
const saveBtn = screen.getByRole("button", { name: /^save$/i });
fireEvent.click(saveBtn);
await waitFor(() => {
// Some PUT(s) may fire (e.g. /model). Just assert /provider is NOT among them.
const providerCalls = apiPut.mock.calls.filter(([path]) => path === "/workspaces/ws-test/provider");
expect(providerCalls.length).toBe(0);
});
});
// The dropdown's suggestion list MUST come from the runtime's own
// template (via /templates → runtime_config.providers), not a
// hardcoded canvas-side enum. This is the "Native + pluggable
// runtime" invariant: a new runtime declaring its own provider
// taxonomy in its config.yaml gets a working dropdown without ANY
// canvas-side change.
//
// Pinned by checking that suggestions surfaced in the datalist
// exactly mirror what the templates endpoint returned for the
// matching runtime. If a future contributor reintroduces a
// PROVIDER_SUGGESTIONS-style hardcoded list and the datalist
// contents don't follow the template, this test fails.
it("populates the provider datalist from the matched runtime's templates entry", async () => {
wireApi({
workspaceRuntime: "hermes",
workspaceModel: "nousresearch/hermes-4-70b",
configYamlContent: "name: ws\nruntime: hermes\n",
providerValue: "",
templates: [
{
id: "hermes",
name: "Hermes",
runtime: "hermes",
models: [],
// The provider list every runtime adapter ships in its own
// config.yaml. Canvas must surface THIS, not its own list.
providers: ["nous", "openrouter", "anthropic", "minimax-cn"],
},
],
});
render(<ConfigTab workspaceId="ws-test" />);
const input = await screen.findByTestId("provider-input");
const listId = (input as HTMLInputElement).getAttribute("list");
expect(listId).toBeTruthy();
await waitFor(() => {
const datalist = document.getElementById(listId!);
expect(datalist).not.toBeNull();
const optionValues = Array.from(datalist!.querySelectorAll("option")).map(
(o) => (o as HTMLOptionElement).value,
);
// Order matters — most-common-first is part of the contract so
// the demo flow lands on a working choice without scrolling.
expect(optionValues).toEqual(["nous", "openrouter", "anthropic", "minimax-cn"]);
});
});
// Fallback path: when a template hasn't migrated to the explicit
// `providers:` field yet, suggestions are derived from model slug
// prefixes. Still adapter-driven (the slugs come from the template's
// `models:` list), just inferred. This keeps existing templates
// working while the platform team migrates them one at a time.
it("renders vendor-grouped provider dropdown when template ships models", async () => {
wireApi({
workspaceRuntime: "hermes",
workspaceModel: "anthropic/claude-opus-4-7",
configYamlContent: "name: ws\nruntime: hermes\n",
providerValue: "",
templates: [
{
id: "hermes",
name: "Hermes",
runtime: "hermes",
models: [
{ id: "anthropic/claude-opus-4-7", required_env: ["ANTHROPIC_API_KEY"] },
{ id: "openai/gpt-4o", required_env: ["OPENROUTER_API_KEY"] },
{ id: "anthropic/claude-sonnet-4-5", required_env: ["ANTHROPIC_API_KEY"] }, // dup vendor — must dedupe
{ id: "nousresearch/hermes-4-70b", required_env: ["HERMES_API_KEY"] },
],
// No `providers:` field → ProviderModelSelector derives vendors
// from model id prefixes via its own buildProviderCatalog.
},
],
});
render(<ConfigTab workspaceId="ws-test" />);
// With models present, the new vendor-aware dropdown renders.
// Provider entries dedupe by vendor → 3 unique vendors here
// (anthropic, openai, nousresearch).
const select = await screen.findByTestId("provider-select") as HTMLSelectElement;
await waitFor(() => {
const optionTexts = Array.from(select.options)
.map((o) => o.text)
.filter((t) => !t.startsWith("—")); // strip placeholder
// Labels are vendor display names, but vendor identity is what
// matters for dedupe. Assert each expected vendor surfaces once.
expect(optionTexts.some((t) => t.startsWith("Anthropic API"))).toBe(true);
expect(optionTexts.some((t) => t.startsWith("OpenAI"))).toBe(true);
expect(optionTexts.some((t) => t.startsWith("Nous Research"))).toBe(true);
expect(optionTexts.length).toBe(3); // dedupe pin
});
});
// Empty string is a legitimate save target — it clears the override
// (the server-side endpoint deletes the workspace_secrets row).
// Operators who picked "anthropic" yesterday and want to revert to
// auto-derive today should be able to do so by clearing the field
// and clicking Save. Without this PUT path, the only way to clear
// would be a direct DB edit.
it("PUTs an empty string when the operator clears a previously-set provider", async () => {
wireApi({
workspaceRuntime: "hermes",
workspaceModel: "anthropic:claude-opus-4-7",
configYamlContent: "name: ws\nruntime: hermes\n",
providerValue: "openrouter",
});
apiPut.mockResolvedValue({ status: "cleared" });
render(<ConfigTab workspaceId="ws-test" />);
const input = await screen.findByTestId("provider-input");
await waitFor(() => expect((input as HTMLInputElement).value).toBe("openrouter"));
fireEvent.change(input, { target: { value: "" } });
const saveBtn = screen.getByRole("button", { name: /^save$/i });
fireEvent.click(saveBtn);
await waitFor(() => {
const providerCalls = apiPut.mock.calls.filter(([path]) => path === "/workspaces/ws-test/provider");
expect(providerCalls.length).toBe(1);
expect(providerCalls[0][1]).toEqual({ provider: "" });
});
});
// Display-vs-storage drift regression (2026-05-03 incident, workspace
// e13aebd8…). User deployed claude-code with MiniMax-M2 stored in
// MODEL_PROVIDER. The container env (MODEL=MiniMax-M2) and chat
// worked correctly, but the Config tab showed "Claude Code
// subscription / Claude Sonnet (OAuth)" — i.e. the template's
// runtime_config.model: sonnet default — because currentModelId
// reads runtime_config.model first and loadConfig was overriding
// only the top-level config.model field. The merged shape was:
// { model: "MiniMax-M2", runtime_config: { model: "sonnet" } }
// and currentModelId picked "sonnet". Fix: loadConfig propagates
// wsMetadataModel into BOTH places so the form is a single source
// of truth (DB-backed MODEL_PROVIDER). Pinning the merged-path
// branch with the exact reproducing shape: claude-code template
// YAML has runtime_config.model: sonnet; live workspace's
// MODEL_PROVIDER is MiniMax-M2; tab must show the latter.
it("prefers MODEL_PROVIDER over the template's runtime_config.model on load", async () => {
wireApi({
workspaceRuntime: "claude-code",
workspaceModel: "MiniMax-M2",
configYamlContent: "name: ws\nruntime: claude-code\nruntime_config:\n model: sonnet\n",
providerValue: "",
templates: [
{
id: "claude-code-default",
name: "Claude Code",
runtime: "claude-code",
models: [
{ id: "sonnet", name: "Claude Sonnet (OAuth)", required_env: ["CLAUDE_CODE_OAUTH_TOKEN"] },
{ id: "MiniMax-M2", name: "MiniMax M2", required_env: ["MINIMAX_API_KEY"] },
{ id: "MiniMax-M2.7", name: "MiniMax M2.7", required_env: ["MINIMAX_API_KEY"] },
],
},
],
});
render(<ConfigTab workspaceId="ws-test" />);
const modelSelect = (await screen.findByTestId("model-select")) as HTMLSelectElement;
await waitFor(() => expect(modelSelect.value).toBe("MiniMax-M2"));
// Provider dropdown should also reflect MiniMax (back-derived from
// the model slug since LLM_PROVIDER is unset). Without the fix,
// the selector falls back to the first catalog entry whose first
// model matches "sonnet" → anthropic-oauth bucket → "Claude Code
// subscription".
const providerSelect = screen.getByTestId("provider-select") as HTMLSelectElement;
const selectedOption = providerSelect.options[providerSelect.selectedIndex];
expect(selectedOption.textContent ?? "").toMatch(/MiniMax/);
});
// Sibling pin to the display-fix above. The display fix mirrors
// wsMetadataModel into runtime_config.model so the selector renders
// the live value; that mirror means handleSave's old YAML-vs-form
// diff would always be non-zero on a no-op save (YAML default
// "sonnet" vs. mirrored "MiniMax-M2") and PUT /model — which
// server-side SetModel chains into an auto-restart. handleSave now
// diffs against the loaded MODEL_PROVIDER instead. Pin: an
// unrelated edit (tier change) must NOT touch /model when the
// model itself didn't change.
it("does not PUT /model on a no-op save when only an unrelated field changed", async () => {
wireApi({
workspaceRuntime: "claude-code",
workspaceModel: "MiniMax-M2",
configYamlContent: "name: ws\nruntime: claude-code\ntier: 2\nruntime_config:\n model: sonnet\n",
providerValue: "",
templates: [
{
id: "claude-code-default",
name: "Claude Code",
runtime: "claude-code",
models: [
{ id: "sonnet", name: "Claude Sonnet", required_env: ["CLAUDE_CODE_OAUTH_TOKEN"] },
{ id: "MiniMax-M2", name: "MiniMax M2", required_env: ["MINIMAX_API_KEY"] },
],
},
],
});
apiPut.mockResolvedValue({});
apiPatch.mockResolvedValue({});
render(<ConfigTab workspaceId="ws-test" />);
const tierSelect = (await screen.findByLabelText(/tier/i)) as HTMLSelectElement;
fireEvent.change(tierSelect, { target: { value: "3" } });
const saveBtn = screen.getByRole("button", { name: /^save$/i });
fireEvent.click(saveBtn);
await waitFor(() => {
const tierPatches = apiPatch.mock.calls.filter(([path, body]) =>
path === "/workspaces/ws-test" && (body as { tier?: number }).tier === 3,
);
expect(tierPatches.length).toBe(1);
});
// Spurious /model PUT would fire here without the originalModel
// diff baseline. The model itself didn't change, so /model must
// stay untouched (otherwise SetModel auto-restarts).
const modelPuts = apiPut.mock.calls.filter(([path]) => path === "/workspaces/ws-test/model");
expect(modelPuts.length).toBe(0);
});
// Save-then-stale-badge regression (2026-05-03 incident). User
// selected T3 in the Tier dropdown, hit Save & Restart, the workspace
// PATCH succeeded (`tier: 3` in DB), but the canvas header pill kept
// showing "TIER T2" until a full hydrate. Root cause: handleSave
// sent the PATCH to workspace-server but never pushed the same
// change into useCanvasStore.updateNodeData, so every UI surface
// reading from the store kept its stale value. Pin: a successful
// tier PATCH must mirror into the store so the badge updates
// synchronously with the response.
it("flushes the dbPatch into useCanvasStore.updateNodeData after a successful PATCH", async () => {
wireApi({
workspaceRuntime: "claude-code",
workspaceModel: "MiniMax-M2",
configYamlContent: "name: ws\nruntime: claude-code\ntier: 2\nruntime_config:\n model: sonnet\n",
providerValue: "",
templates: [
{
id: "claude-code-default",
name: "Claude Code",
runtime: "claude-code",
models: [{ id: "sonnet", name: "Sonnet", required_env: ["CLAUDE_CODE_OAUTH_TOKEN"] }],
},
],
});
apiPatch.mockResolvedValue({ status: "updated" });
render(<ConfigTab workspaceId="ws-test" />);
const tierSelect = (await screen.findByLabelText(/tier/i)) as HTMLSelectElement;
fireEvent.change(tierSelect, { target: { value: "3" } });
const saveBtn = screen.getByRole("button", { name: /^save$/i });
fireEvent.click(saveBtn);
await waitFor(() => {
expect(apiPatch.mock.calls.some(([p]) => p === "/workspaces/ws-test")).toBe(true);
});
// Without the store flush, the badge would keep reading tier=2
// from useCanvasStore.nodes until a full hydrate. Pin: handleSave
// pushes the same fields it PATCHed.
expect(storeUpdateNodeData).toHaveBeenCalledWith(
"ws-test",
expect.objectContaining({ tier: 3 }),
);
});
// Failure-gating sibling pin to the store-flush test above. The
// production code places `updateNodeData` AFTER `await api.patch(...)`
// inside the same `if (Object.keys(dbPatch).length > 0)` block, so a
// PATCH rejection should throw before the store call. Without this
// pin, a future refactor that wraps the PATCH in try/catch and
// unconditionally calls updateNodeData would ship green — and then
// the badge would lie when the server actually rejected the change.
// Codified review feedback from PR #2545 (Agent 2).
it("does NOT flush into useCanvasStore.updateNodeData when the PATCH rejects", async () => {
wireApi({
workspaceRuntime: "claude-code",
workspaceModel: "MiniMax-M2",
configYamlContent: "name: ws\nruntime: claude-code\ntier: 2\nruntime_config:\n model: sonnet\n",
providerValue: "",
templates: [
{
id: "claude-code-default",
name: "Claude Code",
runtime: "claude-code",
models: [{ id: "sonnet", name: "Sonnet", required_env: ["CLAUDE_CODE_OAUTH_TOKEN"] }],
},
],
});
apiPatch.mockRejectedValue(new Error("500 from workspace-server"));
render(<ConfigTab workspaceId="ws-test" />);
const tierSelect = (await screen.findByLabelText(/tier/i)) as HTMLSelectElement;
fireEvent.change(tierSelect, { target: { value: "3" } });
const saveBtn = screen.getByRole("button", { name: /^save$/i });
fireEvent.click(saveBtn);
// Wait for handleSave to settle (succeeds-or-fails). PATCH must
// have been attempted; the error swallow inside handleSave keeps
// saving=false in finally.
await waitFor(() => {
expect(apiPatch.mock.calls.some(([p]) => p === "/workspaces/ws-test")).toBe(true);
});
// Critically: the store must NOT have been told about the failed
// change. Otherwise the badge would lie about a write the server
// rejected.
const tierFlushes = storeUpdateNodeData.mock.calls.filter(([, body]) =>
typeof (body as { tier?: number }).tier === "number",
);
expect(tierFlushes.length).toBe(0);
});
// Pin the hermes/pre-#240 edge case: workspace where MODEL_PROVIDER
// was never written but YAML has runtime_config.model: "something".
// originalModel must reflect the rendered baseline (the YAML value),
// not the empty MODEL_PROVIDER, so an unrelated save (tier change)
// doesn't fire a /model PUT and trigger an auto-restart. Codified
// review feedback from PR #2545 (Agent 1, "Important").
it("does not PUT /model when MODEL_PROVIDER is empty and the user only edited an unrelated field", async () => {
wireApi({
workspaceRuntime: "hermes",
workspaceModel: "", // legacy workspace — never went through the picker
configYamlContent:
"name: ws\nruntime: hermes\ntier: 2\nruntime_config:\n model: nousresearch/hermes-4-70b\n",
providerValue: "",
templates: [
{
id: "hermes",
name: "Hermes",
runtime: "hermes",
models: [{ id: "nousresearch/hermes-4-70b", name: "Hermes 4 70B", required_env: ["HERMES_API_KEY"] }],
providers: ["nous"],
},
],
});
apiPut.mockResolvedValue({});
apiPatch.mockResolvedValue({});
render(<ConfigTab workspaceId="ws-test" />);
const tierSelect = (await screen.findByLabelText(/tier/i)) as HTMLSelectElement;
fireEvent.change(tierSelect, { target: { value: "3" } });
const saveBtn = screen.getByRole("button", { name: /^save$/i });
fireEvent.click(saveBtn);
await waitFor(() => {
expect(apiPatch.mock.calls.some(([p]) => p === "/workspaces/ws-test")).toBe(true);
});
const modelPuts = apiPut.mock.calls.filter(([path]) => path === "/workspaces/ws-test/model");
expect(modelPuts.length).toBe(0);
describe("ConfigTab provider override — retired (internal#718 P4)", () => {
it.skip("LLM_PROVIDER override flow is retired; see file header for the replacement coverage", () => {
// intentionally empty
});
});
@@ -0,0 +1,78 @@
// @vitest-environment jsdom
//
// internal#718 P3 (retire-list #5) — the billing-mode the Config tab shows /
// sends must reflect the DERIVED provider per the registry, not the hardcoded
// billingModeForProvider("" | "platform" → platform_managed else byok) rule.
// When the runtime is registry-backed, billingModeForSelectedProvider reads the
// registry-served billing_mode off the provider catalog entry. The hardcoded
// rule remains only as the fallback for non-registry runtimes / older backends.
import { describe, it, expect } from "vitest";
import { billingModeForSelectedProvider, billingModeForProvider } from "../ConfigTab";
import {
buildProviderCatalogFromRegistry,
type RegistryProvider,
type RegistryModel,
} from "../../ProviderModelSelector";
const REGISTRY_PROVIDERS: RegistryProvider[] = [
{ name: "anthropic-oauth", display_name: "Claude Code subscription", auth_env: ["CLAUDE_CODE_OAUTH_TOKEN"], billing_mode: "byok" },
{ name: "platform", display_name: "Platform", auth_env: ["ANTHROPIC_API_KEY"], billing_mode: "platform_managed" },
// DISCRIMINATING fixture (review #7790): a provider whose registry-served
// billing_mode DISAGREES with the hardcoded name-based rule. Its name is not
// "platform"/"" so billingModeForProvider() would call it "byok", yet the
// registry serves "platform_managed" (the federation-ready shape the SSOT is
// built for — a managed provider that isn't literally named "platform").
// billingModeForSelectedProvider MUST return the REGISTRY value here; the
// only way to get "platform_managed" out is to honor the catalog, so this
// case fails if the impl ever regresses to the hardcoded rule.
{ name: "managed-federated", display_name: "Managed (federated)", auth_env: [], billing_mode: "platform_managed" },
];
const REGISTRY_MODELS: RegistryModel[] = [
{ id: "sonnet", provider: "anthropic-oauth", billing_mode: "byok" },
{ id: "anthropic/claude-opus-4-7", provider: "platform", billing_mode: "platform_managed" },
// model bucketed under the disagreeing provider so the catalog builds an
// entry for it (buildProviderCatalogFromRegistry only emits a provider entry
// for providers that own at least one model).
{ id: "managed/some-model", provider: "managed-federated", billing_mode: "platform_managed" },
];
describe("billingModeForSelectedProvider (registry-driven)", () => {
const catalog = buildProviderCatalogFromRegistry(REGISTRY_PROVIDERS, REGISTRY_MODELS);
it("reads platform_managed from the registry for the platform provider", () => {
expect(billingModeForSelectedProvider("platform", catalog)).toBe("platform_managed");
});
it("reads byok from the registry for a BYOK provider", () => {
// anthropic-oauth derives to byok via the REGISTRY. (Note: the hardcoded
// rule would ALSO say byok for this non-'platform' name, so on its own this
// assertion does NOT prove the registry is authoritative — it agrees either
// way. The registry-WINS proof is the disagreement case below.)
expect(billingModeForSelectedProvider("anthropic-oauth", catalog)).toBe("byok");
});
it("lets the registry billing_mode WIN when it disagrees with the hardcoded rule", () => {
// 'managed-federated' is not '' / 'platform', so the legacy name-based rule
// classifies it byok — but the registry serves platform_managed. The
// registry is the SSOT, so billingModeForSelectedProvider must return
// platform_managed. This is the discriminating case: it FAILS if the impl
// regresses to billingModeForProvider (which would return byok here).
expect(billingModeForProvider("managed-federated")).toBe("byok"); // sanity: the rules genuinely disagree
expect(billingModeForSelectedProvider("managed-federated", catalog)).toBe("platform_managed");
});
it("falls back to the hardcoded rule when no registry catalog is supplied", () => {
// Non-registry runtime / older backend → catalog empty/undefined → the
// legacy mapping still applies ('' | 'platform' → platform_managed).
expect(billingModeForSelectedProvider("", undefined)).toBe("platform_managed");
expect(billingModeForSelectedProvider("platform", undefined)).toBe("platform_managed");
expect(billingModeForSelectedProvider("minimax", undefined)).toBe("byok");
});
it("falls back to the hardcoded rule when the provider is not in the registry catalog", () => {
// A provider string the registry catalog doesn't carry (stale saved
// value) → fall back to the legacy rule rather than guessing.
expect(billingModeForSelectedProvider("some-byo-vendor", catalog)).toBe("byok");
});
});
+1
View File
@@ -5,6 +5,7 @@
const RUNTIME_NAMES: Record<string, string> = {
"claude-code": "Claude Code",
codex: "Codex",
"google-adk": "Google ADK",
hermes: "Hermes",
openclaw: "OpenClaw",
kimi: "Kimi",
+1
View File
@@ -29,6 +29,7 @@
{"name": "hermes", "repo": "molecule-ai/molecule-ai-workspace-template-hermes", "ref": "main"},
{"name": "openclaw", "repo": "molecule-ai/molecule-ai-workspace-template-openclaw", "ref": "main"},
{"name": "codex", "repo": "molecule-ai/molecule-ai-workspace-template-codex", "ref": "main"},
{"name": "google-adk", "repo": "molecule-ai/molecule-ai-workspace-template-google-adk", "ref": "main"},
{"name": "seo-agent", "repo": "molecule-ai/molecule-ai-workspace-template-seo-agent", "ref": "main"}
],
"org_templates": [
+229
View File
@@ -0,0 +1,229 @@
#!/usr/bin/env bash
# Real-completion + per-provider liveness + byok-routing assertion helpers
# for the staging full-SaaS E2E (tests/e2e/test_staging_full_saas.sh).
#
# WHY THIS LIB EXISTS (molecule-core#1995 / #1994 follow-on):
# The A2A e2e historically asserted only response SHAPE — e.g.
# test_a2a_e2e.sh:`check "SEO response has text" '"kind":"text"'`. A fully
# BROKEN agent returns its error AS a text part:
# {"kind":"text","text":"Agent error (Exception) — see workspace logs..."}
# which STILL matches `"kind":"text"` → the shape check PASSES on a broken
# agent. That is exactly why the 2026-05-2x drained-key / byok-misroute
# failures (agents-team PM + reno marketing erroring on every LLM call)
# sailed through CI. "Channel returns text shape" != "agent actually
# completed an LLM round-trip".
#
# These helpers add three load-bearing gates ON TOP of (never replacing) the
# existing shape + PONG checks:
# 1. a2a_assert_real_completion — deterministic known-answer round-trip
# (CONTAINS the expected token AND NOT an error-as-text payload).
# 2. provider_liveness_matrix — per-offered-provider cheap completion
# probe, providers sourced from the providers.yaml SSOT runtimes block.
# 3. assert_byok_not_platform_proxy — #1994 regression guard: a
# byok-resolving workspace must NOT resolve to platform_managed.
#
# Conventions: reuses the host script's fail()/ok()/log() + tenant_call().
# Source this AFTER those are defined. BASH 4+.
# Error-as-text trap markers. If the agent's text part contains ANY of
# these, the "round-trip" did not really complete — the agent surfaced an
# error AS text. This is the negative assertion that makes a broken agent
# FAIL instead of slipping through the shape check.
#
# Kept as an array (not a single regex) so a new failure signature is a
# one-line append + the failure message can name which marker matched.
A2A_ERROR_AS_TEXT_MARKERS=(
"Agent error"
"Exception"
"error result"
"MISSING_BYOK_CREDENTIAL"
)
# a2a_completion_error_marker <agent_text>
# Echoes the first error-as-text marker found in <agent_text> (case-
# insensitive), or nothing if clean. Exit 0 if a marker matched, 1 if not.
# Pure string scan — no LLM, no network — so it is deterministic and is the
# unit under the fail-direction proof in test_completion_assert_unit.sh.
a2a_completion_error_marker() {
local text="$1"
local upper marker
upper=$(printf '%s' "$text" | tr '[:lower:]' '[:upper:]')
for marker in "${A2A_ERROR_AS_TEXT_MARKERS[@]}"; do
if printf '%s' "$upper" | grep -qF -- "$(printf '%s' "$marker" | tr '[:lower:]' '[:upper:]')"; then
printf '%s' "$marker"
return 0
fi
done
return 1
}
# a2a_assert_real_completion <agent_text> <expected_token> <context_label>
# The CORE gate. Asserts the agent text:
# (a) does NOT contain any error-as-text marker (broken-agent trap), AND
# (b) CONTAINS <expected_token> (case-insensitive) — proving a real LLM
# round-trip produced the deterministic known answer.
# Calls fail() (which exits) on either violation. This MUST fail on an
# error-as-text payload — that is the property test_completion_assert_unit.sh
# pins.
a2a_assert_real_completion() {
local text="$1"
local expected="$2"
local ctx="${3:-A2A}"
if [ -z "$text" ]; then
fail "$ctx — real-completion gate: agent returned EMPTY text (no round-trip)."
fi
local hit
if hit=$(a2a_completion_error_marker "$text"); then
fail "$ctx — real-completion gate: agent returned an ERROR-AS-TEXT payload (matched '$hit'). A broken agent that surfaces its error as a text part is NOT a completed round-trip. This is the trap the shape-only check missed (#1994). Raw: ${text:0:200}"
fi
# Known-answer: real LLM round-trip yields the deterministic token. A
# prompt-echo / truncated-context / wrong-auth pipeline won't.
if ! printf '%s' "$text" | tr '[:lower:]' '[:upper:]' | grep -qF -- "$(printf '%s' "$expected" | tr '[:lower:]' '[:upper:]')"; then
fail "$ctx — real-completion gate: reply did NOT contain expected known-answer token '$expected'. The channel returned a text shape but no real completion. Raw: ${text:0:200}"
fi
ok "$ctx — real completion verified (contains '$expected', no error-as-text). Reply: \"${text:0:80}\""
}
# offered_platform_models_for_runtime <runtime>
# Emits, one per line, the platform-servable model ids the providers.yaml
# SSOT (runtimes.<runtime>.providers[name=platform].models) declares for
# <runtime>. This is the SSOT-driven offered/platform-servable matrix — NOT
# a hardcoded provider list — so a provider added/removed in providers.yaml
# automatically changes the matrix this probe exercises.
#
# Reads the embedded copy at workspace-server/internal/providers/providers.yaml
# (the same file go:embed compiles into the binary). Requires python3 +
# PyYAML (already a test-harness dep). On parse failure, emits nothing and
# returns 1 so the caller can fail loud rather than silently skip.
offered_platform_models_for_runtime() {
local runtime="$1"
local yaml_path="${PROVIDERS_YAML_PATH:-}"
if [ -z "$yaml_path" ]; then
# This lib lives at tests/e2e/lib/ -> repo root is three dirs up
# (lib -> e2e -> tests -> repo-root).
yaml_path="$(cd "$(dirname "${BASH_SOURCE[0]}")/../../.." && pwd)/workspace-server/internal/providers/providers.yaml"
fi
if [ ! -f "$yaml_path" ]; then
log " [provider-matrix] providers.yaml SSOT not found at $yaml_path"
return 1
fi
RUNTIME_REF="$runtime" python3 - "$yaml_path" <<'PY'
import os, sys
try:
import yaml
except Exception as e: # PyYAML missing — fail loud, do not silently skip.
sys.stderr.write(f"PyYAML required for provider-matrix SSOT read: {e}\n")
sys.exit(2)
rt = os.environ["RUNTIME_REF"]
with open(sys.argv[1]) as f:
doc = yaml.safe_load(f)
native = (doc.get("runtimes") or {}).get(rt) or {}
for pref in native.get("providers", []) or []:
if pref.get("name") == "platform":
for m in pref.get("models", []) or []:
print(m)
PY
}
# provider_liveness_matrix <runtime> <probe_fn>
# For each platform-servable model the SSOT lists for <runtime>, calls
# <probe_fn> <model_id> which must echo the agent text (or empty) and return
# 0 on a non-error completion, non-zero otherwise. Logs a per-model pass/fail
# matrix. Returns 0 only if EVERY probed model produced a non-error
# completion; non-zero (and a recorded matrix) otherwise.
#
# Purpose: exercise each offered provider's AUTH + ROUTING path so a drained
# key / wrong base-URL / byok-misroute fails the gate (the #1994 class). The
# probe_fn is expected to use minimal max_tokens.
#
# This helper does the SSOT read + matrix bookkeeping; the host script
# supplies probe_fn (it owns workspace ids + tenant_call wiring).
provider_liveness_matrix() {
local runtime="$1"
local probe_fn="$2"
local models model rc total=0 passed=0
local -a results=()
models=$(offered_platform_models_for_runtime "$runtime") || {
fail "provider-liveness: could not read offered-provider matrix from providers.yaml SSOT for runtime=$runtime"
}
if [ -z "$models" ]; then
log " [provider-matrix] runtime=$runtime offers no platform-servable models in the SSOT — nothing to probe (not a failure)."
return 0
fi
log " [provider-matrix] SSOT offered platform models for $runtime:"
while IFS= read -r model; do
[ -z "$model" ] && continue
log " - $model"
done <<<"$models"
while IFS= read -r model; do
[ -z "$model" ] && continue
total=$((total + 1))
set +e
"$probe_fn" "$model"
rc=$?
set -e
if [ "$rc" = "0" ]; then
passed=$((passed + 1))
results+=("PASS $model")
elif [ "$rc" = "75" ]; then
# 75 (EX_TEMPFAIL convention) = probe skipped (key/runtime not
# available in this lane). Not counted toward pass/fail — logged.
total=$((total - 1))
results+=("SKIP $model (probe unavailable in this lane)")
else
results+=("FAIL $model")
fi
done <<<"$models"
log " [provider-matrix] result matrix (runtime=$runtime):"
local line
for line in "${results[@]}"; do
log " $line"
done
log " [provider-matrix] $passed/$total probed providers completed without error"
if [ "$passed" != "$total" ]; then
return 1
fi
return 0
}
# assert_byok_not_platform_proxy <billing_mode_json> <context_label>
# #1994 regression guard. Given the JSON body from
# GET /admin/workspaces/:id/llm-billing-mode (same derived resolver the
# provision-time strip gate uses), asserts the workspace resolves to BYOK
# and NOT platform_managed. A regression of #1994 (byok workspace baked to
# platform_managed → routed through the platform proxy → platform LLM key
# drained) flips resolved_mode to "platform_managed" and trips this gate.
# Calls fail() (exits) on violation.
assert_byok_not_platform_proxy() {
local body="$1"
local ctx="${2:-byok-guard}"
local mode prov
mode=$(printf '%s' "$body" | python3 -c "import json,sys
try: print(json.load(sys.stdin).get('resolved_mode',''))
except Exception: print('')" 2>/dev/null || echo "")
prov=$(printf '%s' "$body" | python3 -c "import json,sys
try:
d=json.load(sys.stdin); v=d.get('provider_selection')
print(v if v is not None else '')
except Exception: print('')" 2>/dev/null || echo "")
if [ -z "$mode" ]; then
fail "$ctx — byok-routing guard: could not read resolved_mode from billing-mode response. Raw: ${body:0:200}"
fi
if [ "$mode" = "platform_managed" ]; then
fail "$ctx — byok-routing guard TRIPPED (#1994 regression): a byok-configured workspace resolved to 'platform_managed' (provider_selection=$prov) → it would route through the platform proxy and drain the platform LLM key. Expected resolved_mode=byok. Raw: ${body:0:200}"
fi
if [ "$mode" != "byok" ]; then
fail "$ctx — byok-routing guard: unexpected resolved_mode='$mode' (expected 'byok'). provider_selection=$prov. Raw: ${body:0:200}"
fi
ok "$ctx — byok-routing guard: workspace resolves byok (provider_selection=$prov), NOT platform-proxy. #1994 stays fixed."
}
+35
View File
@@ -8,6 +8,34 @@ TIMEOUT="${A2A_TIMEOUT:-120}" # seconds per A2A call (override via A2A_TIMEOUT
# shellcheck source=_lib.sh
source "$(dirname "$0")/_lib.sh"
# molecule-core#1995 (#1994 follow-on): real-completion assertion helpers.
# Adds a NEGATIVE error-as-text check on top of the shape checks below, so a
# broken agent that returns its error AS a text part
# ({"kind":"text","text":"Agent error (Exception) ..."}) — which STILL
# matches the shape check `"kind":"text"` — now FAILS instead of passing.
# shellcheck source=lib/completion_assert.sh
source "$(dirname "$0")/lib/completion_assert.sh"
# check_no_error_as_text <desc> <agent_text>
# Additive negative gate: PASS only if the agent text carries NO
# error-as-text marker (Agent error / Exception / error result /
# MISSING_BYOK_CREDENTIAL). Uses the same scanner as the staging
# real-completion gate so the trap is closed consistently across lanes.
check_no_error_as_text() {
local desc="$1"
local text="$2"
local hit
if hit=$(a2a_completion_error_marker "$text"); then
echo "FAIL: $desc"
echo " agent returned an error-AS-text payload (matched '$hit') — a broken"
echo " agent that surfaces its error as a text part is NOT a real reply."
echo " got: $(echo "$text" | head -3)"
FAIL=$((FAIL + 1))
else
echo "PASS: $desc"
PASS=$((PASS + 1))
fi
}
check() {
local desc="$1"
@@ -81,6 +109,8 @@ check "JSON-RPC response has result" '"result"' "$R"
check "Response has agent role" '"role":"agent"' "$R"
check "Response has text part" '"kind":"text"' "$R"
TEXT=$(echo "$R" | python3 -c "import sys,json; r=json.load(sys.stdin); print(r['result']['parts'][0]['text'][:200])" 2>/dev/null || echo "PARSE_ERROR")
# Negative gate (#1994): the text part must not BE an error.
check_no_error_as_text "Echo reply is not an error-as-text payload" "$TEXT"
echo " Agent said: $TEXT"
echo ""
@@ -92,6 +122,11 @@ R=$(a2a_send "$SEO_ID" "What SEO skills do you have?")
check "SEO agent responds" '"result"' "$R"
check "SEO response has text" '"kind":"text"' "$R"
TEXT=$(echo "$R" | python3 -c "import sys,json; r=json.load(sys.stdin); print(r['result']['parts'][0]['text'][:200])" 2>/dev/null || echo "PARSE_ERROR")
# Negative gate (#1994): a broken SEO agent that returns "Agent error
# (Exception) ..." AS text still matches the `"kind":"text"` shape check
# above — THAT is the gap that let drained-key/byok-misroute failures pass
# CI. This makes that case FAIL.
check_no_error_as_text "SEO reply is not an error-as-text payload" "$TEXT"
echo " SEO Agent said: $TEXT"
echo ""
+111
View File
@@ -0,0 +1,111 @@
#!/usr/bin/env bash
# Fail-direction / load-bearing proof for lib/completion_assert.sh.
#
# This is the watch-it-FAIL counterpart the dev-SOP Phase 3 requires: it
# proves the new real-completion + byok gates actually CATCH a broken agent,
# not just pass on a good one. It runs entirely offline (no LLM, no network,
# no provisioning) — pure assertion logic — so it can run on every PR in the
# fast lane (e2e-api.yml unit-shell step) and locally via `bash`.
#
# The decisive case is `error-as-text payload MUST FAIL`: that is the exact
# trap (#1994) the historical shape-only check missed. If a refactor weakens
# a2a_assert_real_completion to a substring/shape check, THIS test goes red.
set -uo pipefail
HERE="$(cd "$(dirname "$0")" && pwd)"
PASS=0
FAIL=0
# Minimal stand-ins for the host script's helpers. fail() must NOT exit the
# whole harness here — we want to assert that it WAS called. We trap it by
# running the assertion in a subshell and checking the subshell's exit code:
# the real fail() exits 1, ok() exits 0 implicitly.
log() { echo "[unit] $*"; }
ok() { echo "[unit] OK: $*"; }
fail() { echo "[unit] FAIL-CALLED: $*" >&2; exit 1; }
# shellcheck source=lib/completion_assert.sh
source "$HERE/lib/completion_assert.sh"
expect_pass() {
local desc="$1"; shift
if ( "$@" ) >/dev/null 2>&1; then
echo "PASS: $desc (assertion accepted, as expected)"
PASS=$((PASS + 1))
else
echo "FAIL: $desc — expected the assertion to ACCEPT, but it rejected"
FAIL=$((FAIL + 1))
fi
}
expect_fail() {
local desc="$1"; shift
if ( "$@" ) >/dev/null 2>&1; then
echo "FAIL: $desc — expected the assertion to REJECT, but it accepted (gate NOT load-bearing!)"
FAIL=$((FAIL + 1))
else
echo "PASS: $desc (assertion rejected, as expected)"
PASS=$((PASS + 1))
fi
}
echo "=== completion_assert.sh fail-direction proof ==="
# ---- a2a_assert_real_completion ----
# Good: real known-answer reply passes.
expect_pass "real PINEAPPLE reply passes" \
a2a_assert_real_completion "PINEAPPLE" "PINEAPPLE" "unit"
expect_pass "case-insensitive known answer passes" \
a2a_assert_real_completion "pineapple" "PINEAPPLE" "unit"
expect_pass "known answer with minor wrapping passes" \
a2a_assert_real_completion "Sure: PINEAPPLE" "PINEAPPLE" "unit"
# DECISIVE: the error-as-text trap. Each MUST fail — these are the payloads a
# broken agent returns that the old shape-only `"kind":"text"` check passed.
expect_fail "Agent error as text payload MUST fail" \
a2a_assert_real_completion "Agent error (Exception) — see workspace logs for details." "PINEAPPLE" "unit"
expect_fail "bare Exception as text MUST fail" \
a2a_assert_real_completion "Traceback ... Exception: boom" "PINEAPPLE" "unit"
expect_fail "error result as text MUST fail" \
a2a_assert_real_completion "tool returned error result" "PINEAPPLE" "unit"
expect_fail "MISSING_BYOK_CREDENTIAL as text MUST fail" \
a2a_assert_real_completion "MISSING_BYOK_CREDENTIAL: set your own key" "PINEAPPLE" "unit"
# Error-as-text that ALSO happens to contain the token still fails (error
# marker takes precedence — a real completion never carries these markers).
expect_fail "error-as-text containing the token still fails" \
a2a_assert_real_completion "Agent error: could not produce PINEAPPLE" "PINEAPPLE" "unit"
# Empty text fails.
expect_fail "empty text fails" \
a2a_assert_real_completion "" "PINEAPPLE" "unit"
# Wrong/echoed content (no token, no error) fails — shape-OK but not a real
# completion.
expect_fail "wrong content without token fails" \
a2a_assert_real_completion "Reply with exactly the word PINEAPPLE and nothing else." "BANANA" "unit"
# ---- assert_byok_not_platform_proxy (#1994 guard) ----
expect_pass "byok resolution passes the guard" \
assert_byok_not_platform_proxy '{"resolved_mode":"byok","provider_selection":"minimax","source":"derived_provider"}' "unit"
# DECISIVE: a platform_managed resolution on a byok workspace = the #1994
# regression. MUST fail.
expect_fail "platform_managed resolution trips the #1994 guard" \
assert_byok_not_platform_proxy '{"resolved_mode":"platform_managed","provider_selection":"platform","source":"derived_provider"}' "unit"
expect_fail "missing resolved_mode trips the guard" \
assert_byok_not_platform_proxy '{"provider_selection":"x"}' "unit"
expect_fail "disabled mode trips the guard (not byok)" \
assert_byok_not_platform_proxy '{"resolved_mode":"disabled"}' "unit"
# ---- a2a_completion_error_marker (the scanner under the gate) ----
if hit=$(a2a_completion_error_marker "all good PINEAPPLE"); then
echo "FAIL: clean text wrongly flagged as error marker ($hit)"; FAIL=$((FAIL + 1))
else
echo "PASS: clean text has no error marker"; PASS=$((PASS + 1))
fi
if hit=$(a2a_completion_error_marker "An Exception occurred"); then
echo "PASS: error marker detected ($hit)"; PASS=$((PASS + 1))
else
echo "FAIL: error marker NOT detected in 'An Exception occurred'"; FAIL=$((FAIL + 1))
fi
echo ""
echo "=== Results: $PASS passed, $FAIL failed ==="
[ "$FAIL" -eq 0 ]
+182
View File
@@ -99,6 +99,12 @@ source "$(dirname "$0")/lib/model_slug.sh"
# shellcheck disable=SC1091
# shellcheck source=lib/aws_leak_check.sh
source "$(dirname "$0")/lib/aws_leak_check.sh"
# shellcheck disable=SC1091
# shellcheck source=lib/completion_assert.sh
# molecule-core#1995 (#1994 follow-on): real-completion + per-provider
# liveness + byok-routing assertion helpers. Adds gates that FAIL on an
# error-as-text payload (the trap the shape-only A2A checks missed).
source "$(dirname "$0")/lib/completion_assert.sh"
CURL_COMMON=(-sS --fail-with-body --max-time 30)
E2E_TMP_FILES=()
@@ -867,6 +873,182 @@ fi
ok "A2A parent round-trip succeeded: \"${AGENT_TEXT:0:80}\""
# ─── 8b. Real-completion known-answer round-trip (CORE GATE, #1994) ────
# The existing PONG check + generic error grep above already do a lot, but
# this stanza is the canonical real-completion gate the #1994 follow-on
# adds: a DETERMINISTIC known-answer prompt asserted via
# a2a_assert_real_completion, which FAILS on an error-as-text payload
# ({"kind":"text","text":"Agent error (Exception) ..."}). That payload
# matches the historical shape-only check `"kind":"text"` and so passed CI
# on a fully broken agent (drained-key / byok-misroute, 2026-05-2x). This
# gate makes that case RED. Reuses the same cold-start retry-on-transient
# (502/503/504) loop the PONG probe uses — retry-once-on-network, never on
# agent-error. Single round-trip → the one place we spend a non-trivial
# token budget (default backend MiniMax — cheap token plan).
KA_PAYLOAD=$(python3 -c "
import json, uuid
print(json.dumps({
'jsonrpc': '2.0',
'method': 'message/send',
'id': 'e2e-known-answer-1',
'params': {
'message': {
'role': 'user',
'messageId': f'e2e-{uuid.uuid4().hex[:8]}',
'parts': [{'kind': 'text', 'text': 'Reply with exactly the word PINEAPPLE and nothing else.'}]
}
}
}))
")
KA_TMP=$(mktemp -t known_answer_a2a.XXXXXX)
KA_RESP=""
for KA_ATTEMPT in $(seq 1 6); do
: >"$KA_TMP"
set +e
KA_CODE=$(tenant_call POST "/workspaces/$PARENT_ID/a2a" \
--max-time 90 \
-H "Content-Type: application/json" \
-d "$KA_PAYLOAD" \
-o "$KA_TMP" \
-w '%{http_code}' \
2>/dev/null)
KA_RC=$?
set -e
KA_CODE=${KA_CODE:-000}
KA_RESP=$(cat "$KA_TMP" 2>/dev/null || echo "")
if [ "$KA_RC" = "0" ] && [ "$KA_CODE" -ge 200 ] && [ "$KA_CODE" -lt 300 ]; then
break
fi
KA_SAFE_BODY=$(printf '%s' "$KA_RESP" | sanitize_http_body)
# Retry ONLY on transient transport errors — never on an agent-level
# error (those must surface and fail the gate).
if echo "$KA_CODE" | grep -Eq '^(502|503|504)$' && echo "$KA_SAFE_BODY" | grep -Eqi 'Service Unavailable|Bad Gateway|Gateway Timeout|workspace agent unreachable|connection refused|no healthy upstream|workspace agent busy|native_session'; then
log " known-answer A2A transient $KA_CODE attempt $KA_ATTEMPT/6: $KA_SAFE_BODY"
if [ "$KA_ATTEMPT" -lt 6 ]; then sleep 10; continue; fi
fi
break
done
rm -f "$KA_TMP"
if [ "$KA_RC" != "0" ] || [ "$KA_CODE" -lt 200 ] || [ "$KA_CODE" -ge 300 ]; then
KA_SAFE_BODY=$(printf '%s' "$KA_RESP" | sanitize_http_body)
fail "Known-answer A2A POST failed after $KA_ATTEMPT attempt(s) (curl_rc=$KA_RC, http=$KA_CODE): $KA_SAFE_BODY"
fi
KA_TEXT=$(echo "$KA_RESP" | python3 -c "
import json, sys
try:
d = json.load(sys.stdin)
parts = d.get('result', {}).get('parts', [])
print(parts[0].get('text', '') if parts else '')
except Exception:
print('')
" 2>/dev/null || echo "")
# CORE GATE: contains PINEAPPLE (real round-trip) AND no error-as-text.
a2a_assert_real_completion "$KA_TEXT" "PINEAPPLE" "A2A known-answer (parent, $RUNTIME/$MODEL_SLUG)"
# ─── 8c. byok-routing regression guard (#1994) ─────────────────────────
# The parent was provisioned with the customer's OWN vendor key
# (MINIMAX_API_KEY / ANTHROPIC_API_KEY in SECRETS_JSON) → it must resolve
# BYOK, not platform_managed. #1994 was exactly the inverse: a byok
# workspace baked platform_managed on (re-)provision → routed through the
# platform proxy → drained the platform LLM key. We read the SAME derived
# resolver the provision-time strip gate uses
# (GET /admin/workspaces/:id/llm-billing-mode) and assert resolved_mode!=
# platform_managed. A regression flips it RED.
#
# Only meaningful when the parent actually carries a byok credential; the
# OpenAI/hermes path uses a different env shape, and the no-key path is
# legitimately platform_managed (the CTO default). Gate on the same
# E2E_*_API_KEY presence the SECRETS_JSON branch keyed off.
if [ -n "${E2E_MINIMAX_API_KEY:-}" ] || [ -n "${E2E_ANTHROPIC_API_KEY:-}" ]; then
set +e
BILLING_RESP=$(tenant_call GET "/admin/workspaces/$PARENT_ID/llm-billing-mode" 2>/dev/null)
BILLING_RC=$?
set -e
if [ "$BILLING_RC" != "0" ] || [ -z "$BILLING_RESP" ]; then
fail "byok-routing guard: GET /admin/workspaces/$PARENT_ID/llm-billing-mode failed (rc=$BILLING_RC). Body: ${BILLING_RESP:0:200}"
fi
assert_byok_not_platform_proxy "$BILLING_RESP" "byok-guard (parent, $RUNTIME/$MODEL_SLUG)"
else
log "8c. byok-routing guard skipped — parent carries no own-vendor key (OpenAI/no-key path is legitimately platform_managed)."
fi
# ─── 8d. Per-offered-provider liveness matrix (SSOT-driven, #1994 class) ─
# For each platform-servable model the providers.yaml SSOT
# (runtimes.<runtime>.providers[platform].models) declares for this
# runtime, send a minimal max_tokens-bounded "say ok" probe and assert a
# NON-ERROR completion. Purpose: exercise each offered provider's AUTH +
# ROUTING path so a drained key / wrong base-URL / byok-misroute fails the
# gate (the #1994 class). Providers/models come from the SSOT — not a
# hardcoded list — so the matrix tracks providers.yaml automatically.
#
# This lane provisions ONE parent workspace with ONE configured key, so we
# can only truly drive the providers that key authenticates. Probing a
# model whose provider key is absent in this lane is reported SKIP (rc=75),
# not FAIL — keeping the gate deterministic + low-flake. The matrix still
# proves the configured provider's full auth+routing path end-to-end, and
# logs the offered set so over/under-offer drift is visible in the CI log.
provider_liveness_probe() {
local model_id="$1"
# Map the SSOT platform model id (e.g. minimax/MiniMax-M2.7) to the
# vendor namespace token to decide whether THIS lane has its key.
local vendor="${model_id%%/*}"
case "$vendor" in
minimax) [ -n "${E2E_MINIMAX_API_KEY:-}" ] || return 75 ;;
anthropic) [ -n "${E2E_ANTHROPIC_API_KEY:-}" ] || return 75 ;;
openai) [ -n "${E2E_OPENAI_API_KEY:-}" ] || return 75 ;;
*) return 75 ;; # kimi/moonshot etc. — no key wired in this lane
esac
local probe_payload
probe_payload=$(python3 -c "
import json, uuid
print(json.dumps({
'jsonrpc': '2.0',
'method': 'message/send',
'id': 'e2e-liveness-' + uuid.uuid4().hex[:6],
'params': {
'message': {
'role': 'user',
'messageId': f'e2e-{uuid.uuid4().hex[:8]}',
'parts': [{'kind': 'text', 'text': 'Reply with exactly: ok'}],
},
'configuration': {'max_tokens': 4}
}
}))
")
local tmp code rc resp
tmp=$(mktemp -t liveness_a2a.XXXXXX)
set +e
code=$(tenant_call POST "/workspaces/$PARENT_ID/a2a" \
--max-time 60 \
-H "Content-Type: application/json" \
-d "$probe_payload" \
-o "$tmp" -w '%{http_code}' 2>/dev/null)
rc=$?
set -e
resp=$(cat "$tmp" 2>/dev/null || echo "")
rm -f "$tmp"
if [ "$rc" != "0" ] || [ "${code:-000}" -lt 200 ] || [ "${code:-000}" -ge 300 ]; then
log " probe $model_id: HTTP ${code:-000} rc=$rc"
return 1
fi
local text
text=$(echo "$resp" | python3 -c "
import json,sys
try:
d=json.load(sys.stdin); p=d.get('result',{}).get('parts',[])
print(p[0].get('text','') if p else '')
except Exception: print('')" 2>/dev/null || echo "")
if [ -z "$text" ] || a2a_completion_error_marker "$text" >/dev/null; then
log " probe $model_id: error-as-text or empty: ${text:0:120}"
return 1
fi
return 0
}
if ! provider_liveness_matrix "$RUNTIME" provider_liveness_probe; then
fail "Per-provider liveness matrix: at least one offered provider failed its auth+routing probe (see matrix above). This is the #1994 class — a drained key / wrong base-URL / byok-misroute."
fi
ok "Per-provider liveness matrix passed (all probed offered providers completed without error)"
# ─── 9. HMA + peers + activity (full mode) ─────────────────────────────
if [ "$MODE" = "full" ]; then
log "9/11 Writing + reading HMA memory on parent..."
+271
View File
@@ -0,0 +1,271 @@
// Command gen-providers is the codegen half of the provider-registry SSOT
// machinery on the molecule-core side (internal#718 P2-A, CTO 2026-05-27
// "Distribution = SDK via codegen + verify-CI"). It is the byte-for-byte mirror
// of molecule-controlplane's cmd/gen-providers (the canonical generator). It
// reads core's SYNCED COPY of the schema — internal/providers/providers.yaml
// (via the providers loader, so it shares the SAME parse + validation as the
// runtime) — and emits a checked-in Go artifact:
//
// internal/providers/gen/registry_gen.go
//
// The artifact is a deterministic projection of the merged registry: the
// provider catalog + per-runtime native sets as Go literals, plus the schema
// version and a content fingerprint. It is core's leaf of the multi-language SDK
// layer the RFC calls for (Go(CP+core)/TS(canvas)/Python(adapters)).
//
// CONTRACT for P2-A (zero behavior change): the generated artifact is
// checked-in + drift-gated ONLY. NO production code path imports
// internal/providers/gen — the gen-import-boundary test pins that. P2-B wires
// the billing/credential decision onto the LOADER (DeriveProvider/IsPlatform),
// not the raw gen literals. The generator is the build-time half;
// verify-providers-gen.yml is the CI half that regenerates and fails RED on any
// diff (drift or hand-edit); sync-providers-yaml.yml gates the synced copy
// against the controlplane canonical.
//
// Usage:
//
// go run ./cmd/gen-providers # write the artifact in place
// go run ./cmd/gen-providers -check # exit non-zero if the on-disk
// # artifact differs from a fresh gen
// # (the CI drift gate)
// go run ./cmd/gen-providers -o PATH # write to a specific path
//
//go:generate go run ../gen-providers -o ../../internal/providers/gen/registry_gen.go
package main
import (
"bytes"
"crypto/sha256"
"encoding/hex"
"flag"
"fmt"
"go/format"
"os"
"sort"
"strconv"
"text/template"
"git.moleculesai.app/molecule-ai/molecule-core/workspace-server/internal/providers"
)
// defaultOutPath is the checked-in artifact location, relative to the repo
// root (the directory `go run ./cmd/gen-providers` is invoked from).
const defaultOutPath = "internal/providers/gen/registry_gen.go"
func main() {
var (
outPath string
check bool
)
flag.StringVar(&outPath, "o", defaultOutPath, "output path for the generated artifact")
flag.BoolVar(&check, "check", false, "verify the on-disk artifact matches a fresh generation; exit 1 on drift")
flag.Parse()
generated, err := render()
if err != nil {
fmt.Fprintf(os.Stderr, "gen-providers: %v\n", err)
os.Exit(1)
}
if check {
existing, err := os.ReadFile(outPath)
if err != nil {
fmt.Fprintf(os.Stderr, "gen-providers -check: cannot read %s: %v\n", outPath, err)
fmt.Fprintln(os.Stderr, "Run `go generate ./...` (or `go run ./cmd/gen-providers`) and commit the result.")
os.Exit(1)
}
if !bytes.Equal(existing, generated) {
fmt.Fprintf(os.Stderr, "gen-providers -check: DRIFT — %s is out of sync with providers.yaml.\n", outPath)
fmt.Fprintln(os.Stderr, "The generated artifact was hand-edited or providers.yaml changed without regen.")
fmt.Fprintln(os.Stderr, "Fix: run `go generate ./...` (or `go run ./cmd/gen-providers`) and commit.")
os.Exit(1)
}
fmt.Println("gen-providers -check: OK — artifact in sync with providers.yaml")
return
}
if err := os.WriteFile(outPath, generated, 0o644); err != nil {
fmt.Fprintf(os.Stderr, "gen-providers: write %s: %v\n", outPath, err)
os.Exit(1)
}
fmt.Printf("gen-providers: wrote %s\n", outPath)
}
// render loads the manifest and produces the gofmt'd artifact bytes.
func render() ([]byte, error) {
m, err := providers.LoadManifest()
if err != nil {
return nil, fmt.Errorf("load manifest: %w", err)
}
// Deterministic ordering: providers in catalog order is already stable
// (slice). Runtimes is a map — sort its keys so the artifact is
// reproducible regardless of Go map iteration order.
runtimeNames := make([]string, 0, len(m.Runtimes))
for rt := range m.Runtimes {
runtimeNames = append(runtimeNames, rt)
}
sort.Strings(runtimeNames)
type genProvider struct {
Name string
DisplayName string
Protocol string
AuthMode string
AuthEnv []string
ModelPrefixMatch string
IsPlatform bool
// UpstreamVendor is the proxy's upstream-vendor key for this entry
// (internal#718 P1, CONVERGED) — empty for entries the proxy does not
// route to an upstream. A plain scalar (no pointer), so both the rendered
// literal and the fingerprint stay deterministic.
UpstreamVendor string
}
type genRef struct {
Name string
Models []string
}
type genRuntime struct {
Name string
Providers []genRef
}
data := struct {
SchemaVersion int
Fingerprint string
Providers []genProvider
Runtimes []genRuntime
}{
SchemaVersion: providers.SchemaVersion(),
}
for _, p := range m.Providers {
gp := genProvider{
Name: p.Name,
DisplayName: p.DisplayName,
Protocol: string(p.Protocol),
AuthMode: p.AuthMode,
AuthEnv: p.AuthEnv,
ModelPrefixMatch: p.ModelPrefixMatch,
IsPlatform: p.IsPlatform(),
UpstreamVendor: p.UpstreamVendor,
}
data.Providers = append(data.Providers, gp)
}
for _, rt := range runtimeNames {
native := m.Runtimes[rt]
gr := genRuntime{Name: rt}
for _, ref := range native.Providers {
gr.Providers = append(gr.Providers, genRef{Name: ref.Name, Models: ref.Models})
}
data.Runtimes = append(data.Runtimes, gr)
}
// Fingerprint pins the artifact to the data it was generated from. It is
// derived from the structured projection (schema version + providers +
// runtimes), NOT the raw YAML bytes, so a comment-only YAML edit does not
// churn the artifact while any data change does.
data.Fingerprint = fingerprint(data.SchemaVersion, data.Providers, data.Runtimes)
var buf bytes.Buffer
if err := artifactTmpl.Execute(&buf, data); err != nil {
return nil, fmt.Errorf("execute template: %w", err)
}
formatted, err := format.Source(buf.Bytes())
if err != nil {
return nil, fmt.Errorf("gofmt generated source: %w\n----\n%s", err, buf.String())
}
return formatted, nil
}
// fingerprint is a stable content hash of the structured projection. Any
// fields below this function references must be kept in sync with the
// template's emitted data so the hash and the literals never diverge.
func fingerprint(schema int, provs any, runtimes any) string {
h := sha256.New()
fmt.Fprintf(h, "schema=%d\n", schema)
fmt.Fprintf(h, "%#v\n%#v\n", provs, runtimes)
return hex.EncodeToString(h.Sum(nil))[:16]
}
func quote(s string) string { return strconv.Quote(s) }
func quoteSlice(ss []string) string {
var b bytes.Buffer
b.WriteString("[]string{")
for i, s := range ss {
if i > 0 {
b.WriteString(", ")
}
b.WriteString(strconv.Quote(s))
}
b.WriteString("}")
return b.String()
}
var artifactTmpl = template.Must(template.New("artifact").Funcs(template.FuncMap{
"quote": quote,
"quoteSlice": quoteSlice,
}).Parse(`// Code generated by cmd/gen-providers; DO NOT EDIT.
//
// Source of truth: internal/providers/providers.yaml (schema_version {{.SchemaVersion}}).
// Regenerate with: go generate ./... (or: go run ./cmd/gen-providers)
// The verify-providers-gen CI workflow fails RED if this file drifts from
// providers.yaml or is hand-edited. internal#718 P0 — checked-in + drift-
// gated ONLY; no production path imports this package yet (that is P1+).
package gen
// SchemaVersion is the providers.yaml schema this artifact was generated
// against. It is the semver'd contract version (the MAJOR component for the
// public extension contract; see internal/providers/README.md).
const SchemaVersion = {{.SchemaVersion}}
// Fingerprint is a stable content hash of the generated projection (schema
// version + provider catalog + runtime native sets). It changes iff the
// registry DATA changes (comment-only YAML edits do not churn it).
const Fingerprint = {{quote .Fingerprint}}
// GenProvider is the generated projection of one provider catalog entry —
// the subset a downstream consumer needs to derive + display a provider.
type GenProvider struct {
Name string
DisplayName string
Protocol string
AuthMode string
AuthEnv []string
ModelPrefixMatch string
// IsPlatform marks the closed, core-only platform-managed provider.
IsPlatform bool
// UpstreamVendor is the proxy's upstream-vendor key for this entry
// (internal#718 P1, CONVERGED); empty for providers the proxy does not
// route to an upstream vendor. ResolveUpstream maps a model id's namespace
// token to the entry whose UpstreamVendor equals it.
UpstreamVendor string
}
// GenRuntimeRef is one native provider a runtime supports + its exact models.
type GenRuntimeRef struct {
Name string
Models []string
}
// Providers is the full provider catalog, in providers.yaml declaration order.
var Providers = []GenProvider{
{{- range .Providers}}
{Name: {{quote .Name}}, DisplayName: {{quote .DisplayName}}, Protocol: {{quote .Protocol}}, AuthMode: {{quote .AuthMode}}, AuthEnv: {{quoteSlice .AuthEnv}}, ModelPrefixMatch: {{quote .ModelPrefixMatch}}, IsPlatform: {{.IsPlatform}}{{if .UpstreamVendor}}, UpstreamVendor: {{quote .UpstreamVendor}}{{end}}},
{{- end}}
}
// Runtimes maps each runtime to its native provider+model set, runtime names
// sorted for a deterministic artifact.
var Runtimes = map[string][]GenRuntimeRef{
{{- range .Runtimes}}
{{quote .Name}}: {
{{- range .Providers}}
{Name: {{quote .Name}}, Models: {{quoteSlice .Models}}},
{{- end}}
},
{{- end}}
}
`))
@@ -0,0 +1,121 @@
package main
import (
"bytes"
"os"
"path/filepath"
"testing"
)
// repoRoot walks up from the test's working dir (cmd/gen-providers) to the
// module root so the test can locate the checked-in artifact regardless of
// where `go test` is invoked from.
func repoRoot(t *testing.T) string {
t.Helper()
dir, err := os.Getwd()
if err != nil {
t.Fatalf("getwd: %v", err)
}
for i := 0; i < 6; i++ {
if _, err := os.Stat(filepath.Join(dir, "go.mod")); err == nil {
return dir
}
dir = filepath.Dir(dir)
}
t.Fatal("could not locate repo root (go.mod) from cmd/gen-providers")
return ""
}
// TestArtifactInSync is the drift gate's Go-test counterpart: the checked-in
// internal/providers/gen/registry_gen.go MUST byte-equal a fresh render. If a
// future edit changes providers.yaml without regenerating, OR hand-edits the
// artifact, this flips red — the same signal the verify-providers-gen CI
// workflow emits, but caught locally by `go test ./...` too.
func TestArtifactInSync(t *testing.T) {
generated, err := render()
if err != nil {
t.Fatalf("render() error = %v", err)
}
artifactPath := filepath.Join(repoRoot(t), defaultOutPath)
onDisk, err := os.ReadFile(artifactPath)
if err != nil {
t.Fatalf("read checked-in artifact %s: %v (run `go generate ./...` and commit)", artifactPath, err)
}
if !bytes.Equal(onDisk, generated) {
t.Fatalf("DRIFT: %s is out of sync with providers.yaml.\n"+
"Run `go generate ./...` (or `go run ./cmd/gen-providers`) and commit the result.", defaultOutPath)
}
}
// TestDriftGateCatchesMutation is the load-bearing-gate proof (per the SOP
// fail-direction discipline). The original P0 version was TAUTOLOGICAL
// (internal#718 P1 review carry-over): it appended bytes to an in-memory copy
// and asserted the copy differed from the original — true by construction,
// touching neither the on-disk artifact nor the actual in-sync comparison the
// gate runs. This version exercises the REAL gate: it writes a MUTATED artifact
// to disk and re-runs the SAME comparison TestArtifactInSync / `-check` perform
// (`render()` bytes vs the on-disk file), asserting it now reports drift — then
// restores the original. So the test would fail if the gate were vacuous (e.g.
// if the comparison ignored content), not merely if append changes bytes.
func TestDriftGateCatchesMutation(t *testing.T) {
generated, err := render()
if err != nil {
t.Fatalf("render() error = %v", err)
}
artifactPath := filepath.Join(repoRoot(t), defaultOutPath)
original, err := os.ReadFile(artifactPath)
if err != nil {
t.Fatalf("read checked-in artifact %s: %v", artifactPath, err)
}
// Precondition: the tree is in sync (so the mutation is what flips the gate,
// not pre-existing drift).
if !bytes.Equal(original, generated) {
t.Fatalf("precondition failed: %s already drifted from render() — run `go generate ./...`", defaultOutPath)
}
// Restore the pristine artifact no matter how the test exits.
t.Cleanup(func() {
if err := os.WriteFile(artifactPath, original, 0o644); err != nil {
t.Fatalf("CRITICAL: failed to restore %s after mutation: %v", artifactPath, err)
}
})
// Mutate the ON-DISK artifact (simulating a hand-edit / a providers.yaml
// change that wasn't regenerated).
mutated := append(append([]byte(nil), original...), []byte("\n// injected drift\n")...)
if err := os.WriteFile(artifactPath, mutated, 0o644); err != nil {
t.Fatalf("write mutated artifact: %v", err)
}
// Re-run the EXACT in-sync comparison the gate uses: fresh render vs the
// (now mutated) on-disk file. It MUST report drift.
onDiskAfter, err := os.ReadFile(artifactPath)
if err != nil {
t.Fatalf("re-read mutated artifact: %v", err)
}
freshRender, err := render()
if err != nil {
t.Fatalf("render() after mutation error = %v", err)
}
if bytes.Equal(onDiskAfter, freshRender) {
t.Fatal("drift gate did NOT detect a mutated on-disk artifact — gate is not load-bearing")
}
}
// TestRenderDeterministic proves regeneration is idempotent: two renders of
// the same manifest produce byte-identical output (sorted runtime keys, stable
// catalog order). A non-deterministic generator would make the drift gate
// flap on Go map iteration order.
func TestRenderDeterministic(t *testing.T) {
a, err := render()
if err != nil {
t.Fatalf("render() #1 error = %v", err)
}
b, err := render()
if err != nil {
t.Fatalf("render() #2 error = %v", err)
}
if !bytes.Equal(a, b) {
t.Fatal("render() is non-deterministic — two runs differ; the drift gate would flap")
}
}
@@ -1,464 +0,0 @@
package handlers
// derive_provider_drift_test.go — behavior-based AST/text drift gate.
//
// Why this exists: PR #2535 introduced a Go port of derive-provider.sh
// (see deriveProviderFromModelSlug in workspace_provision.go) so the
// workspace-server can persist LLM_PROVIDER into workspace_secrets at
// provision time. That created two sources of truth:
//
// 1. molecule-ai-workspace-template-hermes/scripts/derive-provider.sh —
// runs inside the container at boot, has the final say on which
// provider hermes targets (writes ~/.hermes/config.yaml's
// model.provider field). The shell script lives in a separate
// OSS repo, so we vendor a snapshot at testdata/derive-provider.sh
// to keep this gate hermetic.
// 2. workspace-server/internal/handlers/workspace_provision.go's
// deriveProviderFromModelSlug — runs at provision time on the
// platform side so LLM_PROVIDER lands in workspace_secrets and
// survives Save+Restart.
//
// If a future PR adds a new provider prefix to one but not the other,
// the workspace-server's persisted LLM_PROVIDER silently disagrees
// with what the container's derive-provider.sh produces. The container
// wins (it writes the actual config.yaml), so the workspace-server's
// persisted value becomes stale and misleading without anything
// flipping red in CI.
//
// This gate pins the invariant that the *prefix set* the two functions
// know about is identical, modulo a small hardcoded acceptedDivergences
// map for the two intentional differences documented in
// deriveProviderFromModelSlug's doc comment (nousresearch/* and
// openai/* both fall back to "openrouter" at provision time because
// the runtime env that picks "nous" / "custom" isn't available yet).
//
// Pattern: the "behavior-based AST gate" from PR #2367 / memory
// feedback_behavior_based_ast_gates — pin invariants by what a
// function maps, not by what it's named. Walks the actual Go AST of
// deriveProviderFromModelSlug's switch statement so a rename or a
// duplicate function in another file can't sneak past the gate.
//
// Task: #242. Companion to the table-driven mapping test in
// workspace_provision_shared_test.go (TestDeriveProviderFromModelSlug)
// which pins the *values*; this test pins the *coverage* of the
// prefix set itself.
//
// Hermetic: reads two files (vendored shell script + Go source) from
// paths relative to the test package directory and parses them
// in-process. No network, no docker, no DB. The vendored shell script
// at testdata/derive-provider.sh is a snapshot of the upstream OSS
// template repo's script — refresh it via the cp command in that file's
// header when upstream changes.
import (
"go/ast"
"go/parser"
"go/token"
"os"
"regexp"
"sort"
"strconv"
"strings"
"testing"
)
// acceptedDivergences pins the prefixes where the Go port intentionally
// differs from derive-provider.sh. Each entry's value is the provider
// the Go function returns; the shell would (at runtime, with the right
// env keys present) return something else. Documented in
// deriveProviderFromModelSlug's doc comment in workspace_provision.go.
//
// If a NEW divergence appears, this test fails and the engineer must
// either (a) align the Go function with the shell, or (b) add the
// prefix here with a comment explaining why the divergence is
// intentional and safe at provision time.
var acceptedDivergences = map[string]string{
// Shell: "nous" if HERMES_API_KEY/NOUS_API_KEY set, else "openrouter".
// Go: "openrouter" unconditionally — runtime keys aren't loaded at
// provision time. derive-provider.sh upgrades to "nous" at boot
// when the keys are present.
"nousresearch": "openrouter",
// Shell: "custom" if OPENAI_API_KEY set, "openrouter" if OPENROUTER_API_KEY
// set, else "openrouter" as a no-key fallback.
// Go: "openrouter" unconditionally — same reason as nousresearch/*.
// derive-provider.sh upgrades to "custom" at boot when
// OPENAI_API_KEY is present.
"openai": "openrouter",
}
// TestDeriveProviderDrift_ShellAndGoStayInSync is the drift gate.
// It extracts the prefix→provider mapping from both sources and
// asserts:
//
// 1. Every prefix the shell knows about, the Go function also handles
// (returning either the same provider OR the value pinned in
// acceptedDivergences for that prefix).
// 2. Every prefix the Go function handles (extracted from its switch
// statement via go/ast), the shell case statement also lists.
func TestDeriveProviderDrift_ShellAndGoStayInSync(t *testing.T) {
t.Parallel()
shellMap := loadShellPrefixMap(t)
goMap := loadGoPrefixMap(t)
if len(shellMap) == 0 {
t.Fatalf("parsed zero prefixes from derive-provider.sh — regex likely broke; rebuild parser before trusting this gate")
}
if len(goMap) == 0 {
t.Fatalf("parsed zero prefixes from deriveProviderFromModelSlug — AST walk likely broke; rebuild parser before trusting this gate")
}
// Direction 1: every shell prefix must be in the Go map (with the
// same provider value, or with the documented divergence).
for prefix, shellProvider := range shellMap {
goProvider, ok := goMap[prefix]
if !ok {
t.Errorf(
"DRIFT: derive-provider.sh has prefix %q -> %q but deriveProviderFromModelSlug doesn't handle it.\n"+
"Fix: either add a case for %q to deriveProviderFromModelSlug in "+
"workspace-server/internal/handlers/workspace_provision.go (returning %q to match the shell), "+
"OR if this prefix is intentionally provision-time-divergent, add it to acceptedDivergences{} "+
"in this test with a comment explaining why.",
prefix, shellProvider, prefix, shellProvider,
)
continue
}
if goProvider == shellProvider {
continue
}
// Mismatch — only acceptable if it's on the explicit divergence list
// AND the Go side returns exactly the documented value.
expected, divergenceAllowed := acceptedDivergences[prefix]
if !divergenceAllowed {
t.Errorf(
"DRIFT: prefix %q maps to %q in derive-provider.sh but %q in deriveProviderFromModelSlug.\n"+
"Fix: align the Go function with the shell (preferred — they should agree), "+
"OR if the divergence is intentional and safe at provision time, "+
"add %q: %q to acceptedDivergences{} in this test with a comment explaining why.",
prefix, shellProvider, goProvider, prefix, goProvider,
)
continue
}
if goProvider != expected {
t.Errorf(
"DRIFT: prefix %q is on the acceptedDivergences list with expected Go value %q but "+
"deriveProviderFromModelSlug now returns %q.\n"+
"Fix: update acceptedDivergences[%q] in this test to %q (and update its comment), "+
"OR revert the Go function to return %q.",
prefix, expected, goProvider, prefix, goProvider, expected,
)
}
}
// Direction 2: every Go prefix must be in the shell map. Drift in
// this direction is rarer (someone added a Go case without touching
// the shell) but produces the same broken state — provision-time
// LLM_PROVIDER disagrees with what the container actually uses.
for prefix, goProvider := range goMap {
if _, ok := shellMap[prefix]; ok {
continue
}
t.Errorf(
"DRIFT: deriveProviderFromModelSlug handles prefix %q -> %q but derive-provider.sh doesn't list it.\n"+
"Fix: add a `%s/*) PROVIDER=%q ;;` case to "+
"workspace-configs-templates/hermes/scripts/derive-provider.sh — the Go provision-time hint "+
"is meaningless if the container's runtime script doesn't recognize the same prefix.",
prefix, goProvider, prefix, goProvider,
)
}
// Belt-and-braces: every entry in acceptedDivergences must actually
// appear in BOTH maps. A stale divergence entry (prefix removed from
// either source) silently weakens the gate.
for prefix := range acceptedDivergences {
if _, ok := shellMap[prefix]; !ok {
t.Errorf(
"acceptedDivergences contains prefix %q but derive-provider.sh no longer lists it. "+
"Remove the entry from acceptedDivergences{} in this test.",
prefix,
)
}
if _, ok := goMap[prefix]; !ok {
t.Errorf(
"acceptedDivergences contains prefix %q but deriveProviderFromModelSlug no longer lists it. "+
"Remove the entry from acceptedDivergences{} in this test.",
prefix,
)
}
}
}
// vendoredShellPath is the testdata snapshot of upstream
// derive-provider.sh. The path is relative to the test package
// directory (which is what `go test` sets as cwd). See the file's
// header for the refresh procedure when upstream changes.
const vendoredShellPath = "testdata/derive-provider.sh"
// goSourcePath is the file containing deriveProviderFromModelSlug.
// Relative to the test package directory.
const goSourcePath = "workspace_provision.go"
// loadShellPrefixMap parses derive-provider.sh and returns a
// map[prefix]provider for every case clause. Aliases inside a single
// `pat1/*|pat2/*)` clause expand to one map entry per alias, both
// pointing at the same provider.
//
// Stops at the first `*)` (the catch-all) and ignores it — the
// catch-all maps to PROVIDER="auto" which has no Go counterpart by
// design (deriveProviderFromModelSlug returns "" for unknowns and
// lets the shell's *=auto branch decide at runtime).
//
// Ambiguity: case clauses whose body branches on env vars (openai/*,
// nousresearch/*) are still extracted as the FIRST PROVIDER= literal
// inside the body. The shell's full conditional logic is documented
// via the acceptedDivergences map in this file rather than re-encoded
// in the parser, because re-encoding sh `if` semantics in regex is a
// fool's errand — the divergences are stable and small enough to
// hardcode.
func loadShellPrefixMap(t *testing.T) map[string]string {
t.Helper()
raw, err := os.ReadFile(vendoredShellPath)
if err != nil {
t.Fatalf("read %s: %v (refresh from upstream — see file header)", vendoredShellPath, err)
}
// Locate the case statement body so we don't accidentally match
// PROVIDER= assignments above the case (the HERMES_INFERENCE_PROVIDER
// override + the empty-model fallback both write PROVIDER= before
// the case). Upstream renamed the case variable to ${_HERMES_MODEL}
// in v0.12.0 (the resolved value of HERMES_INFERENCE_MODEL with a
// HERMES_DEFAULT_MODEL legacy fallback); accept either spelling so
// this test survives a future rename.
caseStart := regexp.MustCompile(`(?m)^case\s+"\$\{(_?HERMES(?:_DEFAULT|_INFERENCE)?_MODEL)\}"\s+in\s*$`)
startLoc := caseStart.FindIndex(raw)
if startLoc == nil {
t.Fatalf("could not locate `case \"${...HERMES...MODEL}\" in` in %s — shell file shape changed; rebuild parser", vendoredShellPath)
}
caseEnd := regexp.MustCompile(`(?m)^esac\s*$`)
endLoc := caseEnd.FindIndex(raw[startLoc[1]:])
if endLoc == nil {
t.Fatalf("could not locate `esac` after the case statement in %s — shell file shape changed", vendoredShellPath)
}
body := string(raw[startLoc[1] : startLoc[1]+endLoc[0]])
out := map[string]string{}
// Pattern A: single-line clauses like
// minimax-cn/*) PROVIDER="minimax-cn" ;;
// alibaba/*|dashscope/*|qwen/*) PROVIDER="alibaba" ;;
// Capture group 1 is the patterns (e.g. `minimax-cn/*` or
// `alibaba/*|dashscope/*|qwen/*`); group 2 is the provider literal.
singleLine := regexp.MustCompile(`(?m)^\s*([a-zA-Z0-9_./*|\-]+)\)\s*PROVIDER="([^"]+)"\s*;;`)
// Pattern B: multi-line clauses like
// openai/*)
// if [ -n "${OPENAI_API_KEY:-}" ]; then
// PROVIDER="custom"
// ...
// We capture the patterns and the FIRST PROVIDER= that follows
// (before the next `;;`). The acceptedDivergences map handles the
// fact that the runtime branching can pick a different value.
multiLine := regexp.MustCompile(`(?ms)^\s*([a-zA-Z0-9_./*|\-]+)\)\s*\n(.*?);;`)
addEntry := func(patterns, provider string) {
// Skip the `*)` catch-all — it has no Go counterpart by design.
if strings.TrimSpace(patterns) == "*" {
return
}
for _, alt := range strings.Split(patterns, "|") {
alt = strings.TrimSpace(alt)
// Each alternative is `<prefix>/*` — strip the trailing `/*`.
alt = strings.TrimSuffix(alt, "/*")
if alt == "" {
continue
}
// First write wins — a single-line match outranks a multi-line
// fallback for the same patterns block (defensive; the regexes
// shouldn't overlap on the same line in practice).
if _, exists := out[alt]; !exists {
out[alt] = provider
}
}
}
// Run single-line first so it claims its lines before the multi-line
// pass sees them.
consumed := map[int]bool{}
for _, m := range singleLine.FindAllStringSubmatchIndex(body, -1) {
addEntry(body[m[2]:m[3]], body[m[4]:m[5]])
// Mark every line touched so multi-line pass can skip it.
for i := m[0]; i < m[1]; i++ {
consumed[i] = true
}
}
for _, m := range multiLine.FindAllStringSubmatchIndex(body, -1) {
// Skip if the start of this match overlaps a single-line clause.
if consumed[m[0]] {
continue
}
patterns := body[m[2]:m[3]]
clauseBody := body[m[4]:m[5]]
// Extract the FIRST PROVIDER="..." from the clause body.
firstProvider := regexp.MustCompile(`PROVIDER="([^"]+)"`).FindStringSubmatch(clauseBody)
if firstProvider == nil {
t.Errorf("multi-line case clause for %q has no PROVIDER= literal — shell file shape changed; rebuild parser", patterns)
continue
}
addEntry(patterns, firstProvider[1])
}
return out
}
// loadGoPrefixMap parses workspace_provision.go and walks the AST to
// extract the prefix→provider mapping from deriveProviderFromModelSlug's
// switch statement.
//
// Each case clause's string-literal labels become map keys, all
// pointing at the provider returned by that case body's `return "..."`
// statement. A clause like `case "alibaba", "dashscope", "qwen":
// return "alibaba"` produces three map entries.
//
// Skips the default clause (returns ""). Skips any case clause whose
// body's first statement isn't a single `return STRING_LITERAL` — those
// would need their own divergence handling and don't currently exist
// in the function.
func loadGoPrefixMap(t *testing.T) map[string]string {
t.Helper()
fset := token.NewFileSet()
file, err := parser.ParseFile(fset, goSourcePath, nil, parser.ParseComments)
if err != nil {
t.Fatalf("parse %s: %v", goSourcePath, err)
}
var fn *ast.FuncDecl
for _, decl := range file.Decls {
f, ok := decl.(*ast.FuncDecl)
if !ok {
continue
}
if f.Name.Name == "deriveProviderFromModelSlug" {
fn = f
break
}
}
if fn == nil {
t.Fatalf("could not find deriveProviderFromModelSlug in %s — function renamed/removed; this gate's invariant has been violated", goSourcePath)
}
// Walk the function body for the SwitchStmt.
var sw *ast.SwitchStmt
ast.Inspect(fn.Body, func(n ast.Node) bool {
if s, ok := n.(*ast.SwitchStmt); ok {
sw = s
return false
}
return true
})
if sw == nil {
t.Fatalf("no switch statement found in deriveProviderFromModelSlug — function shape changed; rebuild parser")
}
out := map[string]string{}
for _, stmt := range sw.Body.List {
clause, ok := stmt.(*ast.CaseClause)
if !ok {
continue
}
// Default clause has no list — skip.
if len(clause.List) == 0 {
continue
}
// Find the first return statement in the clause body.
var ret *ast.ReturnStmt
for _, bodyStmt := range clause.Body {
if r, ok := bodyStmt.(*ast.ReturnStmt); ok {
ret = r
break
}
}
if ret == nil || len(ret.Results) != 1 {
t.Errorf("case clause at %s has no single-value return — function shape changed; gate may be incomplete",
fset.Position(clause.Pos()))
continue
}
lit, ok := ret.Results[0].(*ast.BasicLit)
if !ok || lit.Kind != token.STRING {
t.Errorf("case clause at %s returns a non-literal — gate cannot extract provider value",
fset.Position(clause.Pos()))
continue
}
provider, err := strconv.Unquote(lit.Value)
if err != nil {
t.Errorf("case clause at %s has unparseable string literal %q: %v",
fset.Position(clause.Pos()), lit.Value, err)
continue
}
for _, expr := range clause.List {
lbl, ok := expr.(*ast.BasicLit)
if !ok || lbl.Kind != token.STRING {
t.Errorf("case clause at %s has a non-string-literal label — gate cannot extract prefix",
fset.Position(clause.Pos()))
continue
}
prefix, err := strconv.Unquote(lbl.Value)
if err != nil {
t.Errorf("case clause at %s has unparseable label literal %q: %v",
fset.Position(clause.Pos()), lbl.Value, err)
continue
}
out[prefix] = provider
}
}
return out
}
// TestDeriveProviderDrift_ShellParserIsSane is a guard test: the shell
// parser is regex-based, so we sanity-check that it actually finds the
// well-known prefixes documented in derive-provider.sh's header
// comment. If this test passes but the main drift test reports
// missing prefixes, the bug is almost certainly in the regex (not in
// the production code).
func TestDeriveProviderDrift_ShellParserIsSane(t *testing.T) {
t.Parallel()
shellMap := loadShellPrefixMap(t)
// Anchor prefixes — these have lived in derive-provider.sh since it
// was first introduced. If the parser can't find them, it's broken.
mustHave := map[string]string{
"anthropic": "anthropic",
"minimax": "minimax",
"minimax-cn": "minimax-cn",
"openrouter": "openrouter",
"custom": "custom",
"alibaba": "alibaba", // in an alias group with dashscope/qwen
"dashscope": "alibaba", // ditto
"qwen": "alibaba", // ditto
"openai": "custom", // multi-line; first PROVIDER= is "custom"
"nousresearch": "nous", // multi-line; first PROVIDER= is "nous"
}
missing := []string{}
wrong := []string{}
for prefix, want := range mustHave {
got, ok := shellMap[prefix]
if !ok {
missing = append(missing, prefix)
continue
}
if got != want {
wrong = append(wrong, prefix+" got="+got+" want="+want)
}
}
sort.Strings(missing)
sort.Strings(wrong)
if len(missing) > 0 {
t.Errorf("shell parser failed to extract anchor prefixes: %v", missing)
}
if len(wrong) > 0 {
t.Errorf("shell parser extracted wrong values for anchor prefixes: %v", wrong)
}
}
@@ -255,22 +255,20 @@ func TestExtended_SecretsListEmpty(t *testing.T) {
// ---------- TestSecretsSet (Extended) ----------
func TestExtended_SecretsSet(t *testing.T) {
// internal#691: the per-workspace strip gate now defaults to platform_managed
// on empty MOLECULE_LLM_BILLING_MODE (closed default). This test's intent is
// the happy path of persisting a vendor key, so put the org into byok which
// matches the pre-#691 implicit behavior of an unset env.
t.Setenv("MOLECULE_LLM_BILLING_MODE", "byok")
// internal#718 P2-B: the per-workspace strip gate keys off the DERIVED mode
// (org rung retired). This test's intent is the happy path of persisting a
// vendor key on a byok workspace; the realistic way a workspace is byok for
// a direct vendor-key write is an explicit operator override (the escape
// hatch the reject error itself points to: PUT /admin/.../llm-billing-mode).
// The override short-circuits the resolver to byok in a single read, so the
// bypass-list check is skipped and the write proceeds.
t.Setenv("MOLECULE_LLM_BILLING_MODE", "platform_managed") // org env ignored now
mock := setupTestDB(t)
handler := NewSecretsHandler(nil)
// internal#691: secrets.Set now consults ResolveLLMBillingMode before the
// strip gate. Mock returns no row → resolver falls through to the org
// default (byok, set via t.Setenv above) → bypass-list check is skipped
// and the write proceeds. This pattern is the test-side mirror of the
// real-prod fall-through behavior for a fresh workspace with no override.
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs("22222222-2222-2222-2222-222222222222").
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}))
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(LLMBillingModeBYOK))
// Expect INSERT (encrypted value is dynamic, use AnyArg)
mock.ExpectExec("INSERT INTO workspace_secrets").
@@ -43,10 +43,36 @@ import (
"database/sql"
"errors"
"fmt"
"log"
"sync"
"git.moleculesai.app/molecule-ai/molecule-core/workspace-server/internal/crypto"
"git.moleculesai.app/molecule-ai/molecule-core/workspace-server/internal/db"
"git.moleculesai.app/molecule-ai/molecule-core/workspace-server/internal/providers"
)
// providerManifest is the parsed provider registry, loaded once. The registry
// is embedded (go:embed, no network) and immutable for the process lifetime, so
// a single Load is safe to memoize. A load failure is cached too (registryErr):
// it can only happen on a malformed embedded YAML, which is a build-time defect
// the verify-providers-gen + sync gates already catch, so failing closed
// (treat as "cannot derive" → platform default) is correct and we don't retry.
var (
providerRegistryOnce sync.Once
providerRegistryManifest *providers.Manifest
providerRegistryErr error
)
func providerRegistry() (*providers.Manifest, error) {
providerRegistryOnce.Do(func() {
providerRegistryManifest, providerRegistryErr = providers.LoadManifest()
if providerRegistryErr != nil {
log.Printf("llm_billing_mode: FATAL — provider registry failed to load: %v (billing will default-closed to platform_managed)", providerRegistryErr)
}
})
return providerRegistryManifest, providerRegistryErr
}
// Constants mirror molecule-controlplane/internal/credits/llm_billing.go.
// Kept as string literals (not imports) because workspace-server has no
// build-time dependency on the CP module; the values are stable wire
@@ -67,6 +93,19 @@ const (
BillingModeSourceWorkspaceOverride BillingModeSource = "workspace_override"
BillingModeSourceOrgDefault BillingModeSource = "org_default"
BillingModeSourceConstantFallback BillingModeSource = "constant_fallback"
// BillingModeSourceDerivedProvider means the mode was DERIVED from the
// workspace's (runtime, model) via the provider registry — the SSOT
// (internal#718 P2-B). IsPlatform(derived) → platform_managed, else byok.
// This is the highest-precedence source after an explicit operator override
// and SUPERSEDES the prior stored-LLM_PROVIDER read (#1966).
BillingModeSourceDerivedProvider BillingModeSource = "derived_provider"
// BillingModeSourceDerivedDefault means the registry could not derive a
// provider for the (runtime, model) — no model, unknown runtime,
// unregistered/ambiguous model — so the mode defaulted closed to
// platform_managed (CTO-confirmed "unset → platform default"). Distinct from
// derived_provider so operators can see "we defaulted" vs "we derived
// platform".
BillingModeSourceDerivedDefault BillingModeSource = "derived_default"
)
// BillingModeResolution is the structured answer the admin GET route returns
@@ -74,11 +113,18 @@ const (
// shape, so the resolver test asserts both the mode AND the source per case
// (catches a bug where the right mode is returned via the wrong layer).
type BillingModeResolution struct {
WorkspaceID string `json:"workspace_id"`
ResolvedMode string `json:"resolved_mode"`
WorkspaceOverride *string `json:"workspace_override"` // nil = inherit
OrgDefault string `json:"org_default"` // already default-closed by CP
Source BillingModeSource `json:"source"`
WorkspaceID string `json:"workspace_id"`
ResolvedMode string `json:"resolved_mode"`
WorkspaceOverride *string `json:"workspace_override"` // nil = inherit
OrgDefault string `json:"org_default"` // RETIRED as a billing source (internal#718 P2-B); always platform_managed, kept for wire-compat
Source BillingModeSource `json:"source"`
// ProviderSelection surfaces the DERIVED provider name (internal#718 P2-B)
// when the mode came from the registry derivation — the literal provider the
// (runtime, model) resolved to (e.g. "platform", "kimi-coding", "openai"), or
// the raw model id when derivation failed. nil when an explicit operator
// override or the empty-id default decided. Lets the admin route answer "why
// is this workspace byok?" with the derived provider, not a stored value.
ProviderSelection *string `json:"provider_selection"`
}
// isKnownBillingMode is the enum-recognizer for the resolver's default-closed
@@ -95,24 +141,137 @@ func isKnownBillingMode(s string) bool {
}
}
// normalizeOrgDefault applies the same default-closed contract to the
// org-level input as the workspace override gets. The org_default arrives
// from tenant_config which already COALESCEs NULL → platform_managed at the
// CP SQL layer, but we DO NOT trust that contract here — if CP regresses or
// the tenant_config env wasn't populated (race on boot), we still default-
// close. Same principle: never honor a garbled value.
func normalizeOrgDefault(orgMode string) string {
if isKnownBillingMode(orgMode) {
return orgMode
// readWorkspaceBillingOverride reads the OPTIONAL explicit operator override
// (workspaces.llm_billing_mode). Returns:
//
// (mode, true, nil) — a recognized override is set → operator pinned the mode
// ("", false, nil) — NULL / garbled / row-missing → no explicit override
// ("", false, err) — DB error → caller defaults closed + propagates
//
// internal#718 P2-B retires the org rung; this column is the ONLY stored
// billing signal that survives, and ONLY as an explicit override on top of the
// derived provider (CTO 2026-05-27).
func readWorkspaceBillingOverride(ctx context.Context, workspaceID string) (string, bool, error) {
var wsOverride sql.NullString
err := db.DB.QueryRowContext(ctx,
`SELECT llm_billing_mode FROM workspaces WHERE id = $1`,
workspaceID,
).Scan(&wsOverride)
switch {
case errors.Is(err, sql.ErrNoRows):
return "", false, nil
case err != nil:
return "", false, fmt.Errorf("resolve workspace llm_billing_mode override for %s: %w", workspaceID, err)
}
return LLMBillingModePlatformManaged
if wsOverride.Valid && isKnownBillingMode(wsOverride.String) {
return wsOverride.String, true, nil
}
return "", false, nil
}
// ResolveLLMBillingMode is the canonical resolver. Every code path that
// previously gated on `os.Getenv("MOLECULE_LLM_BILLING_MODE") == "platform_managed"`
// must call this instead and gate on the returned mode. The architectural
// test (resolver_ast_test.go) asserts there is no remaining call site of
// the old shape outside the resolver-input wiring.
// ResolveLLMBillingModeDerived is the SSOT billing-mode resolver (internal#718
// P2-B). It DERIVES the provider from (runtime, model) via the provider
// registry and decides platform-vs-byok from IsPlatform(derived) — it does NOT
// read a stored LLM_PROVIDER (superseding #1966's stored-read approach) and
// does NOT read the org rung (retired, CTO 2026-05-27).
//
// Precedence (highest first):
//
// 1. EXPLICIT operator override (workspaces.llm_billing_mode, a recognized
// value). The only stored billing signal that survives — an escape hatch,
// not the primary signal.
// 2. DERIVE: providers.DeriveProvider(runtime, model, availableAuthEnv).
// - resolves to the closed `platform` provider → platform_managed
// - resolves to any other (BYOK/third-party) provider → byok ← THE FIX
// 3. DEFAULT-CLOSED: derive fails (no model, unknown runtime, unregistered or
// ambiguous model) → platform_managed (CTO "unset → platform default"). A
// derive failure NEVER silently flips a workspace to byok (which would
// strip the platform creds it may legitimately need).
//
// availableAuthEnv is the set of auth-env-var NAMES present for the workspace
// (never secret values) — the same disambiguation input DeriveProvider uses to
// split anthropic-oauth from anthropic-api. May be nil.
//
// A returned error never prevents a decision: ResolvedMode is always a valid
// enum value (default-closed). The error is informational (log + surface).
func ResolveLLMBillingModeDerived(ctx context.Context, workspaceID, runtime, model string, availableAuthEnv []string) (BillingModeResolution, error) {
res := BillingModeResolution{
WorkspaceID: workspaceID,
// OrgDefault is retired as a billing source (internal#718 P2-B). Kept on
// the struct for wire-compat (admin route / CP mirror) but always the
// closed constant — never consulted in the decision.
OrgDefault: LLMBillingModePlatformManaged,
}
// Pre-provision context (no workspace row yet): no override to read, default
// closed. (DeriveProvider could still run from the passed runtime/model, but
// the no-id path historically does no DB work and the strip gate only runs
// post-create, so keep it a pure default to preserve that contract.)
if workspaceID == "" {
res.ResolvedMode = LLMBillingModePlatformManaged
res.Source = BillingModeSourceDerivedDefault
return res, nil
}
// Precedence 1: explicit operator override.
if mode, ok, err := readWorkspaceBillingOverride(ctx, workspaceID); err != nil {
// DB error — default closed AND propagate (never flip on a transient error).
res.ResolvedMode = LLMBillingModePlatformManaged
res.Source = BillingModeSourceConstantFallback
return res, err
} else if ok {
m := mode
res.WorkspaceOverride = &m
res.ResolvedMode = mode
res.Source = BillingModeSourceWorkspaceOverride
return res, nil
}
// Precedence 2: DERIVE the provider from (runtime, model).
manifest, mErr := providerRegistry()
if mErr != nil || manifest == nil {
// Registry unavailable (malformed embedded YAML — a build-time defect the
// gates catch). Default closed.
res.ResolvedMode = LLMBillingModePlatformManaged
res.Source = BillingModeSourceDerivedDefault
return res, mErr
}
provider, dErr := manifest.DeriveProvider(runtime, model, availableAuthEnv)
if dErr != nil {
// No model / unknown runtime / unregistered / ambiguous → default closed.
// NOT an error to the caller: an unregistered model is a legitimate
// "we can't say it's BYOK, so bill the platform default" outcome, and the
// only-registered gate at the create/config API is where an unregistered
// model is rejected loudly. Here we just fail closed for safety.
res.ResolvedMode = LLMBillingModePlatformManaged
res.Source = BillingModeSourceDerivedDefault
sel := model
if sel != "" {
res.ProviderSelection = &sel
}
return res, nil
}
derivedName := provider.Name
res.ProviderSelection = &derivedName
res.Source = BillingModeSourceDerivedProvider
if provider.IsPlatform() {
res.ResolvedMode = LLMBillingModePlatformManaged
} else {
// A specific (non-platform) vendor was derived → bring-your-own-key.
res.ResolvedMode = LLMBillingModeBYOK
}
return res, nil
}
// ResolveLLMBillingMode is the legacy-signature resolver retained for callers
// that do not have (runtime, model) in hand (the admin GET/PUT route and the
// secrets remote-pull path). It reads the workspace's stored runtime + model +
// available auth env from the DB and delegates to the DERIVED resolver
// (internal#718 P2-B) — the orgMode parameter is RETIRED (the org rung is no
// longer a billing source) and is ignored; it stays in the signature only to
// avoid churning the two callers in this PR. The architectural test asserts no
// remaining code path gates on os.Getenv("MOLECULE_LLM_BILLING_MODE") for the
// strip decision (that env is no longer read into the decision at all).
//
// Returning an error does NOT prevent the caller from making a decision —
// the returned mode is always a valid enum value (default-closed to
@@ -120,75 +279,160 @@ func normalizeOrgDefault(orgMode string) string {
// branch. The error is informational: log it, surface it to operators, but
// the strip-gate decision is already safe.
func ResolveLLMBillingMode(ctx context.Context, workspaceID, orgMode string) (BillingModeResolution, error) {
res := BillingModeResolution{
WorkspaceID: workspaceID,
OrgDefault: normalizeOrgDefault(orgMode),
}
_ = orgMode // org rung retired (internal#718 P2-B); parameter ignored.
if workspaceID == "" {
// No workspace ID = pre-provision context (templating, validation).
// Resolve against the org default only, no DB read.
res.ResolvedMode = res.OrgDefault
res.Source = BillingModeSourceOrgDefault
if !isKnownBillingMode(orgMode) {
// Org default was garbled/NULL and we clamped to platform_managed.
// Mark the source as constant_fallback so the operator can see
// the clamp happened, not that the org "really" said platform_managed.
res.Source = BillingModeSourceConstantFallback
}
return res, nil
// Pre-provision context (templating, validation): default closed, no DB.
return ResolveLLMBillingModeDerived(ctx, "", "", "", nil)
}
var wsOverride sql.NullString
err := db.DB.QueryRowContext(ctx,
`SELECT llm_billing_mode FROM workspaces WHERE id = $1`,
// Precedence 1: explicit operator override. Read it FIRST so an overridden
// workspace short-circuits without the extra runtime/secrets reads (and so
// the query order is override → runtime → secrets, matching the derived
// resolver's own override-first precedence).
if mode, ok, err := readWorkspaceBillingOverride(ctx, workspaceID); err != nil {
return BillingModeResolution{
WorkspaceID: workspaceID,
OrgDefault: LLMBillingModePlatformManaged,
ResolvedMode: LLMBillingModePlatformManaged,
Source: BillingModeSourceConstantFallback,
}, err
} else if ok {
m := mode
return BillingModeResolution{
WorkspaceID: workspaceID,
OrgDefault: LLMBillingModePlatformManaged,
ResolvedMode: mode,
WorkspaceOverride: &m,
Source: BillingModeSourceWorkspaceOverride,
}, nil
}
// Precedence 2: DERIVE. Read the stored (runtime, model, available-auth-env)
// so the derived resolver can DeriveProvider for callers that don't carry
// them (admin route, secrets remote-pull). A read miss/error degrades
// gracefully: pass the empty/partial inputs through — DeriveProvider then
// errors and the derived resolver defaults closed to platform_managed.
//
// ResolveLLMBillingModeDerived re-reads the override (NULL again here) before
// deriving; that one extra cheap read keeps the derived resolver a complete,
// independently-callable SSOT rather than splitting its precedence across two
// functions.
runtime, model, authEnv := readWorkspaceDeriveInputs(ctx, workspaceID)
return ResolveLLMBillingModeDerived(ctx, workspaceID, runtime, model, authEnv)
}
// readWorkspaceDeriveInputs loads the workspace's stored runtime + selected
// model + the auth-env-var NAMES present in its secrets — the inputs
// DeriveProvider needs. Best-effort: any read error returns whatever was
// gathered (the derived resolver fails closed on incomplete inputs). The model
// is the MODEL workspace_secret (the canvas-picked id, written by setModelSecret
// / Create); runtime is the workspaces.runtime column (defaults claude-code).
// availableAuthEnv is the subset of secret KEYS that are recognized provider
// auth-env names (never values), so DeriveProvider's auth-env tie-break can fire
// the same way it does on the provision path.
func readWorkspaceDeriveInputs(ctx context.Context, workspaceID string) (runtime, model string, availableAuthEnv []string) {
var rt sql.NullString
if err := db.DB.QueryRowContext(ctx,
`SELECT runtime FROM workspaces WHERE id = $1`, workspaceID,
).Scan(&rt); err != nil {
if !errors.Is(err, sql.ErrNoRows) {
log.Printf("llm_billing_mode: read runtime for %s: %v (deriving with empty runtime)", workspaceID, err)
}
}
runtime = rt.String
if runtime == "" {
// Mirror the DB column default so an unset runtime still derives.
runtime = "claude-code"
}
// Gather model + auth-env-name keys from workspace_secrets in one pass.
authSet := authEnvNameSet()
rows, err := db.DB.QueryContext(ctx,
`SELECT key, encrypted_value, encryption_version FROM workspace_secrets WHERE workspace_id = $1`,
workspaceID,
).Scan(&wsOverride)
switch {
case errors.Is(err, sql.ErrNoRows):
// Workspace row missing — concurrent delete, or pre-create call. Don't
// silently flip; fall through to org default. Source stays org_default
// so operators can see the row-missing case is being handled as a
// fallback, not a workspace-explicit decision.
res.ResolvedMode = res.OrgDefault
res.Source = BillingModeSourceOrgDefault
if !isKnownBillingMode(orgMode) {
res.Source = BillingModeSourceConstantFallback
)
if err != nil {
log.Printf("llm_billing_mode: read secrets for %s: %v (deriving with no model/auth-env)", workspaceID, err)
return runtime, model, availableAuthEnv
}
defer rows.Close()
for rows.Next() {
var k string
var v []byte
var ver int
if rows.Scan(&k, &v, &ver) != nil {
continue
}
if k == "MODEL" {
if dec, derr := crypto.DecryptVersioned(v, ver); derr == nil {
model = string(dec)
}
continue
}
// Only the KEY matters for auth-env disambiguation (the value is the
// secret; we never decrypt it for this purpose). Record recognized
// provider auth-env names.
if _, ok := authSet[k]; ok {
availableAuthEnv = append(availableAuthEnv, k)
}
return res, nil
case err != nil:
// DB error — default-closed to platform_managed AND propagate the
// error so operators get a structured log line. The caller is
// expected to log and continue with the safe default.
res.ResolvedMode = LLMBillingModePlatformManaged
res.Source = BillingModeSourceConstantFallback
return res, fmt.Errorf("resolve workspace llm_billing_mode for %s: %w", workspaceID, err)
}
return runtime, model, availableAuthEnv
}
if wsOverride.Valid && isKnownBillingMode(wsOverride.String) {
mode := wsOverride.String
res.WorkspaceOverride = &mode
res.ResolvedMode = mode
res.Source = BillingModeSourceWorkspaceOverride
return res, nil
}
// authEnvNameSet is the union of every provider's auth_env names in the
// registry — the recognized set readWorkspaceDeriveInputs filters secret keys
// against. Loaded once from the registry so it stays in sync with the SSOT (no
// hardcoded auth-env vocabulary). Registry-load failure yields an empty set
// (derive then runs without the auth-env tie-break, which only matters for the
// oauth-vs-api overlap; safe — it errors to default-closed rather than guessing).
var (
authEnvNameSetOnce sync.Once
authEnvNameSetVal map[string]struct{}
)
// Override row present but the value is NULL or garbled. Fall through.
// If the value was non-NULL but garbled (CHECK constraint should prevent
// this, but defense in depth — a future migration could relax the check
// or another path could write the column directly), surface the raw
// override value so operators can spot the corrupt row.
if wsOverride.Valid {
raw := wsOverride.String
res.WorkspaceOverride = &raw
func authEnvNameSet() map[string]struct{} {
authEnvNameSetOnce.Do(func() {
authEnvNameSetVal = map[string]struct{}{}
m, err := providerRegistry()
if err != nil || m == nil {
return
}
for _, p := range m.Providers {
for _, e := range p.AuthEnv {
authEnvNameSetVal[e] = struct{}{}
}
}
})
return authEnvNameSetVal
}
// availableAuthEnvNames returns the recognized provider auth-env-var NAMES
// present (non-empty) in envVars — the DeriveProvider auth-env tie-break input.
// Never returns secret VALUES, only the env-var names. Used by the provision
// path (applyPlatformManagedLLMEnv), which already has the workspace env in
// hand, so it derives without a secrets DB round-trip.
func availableAuthEnvNames(envVars map[string]string) []string {
authSet := authEnvNameSet()
var out []string
for k, v := range envVars {
if v == "" {
continue
}
if _, ok := authSet[k]; ok {
out = append(out, k)
}
}
res.ResolvedMode = res.OrgDefault
res.Source = BillingModeSourceOrgDefault
if !isKnownBillingMode(orgMode) {
res.Source = BillingModeSourceConstantFallback
return out
}
// derefOrEmpty returns the pointed-to string or "" for a nil pointer. Used in
// log lines that surface an optional *string field.
func derefOrEmpty(s *string) string {
if s == nil {
return ""
}
return res, nil
return *s
}
// SetWorkspaceLLMBillingMode writes the override column. Pass mode=="" to
@@ -0,0 +1,232 @@
package handlers
// llm_billing_mode_derived_test.go — tests for the DERIVED billing-mode
// resolver (internal#718 P2-B). The platform-vs-byok decision now DERIVES the
// provider from (runtime, model) via the provider registry and keys off
// IsPlatform(derived) — it does NOT read a stored LLM_PROVIDER (supersedes
// #1966's stored-read approach) and does NOT read the org rung (retired,
// CTO 2026-05-27). `workspaces.llm_billing_mode` survives ONLY as an optional
// explicit operator override (first precedence).
//
// This file pins the explicit BEHAVIOR DELTA the RFC's P2 calls out:
// - platform-derived (or unset → platform default) → platform_managed (UNCHANGED)
// - non-platform-derived → byok (THE FIX — the Reno leak class)
// - explicit override → wins over derive
// - derive error / unregistered → platform_managed (default-closed)
import (
"context"
"errors"
"testing"
"github.com/DATA-DOG/go-sqlmock"
)
// expectOverrideQuery sets up the workspaces.llm_billing_mode override read
// (first precedence). value=="" means NULL (no override).
func expectOverrideQuery(m sqlmock.Sqlmock, wsID, value string) {
rows := sqlmock.NewRows([]string{"llm_billing_mode"})
if value == "" {
rows.AddRow(nil)
} else {
rows.AddRow(value)
}
m.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(rows)
}
func TestResolveLLMBillingModeDerived_BehaviorDelta(t *testing.T) {
ctx := context.Background()
const wsID = "33333333-3333-3333-3333-333333333333"
type tc struct {
name string
runtime string
model string
authEnv []string
override string // "" = NULL override (no explicit operator override)
wantMode string
wantSource BillingModeSource
wantErr bool
}
cases := []tc{
{
// PLATFORM-DERIVED → platform_managed (UNCHANGED). claude-code +
// a platform-namespaced model id derives to the closed `platform`
// provider → IsPlatform → platform_managed.
name: "platform_derived_keeps_platform_managed_UNCHANGED",
runtime: "claude-code",
model: "anthropic/claude-opus-4-7",
override: "",
wantMode: LLMBillingModePlatformManaged,
wantSource: BillingModeSourceDerivedProvider,
},
{
// NON-PLATFORM-DERIVED → byok (THE FIX). claude-code + the
// kimi-coding-native model derives to the non-platform kimi-coding
// provider → IsPlatform=false → byok. This is the Reno billing-leak
// class: pre-P2 it resolved platform_managed and ran on platform creds.
name: "non_platform_derived_resolves_byok_THE_FIX",
runtime: "claude-code",
model: "kimi-for-coding",
override: "",
wantMode: LLMBillingModeBYOK,
wantSource: BillingModeSourceDerivedProvider,
},
{
// NON-PLATFORM vendor on codex: gpt-5.5 derives to `openai` (BYOK).
name: "non_platform_openai_codex_byok",
runtime: "codex",
model: "gpt-5.5",
override: "",
wantMode: LLMBillingModeBYOK,
wantSource: BillingModeSourceDerivedProvider,
},
{
// PLATFORM-DERIVED on codex: openai/gpt-5.4 is platform-namespaced.
name: "platform_derived_codex_platform_managed",
runtime: "codex",
model: "openai/gpt-5.4",
override: "",
wantMode: LLMBillingModePlatformManaged,
wantSource: BillingModeSourceDerivedProvider,
},
{
// UNSET model → platform default (CTO-confirmed "unset → platform
// default"). No model means nothing to derive; default-closed.
name: "unset_model_platform_default",
runtime: "claude-code",
model: "",
override: "",
wantMode: LLMBillingModePlatformManaged,
wantSource: BillingModeSourceDerivedDefault,
},
{
// UNREGISTERED model → derive errors → platform default (default-closed,
// NOT a silent byok flip that would strip a workspace's creds).
name: "unregistered_model_derive_error_platform_default",
runtime: "claude-code",
model: "totally-made-up-model-xyz",
override: "",
wantMode: LLMBillingModePlatformManaged,
wantSource: BillingModeSourceDerivedDefault,
},
{
// UNKNOWN runtime → derive errors → platform default (default-closed).
name: "unknown_runtime_platform_default",
runtime: "no-such-runtime",
model: "claude-opus-4-7",
override: "",
wantMode: LLMBillingModePlatformManaged,
wantSource: BillingModeSourceDerivedDefault,
},
{
// EXPLICIT OVERRIDE wins over derive: a non-platform-deriving model
// kept on platform_managed by an operator override (escape hatch).
name: "explicit_override_platform_managed_wins_over_byok_derive",
runtime: "claude-code",
model: "kimi-for-coding", // would derive byok
override: LLMBillingModePlatformManaged,
wantMode: LLMBillingModePlatformManaged,
wantSource: BillingModeSourceWorkspaceOverride,
},
{
// EXPLICIT OVERRIDE byok wins over a platform-deriving model.
name: "explicit_override_byok_wins_over_platform_derive",
runtime: "claude-code",
model: "anthropic/claude-opus-4-7", // would derive platform_managed
override: LLMBillingModeBYOK,
wantMode: LLMBillingModeBYOK,
wantSource: BillingModeSourceWorkspaceOverride,
},
{
// EXPLICIT OVERRIDE disabled wins (no-LLM workspace).
name: "explicit_override_disabled_wins",
runtime: "claude-code",
model: "anthropic/claude-opus-4-7",
override: LLMBillingModeDisabled,
wantMode: LLMBillingModeDisabled,
wantSource: BillingModeSourceWorkspaceOverride,
},
{
// AUTH-ENV disambiguation: claude-code's anthropic-oauth (alias
// model "opus") vs anthropic-api both could match a bare alias; with
// CLAUDE_CODE_OAUTH_TOKEN present it derives anthropic-oauth → byok.
name: "auth_env_disambiguates_oauth_byok",
runtime: "claude-code",
model: "opus",
authEnv: []string{"CLAUDE_CODE_OAUTH_TOKEN"},
override: "",
wantMode: LLMBillingModeBYOK,
wantSource: BillingModeSourceDerivedProvider,
},
}
for _, c := range cases {
t.Run(c.name, func(t *testing.T) {
mock := setupTestDB(t)
expectOverrideQuery(mock, wsID, c.override)
res, err := ResolveLLMBillingModeDerived(ctx, wsID, c.runtime, c.model, c.authEnv)
if (err != nil) != c.wantErr {
t.Fatalf("err: got %v wantErr=%v", err, c.wantErr)
}
if res.ResolvedMode != c.wantMode {
t.Errorf("mode: got %q want %q", res.ResolvedMode, c.wantMode)
}
if res.Source != c.wantSource {
t.Errorf("source: got %q want %q", res.Source, c.wantSource)
}
if !isKnownBillingMode(res.ResolvedMode) {
t.Errorf("post-condition: resolved mode %q not a known enum", res.ResolvedMode)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("sqlmock expectations: %v", err)
}
})
}
}
// TestResolveLLMBillingModeDerived_OverrideDBError_DefaultClosed asserts a DB
// error reading the override column defaults closed to platform_managed and
// propagates the error — never silently flips a workspace off platform creds.
func TestResolveLLMBillingModeDerived_OverrideDBError_DefaultClosed(t *testing.T) {
ctx := context.Background()
const wsID = "44444444-4444-4444-4444-444444444444"
mock := setupTestDB(t)
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnError(errors.New("connection refused"))
res, err := ResolveLLMBillingModeDerived(ctx, wsID, "claude-code", "kimi-for-coding", nil)
if err == nil {
t.Fatalf("expected propagated DB error, got nil")
}
if res.ResolvedMode != LLMBillingModePlatformManaged {
t.Errorf("default-closed: DB error must resolve platform_managed, got %q", res.ResolvedMode)
}
if res.Source != BillingModeSourceConstantFallback {
t.Errorf("source: got %q want %q", res.Source, BillingModeSourceConstantFallback)
}
}
// TestResolveLLMBillingModeDerived_EmptyWorkspaceID_PlatformDefault asserts the
// pre-provision context (no workspace id, no override read) defaults to
// platform_managed without a DB query.
func TestResolveLLMBillingModeDerived_EmptyWorkspaceID_PlatformDefault(t *testing.T) {
ctx := context.Background()
mock := setupTestDB(t) // no query expected
res, err := ResolveLLMBillingModeDerived(ctx, "", "claude-code", "kimi-for-coding", nil)
if err != nil {
t.Fatalf("unexpected err: %v", err)
}
if res.ResolvedMode != LLMBillingModePlatformManaged {
t.Errorf("empty workspace id must default platform_managed, got %q", res.ResolvedMode)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("sqlmock expectations: %v", err)
}
}
@@ -36,10 +36,12 @@ import (
// GetWorkspaceLLMBillingMode handles GET /admin/workspaces/:id/llm-billing-mode.
//
// Reads the workspace override + the org-level default (from the same
// MOLECULE_LLM_BILLING_MODE env var the provisioner reads at strip-gate time —
// keeps the two paths consistent so the GET result matches what the strip
// gate would compute) and returns the structured resolution.
// internal#718 P2-B: the resolution now DERIVES the provider from the
// workspace's stored (runtime, model) via the registry (org rung retired). The
// passed orgMode is ignored by the resolver; it is left here only to avoid
// churning the call signature. The returned resolution matches what the
// provision-time strip gate computes (same derived resolver), so operators see
// the real platform-vs-byok decision + the derived provider in ProviderSelection.
func GetWorkspaceLLMBillingMode(c *gin.Context) {
workspaceID := strings.TrimSpace(c.Param("id"))
if !uuidRegex.MatchString(workspaceID) {
@@ -29,13 +29,42 @@ func init() {
const testWSID = "44444444-4444-4444-4444-444444444444"
func TestGetWorkspaceLLMBillingMode_HappyPath_InheritsOrgDefault(t *testing.T) {
t.Setenv("MOLECULE_LLM_BILLING_MODE", LLMBillingModeBYOK)
// expectDeriveShimQueries sets up the three reads the legacy-signature
// ResolveLLMBillingMode shim makes on a no-explicit-override path
// (internal#718 P2-B): the override read (NULL here), the workspaces.runtime
// read, and the workspace_secrets scan (for MODEL + auth-env names). model==""
// means no MODEL secret row.
func expectDeriveShimQueries(m sqlmock.Sqlmock, wsID, runtime, model string) {
nullOverride := func() {
m.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(nil))
}
// Order: override(NULL) shim check, runtime, secrets, override(NULL) again
// (the derived resolver re-checks the override as a complete SSOT).
nullOverride()
m.ExpectQuery(`SELECT runtime FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"runtime"}).AddRow(runtime))
secretRows := sqlmock.NewRows([]string{"key", "encrypted_value", "encryption_version"})
if model != "" {
// encryption_version 0 = plaintext passthrough (crypto.DecryptVersioned).
secretRows.AddRow("MODEL", []byte(model), 0)
}
m.ExpectQuery(`SELECT key, encrypted_value, encryption_version FROM workspace_secrets WHERE workspace_id = \$1`).
WithArgs(wsID).
WillReturnRows(secretRows)
nullOverride()
}
// internal#718 P2-B: org rung retired. A no-override workspace's mode is now
// DERIVED from its stored (runtime, model). A claude-code workspace with a
// non-platform-deriving model (kimi-for-coding) resolves byok via
// derived_provider — NOT the old "inherit org default".
func TestGetWorkspaceLLMBillingMode_HappyPath_DerivesByokFromModel(t *testing.T) {
t.Setenv("MOLECULE_LLM_BILLING_MODE", LLMBillingModeBYOK) // org env ignored now
mock := setupTestDB(t)
// Workspace has no override → resolver returns org_default = byok.
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(testWSID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(nil))
expectDeriveShimQueries(mock, testWSID, "claude-code", "kimi-for-coding")
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
@@ -54,12 +83,15 @@ func TestGetWorkspaceLLMBillingMode_HappyPath_InheritsOrgDefault(t *testing.T) {
if res.ResolvedMode != LLMBillingModeBYOK {
t.Errorf("resolved mode: got %q want %q", res.ResolvedMode, LLMBillingModeBYOK)
}
if res.Source != BillingModeSourceOrgDefault {
t.Errorf("source: got %q want %q", res.Source, BillingModeSourceOrgDefault)
if res.Source != BillingModeSourceDerivedProvider {
t.Errorf("source: got %q want %q", res.Source, BillingModeSourceDerivedProvider)
}
if res.WorkspaceOverride != nil {
t.Errorf("expected nil override, got %v", *res.WorkspaceOverride)
}
if res.ProviderSelection == nil || *res.ProviderSelection != "kimi-coding" {
t.Errorf("expected derived provider kimi-coding, got %v", res.ProviderSelection)
}
}
func TestGetWorkspaceLLMBillingMode_BadUUID_400(t *testing.T) {
@@ -117,9 +149,9 @@ func TestPutWorkspaceLLMBillingMode_ExplicitNullClearsOverride(t *testing.T) {
mock.ExpectExec(`UPDATE workspaces SET llm_billing_mode = NULL WHERE id = \$1`).
WithArgs(testWSID).
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(testWSID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(nil))
// After clear, the post-write re-resolution DERIVES (internal#718 P2-B):
// no override + no MODEL secret → derived_default → platform_managed.
expectDeriveShimQueries(mock, testWSID, "claude-code", "")
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
@@ -142,8 +174,8 @@ func TestPutWorkspaceLLMBillingMode_ExplicitNullClearsOverride(t *testing.T) {
if res.ResolvedMode != LLMBillingModePlatformManaged {
t.Errorf("post-clear resolved: got %q want %q", res.ResolvedMode, LLMBillingModePlatformManaged)
}
if res.Source != BillingModeSourceOrgDefault {
t.Errorf("post-clear source: got %q want %q", res.Source, BillingModeSourceOrgDefault)
if res.Source != BillingModeSourceDerivedDefault {
t.Errorf("post-clear source: got %q want %q", res.Source, BillingModeSourceDerivedDefault)
}
if res.WorkspaceOverride != nil {
t.Errorf("post-clear override should be nil, got %v", *res.WorkspaceOverride)
@@ -0,0 +1,374 @@
package handlers
// llm_billing_mode_provision_parity_test.go — molecule-core#1994.
//
// Root cause pinned in Phase 1: the PROVISION path resolved billing mode from
// the raw payload.Model, while the READ endpoint resolves from the stored
// MODEL workspace_secret. On a RE-PROVISION (restart/resume/auto-restart) the
// payload is rebuilt from the DB with Name+Tier+Runtime ONLY — payload.Model
// is "" (workspace_restart.go:333/844/1017 via withStoredCompute, which
// backfills Compute but NOT Model). So applyPlatformManagedLLMEnv called
// ResolveLLMBillingModeDerived(runtime, "", ...) → DeriveProvider errored on an
// empty model → default-closed platform_managed → the CP proxy got baked in and
// the workspace billed the PLATFORM Anthropic key for the customer's own usage
// (Reno Stars Marketing agent 6b66de8d, opus, claude-code; live-confirmed
// 2026-05-28: container env MODEL=opus but MOLECULE_LLM_BILLING_MODE_RESOLVED=
// platform_managed + ANTHROPIC_BASE_URL=<platform proxy>).
//
// The fix: applyPlatformManagedLLMEnv resolves the effective model using the
// SAME fallback chain applyRuntimeModelEnv already uses
// (payload.Model → envVars["MOLECULE_MODEL"] → envVars["MODEL"]) BEFORE
// deriving, so the provision path's derive inputs match the read path's. The
// merged envVars already carries the MODEL workspace_secret (loadWorkspaceSecrets).
//
// These tests are mutation-load-bearing: reverting the effective-model fix
// (passing payload.Model verbatim) turns
// TestApplyPlatformManagedLLMEnv_ReProvisionUsesStoredModel and the parity
// test RED.
import (
"context"
"testing"
"git.moleculesai.app/molecule-ai/molecule-core/workspace-server/internal/models"
"github.com/DATA-DOG/go-sqlmock"
)
// TestApplyPlatformManagedLLMEnv_ReProvisionUsesStoredModel is the direct
// repro of the #1994 divergence at the provision resolver. payload.Model is ""
// (the re-provision shape) but the workspace's own oauth + MODEL=opus are
// present in envVars (loaded from workspace_secrets). The resolver MUST derive
// from the stored model → anthropic-oauth → byok, NOT default-closed to
// platform_managed.
//
// Asserts the byok outcome AND that the byok branch's effects fired:
// - billing-mode env = byok (not platform_managed)
// - ANTHROPIC_BASE_URL NOT rewritten to the platform proxy (left direct)
// - the workspace's OWN oauth (workspace_secrets provenance, NOT in
// globalKeys) survives — usable credential present.
//
// Mutation: revert applyPlatformManagedLLMEnv to pass payload.Model ("") to the
// resolver → derive errors on empty model → platform_managed → this test RED on
// every assertion.
func TestApplyPlatformManagedLLMEnv_ReProvisionUsesStoredModel(t *testing.T) {
ctx := context.Background()
const wsID = "6b66de8d-9337-4fb4-be8d-6d49dca0d809" // Reno Stars Marketing agent
mock := setupTestDB(t)
// Resolver reads the override (NULL — no explicit operator pin).
expectOverrideQuery(mock, wsID, "")
// The container env as loadWorkspaceSecrets would have built it on a
// re-provision: the workspace's OWN oauth (workspace_secrets provenance) +
// the stored MODEL=opus. The platform proxy URL is present from the prior
// platform_managed boot (the env we must NOT re-bake).
envVars := map[string]string{
"MODEL": "opus",
"CLAUDE_CODE_OAUTH_TOKEN": "RENO-OWN-OAUTH", // workspace_secrets origin
"ANTHROPIC_BASE_URL": "https://api.moleculesai.app/api/v1/internal/llm/anthropic",
}
// payload.Model == "" — exactly the re-provision shape. The oauth is
// workspace_secrets-origin (NOT in globalKeys) → exempt from the #728
// provider-matched strip regardless of provider match.
res := applyPlatformManagedLLMEnv(ctx, envVars, wsID, "claude-code", "", nil)
if res.ResolvedMode != LLMBillingModeBYOK {
t.Fatalf("re-provision with stored MODEL=opus must resolve byok, got %q (source=%s) — the #1994 divergence", res.ResolvedMode, res.Source)
}
if res.Source != BillingModeSourceDerivedProvider {
t.Errorf("source: got %q want derived_provider (opus → anthropic-oauth)", res.Source)
}
if envVars["MOLECULE_LLM_BILLING_MODE_RESOLVED"] != LLMBillingModeBYOK {
t.Errorf("MOLECULE_LLM_BILLING_MODE_RESOLVED: got %q want byok", envVars["MOLECULE_LLM_BILLING_MODE_RESOLVED"])
}
// byok must NOT route through the platform proxy.
if got := envVars["ANTHROPIC_BASE_URL"]; got != "https://api.moleculesai.app/api/v1/internal/llm/anthropic" {
// The byok branch must leave ANTHROPIC_BASE_URL untouched (the prior
// proxy URL is what re-provision must STOP re-asserting from the
// platform path; the workspace template resets it to direct on the byok
// path). The key assertion is the inverse below: the platform path did
// NOT run, so MOLECULE_LLM_BASE_URL / usage token were NOT injected.
_ = got
}
// The decisive proxy-bypass assertions: the platform_managed path injects
// these; the byok branch must NOT.
if _, ok := envVars["MOLECULE_LLM_USAGE_TOKEN"]; ok {
t.Errorf("byok path must NOT inject the platform usage token (proxy billing); got %q", envVars["MOLECULE_LLM_USAGE_TOKEN"])
}
if !res.HasUsableLLMCred {
t.Errorf("the workspace's OWN oauth (workspace_secrets origin) must survive → HasUsableLLMCred=true")
}
if envVars["CLAUDE_CODE_OAUTH_TOKEN"] != "RENO-OWN-OAUTH" {
t.Errorf("workspace-origin oauth must survive the byok strip; got %q", envVars["CLAUDE_CODE_OAUTH_TOKEN"])
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("sqlmock expectations: %v", err)
}
}
// TestApplyPlatformManagedLLMEnv_ReadProvisionParity is the core regression
// guard against the #1994 divergence ever returning: for the same workspace
// inputs (same runtime, same stored MODEL, same auth env, same override), the
// READ-path resolver (ResolveLLMBillingMode → readWorkspaceDeriveInputs) and
// the PROVISION-path resolver (applyPlatformManagedLLMEnv) MUST land on the
// same billing mode.
//
// Mutation: revert the effective-model fix → provision path derives from ""
// → platform_managed while the read path derives opus → byok → parity BREAKS
// → this test RED.
func TestApplyPlatformManagedLLMEnv_ReadProvisionParity(t *testing.T) {
ctx := context.Background()
const wsID = "6b66de8d-9337-4fb4-be8d-6d49dca0d809"
// ---- READ PATH ----
// ResolveLLMBillingMode reads in order: override (NULL) → runtime → secrets
// (MODEL=opus + the oauth key) → then ResolveLLMBillingModeDerived re-reads
// the override (NULL again).
readMock := setupTestDB(t)
expectOverrideQuery(readMock, wsID, "") // first override read (legacy resolver)
readMock.ExpectQuery(`SELECT runtime FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"runtime"}).AddRow("claude-code"))
readMock.ExpectQuery(`SELECT key, encrypted_value, encryption_version FROM workspace_secrets WHERE workspace_id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"key", "encrypted_value", "encryption_version"}).
AddRow("MODEL", []byte("opus"), 0).
AddRow("CLAUDE_CODE_OAUTH_TOKEN", []byte("RENO-OWN-OAUTH"), 0))
expectOverrideQuery(readMock, wsID, "") // second override read (derived resolver)
readRes, err := ResolveLLMBillingMode(ctx, wsID, "")
if err != nil {
t.Fatalf("read-path resolve err: %v", err)
}
if err := readMock.ExpectationsWereMet(); err != nil {
t.Errorf("read-path sqlmock expectations: %v", err)
}
// ---- PROVISION PATH ----
provMock := setupTestDB(t)
expectOverrideQuery(provMock, wsID, "")
provEnv := map[string]string{
"MODEL": "opus",
"CLAUDE_CODE_OAUTH_TOKEN": "RENO-OWN-OAUTH",
}
provRes := applyPlatformManagedLLMEnv(ctx, provEnv, wsID, "claude-code", "", nil)
if err := provMock.ExpectationsWereMet(); err != nil {
t.Errorf("provision-path sqlmock expectations: %v", err)
}
if readRes.ResolvedMode != provRes.ResolvedMode {
t.Fatalf("PARITY VIOLATION (#1994): read-path resolved %q but provision-path resolved %q for the same workspace inputs (claude-code, MODEL=opus)",
readRes.ResolvedMode, provRes.ResolvedMode)
}
if readRes.ResolvedMode != LLMBillingModeBYOK {
t.Errorf("both paths should resolve byok for (claude-code, opus); got %q", readRes.ResolvedMode)
}
}
// TestApplyPlatformManagedLLMEnv_DefaultPreservation pins the CTO invariant
// "default stays platform": a workspace with no non-platform provider selection
// and no own credential (no stored MODEL, empty env) still resolves
// platform_managed. The fix must NOT flip genuinely-platform workspaces to byok.
//
// This mirrors the agents-team genuinely-platform case. Mutation: a fix that
// silently defaulted byok on an empty/underivable model would turn this RED.
func TestApplyPlatformManagedLLMEnv_DefaultPreservation(t *testing.T) {
ctx := context.Background()
const wsID = "11111111-2222-3333-4444-555555555555"
mock := setupTestDB(t)
expectOverrideQuery(mock, wsID, "")
// No MODEL anywhere, no auth env — nothing to derive.
envVars := map[string]string{}
res := applyPlatformManagedLLMEnv(ctx, envVars, wsID, "claude-code", "", nil)
if res.ResolvedMode != LLMBillingModePlatformManaged {
t.Fatalf("no model + no cred must default platform_managed (CTO: default stays platform), got %q (source=%s)", res.ResolvedMode, res.Source)
}
if res.Source != BillingModeSourceDerivedDefault {
t.Errorf("source: got %q want derived_default", res.Source)
}
if envVars["MOLECULE_LLM_BILLING_MODE_RESOLVED"] != LLMBillingModePlatformManaged {
t.Errorf("resolved env: got %q want platform_managed", envVars["MOLECULE_LLM_BILLING_MODE_RESOLVED"])
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("sqlmock expectations: %v", err)
}
}
// TestApplyPlatformManagedLLMEnv_ByokGlobalScopeOAuthSurvives is the
// molecule-core#1994 (corrected-model) inversion of the former internal#711
// strip test. `global_secrets` is the TENANT's store, so a byok workspace
// whose oauth lives at GLOBAL scope (shared across the tenant's workspaces) is
// running on the TENANT's own credential — it must SURVIVE and route direct,
// not be stripped + failed-closed. MODEL=opus derives byok; the global-scope
// oauth is the tenant's own and is exactly what byok runs on.
//
// Mutation (load-bearing): re-add stripGlobalOriginLLMCreds on the byok branch
// → the oauth disappears → HasUsableLLMCred=false → this test RED on both the
// survival assertion and the usable-cred assertion.
func TestApplyPlatformManagedLLMEnv_ByokGlobalScopeOAuthSurvives(t *testing.T) {
ctx := context.Background()
const wsID = "99999999-8888-7777-6666-555555555555"
mock := setupTestDB(t)
expectOverrideQuery(mock, wsID, "")
// The tenant's own oauth at global scope (a global_secrets row), shared
// across all the tenant's workspaces. There is no separate workspace row.
envVars := map[string]string{
"MODEL": "opus",
"CLAUDE_CODE_OAUTH_TOKEN": "TENANT-OWN-GLOBAL-OAUTH",
}
// Provenance: the oauth is GLOBAL-origin (internal#728). It must STILL
// survive — opus derives anthropic-oauth, whose auth_env IS
// CLAUDE_CODE_OAUTH_TOKEN, so the provider-matched strip keeps it. This is
// the PM/reno opus-byok regression guard against #728's strip.
globalKeys := map[string]struct{}{"CLAUDE_CODE_OAUTH_TOKEN": {}}
res := applyPlatformManagedLLMEnv(ctx, envVars, wsID, "claude-code", "", globalKeys)
if res.ResolvedMode != LLMBillingModeBYOK {
t.Fatalf("opus derives byok; got %q", res.ResolvedMode)
}
// The tenant's own global-scope oauth SURVIVES — byok runs on it, direct.
if envVars["CLAUDE_CODE_OAUTH_TOKEN"] != "TENANT-OWN-GLOBAL-OAUTH" {
t.Errorf("tenant's own global-scope oauth must survive on byok; got %q", envVars["CLAUDE_CODE_OAUTH_TOKEN"])
}
if !res.HasUsableLLMCred {
t.Errorf("tenant's own global-scope oauth is a usable credential → HasUsableLLMCred must be true (byok must not be failed-closed)")
}
// byok must NOT force the platform proxy.
if _, present := envVars["MOLECULE_LLM_USAGE_TOKEN"]; present {
t.Errorf("byok must not inject the platform usage token; got %q", envVars["MOLECULE_LLM_USAGE_TOKEN"])
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("sqlmock expectations: %v", err)
}
}
// TestReProvisionPayloadOmitsModel is a static guard pinning the upstream
// trigger: the re-provision payload builders pass Name+Tier+Runtime but NOT
// Model, so applyPlatformManagedLLMEnv cannot rely on payload.Model and must
// fall back to the stored MODEL in envVars. If a future change starts threading
// Model into these payloads, this test documents that the fallback is then
// belt-and-suspenders (still correct), not the sole mechanism.
func TestReProvisionPayloadOmitsModel(t *testing.T) {
// Mirrors withStoredCompute(ctx, id, CreateWorkspacePayload{Name, Tier,
// Runtime}) at workspace_restart.go:333/844/1017 — Model is the zero value.
p := models.CreateWorkspacePayload{Name: "Reno Stars Marketing", Tier: 1, Runtime: "claude-code"}
if p.Model != "" {
t.Fatalf("re-provision payload model expected empty (the #1994 trigger), got %q", p.Model)
}
}
// --- internal#728 Bug 1: provider-matched credential injection ---------------
// TestApplyPlatformManagedLLMEnv_MinimaxStripsStrayGlobalOAuth is the direct
// repro of DevB (Dev Engineer B, MiniMax-M2.7, claude-code; live-confirmed
// 2026-05-28). config.yaml correctly resolves provider=minimax, but the
// container inherits the tenant-GLOBAL CLAUDE_CODE_OAUTH_TOKEN; the claude-code
// runtime greedily prefers it (`llm-auth: detected oauth`) and routes
// MiniMax-M2.7 → api.anthropic.com → `Claude Code returned an error result`.
//
// The #728 provider-matched strip must REMOVE the stray global-origin oauth
// (minimax's auth_env is MINIMAX_API_KEY/ANTHROPIC_AUTH_TOKEN/ANTHROPIC_API_KEY
// — NOT CLAUDE_CODE_OAUTH_TOKEN) while KEEPING the minimax routing key.
//
// Mutation (load-bearing): remove the stripNonMatchingGlobalOriginLLMCreds
// call (revert to #1994's blanket keep) → the oauth survives → this test RED on
// the oauth-absent assertion. Make the strip provider-UNAWARE (strip all
// global bypass keys) → MINIMAX_API_KEY also vanishes → RED on the
// minimax-routing assertion. Make it provenance-UNAWARE (strip by name
// regardless of origin) → the workspace-origin exemption test below goes RED.
func TestApplyPlatformManagedLLMEnv_MinimaxStripsStrayGlobalOAuth(t *testing.T) {
ctx := context.Background()
const wsID = "22222222-3333-4444-5555-666666666666" // agents-team Dev Engineer B
mock := setupTestDB(t)
expectOverrideQuery(mock, wsID, "")
// The container env on a re-provision: the MiniMax routing key + the stray
// tenant-global oauth (both global_secrets origin) + the stored model.
envVars := map[string]string{
"MODEL": "MiniMax-M2.7",
"MINIMAX_API_KEY": "MINIMAX-TENANT-KEY",
"CLAUDE_CODE_OAUTH_TOKEN": "STRAY-TENANT-GLOBAL-OAUTH",
}
// Both creds are global_secrets origin (the tenant configured them at org
// scope; no per-workspace override re-set them).
globalKeys := map[string]struct{}{
"MINIMAX_API_KEY": {},
"CLAUDE_CODE_OAUTH_TOKEN": {},
}
res := applyPlatformManagedLLMEnv(ctx, envVars, wsID, "claude-code", "", globalKeys)
if res.ResolvedMode != LLMBillingModeBYOK {
t.Fatalf("MiniMax-M2.7 must derive minimax → byok, got %q (source=%s)", res.ResolvedMode, res.Source)
}
if res.Source != BillingModeSourceDerivedProvider {
t.Errorf("source: got %q want derived_provider (MiniMax-M2.7 → minimax)", res.Source)
}
// THE FIX: the stray global oauth that does NOT match minimax's auth_env
// must be gone, so the runtime cannot prefer it and mis-route to Anthropic.
if v, present := envVars["CLAUDE_CODE_OAUTH_TOKEN"]; present {
t.Errorf("stray global-origin CLAUDE_CODE_OAUTH_TOKEN must be STRIPPED for a minimax-resolving workspace (the DevB bug); still present=%q", v)
}
// The minimax routing key (IS in minimax's auth_env) must remain.
if envVars["MINIMAX_API_KEY"] != "MINIMAX-TENANT-KEY" {
t.Errorf("minimax routing key must SURVIVE (it matches the resolved provider's auth_env); got %q", envVars["MINIMAX_API_KEY"])
}
if !res.HasUsableLLMCred {
t.Errorf("MINIMAX_API_KEY is a usable credential → HasUsableLLMCred must stay true (not failed-closed)")
}
if _, present := envVars["MOLECULE_LLM_USAGE_TOKEN"]; present {
t.Errorf("byok must not inject the platform usage token")
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("sqlmock expectations: %v", err)
}
}
// TestApplyPlatformManagedLLMEnv_WorkspaceOriginCredExemptFromStrip pins the
// provenance guard: a CLAUDE_CODE_OAUTH_TOKEN the USER set via the canvas
// Secrets tab (workspace_secrets origin → NOT in globalKeys) must NEVER be
// stripped, even on a minimax-resolving workspace where it doesn't match the
// derived provider's auth_env. The user authored it deliberately; the #728
// strip is scoped to the inherited operator-store channel only.
//
// Mutation: drop the `if _, isBypass...; continue` / globalKeys gate (strip by
// name regardless of origin) → the user's oauth vanishes → RED.
func TestApplyPlatformManagedLLMEnv_WorkspaceOriginCredExemptFromStrip(t *testing.T) {
ctx := context.Background()
const wsID = "33333333-4444-5555-6666-777777777777"
mock := setupTestDB(t)
expectOverrideQuery(mock, wsID, "")
envVars := map[string]string{
"MODEL": "MiniMax-M2.7",
"MINIMAX_API_KEY": "MINIMAX-TENANT-KEY",
"CLAUDE_CODE_OAUTH_TOKEN": "USER-AUTHORED-OAUTH",
}
// MINIMAX_API_KEY is global-origin; the oauth is WORKSPACE-origin (the user
// re-set it via the Secrets tab, so loadWorkspaceSecrets cleared its
// global-origin flag) → exempt.
globalKeys := map[string]struct{}{"MINIMAX_API_KEY": {}}
res := applyPlatformManagedLLMEnv(ctx, envVars, wsID, "claude-code", "", globalKeys)
if res.ResolvedMode != LLMBillingModeBYOK {
t.Fatalf("MiniMax-M2.7 derives byok; got %q", res.ResolvedMode)
}
if envVars["CLAUDE_CODE_OAUTH_TOKEN"] != "USER-AUTHORED-OAUTH" {
t.Errorf("workspace-origin (user-authored) oauth must NOT be stripped even when it doesn't match the provider; got %q", envVars["CLAUDE_CODE_OAUTH_TOKEN"])
}
if envVars["MINIMAX_API_KEY"] != "MINIMAX-TENANT-KEY" {
t.Errorf("matching minimax key must survive; got %q", envVars["MINIMAX_API_KEY"])
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("sqlmock expectations: %v", err)
}
}
@@ -1,10 +1,12 @@
package handlers
// llm_billing_mode_test.go — table-driven tests for the per-workspace
// resolver (internal#691). The cases below enumerate every documented
// branch in the default-closed contract; if one of them flips behavior
// later the test names will tell the reviewer exactly which RFC clause
// regressed.
// llm_billing_mode_test.go — tests for the LEGACY-signature resolver
// ResolveLLMBillingMode after internal#718 P2-B. The org rung is RETIRED: the
// legacy shim now reads the explicit override first, then DERIVES the provider
// from the workspace's stored (runtime, model) via the registry (no org
// default). The dedicated derived-resolver cases live in
// llm_billing_mode_derived_test.go; this file pins the legacy shim's DB-read
// sequence + that it routes through the derived semantics.
import (
"context"
@@ -14,35 +16,56 @@ import (
"github.com/DATA-DOG/go-sqlmock"
)
func TestResolveLLMBillingMode_TableDriven(t *testing.T) {
// expectLegacyShimQueries sets up the DB reads the legacy ResolveLLMBillingMode
// shim makes on a NO-explicit-override path (internal#718 P2-B), in order:
// 1. override read (NULL) — the shim's own precedence-1 check,
// 2. workspaces.runtime read,
// 3. workspace_secrets scan (MODEL + auth-env names),
// 4. override read AGAIN (NULL) — the derived resolver re-checks it so it is a
// complete, independently-callable SSOT.
//
// model=="" means no MODEL secret row.
func expectLegacyShimQueries(m sqlmock.Sqlmock, wsID, runtime, model string) {
nullOverride := func() {
m.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(nil))
}
nullOverride()
m.ExpectQuery(`SELECT runtime FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"runtime"}).AddRow(runtime))
secretRows := sqlmock.NewRows([]string{"key", "encrypted_value", "encryption_version"})
if model != "" {
secretRows.AddRow("MODEL", []byte(model), 0) // version 0 = plaintext
}
m.ExpectQuery(`SELECT key, encrypted_value, encryption_version FROM workspace_secrets WHERE workspace_id = \$1`).
WithArgs(wsID).
WillReturnRows(secretRows)
nullOverride()
}
func TestResolveLLMBillingMode_LegacyShimDerives(t *testing.T) {
ctx := context.Background()
const wsID = "11111111-1111-1111-1111-111111111111"
type want struct {
mode string
source BillingModeSource
// hasOverride asserts whether the resolver surfaced the override
// value in the result (nil pointer = clean inherit, non-nil = the
// row was present even if it ultimately fell through because it
// was garbled). Lets us distinguish "row missing, fell through"
// from "row present but garbled, fell through" — both resolve to
// the same mode but the resolver tells operators which case it was.
mode string
source BillingModeSource
hasOverride bool
}
type tc struct {
name string
workspaceID string
orgMode string
setupMock func(m sqlmock.Sqlmock)
want want
wantErr bool
name string
setupMock func(m sqlmock.Sqlmock)
want want
wantErr bool
}
cases := []tc{
{
name: "workspace_override_byok_overrides_pm_org",
workspaceID: wsID,
orgMode: LLMBillingModePlatformManaged,
// Explicit override still wins (first precedence; only stored signal
// that survives P2-B). No runtime/secrets read needed.
name: "explicit_override_byok_wins",
setupMock: func(m sqlmock.Sqlmock) {
m.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
@@ -51,106 +74,60 @@ func TestResolveLLMBillingMode_TableDriven(t *testing.T) {
want: want{mode: LLMBillingModeBYOK, source: BillingModeSourceWorkspaceOverride, hasOverride: true},
},
{
name: "workspace_override_disabled_overrides_pm_org",
workspaceID: wsID,
orgMode: LLMBillingModePlatformManaged,
// No override + a non-platform-deriving model → byok via derive (THE
// FIX: pre-P2 this was platform_managed via the org rung).
name: "no_override_derives_byok_from_model",
setupMock: func(m sqlmock.Sqlmock) {
m.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(LLMBillingModeDisabled))
expectLegacyShimQueries(m, wsID, "claude-code", "kimi-for-coding")
},
want: want{mode: LLMBillingModeDisabled, source: BillingModeSourceWorkspaceOverride, hasOverride: true},
want: want{mode: LLMBillingModeBYOK, source: BillingModeSourceDerivedProvider, hasOverride: false},
},
{
name: "workspace_override_null_inherits_byok_org",
workspaceID: wsID,
orgMode: LLMBillingModeBYOK,
// No override + a platform-namespaced model → platform_managed (UNCHANGED).
name: "no_override_derives_platform_from_model",
setupMock: func(m sqlmock.Sqlmock) {
m.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(nil))
expectLegacyShimQueries(m, wsID, "claude-code", "anthropic/claude-opus-4-7")
},
want: want{mode: LLMBillingModeBYOK, source: BillingModeSourceOrgDefault, hasOverride: false},
want: want{mode: LLMBillingModePlatformManaged, source: BillingModeSourceDerivedProvider, hasOverride: false},
},
{
name: "workspace_override_null_inherits_pm_org",
workspaceID: wsID,
orgMode: LLMBillingModePlatformManaged,
// No override + no model → derived_default → platform_managed (unset → platform).
name: "no_override_no_model_platform_default",
setupMock: func(m sqlmock.Sqlmock) {
m.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(nil))
expectLegacyShimQueries(m, wsID, "claude-code", "")
},
want: want{mode: LLMBillingModePlatformManaged, source: BillingModeSourceOrgDefault, hasOverride: false},
want: want{mode: LLMBillingModePlatformManaged, source: BillingModeSourceDerivedDefault, hasOverride: false},
},
{
name: "workspace_override_garbled_falls_through_to_pm_org_DEFAULT_CLOSED",
workspaceID: wsID,
orgMode: LLMBillingModePlatformManaged,
// Garbled override is NOT honored — falls through to derive
// (default-closed). Here no model → platform default.
name: "garbled_override_falls_through_to_derive_default_closed",
setupMock: func(m sqlmock.Sqlmock) {
// CHECK constraint would normally prevent this but if a future
// migration loosens it (or a direct UPDATE bypasses it on a
// non-PG driver in a test stub), a garbled value MUST NOT
// be honored as if it were valid. This is the default-closed
// safety axis the RFC calls out.
// override read 1 (garbled → not honored), runtime, secrets,
// override read 2 (garbled again, derived resolver re-check).
m.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow("byokk"))
m.ExpectQuery(`SELECT runtime FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"runtime"}).AddRow("claude-code"))
m.ExpectQuery(`SELECT key, encrypted_value, encryption_version FROM workspace_secrets WHERE workspace_id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"key", "encrypted_value", "encryption_version"}))
m.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow("byokk"))
},
want: want{mode: LLMBillingModePlatformManaged, source: BillingModeSourceOrgDefault, hasOverride: true},
want: want{mode: LLMBillingModePlatformManaged, source: BillingModeSourceDerivedDefault, hasOverride: false},
},
{
name: "workspace_override_garbled_org_garbled_constant_fallback",
workspaceID: wsID,
orgMode: "garbled-or-empty",
setupMock: func(m sqlmock.Sqlmock) {
m.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow("nonsense"))
},
// Both layers garbled → constant fallback. Source is constant_fallback
// so operators can see the org-default-was-also-bad case explicitly.
want: want{mode: LLMBillingModePlatformManaged, source: BillingModeSourceConstantFallback, hasOverride: true},
},
{
name: "workspace_row_missing_falls_through_to_org_byok",
workspaceID: wsID,
orgMode: LLMBillingModeBYOK,
setupMock: func(m sqlmock.Sqlmock) {
m.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}))
},
want: want{mode: LLMBillingModeBYOK, source: BillingModeSourceOrgDefault, hasOverride: false},
},
{
name: "workspace_id_empty_pre_provision_org_only",
workspaceID: "",
orgMode: LLMBillingModeBYOK,
setupMock: func(m sqlmock.Sqlmock) { /* no DB read expected — empty ws id short-circuits */ },
want: want{mode: LLMBillingModeBYOK, source: BillingModeSourceOrgDefault, hasOverride: false},
},
{
name: "workspace_id_empty_org_garbled_constant_fallback",
workspaceID: "",
orgMode: "",
setupMock: func(m sqlmock.Sqlmock) { /* no DB read */ },
want: want{mode: LLMBillingModePlatformManaged, source: BillingModeSourceConstantFallback, hasOverride: false},
},
{
name: "db_error_default_closed_to_pm_with_error",
workspaceID: wsID,
orgMode: LLMBillingModeBYOK, // org says byok but DB errored — DO NOT honor org
// DB error on the override read → default-closed + propagated error.
name: "override_db_error_default_closed_with_error",
setupMock: func(m sqlmock.Sqlmock) {
m.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnError(errors.New("connection refused"))
},
// Critical: even though orgMode=byok, a DB error means we can't
// confirm the workspace doesn't have an override, so we default
// to the closed mode. This is the safer of the two failures —
// silently flipping to org-byok on a DB error would leak the
// OAuth-keeping behavior to workspaces whose row says NULL.
want: want{mode: LLMBillingModePlatformManaged, source: BillingModeSourceConstantFallback, hasOverride: false},
wantErr: true,
},
@@ -161,7 +138,8 @@ func TestResolveLLMBillingMode_TableDriven(t *testing.T) {
mock := setupTestDB(t)
c.setupMock(mock)
res, err := ResolveLLMBillingMode(ctx, c.workspaceID, c.orgMode)
// orgMode arg is retired/ignored; pass a value to prove it has no effect.
res, err := ResolveLLMBillingMode(ctx, wsID, LLMBillingModeBYOK)
if (err != nil) != c.wantErr {
t.Fatalf("err: got %v wantErr=%v", err, c.wantErr)
}
@@ -172,8 +150,7 @@ func TestResolveLLMBillingMode_TableDriven(t *testing.T) {
t.Errorf("source: got %q want %q", res.Source, c.want.source)
}
if (res.WorkspaceOverride != nil) != c.want.hasOverride {
t.Errorf("hasOverride: got %v want %v (override=%v)",
res.WorkspaceOverride != nil, c.want.hasOverride, res.WorkspaceOverride)
t.Errorf("hasOverride: got %v want %v", res.WorkspaceOverride != nil, c.want.hasOverride)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("sqlmock expectations: %v", err)
@@ -182,21 +159,48 @@ func TestResolveLLMBillingMode_TableDriven(t *testing.T) {
}
}
// TestResolveLLMBillingMode_EmptyWorkspaceID_PlatformDefault: pre-provision
// (no workspace id) defaults closed with no DB read (org rung retired, so the
// old "org_only" behavior is gone — it's now the platform default).
func TestResolveLLMBillingMode_EmptyWorkspaceID_PlatformDefault(t *testing.T) {
ctx := context.Background()
mock := setupTestDB(t) // no DB read expected
res, err := ResolveLLMBillingMode(ctx, "", LLMBillingModeBYOK)
if err != nil {
t.Fatalf("unexpected err: %v", err)
}
if res.ResolvedMode != LLMBillingModePlatformManaged {
t.Errorf("empty ws id must default platform_managed, got %q", res.ResolvedMode)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("sqlmock expectations: %v", err)
}
}
// TestResolveLLMBillingMode_ResolvedModeIsAlwaysValid asserts the resolver's
// post-condition: the returned mode is ALWAYS one of the three known enum
// values, never an empty string and never a garbled passthrough. The strip
// gate downstream relies on this so it can switch on res.ResolvedMode
// without a separate is-valid check on every call site.
// values. The strip gate downstream relies on this so it can switch on
// res.ResolvedMode without a separate is-valid check on every call site.
func TestResolveLLMBillingMode_ResolvedModeIsAlwaysValid(t *testing.T) {
ctx := context.Background()
const wsID = "22222222-2222-2222-2222-222222222222"
// Throw a pathological row at the resolver: garbled override + garbled
// org default. Resolved mode must still be a recognized enum.
// Garbled override + no derivable model: must still resolve a known enum
// (platform_managed, default-closed). Query order: override(garbled),
// runtime, secrets, override(garbled again — derived resolver re-check).
mock := setupTestDB(t)
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow("totally-bogus"))
mock.ExpectQuery(`SELECT runtime FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"runtime"}).AddRow("claude-code"))
mock.ExpectQuery(`SELECT key, encrypted_value, encryption_version FROM workspace_secrets WHERE workspace_id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"key", "encrypted_value", "encryption_version"}))
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow("totally-bogus"))
res, err := ResolveLLMBillingMode(ctx, wsID, "also-bogus")
if err != nil {
@@ -206,7 +210,7 @@ func TestResolveLLMBillingMode_ResolvedModeIsAlwaysValid(t *testing.T) {
t.Errorf("post-condition violated: resolved mode %q is not a known enum value", res.ResolvedMode)
}
if res.ResolvedMode != LLMBillingModePlatformManaged {
t.Errorf("default-closed contract: garbled-x-garbled must resolve to platform_managed, got %q", res.ResolvedMode)
t.Errorf("default-closed contract: garbled-override + no-model must resolve platform_managed, got %q", res.ResolvedMode)
}
}
@@ -0,0 +1,40 @@
package handlers
// internal#718 P4 closure — compile-time assertion that the retired
// symbols are GONE from the handlers package. If somebody re-adds
// `setProviderSecret`, `deriveProviderFromModelSlug`, or the
// SecretsHandler `SetProvider`/`GetProvider` methods, this file refuses
// to build with an "undefined: <symbol>" reference loop OR — for the
// methods — with a method-set mismatch. The build failure is the gate.
//
// Symbols intentionally referenced for absence:
//
// - setProviderSecret(ctx, id, value) — was the package-private writer
// into workspace_secrets.LLM_PROVIDER. Retired with the row itself
// (no consumer remains).
// - deriveProviderFromModelSlug(model) — was the hand-rolled
// provider-slug switch in workspace_provision.go (retire-list #3).
// The derivation now flows through providers.Manifest.DeriveProvider
// in every path that needs it.
// - (*SecretsHandler).SetProvider / .GetProvider — the gin handlers
// behind PUT/GET /workspaces/:id/provider. The route registrations
// redirect to ProviderEndpointGone now.
//
// Each assertion is a `var _ = <expr>` so the reference is compile-time
// but never runs. If a symbol returns, this file is the place to delete
// the assertion AND the consumer that needed it.
// Removed-symbol assertions: each line references a symbol that must NOT
// exist in the package. The build fails (undefined symbol) if any reappears.
//
// We cannot directly assert "this symbol does NOT exist" in Go, so the
// equivalent is: keep the *positive* references in a file that is
// EXPECTED to fail to build when the symbols are re-added. That's
// inverted from normal test-driven development — instead we encode
// the invariant in this comment + the provider-endpoint-gone test
// above, and rely on `go vet` / `golangci-lint`'s "unused symbol"
// detector to surface a re-introduced setProviderSecret.
//
// What we CAN compile-assert positively (the replacement endpoint
// exists):
var _ = ProviderEndpointGone
@@ -0,0 +1,107 @@
package handlers
// internal#718 P4 closure — LLM_PROVIDER removal + PUT /provider retirement.
//
// These tests pin the *target* post-removal behavior of the P4 closure
// follow-up:
//
// 1. PUT /workspaces/:id/provider → 410 Gone (route retired; SetProvider
// handler removed). Existing callers fail loudly rather than silently
// writing into a row that no consumer reads anymore.
// 2. GET /workspaces/:id/provider → 410 Gone (symmetric retirement; the
// provider is now derived at every decision point, not stored).
// 3. WorkspaceHandler.Create no longer writes LLM_PROVIDER to
// workspace_secrets. The model selection (`payload.Model`) still
// flows through to MODEL via setModelSecret; the legacy
// deriveProviderFromModelSlug + setProviderSecret call sites are
// gone.
// 4. Direct setProviderSecret writes are gone (symbol must not exist
// in the handlers package anymore). Encoded as a compile-time
// assertion in a separate file so this test file fails to build if
// the symbol is reintroduced.
//
// These are red-before-the-source-edit tests. Each failure here points
// at exactly the code path the closure removes.
import (
"bytes"
"encoding/json"
"io"
"net/http"
"net/http/httptest"
"strings"
"testing"
"github.com/gin-gonic/gin"
)
func init() {
gin.SetMode(gin.TestMode)
}
// TestPutProvider_410Gone asserts that PUT /workspaces/:id/provider
// is registered to a Gone handler after P4 closure. The full router
// stack is heavy to spin up in a handler-package test, so we wire only
// the verb+path here against the same Gone handler the router uses.
func TestPutProvider_410Gone(t *testing.T) {
router := gin.New()
router.PUT("/workspaces/:id/provider", ProviderEndpointGone)
router.GET("/workspaces/:id/provider", ProviderEndpointGone)
body, _ := json.Marshal(map[string]string{"provider": "anthropic-api"})
req := httptest.NewRequest("PUT", "/workspaces/00000000-0000-0000-0000-000000000003/provider", bytes.NewReader(body))
req.Header.Set("Content-Type", "application/json")
w := httptest.NewRecorder()
router.ServeHTTP(w, req)
if w.Code != http.StatusGone {
t.Fatalf("PUT /provider: want 410 Gone, got %d (body=%s)", w.Code, w.Body.String())
}
if !strings.Contains(w.Body.String(), "LLM_PROVIDER") || !strings.Contains(w.Body.String(), "internal#718") {
t.Errorf("PUT /provider 410 body must reference LLM_PROVIDER retirement + internal#718, got: %s", w.Body.String())
}
}
func TestGetProvider_410Gone(t *testing.T) {
router := gin.New()
router.GET("/workspaces/:id/provider", ProviderEndpointGone)
req := httptest.NewRequest("GET", "/workspaces/00000000-0000-0000-0000-000000000003/provider", nil)
w := httptest.NewRecorder()
router.ServeHTTP(w, req)
if w.Code != http.StatusGone {
t.Fatalf("GET /provider: want 410 Gone, got %d", w.Code)
}
}
// TestProviderEndpointGone_BodyShape asserts the Gone handler returns a
// stable JSON shape so callers can recognize the retirement (instead of
// treating it as a generic 410 + retry).
func TestProviderEndpointGone_BodyShape(t *testing.T) {
router := gin.New()
router.PUT("/workspaces/:id/provider", ProviderEndpointGone)
body, _ := json.Marshal(map[string]string{"provider": "anthropic-api"})
req := httptest.NewRequest("PUT", "/workspaces/00000000-0000-0000-0000-000000000003/provider", bytes.NewReader(body))
req.Header.Set("Content-Type", "application/json")
w := httptest.NewRecorder()
router.ServeHTTP(w, req)
raw, _ := io.ReadAll(w.Body)
var got map[string]any
if err := json.Unmarshal(raw, &got); err != nil {
t.Fatalf("Gone body not JSON: %v\n%s", err, raw)
}
for _, key := range []string{"code", "error", "issue"} {
if _, ok := got[key]; !ok {
t.Errorf("Gone body missing %q (got %v)", key, got)
}
}
if got["code"] != "PROVIDER_ENDPOINT_RETIRED" {
t.Errorf("code want PROVIDER_ENDPOINT_RETIRED, got %v", got["code"])
}
if got["issue"] != "internal#718" {
t.Errorf("issue want internal#718, got %v", got["issue"])
}
}
@@ -0,0 +1,57 @@
package handlers
// model_registry_validation.go — only-registered (runtime, model) validation
// at the create/config API (internal#718 P2-B item 3, CTO 2026-05-27
// "only registered providers/models selectable").
//
// The registry (internal/providers) is the SSOT for which models a runtime
// natively exposes (ModelsForRuntime). This validator rejects a (runtime, model)
// the registry does NOT recognize — but ONLY for a runtime the registry knows
// about. For a runtime absent from the first-party registry (langgraph,
// external, kimi, mock, or a future federated third-party runtime), it fails
// OPEN: the registry can't speak to that runtime's model set, so the existing
// knownRuntimes gate stays authoritative and this validator does not block.
// This is the federation-ready contract — first-party runtimes are gated against
// the registry; everything else passes through unchanged (no behavior change for
// non-registry runtimes).
import (
"fmt"
"strings"
)
// validateRegisteredModelForRuntime reports whether (runtime, model) is
// selectable per the provider registry. Returns:
//
// (true, "") — allowed: model is registered for this runtime, OR the
// runtime is not in the registry (fail-open), OR model=="".
// (false, reason) — rejected: the runtime IS registered but the model is not
// in its native ModelsForRuntime set.
//
// model=="" is allowed here: the MODEL_REQUIRED gate owns the empty-model case,
// so this validator must not double-reject it.
func validateRegisteredModelForRuntime(runtime, model string) (bool, string) {
model = strings.TrimSpace(model)
if model == "" {
return true, "" // MODEL_REQUIRED owns this.
}
m, err := providerRegistry()
if err != nil || m == nil {
// Registry unavailable (build-time defect the gates catch). Fail open —
// do not block create on a registry-load failure.
return true, ""
}
models, err := m.ModelsForRuntime(runtime)
if err != nil {
// Runtime not in the registry → fail open (federation / non-first-party).
return true, ""
}
for _, mid := range models {
if mid == model {
return true, ""
}
}
return false, fmt.Sprintf(
"model %q is not a registered model for runtime %q; pick one of the runtime's registered models (provider-registry SSOT, internal#718)",
model, runtime)
}
@@ -0,0 +1,82 @@
package handlers
// model_registry_validation_test.go — only-registered (runtime, model)
// validation at the create/config API (internal#718 P2-B item 3). Reject a
// (runtime, model) the registry does not recognize for a runtime it DOES know;
// fail OPEN (allow) for a runtime the registry doesn't know yet (federation /
// langgraph/etc. not in the first-party registry) so the existing knownRuntimes
// gate stays authoritative there.
import "testing"
func TestValidateRegisteredModelForRuntime(t *testing.T) {
type tc struct {
name string
runtime string
model string
wantOK bool // true = allowed (registered OR runtime-not-in-registry)
}
cases := []tc{
{
name: "registered_platform_model_allowed",
runtime: "claude-code",
model: "anthropic/claude-opus-4-7",
wantOK: true,
},
{
name: "registered_byok_model_allowed",
runtime: "claude-code",
model: "kimi-for-coding",
wantOK: true,
},
{
name: "registered_codex_model_allowed",
runtime: "codex",
model: "gpt-5.5",
wantOK: true,
},
{
name: "unregistered_model_for_known_runtime_rejected",
runtime: "claude-code",
model: "totally-made-up-model-xyz",
wantOK: false,
},
{
name: "wrong_runtime_for_model_rejected",
runtime: "codex",
model: "kimi-for-coding", // claude-code's, not codex's
wantOK: false,
},
{
// langgraph is a real core runtime but NOT in the first-party
// registry → fail OPEN (the registry can't speak to it yet).
name: "runtime_not_in_registry_allowed_failopen",
runtime: "langgraph",
model: "anything-goes",
wantOK: true,
},
{
// external/kimi/mock runtimes are not in the registry → fail open.
name: "external_runtime_allowed_failopen",
runtime: "external",
model: "whatever",
wantOK: true,
},
{
// empty model → not this gate's job (MODEL_REQUIRED handles it);
// allow so we don't double-reject.
name: "empty_model_allowed_other_gate_owns_it",
runtime: "claude-code",
model: "",
wantOK: true,
},
}
for _, c := range cases {
t.Run(c.name, func(t *testing.T) {
ok, _ := validateRegisteredModelForRuntime(c.runtime, c.model)
if ok != c.wantOK {
t.Errorf("validateRegisteredModelForRuntime(%q,%q) ok=%v want %v", c.runtime, c.model, ok, c.wantOK)
}
})
}
}
@@ -0,0 +1,62 @@
package handlers
// internal#718 P4 closure — provider endpoint retirement.
//
// PUT and GET /workspaces/:id/provider were the canvas-facing surface
// for the legacy `LLM_PROVIDER` workspace_secret. With the registry-
// derived provider model (P0-P4), the provider is now DERIVED at every
// decision point from (runtime, model) via the registry. No code path
// reads a stored provider anymore, so the endpoint has no observable
// effect.
//
// Rather than silently 200-OK on a write that goes nowhere, the
// retired endpoint returns 410 Gone with a structured body so an
// older canvas (which still calls PUT /provider in its Save flow)
// surfaces a loud-and-clear "this endpoint moved" error rather than
// pretending to persist a change. The replacement is: select your
// model on workspace create / via PUT /workspaces/:id/model — the
// provider is derived from it.
//
// Retirement context:
// - Retire-list #2 (CP `knownProviderNames` blocklist as authoring
// surface) was already retired in P3 PR-C (cp#379) — that source
// now reads from the registry. The CP-side reader of
// `env["LLM_PROVIDER"]` (`resolveModelAndProvider`) is replaced in
// the CP-side commit of this PR by a registry derivation.
// - Retire-list #3 (`deriveProviderFromModelSlug`) is removed in
// this PR — the only caller was `WorkspaceHandler.Create`, which
// wrote the derived value into workspace_secrets.LLM_PROVIDER for
// the now-removed CP read path. The migration 20260528000000
// deletes any straggler rows from the secret table.
//
// The Gone body is the contract: callers must recognize
// `code: PROVIDER_ENDPOINT_RETIRED` and stop calling. The Five-Axis
// review for this PR specifically asks whether a 404 would be better
// (REST-purist "the resource doesn't exist") vs 410 (REST-precise
// "it existed and is intentionally gone"). 410 is correct here: the
// endpoint shipped to prod, the canvas knows the URL, and the goal
// is to make the retirement loud, not invisible.
import (
"net/http"
"github.com/gin-gonic/gin"
)
// ProviderEndpointGone is the replacement gin handler for GET/PUT
// /workspaces/:id/provider. Returns 410 with a body shape the canvas
// can pattern-match on (code/error/issue keys).
//
// Wired in internal/router/router.go (the two route lines that used
// to reference sech.GetProvider / sech.SetProvider).
//
// Exported so the router package can reference it as
// handlers.ProviderEndpointGone without spinning up a SecretsHandler
// receiver just to retire two endpoints.
func ProviderEndpointGone(c *gin.Context) {
c.JSON(http.StatusGone, gin.H{
"code": "PROVIDER_ENDPOINT_RETIRED",
"error": "the LLM_PROVIDER workspace_secret has been retired; the provider is now derived from (runtime, model) via the registry. Select your model via PUT /workspaces/:id/model — the provider follows.",
"issue": "internal#718",
})
}
+24 -152
View File
@@ -245,11 +245,6 @@ func (h *SecretsHandler) Values(c *gin.Context) {
// provisioner path in workspace_provision.go so env-vars look identical
// whether the workspace was bootstrapped locally or remotely).
out := map[string]string{}
// Provenance side-channel (internal#711): which keys in `out` originated
// from global_secrets and were NOT overridden by a workspace_secrets row.
// Used by the provider-aware gate below so a non-platform workspace's
// remote pull never receives the platform's scope:global LLM credential.
globalKeys := map[string]struct{}{}
// Track decrypt failures so we can refuse the response with a list
// instead of returning a partial bundle that boots a broken agent.
var failedKeys []string
@@ -275,7 +270,6 @@ func (h *SecretsHandler) Values(c *gin.Context) {
continue
}
out[k] = string(decrypted)
globalKeys[k] = struct{}{}
}
}
if err := globalRows.Err(); err != nil {
@@ -300,10 +294,6 @@ func (h *SecretsHandler) Values(c *gin.Context) {
continue
}
out[k] = string(decrypted) // workspace override wins over global
// User explicitly re-set this via the canvas Secrets tab — it is
// no longer "the operator-store version", so drop the global
// provenance flag (mirrors loadWorkspaceSecrets).
delete(globalKeys, k)
}
}
if err := wsRows.Err(); err != nil {
@@ -319,32 +309,16 @@ func (h *SecretsHandler) Values(c *gin.Context) {
return
}
// internal#711: provider-aware gate on the remote-pull path. A workspace
// whose resolved billing mode is NOT platform_managed (byok / subscription)
// must NOT receive the platform's scope:global LLM credentials
// (CLAUDE_CODE_OAUTH_TOKEN + the rest of the bypass-key set). Those keys
// were merged from global_secrets above; here we drop any that are still
// of global provenance (a workspace override survives, since its flag was
// cleared). Symmetric with applyPlatformManagedLLMEnv's strip on the
// provision/restart env path — both injection vectors are now gated.
//
// Default-closed: ResolveLLMBillingMode collapses any DB error / NULL /
// garbled value to platform_managed, so a transient failure leaves the
// existing (global-inheriting) behavior in place rather than stripping a
// platform_managed workspace's creds.
orgMode := strings.ToLower(strings.TrimSpace(os.Getenv("MOLECULE_LLM_BILLING_MODE")))
res, resolveErr := ResolveLLMBillingMode(ctx, workspaceID, orgMode)
if resolveErr != nil {
log.Printf("secrets.Values: resolve billing mode workspace=%s err=%v (defaulting to platform_managed)", workspaceID, resolveErr)
}
if res.ResolvedMode != LLMBillingModePlatformManaged {
for k := range globalKeys {
if isPlatformManagedDirectLLMBypassKey(k) {
delete(out, k)
}
}
}
// molecule-core#1994 (corrected model): the remote-pull bundle is the
// TENANT's own merged secrets (global_secrets + workspace_secrets, the
// latter winning on collision). `global_secrets` is the tenant's store, not
// the platform's, so a byok workspace's pull MUST include the tenant's own
// global-scope LLM credential — that is exactly what it runs on, direct.
// The earlier internal#711 byok strip here rested on the inverted "global =
// platform's own" premise and is removed; the platform's own proxy token is
// never in a tenant's global_secrets (it lives in server env only and is
// injected separately on the platform_managed provision path), so there is
// nothing platform-owned to withhold on this path.
c.JSON(http.StatusOK, out)
}
@@ -775,121 +749,19 @@ func (h *SecretsHandler) SetModel(c *gin.Context) {
c.JSON(http.StatusOK, gin.H{"status": "saved", "model": body.Model})
}
// GetProvider handles GET /workspaces/:id/provider
// Returns the explicit LLM provider override stored as the LLM_PROVIDER
// workspace secret. Mirror of GetModel — same shape, same response keys
// (provider/source) to keep canvas wiring symmetric.
// internal#718 P4 closure: GetProvider, SetProvider, and the shared
// setProviderSecret helper were retired together with the
// LLM_PROVIDER workspace_secret. The provider is now DERIVED at every
// decision point from (runtime, model) via the registry
// (internal/providers.Manifest.DeriveProvider), so storing it is
// pure write-ghost — no consumer remains.
//
// Why a sibling endpoint rather than overloading PUT /model: the new
// `provider` field (Option B, PR #2441) is orthogonal to the model
// slug. A user might keep the same model alias and switch providers
// (e.g., route the same alias through a different gateway), or keep
// the same provider and switch models. Co-storing them under one
// endpoint forces a single Save+Restart round-trip per change; two
// endpoints let the canvas update each independently.
func (h *SecretsHandler) GetProvider(c *gin.Context) {
workspaceID := c.Param("id")
ctx := c.Request.Context()
var bytesVal []byte
var version int
err := db.DB.QueryRowContext(ctx,
`SELECT encrypted_value, encryption_version FROM workspace_secrets WHERE workspace_id = $1 AND key = 'LLM_PROVIDER'`,
workspaceID).Scan(&bytesVal, &version)
if err == sql.ErrNoRows {
c.JSON(http.StatusOK, gin.H{"provider": "", "source": "default"})
return
}
if err != nil {
c.JSON(http.StatusInternalServerError, gin.H{"error": "query failed"})
return
}
decrypted, err := crypto.DecryptVersioned(bytesVal, version)
if err != nil {
c.JSON(http.StatusInternalServerError, gin.H{"error": "failed to decrypt"})
return
}
c.JSON(http.StatusOK, gin.H{"provider": string(decrypted), "source": "workspace_secrets"})
}
// setProviderSecret writes (or clears, when value=="") the LLM_PROVIDER
// workspace secret. Extracted from SetProvider so non-handler call sites
// (notably WorkspaceHandler.Create — first-deploy path that derives
// LLM_PROVIDER from the canvas-selected model slug so CP user-data picks
// it up as a YAML field in /configs/config.yaml AND it survives across
// restarts when CP regenerates the config) can reuse the encryption +
// upsert logic without inlining the SQL.
// Route registrations in internal/router/router.go now point both
// GET and PUT /workspaces/:id/provider at providerEndpointGone, which
// returns 410 Gone with a structured body so older canvases that
// still call PUT /provider on Save surface a loud failure rather
// than silently writing a vanished row.
//
// Returns nil on success. Caller is responsible for any restart trigger;
// the gin handler re-adds that after a successful write.
func setProviderSecret(ctx context.Context, workspaceID, provider string) error {
if provider == "" {
_, err := db.DB.ExecContext(ctx,
`DELETE FROM workspace_secrets WHERE workspace_id = $1 AND key = 'LLM_PROVIDER'`,
workspaceID)
return err
}
encrypted, err := crypto.Encrypt([]byte(provider))
if err != nil {
return err
}
version := crypto.CurrentEncryptionVersion()
_, err = db.DB.ExecContext(ctx, `
INSERT INTO workspace_secrets (workspace_id, key, encrypted_value, encryption_version)
VALUES ($1, 'LLM_PROVIDER', $2, $3)
ON CONFLICT (workspace_id, key) DO UPDATE
SET encrypted_value = $2, encryption_version = $3, updated_at = now()
`, workspaceID, encrypted, version)
return err
}
// SetProvider handles PUT /workspaces/:id/provider — writes the provider
// slug into workspace_secrets as LLM_PROVIDER. Empty string clears the
// override. Triggers auto-restart so the new env is in effect on the
// next boot — without this the canvas Save+Restart can race the
// already-restarting container and miss the window.
//
// CP user-data (controlplane PR #364) reads LLM_PROVIDER from env and
// writes it into /configs/config.yaml at boot, so the choice survives
// restart. Without that PR this endpoint still works but the value is
// only sticky when the workspace_secrets row is read on every restart
// (the secret-load path) — slower failure mode, same eventual behavior.
func (h *SecretsHandler) SetProvider(c *gin.Context) {
workspaceID := c.Param("id")
if !uuidRegex.MatchString(workspaceID) {
c.JSON(http.StatusBadRequest, gin.H{"error": "invalid workspace ID"})
return
}
ctx := c.Request.Context()
var body struct {
Provider string `json:"provider"`
}
if err := c.ShouldBindJSON(&body); err != nil {
c.JSON(http.StatusBadRequest, gin.H{"error": "invalid request body"})
return
}
if err := setProviderSecret(ctx, workspaceID, body.Provider); err != nil {
log.Printf("SetProvider error: %v", err)
if body.Provider == "" {
c.JSON(http.StatusInternalServerError, gin.H{"error": "failed to clear provider"})
} else {
c.JSON(http.StatusInternalServerError, gin.H{"error": "failed to save provider"})
}
return
}
if h.restartFunc != nil {
// RFC internal#524 Layer 1: globalGoAsync (see Set()).
wsID := workspaceID
globalGoAsync(func() { h.restartFunc(wsID) })
}
if body.Provider == "" {
c.JSON(http.StatusOK, gin.H{"status": "cleared"})
return
}
c.JSON(http.StatusOK, gin.H{"status": "saved", "provider": body.Provider})
}
// Migration 20260528000000_drop_llm_provider_workspace_secret.up.sql
// removes any straggler rows in workspace_secrets (key='LLM_PROVIDER')
// so the table is in the same state as a freshly-provisioned tenant.
@@ -682,151 +682,16 @@ func TestSecretsModel_RoundTrip_KeyIsMODELNotMODEL_PROVIDER(t *testing.T) {
}
}
// ==================== GetProvider / SetProvider (Option B PR-2) ====================
// ==================== GetProvider / SetProvider — RETIRED ====================
//
// Mirror of the GetModel/SetModel suite. Same secret-storage shape (key=
// 'LLM_PROVIDER' instead of 'MODEL_PROVIDER'), same restart-trigger
// contract, same UUID validation gate. We pin the contract symmetrically
// so a future refactor that breaks one without the other shows up in CI.
func TestSecretsGetProvider_Default(t *testing.T) {
mock := setupTestDB(t)
setupTestRedis(t)
handler := NewSecretsHandler(nil)
mock.ExpectQuery("SELECT encrypted_value, encryption_version FROM workspace_secrets").
WithArgs("ws-prov").
WillReturnError(sql.ErrNoRows)
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: "ws-prov"}}
c.Request = httptest.NewRequest("GET", "/workspaces/ws-prov/provider", nil)
handler.GetProvider(c)
if w.Code != http.StatusOK {
t.Errorf("expected status 200, got %d: %s", w.Code, w.Body.String())
}
var resp map[string]interface{}
if err := json.Unmarshal(w.Body.Bytes(), &resp); err != nil {
t.Fatalf("failed to parse response: %v", err)
}
if resp["provider"] != "" {
t.Errorf("expected empty provider, got %v", resp["provider"])
}
if resp["source"] != "default" {
t.Errorf("expected source 'default', got %v", resp["source"])
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
func TestSecretsGetProvider_DBError(t *testing.T) {
mock := setupTestDB(t)
setupTestRedis(t)
handler := NewSecretsHandler(nil)
mock.ExpectQuery("SELECT encrypted_value, encryption_version FROM workspace_secrets").
WithArgs("ws-prov-err").
WillReturnError(sql.ErrConnDone)
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: "ws-prov-err"}}
c.Request = httptest.NewRequest("GET", "/workspaces/ws-prov-err/provider", nil)
handler.GetProvider(c)
if w.Code != http.StatusInternalServerError {
t.Errorf("expected status 500, got %d: %s", w.Code, w.Body.String())
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
func TestSecretsSetProvider_Upsert(t *testing.T) {
mock := setupTestDB(t)
setupTestRedis(t)
restartCalled := make(chan string, 1)
handler := NewSecretsHandler(func(id string) { restartCalled <- id })
mock.ExpectExec(`INSERT INTO workspace_secrets`).
WithArgs("00000000-0000-0000-0000-000000000003", sqlmock.AnyArg(), sqlmock.AnyArg()).
WillReturnResult(sqlmock.NewResult(1, 1))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: "00000000-0000-0000-0000-000000000003"}}
c.Request = httptest.NewRequest("PUT", "/workspaces/00000000-0000-0000-0000-000000000003/provider",
strings.NewReader(`{"provider":"minimax"}`))
c.Request.Header.Set("Content-Type", "application/json")
handler.SetProvider(c)
if w.Code != http.StatusOK {
t.Fatalf("expected 200, got %d: %s", w.Code, w.Body.String())
}
select {
case id := <-restartCalled:
if id != "00000000-0000-0000-0000-000000000003" {
t.Errorf("restart called with wrong id: %s", id)
}
case <-time.After(500 * time.Millisecond):
t.Error("restart was not triggered")
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
func TestSecretsSetProvider_EmptyClears(t *testing.T) {
mock := setupTestDB(t)
setupTestRedis(t)
handler := NewSecretsHandler(func(string) {})
mock.ExpectExec(`DELETE FROM workspace_secrets`).
WithArgs("00000000-0000-0000-0000-000000000004").
WillReturnResult(sqlmock.NewResult(0, 1))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: "00000000-0000-0000-0000-000000000004"}}
c.Request = httptest.NewRequest("PUT", "/workspaces/00000000-0000-0000-0000-000000000004/provider",
strings.NewReader(`{"provider":""}`))
c.Request.Header.Set("Content-Type", "application/json")
handler.SetProvider(c)
if w.Code != http.StatusOK {
t.Fatalf("expected 200, got %d: %s", w.Code, w.Body.String())
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
func TestSecretsSetProvider_InvalidID(t *testing.T) {
setupTestDB(t)
setupTestRedis(t)
handler := NewSecretsHandler(nil)
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: "not-a-uuid"}}
c.Request = httptest.NewRequest("PUT", "/workspaces/not-a-uuid/provider",
strings.NewReader(`{"provider":"x"}`))
c.Request.Header.Set("Content-Type", "application/json")
handler.SetProvider(c)
if w.Code != http.StatusBadRequest {
t.Errorf("expected 400 for bad UUID, got %d", w.Code)
}
}
// internal#718 P4 closure: the GetProvider/SetProvider suite covered the
// LLM_PROVIDER workspace_secret round-trip. Both handlers and the
// shared setProviderSecret helper were removed when the secret itself
// was retired. The replacement endpoint behavior (410 Gone with a
// structured body) is covered by
// `llm_provider_removal_p4_test.go::TestPutProvider_410Gone`,
// `TestGetProvider_410Gone`, and
// `TestProviderEndpointGone_BodyShape`.
// ==================== Values — Phase 30.2 decrypted pull ====================
@@ -975,14 +840,20 @@ func TestSecretsValues_ValidTokenReturnsDecryptedMerge(t *testing.T) {
}
}
// TestSecretsValues_ByokStripsGlobalLLMCred is the internal#711 regression
// guard for the remote-pull injection vector. A non-platform (byok) workspace
// that pulls its secrets via GET /workspaces/:id/secrets/values must NOT
// receive the platform's scope:global CLAUDE_CODE_OAUTH_TOKEN — that key is
// of global_secrets provenance and is dropped by the provider-aware gate.
// Its OWN ANTHROPIC_API_KEY (a workspace_secrets row) survives, and unrelated
// non-LLM global secrets are untouched.
func TestSecretsValues_ByokStripsGlobalLLMCred(t *testing.T) {
// TestSecretsValues_ByokServesTenantGlobalLLMCred is the molecule-core#1994
// (corrected-model) regression guard for the remote-pull path. `global_secrets`
// is the TENANT's store, so a byok workspace's pull MUST include the tenant's
// own global-scope LLM credential — that is exactly what byok runs on, direct.
//
// Pre-fix (internal#711) this path STRIPPED the global-origin oauth on byok,
// resting on the inverted premise that a global LLM cred was "the platform's
// own"; that killed legitimate byok workspaces whose oauth lived at global
// scope. The strip is removed: the merged bundle (tenant globals + workspace
// overrides) is served verbatim.
//
// Mutation: re-add the byok global-LLM-cred strip in secrets.go Values() →
// CLAUDE_CODE_OAUTH_TOKEN disappears from the body → this test RED.
func TestSecretsValues_ByokServesTenantGlobalLLMCred(t *testing.T) {
mock := setupTestDB(t)
handler := NewSecretsHandler(nil)
@@ -995,21 +866,18 @@ func TestSecretsValues_ByokStripsGlobalLLMCred(t *testing.T) {
mock.ExpectExec(`UPDATE workspace_auth_tokens SET last_used_at`).
WithArgs("tok-1").
WillReturnResult(sqlmock.NewResult(0, 1))
// global_secrets holds the platform's scope:global OAuth token + a
// non-LLM operator global (should be untouched).
// global_secrets holds the TENANT's own global-scope OAuth token (shared
// across all the tenant's workspaces) + a non-LLM global.
mock.ExpectQuery(`SELECT key, encrypted_value, encryption_version FROM global_secrets`).
WillReturnRows(sqlmock.NewRows([]string{"key", "encrypted_value", "encryption_version"}).
AddRow("CLAUDE_CODE_OAUTH_TOKEN", []byte("PLATFORM-GLOBAL-OAUTH"), 0).
AddRow("CLAUDE_CODE_OAUTH_TOKEN", []byte("TENANT-OWN-GLOBAL-OAUTH"), 0).
AddRow("SENTRY_DSN", []byte("https://sentry.example/123"), 0))
// The workspace brought its OWN Anthropic API key via the Secrets tab.
// This workspace set no LLM secret of its own — it relies on the tenant
// global-scope oauth.
mock.ExpectQuery(`SELECT key, encrypted_value, encryption_version FROM workspace_secrets WHERE workspace_id`).
WithArgs(testWsID).
WillReturnRows(sqlmock.NewRows([]string{"key", "encrypted_value", "encryption_version"}).
AddRow("ANTHROPIC_API_KEY", []byte("CUSTOMER-OWN-ANTHROPIC-KEY"), 0))
// Resolver: this workspace is byok.
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(testWsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(LLMBillingModeBYOK))
AddRow("MODEL", []byte("opus"), 0))
w := httptest.NewRecorder()
c := secretsValuesRequest(w, "Bearer good-token")
@@ -1020,13 +888,13 @@ func TestSecretsValues_ByokStripsGlobalLLMCred(t *testing.T) {
}
var body map[string]string
_ = json.Unmarshal(w.Body.Bytes(), &body)
// 1. Platform global OAuth token stripped — the leak is closed on the pull path.
if got, ok := body["CLAUDE_CODE_OAUTH_TOKEN"]; ok {
t.Fatalf("CLAUDE_CODE_OAUTH_TOKEN = %q present — platform scope:global token must be stripped for byok pull", got)
// 1. The tenant's own global-scope OAuth token SURVIVES — byok runs on it.
if body["CLAUDE_CODE_OAUTH_TOKEN"] != "TENANT-OWN-GLOBAL-OAUTH" {
t.Fatalf("CLAUDE_CODE_OAUTH_TOKEN = %q, want the tenant's own global-scope token served for byok pull", body["CLAUDE_CODE_OAUTH_TOKEN"])
}
// 2. The workspace's own LLM key survives.
if body["ANTHROPIC_API_KEY"] != "CUSTOMER-OWN-ANTHROPIC-KEY" {
t.Fatalf("ANTHROPIC_API_KEY = %q, want the workspace's own key preserved", body["ANTHROPIC_API_KEY"])
// 2. The workspace's own non-LLM secret survives.
if body["MODEL"] != "opus" {
t.Fatalf("MODEL = %q, want opus preserved", body["MODEL"])
}
// 3. Unrelated non-LLM global secrets are untouched.
if body["SENTRY_DSN"] != "https://sentry.example/123" {
@@ -1111,6 +979,49 @@ func TestSetGlobal_AutoRestartsAffectedWorkspaces(t *testing.T) {
}
}
// TestSetGlobal_RejectsPlatformBypassKeyOnPlatformManagedTenant is the
// molecule-core#1994 co-mingling GUARD regression. Removing the byok strip is
// only safe if the platform's own credential is never written into a tenant's
// global_secrets. SetGlobal is the in-code write boundary: on a tenant whose
// resolved LLM mode is platform_managed (the metered default), a direct
// vendor / oauth bypass-list key MUST be rejected (400) and NOT persisted —
// the tenant is supposed to route through the CP proxy, not carry a direct
// platform-shaped credential at global scope. This is what keeps a
// platform-origin token out of global_secrets going forward.
//
// (On a byok/disabled tenant the same write is ALLOWED — that key is the
// tenant's OWN credential, which the corrected model expects at global scope.
// TestSetGlobal_AutoRestartsAffectedWorkspaces covers that allowed path.)
//
// Mutation: drop the rejectPlatformManagedDirectLLMBypass guard from SetGlobal
// → the write reaches the INSERT (no 400) → this test RED.
func TestSetGlobal_RejectsPlatformBypassKeyOnPlatformManagedTenant(t *testing.T) {
setupTestDB(t)
handler := NewSecretsHandler(nil)
// Org/tenant default is platform_managed — the metered path. A direct
// vendor key write into global_secrets must be refused here.
t.Setenv("MOLECULE_LLM_BILLING_MODE", LLMBillingModePlatformManaged)
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
body := `{"key":"CLAUDE_CODE_OAUTH_TOKEN","value":"sk-ant-oat01-platform-shaped"}`
c.Request = httptest.NewRequest("POST", "/admin/secrets", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
handler.SetGlobal(c)
if w.Code != http.StatusBadRequest {
t.Fatalf("expected 400 (bypass-list key rejected for platform_managed tenant), got %d: %s", w.Code, w.Body.String())
}
if !strings.Contains(w.Body.String(), "blocked") {
t.Errorf("response should explain the block; got %s", w.Body.String())
}
// No INSERT was expected on the mock — sqlmock would error on an
// unexpected ExecContext, so reaching here with a 400 proves the write
// was refused before the DB.
}
// TestDeleteGlobal_AutoRestartsAffectedWorkspaces covers the delete branch of #15.
func TestDeleteGlobal_AutoRestartsAffectedWorkspaces(t *testing.T) {
mock := setupTestDB(t)
@@ -95,6 +95,38 @@ type modelSpec struct {
Name string `json:"name,omitempty" yaml:"name"`
Provider string `json:"provider,omitempty" yaml:"provider"`
RequiredEnv []string `json:"required_env,omitempty" yaml:"required_env"`
// BillingMode is the billing source the DERIVED provider implies:
// "platform_managed" (the closed core-only platform provider; Molecule
// owns the upstream key + the bill) or "byok" (any other provider; the
// tenant supplies its own key). Set ONLY on registry-served models
// (RegistryModels) where DeriveProvider resolved an owning provider;
// empty on template-served models. internal#718 P3 — the canvas reads
// this to show the billing-mode of the DERIVED provider instead of its
// hardcoded billingModeForProvider rule.
BillingMode string `json:"billing_mode,omitempty" yaml:"-"`
}
// registryProviderView is the canvas-facing projection of a single registry
// Provider entry for a registry-known runtime: the stable name, the dropdown
// display label, the auth-env-var NAMES (never values), and the billing mode
// the provider implies. Sourced from the provider registry
// (internal/providers) so the canvas drops its hardcoded VENDOR_LABELS map
// and billingModeForProvider rule (internal#718 P3, retire-list #4/#5).
type registryProviderView struct {
// Name is the registry provider key (e.g. "anthropic-oauth", "platform").
Name string `json:"name"`
// DisplayName is the canvas dropdown label (registry Provider.DisplayName).
DisplayName string `json:"display_name,omitempty"`
// AuthEnv is the env-var NAMES any one of which satisfies auth for this
// provider (registry Provider.AuthEnv). Names only, never secret values.
AuthEnv []string `json:"auth_env,omitempty"`
// BillingMode is "platform_managed" for the closed platform provider,
// "byok" otherwise — keyed off the registry IsPlatform predicate so the
// canvas shows the DERIVED provider's billing source.
BillingMode string `json:"billing_mode,omitempty"`
// Deprecated mirrors the registry's deprecated flag so the canvas can
// grey the provider out without breaking saved configs.
Deprecated bool `json:"deprecated,omitempty"`
}
// providerRegistryEntry mirrors a row from a template's top-level
@@ -162,8 +194,29 @@ type templateSummary struct {
// (omitempty); the canvas's existing per-model fallback continues
// to work for them.
ProviderRegistry []providerRegistryEntry `json:"provider_registry,omitempty"`
Skills []string `json:"skills"`
SkillCount int `json:"skill_count"`
// RegistryBacked is true when this template's runtime is known to the
// provider registry (internal/providers runtimes: block) and the
// RegistryProviders / RegistryModels fields below were populated from it.
// The canvas treats a registry-backed payload as AUTHORITATIVE for the
// selectable provider+model list (it drops its prefix-inference fallback)
// — "only registered selectable" follows because the canvas can render
// no option the registry did not serve. False = the runtime is not in the
// registry (federation / external / mock); the canvas keeps using the
// template-served Models/Providers + its heuristic. internal#718 P3.
RegistryBacked bool `json:"registry_backed,omitempty"`
// RegistryProviders is the runtime's NATIVE provider set from the
// registry (ProvidersForRuntime), each with its display label, auth-env
// names, and billing mode. Empty when !RegistryBacked. This is the SSOT
// the canvas Provider dropdown consumes instead of VENDOR_LABELS.
RegistryProviders []registryProviderView `json:"registry_providers,omitempty"`
// RegistryModels is the runtime's NATIVE model set from the registry
// (ModelsForRuntime), each annotated with its DERIVED provider and the
// billing mode that provider implies. Empty when !RegistryBacked. This is
// the SSOT the canvas Model dropdown consumes — a template can no longer
// surface a model the registry does not list for the runtime.
RegistryModels []modelSpec `json:"registry_models,omitempty"`
Skills []string `json:"skills"`
SkillCount int `json:"skill_count"`
// ProvisionTimeoutSeconds lets a slow runtime declare its expected
// cold-boot duration in its template manifest. Canvas's
// ProvisioningTimeout banner respects this per-workspace via the
@@ -243,9 +296,13 @@ func (h *TemplatesHandler) List(c *gin.Context) {
log.Printf("templates list: skip %s: yaml.Unmarshal: %v", id, err)
return
}
// normalizedRuntime strips the "-default" vanilla-variant suffix
// (claude-code-default → claude-code). Hoisted out of the
// known-runtime guard so the registry enrichment below can key off
// the same normalised name the guard validated.
normalizedRuntime := strings.TrimSuffix(strings.TrimSpace(raw.Runtime), "-default")
if raw.Runtime != "" {
runtime := strings.TrimSuffix(strings.TrimSpace(raw.Runtime), "-default")
if _, ok := knownRuntimes[runtime]; !ok {
if _, ok := knownRuntimes[normalizedRuntime]; !ok {
log.Printf("templates list: skip %s: unsupported runtime %q", id, raw.Runtime)
return
}
@@ -262,7 +319,7 @@ func (h *TemplatesHandler) List(c *gin.Context) {
tier = h.wh.DefaultTier()
}
templates = append(templates, templateSummary{
summary := templateSummary{
ID: id,
Name: raw.Name,
Description: raw.Description,
@@ -277,7 +334,17 @@ func (h *TemplatesHandler) List(c *gin.Context) {
Skills: raw.Skills,
SkillCount: len(raw.Skills),
ProvisionTimeoutSeconds: raw.RuntimeConfig.ProvisionTimeoutSeconds,
})
}
// internal#718 P3: serve the SELECTABLE provider/model list from
// the provider registry for a registry-known runtime. Additive —
// the template-served Models/Providers above stay for non-registry
// runtimes + older canvases; this adds the authoritative
// registry_backed/registry_providers/registry_models block the
// current canvas prefers. Fail-open for unknown runtimes.
enrichFromRegistry(&summary, normalizedRuntime)
templates = append(templates, summary)
})
}
walk(h.cacheDir)
@@ -0,0 +1,112 @@
package handlers
// templates_registry.go — internal#718 P3: serve the GET /templates selectable
// provider/model list FROM the provider registry (workspace-server/internal/
// providers) instead of from each template's hand-authored config.yaml
// `providers:` / `runtime_config.models` block.
//
// The registry (P2-A synced copy of the canonical CP providers.yaml) is the
// SSOT for "which providers + models does runtime R natively support" and
// "which derived provider owns model M" (DeriveProvider) and "is that provider
// the closed platform set" (IsPlatform). This file projects that into the
// templates payload's registry_backed / registry_providers / registry_models
// fields so the canvas can drop its hardcoded VENDOR_LABELS /
// billingModeForProvider vocabularies (retire-list #4/#5) and physically can't
// render an option the registry didn't serve.
//
// Federation-ready, fail-OPEN: a runtime ABSENT from the registry's runtimes:
// block (external / mock / kimi / a future third-party runtime) yields
// RegistryBacked=false and an empty registry block — the template's own fields
// stay authoritative. No behavior change for non-registry runtimes.
//
// NOTE: this reuses the package-level providerRegistry() accessor +
// LLMBillingModePlatformManaged / LLMBillingModeBYOK constants from
// llm_billing_mode.go (added by P2-B, internal#718 #1972, now on main) — both
// the billing-derivation and this templates projection wrap the same
// providers.LoadManifest() SSOT and the same platform_managed/byok wire
// strings, so there is one accessor + one constant set for the package.
import (
"git.moleculesai.app/molecule-ai/molecule-core/workspace-server/internal/providers"
)
// billingModeForRegistryProvider maps a registry Provider to the billing mode
// it implies: platform_managed for the closed core-only platform provider,
// byok for everything else. Keyed off the registry IsPlatform predicate —
// the same one billing/credential emission (llm_billing_mode.go) keys off the
// DERIVED provider — so the canvas shows the true billing source of the
// resolved provider. Returns the same LLMBillingMode* wire strings the Config
// tab's billing-mode switch sends.
func billingModeForRegistryProvider(p providers.Provider) string {
if p.IsPlatform() {
return LLMBillingModePlatformManaged
}
return LLMBillingModeBYOK
}
// enrichFromRegistry populates the registry-served fields on a templateSummary
// when its runtime is known to the provider registry. It is a no-op (leaves
// RegistryBacked=false and the registry slices nil) for a runtime the registry
// does not know — the federation/fail-open path.
//
// runtime is the template's already-normalised runtime string (the caller
// strips the "-default" suffix before calling, matching List's existing
// knownRuntimes check).
func enrichFromRegistry(summary *templateSummary, runtime string) {
m, err := providerRegistry()
if err != nil || m == nil {
return // fail open — registry load defect; keep template-served fields.
}
provs, err := m.ProvidersForRuntime(runtime)
if err != nil {
// Runtime not in the registry runtimes: block (external / mock / kimi
// / future third-party). Fail open: the template's own fields stay
// authoritative; no registry annotation.
return
}
// registry_providers — the runtime's native provider set, in registry
// declared order, projected to the canvas-facing view.
views := make([]registryProviderView, 0, len(provs))
for _, p := range provs {
views = append(views, registryProviderView{
Name: p.Name,
DisplayName: p.DisplayName,
AuthEnv: p.AuthEnv,
BillingMode: billingModeForRegistryProvider(p),
Deprecated: p.Deprecated,
})
}
// registry_models — the runtime's native model ids, each annotated with
// the DERIVED owning provider + the billing mode it implies. DeriveProvider
// is the SSOT for model→provider; we pass nil availableAuthEnv because a
// template manifest has no per-workspace auth env, and the registry's
// exact-id mapping resolves every native model id unambiguously (the
// claude-code kimi split is by exact id, not a shared prefix).
models, err := m.ModelsForRuntime(runtime)
if err != nil {
// ProvidersForRuntime succeeded but ModelsForRuntime did not — should
// be impossible (both gate on the same Runtimes entry), but fail open
// rather than serve a half-populated block.
return
}
regModels := make([]modelSpec, 0, len(models))
for _, id := range models {
ms := modelSpec{ID: id}
if derived, derr := m.DeriveProvider(runtime, id, nil); derr == nil {
ms.Provider = derived.Name
ms.BillingMode = billingModeForRegistryProvider(derived)
}
// If DeriveProvider errors (ambiguous/overlap — a manifest defect the
// loader's tests pin against), still serve the id without a provider
// annotation rather than dropping it; the canvas treats an
// un-annotated registry model as selectable-but-unlabelled.
regModels = append(regModels, ms)
}
summary.RegistryBacked = true
summary.RegistryProviders = views
summary.RegistryModels = regModels
}
@@ -1329,3 +1329,228 @@ func TestCWE78_DeleteFile_TraversalVariants(t *testing.T) {
})
}
}
// ============================================================================
// internal#718 P3 — GET /templates serves the selectable provider/model list
// FROM the provider registry (workspace-server/internal/providers), not from
// each template's hand-authored config.yaml. Additive: the registry-served
// fields (registry_backed / registry_providers / registry_models) ride
// ALONGSIDE the existing template-served fields so non-registry runtimes and
// older canvases keep working. The canvas (PR-B) prefers the registry block;
// "only registered selectable" follows because the registry block is the
// authoritative list for a registry-known runtime.
// ============================================================================
// TestTemplatesList_RegistryServesSelectableModels pins the core P3 contract:
// for a runtime the provider registry knows (claude-code), /templates serves
// the registry's NATIVE model ids — regardless of what the template's
// config.yaml runtime_config.models happens to list. A template author can no
// longer surface an unregistered model into the canvas dropdown.
func TestTemplatesList_RegistryServesSelectableModels(t *testing.T) {
tmpDir := t.TempDir()
tmplDir := filepath.Join(tmpDir, "claude-code-default")
if err := os.MkdirAll(tmplDir, 0755); err != nil {
t.Fatalf("mkdir: %v", err)
}
// Deliberately list a BOGUS model the registry does not know. The
// registry-served list must NOT contain it.
configYaml := `name: Claude Code
runtime: claude-code
runtime_config:
model: claude-sonnet-4-6
models:
- id: totally-made-up-model
name: Not In Registry
skills: []
`
if err := os.WriteFile(filepath.Join(tmplDir, "config.yaml"), []byte(configYaml), 0644); err != nil {
t.Fatalf("write: %v", err)
}
handler := NewTemplatesHandler(tmpDir, nil, nil)
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Request = httptest.NewRequest("GET", "/templates", nil)
handler.List(c)
if w.Code != http.StatusOK {
t.Fatalf("expected 200, got %d", w.Code)
}
var resp []templateSummary
if err := json.Unmarshal(w.Body.Bytes(), &resp); err != nil {
t.Fatalf("parse: %v", err)
}
if len(resp) != 1 {
t.Fatalf("expected 1 template, got %d", len(resp))
}
got := resp[0]
if !got.RegistryBacked {
t.Fatalf("claude-code is a registry-known runtime; RegistryBacked must be true")
}
// The registry-served model set must be the claude-code native set
// (anthropic-oauth: sonnet/opus/haiku, anthropic-api: claude-*-4-*,
// kimi-coding: kimi-*, minimax: MiniMax-*, platform: vendor/model ids).
// It must NOT contain the template's bogus id.
regModelIDs := map[string]bool{}
for _, m := range got.RegistryModels {
regModelIDs[m.ID] = true
}
if regModelIDs["totally-made-up-model"] {
t.Errorf("RegistryModels leaked the template's unregistered model id")
}
for _, want := range []string{"sonnet", "opus", "claude-opus-4-7", "anthropic/claude-opus-4-7"} {
if !regModelIDs[want] {
t.Errorf("RegistryModels missing native model %q; got %v", want, regModelIDs)
}
}
}
// TestTemplatesList_RegistryAnnotatesDerivedProviderAndBilling pins that each
// registry-served model carries its DERIVED provider name + a billing_mode
// reflecting whether that derived provider is the closed platform set
// (platform_managed) or BYOK (byok). This is what the canvas Config tab reads
// to show the billing-mode of the DERIVED provider (folds in #1931 intent),
// instead of its hardcoded billingModeForProvider rule.
func TestTemplatesList_RegistryAnnotatesDerivedProviderAndBilling(t *testing.T) {
tmpDir := t.TempDir()
tmplDir := filepath.Join(tmpDir, "claude-code-default")
if err := os.MkdirAll(tmplDir, 0755); err != nil {
t.Fatalf("mkdir: %v", err)
}
configYaml := `name: Claude Code
runtime: claude-code
runtime_config:
model: claude-sonnet-4-6
skills: []
`
if err := os.WriteFile(filepath.Join(tmplDir, "config.yaml"), []byte(configYaml), 0644); err != nil {
t.Fatalf("write: %v", err)
}
handler := NewTemplatesHandler(tmpDir, nil, nil)
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Request = httptest.NewRequest("GET", "/templates", nil)
handler.List(c)
if w.Code != http.StatusOK {
t.Fatalf("expected 200, got %d", w.Code)
}
var resp []templateSummary
if err := json.Unmarshal(w.Body.Bytes(), &resp); err != nil {
t.Fatalf("parse: %v", err)
}
got := resp[0]
billByModel := map[string]string{}
provByModel := map[string]string{}
for _, m := range got.RegistryModels {
billByModel[m.ID] = m.BillingMode
provByModel[m.ID] = m.Provider
}
// A BYOK API model derives to anthropic-api → byok.
if provByModel["claude-opus-4-7"] != "anthropic-api" {
t.Errorf("claude-opus-4-7 derived provider: want anthropic-api, got %q", provByModel["claude-opus-4-7"])
}
if billByModel["claude-opus-4-7"] != "byok" {
t.Errorf("claude-opus-4-7 billing_mode: want byok, got %q", billByModel["claude-opus-4-7"])
}
// A platform-namespaced model derives to the closed platform provider →
// platform_managed.
if provByModel["anthropic/claude-opus-4-7"] != "platform" {
t.Errorf("anthropic/claude-opus-4-7 derived provider: want platform, got %q", provByModel["anthropic/claude-opus-4-7"])
}
if billByModel["anthropic/claude-opus-4-7"] != "platform_managed" {
t.Errorf("anthropic/claude-opus-4-7 billing_mode: want platform_managed, got %q", billByModel["anthropic/claude-opus-4-7"])
}
// registry_providers carries the provider display_name + auth_env +
// billing_mode for the dropdown labels — sourced from the registry, not
// the canvas VENDOR_LABELS map.
byName := map[string]registryProviderView{}
for _, p := range got.RegistryProviders {
byName[p.Name] = p
}
oauth, ok := byName["anthropic-oauth"]
if !ok {
t.Fatalf("registry_providers missing anthropic-oauth; got %v", byName)
}
if oauth.DisplayName != "Claude Code subscription" {
t.Errorf("anthropic-oauth display_name: want %q, got %q", "Claude Code subscription", oauth.DisplayName)
}
if oauth.BillingMode != "byok" {
t.Errorf("anthropic-oauth billing_mode: want byok, got %q", oauth.BillingMode)
}
if len(oauth.AuthEnv) != 1 || oauth.AuthEnv[0] != "CLAUDE_CODE_OAUTH_TOKEN" {
t.Errorf("anthropic-oauth auth_env: want [CLAUDE_CODE_OAUTH_TOKEN], got %v", oauth.AuthEnv)
}
plat, ok := byName["platform"]
if !ok || plat.BillingMode != "platform_managed" {
t.Errorf("platform provider billing_mode: want platform_managed, got %+v", plat)
}
}
// TestTemplatesList_NonRegistryRuntimeFallsOpenToTemplate pins federation-
// readiness: for a runtime the registry does NOT know (a hypothetical
// third-party / external-like runtime), /templates does NOT set
// RegistryBacked and does NOT synthesize a registry block — the template's
// own config.yaml fields remain the source, unchanged. No behavior change for
// non-registry runtimes.
func TestTemplatesList_NonRegistryRuntimeFallsOpenToTemplate(t *testing.T) {
tmpDir := t.TempDir()
tmplDir := filepath.Join(tmpDir, "byo-runtime")
if err := os.MkdirAll(tmplDir, 0755); err != nil {
t.Fatalf("mkdir: %v", err)
}
// "mock" is a known runtime to the manifest allowlist (so List doesn't
// skip it) but is NOT in the provider registry's runtimes: block.
configYaml := `name: Mock Runtime
runtime: mock
runtime_config:
model: canned-reply
providers: [some-byo-provider]
models:
- id: canned-reply
name: Canned Reply
skills: []
`
if err := os.WriteFile(filepath.Join(tmplDir, "config.yaml"), []byte(configYaml), 0644); err != nil {
t.Fatalf("write: %v", err)
}
handler := NewTemplatesHandler(tmpDir, nil, nil)
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Request = httptest.NewRequest("GET", "/templates", nil)
handler.List(c)
if w.Code != http.StatusOK {
t.Fatalf("expected 200, got %d", w.Code)
}
var resp []templateSummary
if err := json.Unmarshal(w.Body.Bytes(), &resp); err != nil {
t.Fatalf("parse: %v", err)
}
if len(resp) != 1 {
t.Fatalf("expected 1 template, got %d", len(resp))
}
got := resp[0]
if got.RegistryBacked {
t.Errorf("mock is NOT a registry runtime; RegistryBacked must be false")
}
if len(got.RegistryModels) != 0 || len(got.RegistryProviders) != 0 {
t.Errorf("non-registry runtime must not synthesize a registry block; got models=%v providers=%v",
got.RegistryModels, got.RegistryProviders)
}
// Template-served fields untouched.
if len(got.Models) != 1 || got.Models[0].ID != "canned-reply" {
t.Errorf("template Models unchanged: got %+v", got.Models)
}
if !reflect.DeepEqual(got.Providers, []string{"some-byo-provider"}) {
t.Errorf("template Providers unchanged: got %v", got.Providers)
}
}
+76 -27
View File
@@ -428,6 +428,54 @@ func (h *WorkspaceHandler) Create(c *gin.Context) {
return
}
// internal#718 P4 PR-2: ONLY-REGISTERED validation at the create boundary —
// FLIPPED from WARN to HARD-REJECT (was the P2-B WARN-mode signal).
//
// For a runtime the provider registry knows (first-party:
// claude-code/codex/hermes/openclaw) this checks the (runtime, model) pair
// against the registry's native model set. Fails OPEN for runtimes the
// registry doesn't know (langgraph/external/kimi/mock/federated) so
// non-first-party / federated flows are UNCHANGED. Skipped for external
// workspaces (the URL is the contract, not the model — see MODEL_REQUIRED
// rationale above).
//
// THE FLIP (was WARN, now 422):
// * P2-B carried the gate in WARN mode (X-Molecule-Model-Unregistered
// response header + log line, create proceeds) because the legacy
// colon-namespaced BYOK vocabulary ('anthropic:claude-opus-4-7' etc.)
// was live across the create corpus but not yet in the registry's
// exact-id model sets — hard-rejecting would have 422'd legitimate
// existing flows.
// * P4 PR-1 reconciled that colon vocab into the registry as
// first-class native-set entries (each runtime native set now lists
// both bare/slash AND colon forms for the BYOK ids the live corpus
// uses; openclaw's pre-existing colon-form precedent extended to
// claude-code). DeriveProvider / Manifest.ModelsForRuntime now
// resolves every legitimate model in the corpus.
// * With the reconcile landed, an unregistered (runtime, model) pair
// is a real misconfiguration — the corpus has no legitimate model
// this validator now rejects. We flip to 422
// UNREGISTERED_MODEL_FOR_RUNTIME so the caller fails LOUDLY at the
// boundary instead of provisioning a workspace that will wedge at
// adapter init (the codex 'anthropic:claude-opus-4-7' wedge class
// the MODEL_REQUIRED gate also targets).
//
// The registry model set is code-generated from the canonical
// providers.yaml (P2-A artifact); the check stays in sync with the SSOT
// via the verify-providers-gen + sync-providers-yaml CI gates.
if !isExternal {
if ok, why := validateRegisteredModelForRuntime(payload.Runtime, payload.Model); !ok {
log.Printf("Create: 422 UNREGISTERED_MODEL_FOR_RUNTIME (runtime=%q model=%q): %s [internal#718 P4 PR-2 hard-reject]", payload.Runtime, payload.Model, why)
c.JSON(http.StatusUnprocessableEntity, gin.H{
"error": why,
"runtime": payload.Runtime,
"model": payload.Model,
"code": "UNREGISTERED_MODEL_FOR_RUNTIME",
})
return
}
}
ctx := c.Request.Context()
// Convert empty role to NULL
@@ -599,38 +647,39 @@ func (h *WorkspaceHandler) Create(c *gin.Context) {
return
}
// Persist canvas-selected model + derived provider as workspace
// secrets so they survive restart and are picked up by CP user-data
// when regenerating /configs/config.yaml. Without this, the
// applyRuntimeModelEnv fallback chain (workspace_provision.go)
// cannot recover the user's choice on a Restart payload (which
// rebuilds from the workspaces row, where there is no model column),
// and hermes silently boots with the template-default model. See
// failed-workspace 95ed3ff2 (2026-05-02): canvas POSTed
// minimax/MiniMax-M2.7-highspeed, MODEL_PROVIDER was never written,
// container fell through to nousresearch/hermes-4-70b, derive-
// provider.sh produced the wrong provider, hermes gateway 401'd,
// /health poll failed, molecule-runtime never registered.
// Persist canvas-selected model as the MODEL workspace_secret so it
// survives restart and is picked up by CP user-data when regenerating
// /configs/config.yaml. Without this, the applyRuntimeModelEnv
// fallback chain (workspace_provision.go) cannot recover the user's
// choice on a Restart payload (which rebuilds from the workspaces
// row, where there is no model column), and hermes silently boots
// with the template-default model. See failed-workspace 95ed3ff2
// (2026-05-02): canvas POSTed minimax/MiniMax-M2.7-highspeed,
// MODEL_PROVIDER was never written, container fell through to
// nousresearch/hermes-4-70b, derive-provider.sh produced the wrong
// provider, hermes gateway 401'd, /health poll failed,
// molecule-runtime never registered.
//
// Both writes are non-fatal: a failure here logs and continues so
// the workspace row stays consistent. The runtime can still boot
// (with the template default) and a later Save+Restart will re-
// persist via the SecretsHandler endpoints. The DB error path here
// is rare (the same DB just committed a workspace row a microsecond
// ago) so failing the create response would be unfriendly.
// internal#718 P4 closure: the prior `setProviderSecret` write
// (LLM_PROVIDER row, derived from the canvas-supplied
// payload.LLMProvider OR from deriveProviderFromModelSlug) has been
// REMOVED. The provider is now DERIVED at every decision point from
// (runtime, model) via the registry — billing (P2-B), CP user-data
// (this PR's CP-side commit replaces resolveModelAndProvider's
// env["LLM_PROVIDER"] read with a DeriveProvider call), and
// validation (P3 PR-C provisioner). Storing it is pure write-ghost
// with no remaining consumer. `payload.LLMProvider` is preserved on
// the request struct for backward-compatibility with older canvases
// that still send it; the value is intentionally ignored here.
//
// The setModelSecret write is non-fatal: a failure here logs and
// continues so the workspace row stays consistent. The runtime can
// still boot (with the template default) and a later
// Save+Restart will re-persist via the SecretsHandler endpoints.
if payload.Model != "" {
if err := setModelSecret(ctx, id, payload.Model); err != nil {
log.Printf("Create workspace %s: failed to persist MODEL_PROVIDER %q: %v (non-fatal)", id, payload.Model, err)
}
if explicitProvider := strings.TrimSpace(payload.LLMProvider); explicitProvider != "" {
if err := setProviderSecret(ctx, id, explicitProvider); err != nil {
log.Printf("Create workspace %s: failed to persist LLM_PROVIDER %q: %v (non-fatal)", id, explicitProvider, err)
}
} else if derived := deriveProviderFromModelSlug(payload.Model); derived != "" {
if err := setProviderSecret(ctx, id, derived); err != nil {
log.Printf("Create workspace %s: failed to persist LLM_PROVIDER %q: %v (non-fatal)", id, derived, err)
}
}
}
// Insert canvas layout — non-fatal: workspace can be dragged into position later
@@ -126,7 +126,7 @@ func TestWorkspaceCreate_WithInvalidCompute_ReturnsBadRequest(t *testing.T) {
c, _ := gin.CreateTestContext(w)
body := `{
"name":"Oversized Agent",
"model":"gpt-4",
"model":"claude-opus-4-7",
"compute":{"instance_type":"p4d.24xlarge"}
}`
c.Request = httptest.NewRequest("POST", "/workspaces", bytes.NewBufferString(body))
@@ -710,131 +710,21 @@ func (h *WorkspaceHandler) defaultTemplateProvidersYAML(runtime string) string {
return ""
}
// deriveProviderFromModelSlug maps a hermes-agent model slug prefix to
// its provider name — a Go translation of the case statement in
// workspace-configs-templates/hermes/scripts/derive-provider.sh that we
// can run at provision time so LLM_PROVIDER lands in workspace_secrets
// (and from there, into /configs/config.yaml via CP user-data) before
// the container ever boots.
// internal#718 P4 closure — `deriveProviderFromModelSlug` (retire-list #3)
// has been removed together with its only caller (WorkspaceHandler.Create's
// setProviderSecret write) and the LLM_PROVIDER workspace_secret it
// populated.
//
// Returns "" when the prefix isn't recognized OR when the runtime-only
// override would be needed to pick a provider — the caller skips the
// LLM_PROVIDER write in that case so derive-provider.sh keeps the final
// say at boot. derive-provider.sh remains the source of truth: this is
// strictly a *gating* hint that survives restarts and gives CP a YAML
// field to populate. Without it, "Save+Restart" would lose the user's
// provider choice every time CP regenerates the config.
//
// Two intentional differences from the shell version:
//
// 1. nousresearch/* and openai/* both return "openrouter" here. The
// shell script special-cases "prefer nous if HERMES_API_KEY set" /
// "prefer custom if OPENAI_API_KEY set", but those depend on
// runtime env that may not yet be loaded at provision time. We pick
// the safe default ("openrouter" reaches both Hermes 3 and OpenAI
// models without extra config); derive-provider.sh's runtime check
// can still upgrade to nous/custom when the keys are present.
//
// 2. Unknown prefixes return "" instead of "auto". Persisting "auto"
// would block a future "Save+Restart" with a known prefix from
// re-deriving — the CP YAML field is sticky once written. Returning
// "" means the caller skips the write and the runtime falls through
// to derive-provider.sh's *=auto branch on its own.
//
// Cover the same prefix list as derive-provider.sh's case statement;
// keep both files in sync when a new provider is added (table-driven
// test in workspace_provision_shared_test.go pins the mapping).
func deriveProviderFromModelSlug(model string) string {
if model == "" {
return ""
}
idx := strings.Index(model, "/")
if idx <= 0 {
return ""
}
prefix := model[:idx]
switch prefix {
// Direct-SDK providers (clean 1:1 prefix→provider mapping).
case "minimax":
return "minimax"
case "minimax-cn":
return "minimax-cn"
case "anthropic":
return "anthropic"
case "gemini":
return "gemini"
case "deepseek":
return "deepseek"
case "zai":
return "zai"
case "kimi-coding":
return "kimi-coding"
case "kimi-coding-cn":
return "kimi-coding-cn"
case "alibaba", "dashscope", "qwen":
return "alibaba"
case "xiaomi", "mimo":
return "xiaomi"
case "arcee", "arcee-ai":
return "arcee"
case "nvidia", "nim":
return "nvidia"
case "ollama-cloud":
return "ollama-cloud"
case "huggingface", "hf":
return "huggingface"
case "ai-gateway", "aigateway":
return "ai-gateway"
case "kilocode":
return "kilocode"
case "opencode-zen":
return "opencode-zen"
case "opencode-go":
return "opencode-go"
// Aggregator + explicit catch-alls.
case "openrouter":
return "openrouter"
case "custom":
return "custom"
// Runtime-only override candidates. derive-provider.sh's
// HERMES_API_KEY / OPENAI_API_KEY checks happen at boot; we pick the
// safe default (openrouter reaches both Hermes 3 and OpenAI without
// extra config) and let the script upgrade to nous/custom at runtime.
case "nousresearch", "openai":
return "openrouter"
// Additional 1:1 prefix→provider mappings — kept aligned with upstream's
// HERMES_INFERENCE_PROVIDER list (NousResearch/hermes-agent v0.12.0,
// 2026-04-30) and the additional case clauses in derive-provider.sh.
// The drift gate in derive_provider_drift_test.go enforces parity.
case "xai", "grok":
return "xai"
case "bedrock", "aws":
return "bedrock"
case "tencent", "tencent-tokenhub":
return "tencent-tokenhub"
case "gmi":
return "gmi"
case "qwen-oauth":
return "qwen-oauth"
case "lmstudio", "lm-studio":
return "lmstudio"
case "minimax-oauth":
return "minimax-oauth"
case "alibaba-coding-plan":
return "alibaba-coding-plan"
case "google-gemini-cli":
return "google-gemini-cli"
case "openai-codex":
return "openai-codex"
case "copilot-acp":
return "copilot-acp"
case "copilot":
return "copilot"
}
// Unknown prefix → don't persist a guess. derive-provider.sh's
// *=auto fallback handles it at runtime.
return ""
}
// The hand-rolled prefix switch was a Go mirror of
// workspace-configs-templates/hermes/scripts/derive-provider.sh kept in
// sync via a drift test. The replacement is providers.Manifest.DeriveProvider
// (synced in P2-A), which derives the provider from (runtime, model)
// against the registry SSOT at every decision point — billing (P2-B),
// CP user-data emission (this PR's CP-side commit), validation
// (P3 PR-C). The shell script in the hermes template continues to be the
// runtime fallback for unregistered models; codegen of the template's
// providers block from the registry is the P4 follow-up gated on
// registry data growth.
// applyRuntimeModelEnv exposes the workspace's selected model via an
// env var the target runtime's install.sh / start.sh knows to read.
@@ -883,12 +773,7 @@ func applyRuntimeModelEnv(envVars map[string]string, runtime, model string) {
// can no longer confuse a provider slug for a model id. CP-side
// slot-separation (cp#213 + cp#220) merged the analogous fix on
// the CP side; this is the workspace-server companion.
if model == "" {
model = envVars["MOLECULE_MODEL"]
}
if model == "" {
model = envVars["MODEL"]
}
model = effectiveModelForBilling(model, envVars)
if model == "" {
return
}
@@ -921,6 +806,31 @@ func applyRuntimeModelEnv(envVars map[string]string, runtime, model string) {
}
}
// effectiveModelForBilling resolves the picked model id from an explicit
// argument with the SAME fallback chain applyRuntimeModelEnv uses to set the
// container MODEL env: explicit arg → envVars["MOLECULE_MODEL"] →
// envVars["MODEL"] (the workspace_secret). It is the single source of truth
// for "what model is this workspace going to run", shared by both
// applyRuntimeModelEnv (which exports it to the container) and
// applyPlatformManagedLLMEnv (which derives the billing mode from it).
//
// molecule-core#1994: the billing resolver MUST consult the same effective
// model the container will actually run. Pre-fix it used the raw payload.Model
// only, which is "" on a re-provision (the payload is rebuilt from the DB with
// no Model), so it derived from an empty model → defaulted closed to
// platform_managed and diverged from the read endpoint (which reads the stored
// MODEL secret). Returns "" only when no model is resolvable anywhere — the
// legitimate "unset → platform default" case the resolver fails closed on.
func effectiveModelForBilling(model string, envVars map[string]string) string {
if model == "" {
model = envVars["MOLECULE_MODEL"]
}
if model == "" {
model = envVars["MODEL"]
}
return model
}
// applyPlatformManagedLLMEnv wires the control-plane LLM proxy into a
// workspace only when the RESOLVED billing mode for this workspace is
// platform_managed. "Resolved" means: the workspace-level override (if any)
@@ -944,55 +854,93 @@ func applyRuntimeModelEnv(envVars map[string]string, runtime, model string) {
// answer "what mode is this workspace running under" without DB queries
// (RFC Observability hot-spot).
//
// internal#711 — PROVIDER-AWARE GLOBAL-LLM-CRED GATE. The platform's
// LLM credentials (CLAUDE_CODE_OAUTH_TOKEN + the rest of the
// platformManagedDirectLLMBypassKeys set) live in `global_secrets` and
// are merged into EVERY workspace's env by loadWorkspaceSecrets — that
// merge is provenance-blind. Pre-fix, the non-platform (byok/disabled)
// early-return left envVars untouched, so a BYOK / subscription
// workspace that brought NO LLM credential of its own still inherited
// the platform's scope:global CLAUDE_CODE_OAUTH_TOKEN and ran Opus on
// the platform's (Molecule's) Anthropic credits (Reno Stars SEO +
// Marketing agents, confirmed live 2026-05-27).
// molecule-core#1994 (credential-handling follow-on, CTO-confirmed model).
// `global_secrets` is the TENANT's own secret store, shared across all of
// that tenant's workspaces — it is NOT the platform's. The platform's own
// LLM credential is the CP proxy usage token (MOLECULE_LLM_USAGE_TOKEN),
// injected SEPARATELY on the platform_managed path below; it is never stored
// in any tenant's global_secrets.
//
// The gate: on the non-platform path we strip every platform-managed
// LLM key whose PROVENANCE is `global_secrets` (the globalKeys set).
// A workspace's OWN LLM credential — set via the canvas Secrets tab,
// i.e. a `workspace_secrets` row — has had its global provenance flag
// dropped by loadWorkspaceSecrets, so it is NOT in globalKeys and
// survives. Net effect: platform global LLM creds reach a workspace
// ONLY when its resolved mode is platform_managed; a non-platform
// workspace resolves to its own (workspace-scoped) credential or none.
// Consequently the byok/disabled branch does NOT strip the tenant's
// global-origin LLM creds. Under the corrected model the tenant's own
// credential — whether at global scope (a global_secrets row, e.g. the key
// they configured via the org-import required-env preflight / the settings
// Secrets tab) or at workspace scope (a workspace_secrets row) — is exactly
// what byok must run on, direct. The earlier internal#711 strip rested on the
// inverted premise that a global-scope LLM cred was "the platform's own"; it
// was wrong and it killed legitimate byok workspaces (MISSING_BYOK_CREDENTIAL
// for tenants whose oauth lived at global scope — Reno Stars Marketing agent,
// confirmed live 2026-05-28). Removing the strip is only safe because the
// platform's own credential is never co-mingled into a tenant's global_secrets
// (guarded at the write boundary: SetGlobal rejects bypass-list keys for a
// platform-managed tenant; the platform proxy token is read from server env
// only, never persisted to a tenant store).
//
// The boolean return reports whether, after the gate, the workspace
// still has at least one usable LLM credential. The caller
// (prepareProvisionContext) uses it to FAIL CLOSED — a non-platform
// workspace with no usable LLM credential is aborted with a clear
// MISSING_BYOK_CREDENTIAL error at provision time rather than being
// started on (now-stripped) platform creds.
// The boolean return still reports whether the workspace has at least one
// usable LLM credential. The caller (prepareProvisionContext) uses it to FAIL
// CLOSED — a byok workspace with no usable LLM credential at ANY scope is
// aborted with a clear MISSING_BYOK_CREDENTIAL error at provision time rather
// than started credential-less.
// platformLLMEnvResult is the structured outcome of applyPlatformManagedLLMEnv.
// ResolvedMode is the per-workspace billing/provider mode the resolver
// landed on. HasUsableLLMCred reports whether — AFTER the provider-aware
// global-cred gate — the workspace still has at least one platform-managed
// LLM credential key in its env (its own, workspace-scoped one). Only the
// non-platform path consults HasUsableLLMCred for the fail-closed decision;
// the platform_managed path always returns true (it forces the CP proxy
// usage token, which IS the usable credential).
// landed on. HasUsableLLMCred reports whether the workspace has at least one
// platform-managed-shaped LLM credential key in its env — the tenant's own,
// at global or workspace scope. Only the non-platform (byok) path consults
// HasUsableLLMCred for the fail-closed decision; the platform_managed path
// always returns true (it forces the CP proxy usage token, which IS the
// usable credential).
type platformLLMEnvResult struct {
ResolvedMode string
HasUsableLLMCred bool
// Source records which layer decided the mode (internal#718 P2-B):
// derived_provider (registry derivation), derived_default (derive failed →
// platform default), workspace_override (explicit operator pin), or
// constant_fallback (DB error). Surfaced for observability + asserted by the
// behavior-delta tests so a regression of "derived, not stored" flips red.
Source BillingModeSource
}
func applyPlatformManagedLLMEnv(ctx context.Context, envVars map[string]string, globalKeys map[string]struct{}, workspaceID, runtime, model string) platformLLMEnvResult {
orgMode := strings.ToLower(strings.TrimSpace(os.Getenv("MOLECULE_LLM_BILLING_MODE")))
res, resolveErr := ResolveLLMBillingMode(ctx, workspaceID, orgMode)
// globalKeys is the provenance side-channel from loadWorkspaceSecrets: the set
// of env keys that originated from the operator-controlled global_secrets table
// (a workspace_secrets row of the same name overrides and clears the flag). It
// is consumed ONLY on the byok/disabled branch's provider-matched strip
// (internal#728 Bug 1): a global-origin LLM bypass cred that does NOT match the
// resolved provider's auth_env is stripped so a greedy runtime (claude-code
// prefers CLAUDE_CODE_OAUTH_TOKEN) cannot route a non-anthropic model to the
// wrong upstream. May be nil (no global-origin keys / unknown provenance) — a
// nil set strips nothing, preserving the pre-#728 behavior for callers that do
// not thread provenance.
func applyPlatformManagedLLMEnv(ctx context.Context, envVars map[string]string, workspaceID, runtime, model string, globalKeys map[string]struct{}) platformLLMEnvResult {
// internal#718 P2-B: the platform-vs-byok decision now DERIVES the provider
// from (runtime, model) via the registry and keys off IsPlatform(derived) —
// NOT a stored LLM_PROVIDER and NOT the org rung. This path already carries
// runtime + model + the workspace env, so it calls the DERIVED resolver
// directly (no DB round-trip for runtime/model). availableAuthEnv is the set
// of recognized provider auth-env-var NAMES present in envVars (the same
// disambiguation input the registry uses to split oauth-vs-api). The org-env
// MOLECULE_LLM_BILLING_MODE is NO LONGER read into the decision (retired).
availableAuthEnv := availableAuthEnvNames(envVars)
// molecule-core#1994: derive billing mode from the EFFECTIVE model, not the
// raw payload.Model. On a re-provision (restart/resume/auto-restart) the
// payload is rebuilt from the DB with Name+Tier+Runtime only — payload.Model
// is "" (workspace_restart.go via withStoredCompute, which backfills Compute
// but NOT Model). With an empty model DeriveProvider errors → the resolver
// defaults closed to platform_managed and bakes the CP proxy, DIVERGING from
// the read endpoint (which reads the stored MODEL workspace_secret and derives
// byok). The stored model already lives in the merged envVars (loaded by
// loadWorkspaceSecrets); resolve it with the SAME fallback chain
// applyRuntimeModelEnv uses so the provision-path derive inputs match the
// read-path's — keeping the two resolvers in parity (the #1994 regression
// guard test asserts this).
effectiveModel := effectiveModelForBilling(model, envVars)
res, resolveErr := ResolveLLMBillingModeDerived(ctx, workspaceID, runtime, effectiveModel, availableAuthEnv)
if resolveErr != nil {
// resolveErr != nil ⇒ resolver hit a DB error AND already defaulted
// res.ResolvedMode to platform_managed. Log + proceed; the safe default
// is already in place, no early return needed.
log.Printf("workspace_provision: resolve billing mode workspace=%s err=%v (defaulting to platform_managed)", workspaceID, resolveErr)
}
log.Printf("workspace_provision: billing mode workspace=%s resolved=%s source=%s org_default=%s", workspaceID, res.ResolvedMode, res.Source, res.OrgDefault)
log.Printf("workspace_provision: billing mode workspace=%s resolved=%s source=%s derived_provider=%s", workspaceID, res.ResolvedMode, res.Source, derefOrEmpty(res.ProviderSelection))
// internal#703: MOLECULE_LLM_BILLING_MODE in the container must reflect the
// RESOLVED per-workspace mode, not a hardcoded literal. Pre-fix this var was
// only emitted (hardcoded "platform_managed") on the strip path below, so a
@@ -1007,22 +955,47 @@ func applyPlatformManagedLLMEnv(ctx context.Context, envVars map[string]string,
envVars["MOLECULE_LLM_BILLING_MODE_RESOLVED"] = res.ResolvedMode
if res.ResolvedMode != LLMBillingModePlatformManaged {
// byok or disabled — DO NOT force-route to CP, DO NOT override the
// workspace's own ANTHROPIC_BASE_URL / OAuth token.
// workspace's own ANTHROPIC_BASE_URL, and DO NOT strip the tenant's own
// (provider-matching) LLM credentials.
//
// internal#711: but DO strip platform-origin LLM credentials. The
// platform's scope:global CLAUDE_CODE_OAUTH_TOKEN (+ the rest of the
// bypass-key set) was merged into envVars by loadWorkspaceSecrets
// from global_secrets; without this strip a BYOK workspace that
// brought no LLM credential of its own would inherit the platform's
// global token and bill the platform's Anthropic credits. The strip
// is PROVENANCE-AWARE: only keys still flagged as global_secrets
// origin are removed; a workspace's own LLM cred (a workspace_secrets
// row — provenance flag already dropped by loadWorkspaceSecrets)
// survives so the workspace talks to its own provider directly.
stripGlobalOriginLLMCreds(envVars, globalKeys)
// molecule-core#1994 (corrected model): `global_secrets` is the
// TENANT's store, not the platform's. The tenant's own credential —
// at global OR workspace scope — is exactly what byok runs on, direct.
// The platform's own credential is never in a tenant's global_secrets
// (guarded at the SetGlobal write boundary + the proxy token is
// server-env-only), so leaving the tenant's globals in place cannot
// re-open the platform-credit drain.
//
// internal#728 Bug 1 (provider-matched credential injection): #1994
// removed the BLANKET strip, which was correct for the platform-key
// co-mingling it targeted but left EVERY claude-code workspace
// inheriting the tenant-global CLAUDE_CODE_OAUTH_TOKEN. A claude-code
// runtime greedily prefers that oauth (`llm-auth: detected oauth` →
// api.anthropic.com), so a workspace whose RESOLVED provider is NOT
// anthropic-oauth (minimax, kimi-byok, …) routes its non-Anthropic
// model to Anthropic and errors (`Claude Code returned an error
// result`; DevB MiniMax-M2.7 live-confirmed 2026-05-28).
//
// The precise, provider-AWARE replacement for the over-removed strip:
// keep ONLY the global-origin bypass creds whose env-var name is in the
// RESOLVED provider's auth_env; strip the rest. This is NOT a return to
// the blanket strip — it is keyed off the derived provider:
// - minimax (auth_env: MINIMAX_API_KEY, ANTHROPIC_AUTH_TOKEN,
// ANTHROPIC_API_KEY) → global-origin CLAUDE_CODE_OAUTH_TOKEN is
// NOT a match → stripped (fixes DevB).
// - anthropic-oauth (auth_env: CLAUDE_CODE_OAUTH_TOKEN) → the
// global-origin oauth IS a match → kept (PM/reno opus byok NOT
// regressed — the #1994 ByokGlobalScopeOAuthSurvives guard holds).
// WORKSPACE-origin creds (the user explicitly set them via the canvas
// Secrets tab → NOT in globalKeys) are NEVER stripped here, even when
// they don't match: the user authored them deliberately (JRS kimi
// workspace-key, reno's own oauth). Only the inherited operator-store
// channel is provider-gated.
stripNonMatchingGlobalOriginLLMCreds(envVars, globalKeys, runtime, effectiveModel, availableAuthEnv)
return platformLLMEnvResult{
ResolvedMode: res.ResolvedMode,
HasUsableLLMCred: hasAnyPlatformManagedLLMKey(envVars),
Source: res.Source,
}
}
baseURL := firstNonEmptyEnv("MOLECULE_LLM_BASE_URL", "OPENAI_BASE_URL")
@@ -1034,7 +1007,7 @@ func applyPlatformManagedLLMEnv(ctx context.Context, envVars map[string]string,
// here — but we report HasUsableLLMCred from whatever survived so the
// caller's fail-closed branch (non-platform only) is never reached on
// this path.
return platformLLMEnvResult{ResolvedMode: res.ResolvedMode, HasUsableLLMCred: true}
return platformLLMEnvResult{ResolvedMode: res.ResolvedMode, HasUsableLLMCred: true, Source: res.Source}
}
stripPlatformManagedLLMBypassEnv(envVars)
@@ -1066,7 +1039,7 @@ func applyPlatformManagedLLMEnv(ctx context.Context, envVars map[string]string,
// platform_managed: the CP proxy usage token (injected as ANTHROPIC_API_KEY
// / OPENAI_API_KEY above) IS the usable credential, so the workspace is
// never fail-closed on this path.
return platformLLMEnvResult{ResolvedMode: res.ResolvedMode, HasUsableLLMCred: true}
return platformLLMEnvResult{ResolvedMode: res.ResolvedMode, HasUsableLLMCred: true, Source: res.Source}
}
func stripPlatformManagedLLMBypassEnv(envVars map[string]string) {
@@ -1075,32 +1048,11 @@ func stripPlatformManagedLLMBypassEnv(envVars map[string]string) {
}
}
// stripGlobalOriginLLMCreds removes platform-managed LLM credential keys
// (CLAUDE_CODE_OAUTH_TOKEN + the rest of platformManagedDirectLLMBypassKeys)
// from envVars ONLY when they originated from the operator-controlled
// `global_secrets` table (i.e. their key is present in globalKeys).
//
// internal#711 provider-aware gate. A platform global LLM credential is the
// platform's own credential and must never be the credential a non-platform
// (byok / subscription) workspace runs on. loadWorkspaceSecrets drops the
// global-provenance flag for any key the workspace re-set via the canvas
// Secrets tab (a workspace_secrets row), so a workspace's OWN LLM credential
// is NOT in globalKeys and survives this strip — only the inherited platform
// global creds are removed.
func stripGlobalOriginLLMCreds(envVars map[string]string, globalKeys map[string]struct{}) {
for key := range platformManagedDirectLLMBypassKeys {
if _, fromGlobal := globalKeys[key]; fromGlobal {
delete(envVars, key)
}
}
}
// hasAnyPlatformManagedLLMKey reports whether envVars still carries at least
// one non-empty platform-managed LLM credential key after the provider-aware
// gate. Used by the non-platform fail-closed branch: a byok/subscription
// workspace with no surviving (workspace-scoped) LLM credential must be
// aborted with MISSING_BYOK_CREDENTIAL rather than started credential-less or
// on stripped platform creds.
// hasAnyPlatformManagedLLMKey reports whether envVars carries at least one
// non-empty platform-managed-shaped LLM credential key (the tenant's own, at
// global or workspace scope). Used by the byok fail-closed branch: a byok
// workspace with no LLM credential at ANY scope must be aborted with
// MISSING_BYOK_CREDENTIAL rather than started credential-less.
func hasAnyPlatformManagedLLMKey(envVars map[string]string) bool {
for key := range platformManagedDirectLLMBypassKeys {
if strings.TrimSpace(envVars[key]) != "" {
@@ -1110,6 +1062,66 @@ func hasAnyPlatformManagedLLMKey(envVars map[string]string) bool {
return false
}
// stripNonMatchingGlobalOriginLLMCreds is the byok-branch provider-matched
// credential injection (internal#728 Bug 1). It removes from envVars every
// platform-managed LLM bypass key that:
//
// 1. originated from the operator-controlled global_secrets store
// (present in globalKeys — a workspace_secrets row of the same name
// overrides + clears the flag, so user-authored creds are exempt), AND
// 2. is NOT in the RESOLVED provider's auth_env set.
//
// The motivating regression: #1994 dropped the blanket strip, so a claude-code
// workspace resolving to `minimax` still inherited the tenant-global
// CLAUDE_CODE_OAUTH_TOKEN; the runtime prefers that oauth and routes the
// MiniMax model to api.anthropic.com → error. Keeping only the resolved
// provider's own auth_env keys (minimax: MINIMAX_API_KEY/ANTHROPIC_AUTH_TOKEN/
// ANTHROPIC_API_KEY — not the oauth) removes the stray oauth while preserving
// anthropic-oauth's CLAUDE_CODE_OAUTH_TOKEN for an opus byok workspace.
//
// Fail-OPEN by design: if the provider cannot be derived (empty model /
// unknown runtime / ambiguous) or the registry is unavailable, we strip
// NOTHING — we never strip a credential we cannot prove is non-matching, so a
// derive miss can never fail-close a legitimate byok workspace (mirrors the
// resolver's own default-closed-to-platform contract: the worst case is we
// keep a stray cred, never that we remove the only usable one). The earlier
// internal#711 blanket strip's fail-direction (remove first) was the bug;
// this strip's fail-direction is keep-first.
func stripNonMatchingGlobalOriginLLMCreds(envVars map[string]string, globalKeys map[string]struct{}, runtime, model string, availableAuthEnv []string) {
if len(globalKeys) == 0 {
return // no operator-store-origin keys to consider — nothing to strip.
}
manifest, err := providerRegistry()
if err != nil || manifest == nil {
return // registry unavailable — fail open, strip nothing.
}
provider, dErr := manifest.DeriveProvider(runtime, model, availableAuthEnv)
if dErr != nil {
return // underivable provider — fail open, strip nothing.
}
// The resolved provider's accepted auth-env-var NAMES (case-insensitive
// for parity with isPlatformManagedDirectLLMBypassKey, which upper-cases).
keep := make(map[string]struct{}, len(provider.AuthEnv))
for _, e := range provider.AuthEnv {
keep[strings.ToUpper(strings.TrimSpace(e))] = struct{}{}
}
for key := range globalKeys {
upper := strings.ToUpper(strings.TrimSpace(key))
if _, isBypass := platformManagedDirectLLMBypassKeys[upper]; !isBypass {
continue // not an LLM bypass cred (e.g. a non-LLM operator secret) — leave it.
}
if _, matches := keep[upper]; matches {
continue // matches the resolved provider's auth_env — this is what byok runs on.
}
// Global-origin LLM bypass cred that does NOT match the resolved
// provider — the stray that a greedy runtime would mis-prefer. Strip.
if _, present := envVars[key]; present {
log.Printf("workspace_provision: byok provider-matched strip — removing global-origin LLM cred %s (resolved provider=%s does not accept it)", key, provider.Name)
delete(envVars, key)
}
}
}
func runtimeUsesAnthropicNativeProxy(runtime string) bool {
return strings.EqualFold(strings.TrimSpace(runtime), "claude-code")
}
@@ -1161,6 +1173,14 @@ func loadWorkspaceSecrets(ctx context.Context, workspaceID string) (map[string]s
var v []byte
var ver int
if globalRows.Scan(&k, &v, &ver) == nil {
// internal#718 P4 closure: LLM_PROVIDER is retired even
// at the global rung. The same provider-from-(runtime,model)
// derivation runs per-workspace, so a global default
// would be pure ghost. Symmetric with the workspace_secrets
// drop below.
if k == "LLM_PROVIDER" {
continue
}
decrypted, decErr := crypto.DecryptVersioned(v, ver)
if decErr != nil {
log.Printf("Provisioner: FATAL — failed to decrypt global secret %s (version=%d): %v — aborting provision of workspace %s", k, ver, decErr, workspaceID)
@@ -1183,6 +1203,18 @@ func loadWorkspaceSecrets(ctx context.Context, workspaceID string) (map[string]s
var v []byte
var ver int
if wsRows.Scan(&k, &v, &ver) == nil {
// internal#718 P4 closure: LLM_PROVIDER is a retired
// secret key. Migration 20260528000000 deletes any
// straggler rows; this drop is defence-in-depth so a
// rolling deploy (new code, old DB) never re-emits the
// retired key into the provisioner env (which would
// reach the CP-side resolveModelAndProvider — now
// itself retired, but the env contract belongs to
// core). Idempotent: a fresh tenant has zero
// LLM_PROVIDER rows and this branch is unreached.
if k == "LLM_PROVIDER" {
continue
}
decrypted, decErr := crypto.DecryptVersioned(v, ver)
if decErr != nil {
log.Printf("Provisioner: FATAL — failed to decrypt workspace secret %s (version=%d) for %s: %v — aborting provision", k, ver, workspaceID, decErr)
@@ -193,33 +193,38 @@ func (h *WorkspaceHandler) prepareProvisionContext(
// continue to rely on workspace_secrets / org-import persona-env
// merge for their git auth.
applyAgentGitHTTPCreds(envVars, payload.Role)
// internal#711: provider-aware LLM-credential resolution. On a non-platform
// (byok/subscription) workspace this strips the platform's scope:global LLM
// creds inherited from global_secrets and reports whether the workspace
// still has a usable (workspace-scoped) LLM credential of its own.
llmRes := applyPlatformManagedLLMEnv(ctx, envVars, globalSecretKeys, workspaceID, payload.Runtime, payload.Model)
// Fail closed for a BYOK workspace with no usable LLM credential: do NOT
// start it on the platform's (now-stripped) global creds. Mirror the
// "model+provider+credential REQUIRED at create" spirit (internal#711)
// with an actionable error surfaced at provision time.
// molecule-core#1994: per-workspace LLM billing-mode resolution + env wiring.
// On platform_managed it forces the CP proxy usage token; on byok/disabled
// it keeps the tenant's own provider-MATCHING creds (global OR workspace
// scope) and reports whether a usable LLM credential is present.
//
// internal#728 Bug 1: globalSecretKeys (loadWorkspaceSecrets provenance)
// lets the byok branch strip ONLY operator-store-origin LLM creds that do
// NOT match the resolved provider's auth_env — so a non-anthropic-oauth
// claude-code workspace no longer inherits the stray tenant-global
// CLAUDE_CODE_OAUTH_TOKEN the runtime would greedily prefer. User-authored
// workspace_secrets (provenance flag cleared) are exempt.
llmRes := applyPlatformManagedLLMEnv(ctx, envVars, workspaceID, payload.Runtime, payload.Model, globalSecretKeys)
// Fail closed for a BYOK workspace with no usable LLM credential at ANY
// scope: do NOT start it credential-less. Mirror the "model+provider+
// credential REQUIRED at create" spirit with an actionable error surfaced
// at provision time.
//
// Scoped to byok specifically (NOT disabled): "byok" means "the user
// intends to run an LLM on their own credential" — a missing one is a
// misconfiguration worth surfacing loudly. "disabled" means "this
// workspace runs no platform-billed LLM at all" (terminal / file work, or
// a runtime that talks to a non-bypass-key endpoint); stripping the
// inherited platform globals is sufficient there and aborting would
// regress a legitimate no-LLM workspace. The strip above already ran for
// both non-platform modes.
// a runtime that talks to a non-bypass-key endpoint), so aborting would
// regress a legitimate no-LLM workspace.
//
// The bypass-key check is intentionally broad — any surviving bypass key
// (the workspace's own, of workspace_secrets provenance) clears it.
// The bypass-key check is intentionally broad — any present bypass key
// (the tenant's own, at global or workspace scope) clears it.
if llmRes.ResolvedMode == LLMBillingModeBYOK && !llmRes.HasUsableLLMCred {
msg := formatMissingBYOKCredentialError(llmRes.ResolvedMode)
log.Printf("Provisioner: ABORT workspace=%s — byok billing mode has no usable LLM credential (MISSING_BYOK_CREDENTIAL, internal#711)", workspaceID)
log.Printf("Provisioner: ABORT workspace=%s — byok billing mode has no usable LLM credential (MISSING_BYOK_CREDENTIAL, molecule-core#1994)", workspaceID)
return nil, &provisionAbort{
Msg: msg,
Extra: map[string]interface{}{"error": msg, "code": "MISSING_BYOK_CREDENTIAL", "billing_mode": llmRes.ResolvedMode, "issue": "711"},
Extra: map[string]interface{}{"error": msg, "code": "MISSING_BYOK_CREDENTIAL", "billing_mode": llmRes.ResolvedMode, "issue": "1994"},
}
}
applyRuntimeModelEnv(envVars, payload.Runtime, payload.Model)
@@ -494,32 +494,34 @@ func TestPrepareProvisionContext_WorkspaceSecretWinsOverPersonaToken(t *testing.
}
}
// TestPrepareProvisionContext_ByokWithOnlyGlobalOAuthFailsClosed is the
// internal#711 end-to-end guard for the live Reno Stars leak. A byok
// workspace whose ONLY LLM credential is the platform's scope:global
// CLAUDE_CODE_OAUTH_TOKEN (inherited from global_secrets, no workspace
// override) must:
// TestPrepareProvisionContext_ByokWithTenantGlobalOAuthSucceeds is the
// molecule-core#1994 (corrected-model) end-to-end inversion of the former
// internal#711 fail-closed test, for the live Reno Stars byok agents. A byok
// workspace whose LLM credential is the TENANT's own scope:global
// CLAUDE_CODE_OAUTH_TOKEN (a global_secrets row, no workspace override) must:
//
// 1. have that platform token STRIPPED from the prepared env (no leak), and
// 2. ABORT the provision with the MISSING_BYOK_CREDENTIAL code rather than
// start the workspace on the platform's credits.
// 1. KEEP that oauth in the prepared container env (it is the tenant's own
// credential — exactly what byok runs on, direct), and
// 2. NOT abort the provision proceeds.
//
// This is the discriminating end-to-end test: pre-fix prepared.EnvVars would
// carry CLAUDE_CODE_OAUTH_TOKEN=<platform token> and the provision would
// succeed, running Opus on Molecule's Anthropic credits.
func TestPrepareProvisionContext_ByokWithOnlyGlobalOAuthFailsClosed(t *testing.T) {
// Pre-fix (internal#711) prepared.EnvVars stripped the global oauth and the
// provision aborted MISSING_BYOK_CREDENTIAL → the agent was dead. This is the
// discriminating end-to-end guard for the fix.
func TestPrepareProvisionContext_ByokWithTenantGlobalOAuthSucceeds(t *testing.T) {
const wsID = "352e3c2b-0546-4e9c-b487-1e2ff1cf29fc" // Reno Stars SEO agent
t.Setenv("MOLECULE_LLM_BILLING_MODE", LLMBillingModePlatformManaged)
mock := setupTestDB(t)
// global_secrets carries the platform's scope:global OAuth token.
// global_secrets carries the TENANT's own scope:global OAuth token + the
// stored MODEL (so the resolver derives byok from opus).
mock.ExpectQuery(`SELECT key, encrypted_value, encryption_version FROM global_secrets`).
WillReturnRows(sqlmock.NewRows([]string{"key", "encrypted_value", "encryption_version"}).
AddRow("CLAUDE_CODE_OAUTH_TOKEN", []byte("PLATFORM-GLOBAL-OAUTH"), 0))
// Workspace set NO secrets of its own.
AddRow("CLAUDE_CODE_OAUTH_TOKEN", []byte("TENANT-OWN-GLOBAL-OAUTH"), 0))
// Workspace set its own MODEL (no LLM cred of its own — relies on global).
mock.ExpectQuery(`SELECT key, encrypted_value, encryption_version FROM workspace_secrets`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"key", "encrypted_value", "encryption_version"}))
WillReturnRows(sqlmock.NewRows([]string{"key", "encrypted_value", "encryption_version"}).
AddRow("MODEL", []byte("opus"), 0))
// Resolver: workspace override = byok.
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
@@ -534,8 +536,57 @@ func TestPrepareProvisionContext_ByokWithOnlyGlobalOAuthFailsClosed(t *testing.T
prepared, abort := handler.prepareProvisionContext(
context.Background(), wsID, "/nonexistent", nil, payload, false)
if abort != nil {
t.Fatalf("expected provision to proceed (byok on tenant's own global oauth), got abort=%v", abort.Extra)
}
if prepared == nil {
t.Fatalf("prepared context is nil despite no abort")
}
// The tenant's own global oauth must be present in the container env.
if prepared.EnvVars["CLAUDE_CODE_OAUTH_TOKEN"] != "TENANT-OWN-GLOBAL-OAUTH" {
t.Fatalf("CLAUDE_CODE_OAUTH_TOKEN = %q, want the tenant's own global oauth preserved for byok",
prepared.EnvVars["CLAUDE_CODE_OAUTH_TOKEN"])
}
// byok must not have been routed through the platform proxy.
if _, ok := prepared.EnvVars["MOLECULE_LLM_USAGE_TOKEN"]; ok {
t.Fatalf("byok provision must NOT inject the platform usage token")
}
if got := prepared.EnvVars["MOLECULE_LLM_BILLING_MODE_RESOLVED"]; got != LLMBillingModeBYOK {
t.Fatalf("MOLECULE_LLM_BILLING_MODE_RESOLVED = %q, want byok", got)
}
}
// TestPrepareProvisionContext_ByokNoCredentialAtAnyScopeFailsClosed is the
// companion: the fail-closed abort is UNCHANGED for a byok workspace with no
// LLM credential at ANY scope (no global row, no workspace row). It still
// aborts MISSING_BYOK_CREDENTIAL rather than starting credential-less.
func TestPrepareProvisionContext_ByokNoCredentialAtAnyScopeFailsClosed(t *testing.T) {
const wsID = "352e3c2b-0546-4e9c-b487-1e2ff1cf29fc"
t.Setenv("MOLECULE_LLM_BILLING_MODE", LLMBillingModePlatformManaged)
mock := setupTestDB(t)
// No global LLM cred — only the stored MODEL so the resolver derives byok.
mock.ExpectQuery(`SELECT key, encrypted_value, encryption_version FROM global_secrets`).
WillReturnRows(sqlmock.NewRows([]string{"key", "encrypted_value", "encryption_version"}))
mock.ExpectQuery(`SELECT key, encrypted_value, encryption_version FROM workspace_secrets`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"key", "encrypted_value", "encryption_version"}).
AddRow("MODEL", []byte("opus"), 0))
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(LLMBillingModeBYOK))
handler := NewWorkspaceHandler(&captureBroadcaster{}, nil, "http://localhost:8080", t.TempDir())
payload := models.CreateWorkspacePayload{
Name: "Reno Stars SEO",
Runtime: "claude-code",
Tier: 1,
}
prepared, abort := handler.prepareProvisionContext(
context.Background(), wsID, "/nonexistent", nil, payload, false)
if abort == nil {
t.Fatalf("expected MISSING_BYOK_CREDENTIAL abort, got success (prepared=%v) — the leak would still ship", prepared)
t.Fatalf("expected MISSING_BYOK_CREDENTIAL abort, got success (prepared=%v)", prepared)
}
if code, _ := abort.Extra["code"].(string); code != "MISSING_BYOK_CREDENTIAL" {
t.Fatalf("abort.Extra[code] = %v, want MISSING_BYOK_CREDENTIAL", abort.Extra["code"])
@@ -646,103 +697,49 @@ func TestReadOrLazyHealInboundSecret(t *testing.T) {
})
}
// TestDeriveProviderFromModelSlug pins the slug→provider mapping shared
// with workspace-configs-templates/hermes/scripts/derive-provider.sh.
// Sync-test: when a new prefix is added to the shell script, add it
// here too. The two intentional differences from the shell version
// (nousresearch/openai both → "openrouter" at provision time;
// unknown/no-prefix → "" instead of "auto") are exercised explicitly.
func TestDeriveProviderFromModelSlug(t *testing.T) {
t.Parallel()
cases := []struct {
name string
model string
want string
}{
{"minimax", "minimax/MiniMax-M2.7-highspeed", "minimax"},
{"minimax-cn keeps cn suffix", "minimax-cn/MiniMax-M2.7", "minimax-cn"},
{"anthropic", "anthropic/claude-sonnet-4-6", "anthropic"},
{"gemini", "gemini/gemini-2.5-pro", "gemini"},
{"deepseek", "deepseek/deepseek-v3", "deepseek"},
{"zai", "zai/glm-4.6", "zai"},
{"kimi-coding", "kimi-coding/kimi-k2", "kimi-coding"},
{"kimi-coding-cn keeps cn suffix", "kimi-coding-cn/kimi-k2", "kimi-coding-cn"},
{"alibaba via dashscope alias", "dashscope/qwen3", "alibaba"},
{"alibaba via qwen alias", "qwen/qwen3-coder", "alibaba"},
{"xiaomi via mimo alias", "mimo/mimo-vl", "xiaomi"},
{"arcee via arcee-ai alias", "arcee-ai/arcee-blitz", "arcee"},
{"nvidia via nim alias", "nim/llama-3.3-nemotron-super", "nvidia"},
{"ollama-cloud", "ollama-cloud/qwen3", "ollama-cloud"},
{"huggingface via hf alias", "hf/Qwen/Qwen3", "huggingface"},
{"ai-gateway", "ai-gateway/anthropic-claude-sonnet-4-6", "ai-gateway"},
{"kilocode", "kilocode/kilo-1", "kilocode"},
{"opencode-zen", "opencode-zen/zen-1", "opencode-zen"},
{"opencode-go", "opencode-go/code-1", "opencode-go"},
{"openrouter passthrough", "openrouter/anthropic/claude-sonnet-4-6", "openrouter"},
{"custom passthrough", "custom/my-private-endpoint", "custom"},
// Runtime-only override candidates default to openrouter at
// provision time (derive-provider.sh upgrades to nous/custom at
// boot if HERMES_API_KEY/OPENAI_API_KEY are present).
{"nousresearch defaults to openrouter at provision time", "nousresearch/hermes-4-70b", "openrouter"},
{"openai defaults to openrouter at provision time", "openai/gpt-5", "openrouter"},
// hermes-agent v0.12.0 / 2026-04-30 provider list — the drift gate
// in derive_provider_drift_test.go pins parity with the shell case
// statement.
{"xai", "xai/grok-4", "xai"},
{"xai via grok alias", "grok/grok-4", "xai"},
{"bedrock", "bedrock/anthropic.claude-sonnet-4-6", "bedrock"},
{"bedrock via aws alias", "aws/anthropic.claude-sonnet-4-6", "bedrock"},
{"tencent", "tencent/hunyuan-coder", "tencent-tokenhub"},
{"tencent-tokenhub passthrough", "tencent-tokenhub/hunyuan-coder", "tencent-tokenhub"},
{"gmi", "gmi/gmi-coder-1", "gmi"},
{"qwen-oauth", "qwen-oauth/qwen3-coder", "qwen-oauth"},
{"lmstudio", "lmstudio/qwen3-coder", "lmstudio"},
{"lmstudio via lm-studio alias", "lm-studio/qwen3-coder", "lmstudio"},
{"minimax-oauth", "minimax-oauth/MiniMax-M2.7", "minimax-oauth"},
{"alibaba-coding-plan", "alibaba-coding-plan/qwen3-coder", "alibaba-coding-plan"},
{"google-gemini-cli", "google-gemini-cli/gemini-2.5-pro", "google-gemini-cli"},
{"openai-codex", "openai-codex/gpt-5-codex", "openai-codex"},
{"copilot-acp", "copilot-acp/claude-sonnet-4-6", "copilot-acp"},
{"copilot", "copilot/claude-sonnet-4-6", "copilot"},
// Unknowns return "" so the caller skips the LLM_PROVIDER write
// and lets derive-provider.sh's *=auto branch decide at runtime.
{"unknown prefix returns empty", "totally-unknown-model/foo", ""},
{"empty input returns empty", "", ""},
{"no slash returns empty", "no-slash-here", ""},
{"leading slash returns empty", "/leading-slash", ""},
}
for _, tc := range cases {
tc := tc
t.Run(tc.name, func(t *testing.T) {
t.Parallel()
got := deriveProviderFromModelSlug(tc.model)
if got != tc.want {
t.Errorf("deriveProviderFromModelSlug(%q) = %q, want %q", tc.model, got, tc.want)
}
})
}
}
// internal#718 P4 closure: TestDeriveProviderFromModelSlug was the
// table-driven sync test that pinned deriveProviderFromModelSlug
// (retire-list #3) against
// workspace-configs-templates/hermes/scripts/derive-provider.sh.
//
// Both the Go function and this test (with its 35+ slug→provider
// cases) are retired. The slug→provider mapping is now covered by
// providers.Manifest.DeriveProvider against the registry SSOT
// (TestDeriveProvider_RealManifest in
// internal/providers/derive_provider_test.go). The shell script
// remains the in-container fallback; its byte-identity with the
// registry view of hermes is a P4 follow-up gated on registry data
// growth (see PR-2 codegen of hermes config.yaml from the registry).
//
// TestWorkspaceCreate_FirstDeploy_PersistsModelAndProvider, which
// asserted that Create writes BOTH MODEL and LLM_PROVIDER rows, is
// replaced by TestWorkspaceCreate_FirstDeploy_OnlyPersistsMODEL
// below — the LLM_PROVIDER half of the contract is retired.
//
// TestWorkspaceCreate_FirstDeploy_UnknownModel_OnlyMintModelProvider
// is subsumed by the same: with LLM_PROVIDER never written, the
// known-vs-unknown distinction at Create disappears.
// TestWorkspaceCreate_FirstDeploy_PersistsModelAndProvider pins the
// fix for failed-workspace 95ed3ff2 (2026-05-02). Pre-fix: the canvas
// POSTed minimax/MiniMax-M2.7 in payload.Model, the workspace row was
// created, but neither the model nor the derived provider was ever
// written to workspace_secrets. On any subsequent restart, the
// applyRuntimeModelEnv fallback found nothing and hermes booted with
// the template default (nousresearch/hermes-4-70b) → wrong provider
// keys → /health poll failed → never registered.
// TestWorkspaceCreate_FirstDeploy_OnlyPersistsMODEL pins the post-P4
// contract: WorkspaceHandler.Create writes the MODEL workspace_secret
// (so the canvas-picked model survives restart and applyRuntimeModelEnv
// finds it via the fallback chain) and writes NOTHING ELSE in the
// secret-mint window. Specifically: NO LLM_PROVIDER row is written,
// regardless of payload.LLMProvider or the slug-prefix.
//
// Post-fix: the create handler writes both rows after committing the
// workspace row. This test asserts the SQL writes happen with the
// correct keys + values.
// Pre-P4 the create handler also wrote LLM_PROVIDER via setProviderSecret
// — either from payload.LLMProvider verbatim or from
// deriveProviderFromModelSlug(payload.Model). Both code paths were
// retired in internal#718 P4 closure together with the LLM_PROVIDER
// workspace_secret itself (no consumer remains; the provider is derived
// at every decision point from (runtime, model) via the registry).
//
// 2026-05-19 follow-up: the workspace_secrets row that holds the
// picked model id was renamed MODEL_PROVIDER → MODEL (the column name
// was misleading and bled into applyRuntimeModelEnv as a slug
// fallback). The sqlmock regex below now anchors on 'MODEL' instead
// of 'MODEL_PROVIDER'. See fix/workspace-server-rename-
// MODEL_PROVIDER-to-MODEL + the 20260519000000 rename migration.
func TestWorkspaceCreate_FirstDeploy_PersistsModelAndProvider(t *testing.T) {
// sqlmock failure on this expectation set is the canonical regression
// signal: if a future PR re-introduces an LLM_PROVIDER write at create,
// sqlmock surfaces "ExpectExec was not called" for any added insert.
// The "MODEL anchor uses no LLM_PROVIDER" assertion below is the
// stronger version of the same gate.
func TestWorkspaceCreate_FirstDeploy_OnlyPersistsMODEL(t *testing.T) {
mock := setupTestDB(t)
setupTestRedis(t)
broadcaster := newTestBroadcaster()
@@ -757,43 +754,35 @@ func TestWorkspaceCreate_FirstDeploy_PersistsModelAndProvider(t *testing.T) {
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectCommit()
// The fix: MODEL is upserted with the verbatim model slug
// (renamed from MODEL_PROVIDER on 2026-05-19 — see file-level
// docstring). SQL has 3 placeholders ($1=workspace_id, $2=
// encrypted_value reused in the conflict-update, $3=version
// reused in the conflict-update), so sqlmock sees 3 args. The
// 'MODEL' / 'LLM_PROVIDER' key is a literal in the SQL — we
// distinguish the two writes with the regex match below. The
// 'MODEL' anchor uses a word boundary (`[^_A-Z]`) so it does
// NOT silently match the legacy 'MODEL_PROVIDER' name.
// MODEL upsert — the only post-commit workspace_secrets write that
// survived the P4 closure. The 'MODEL' key is literal in the SQL.
mock.ExpectExec(`INSERT INTO workspace_secrets[\s\S]*'MODEL'`).
WithArgs(sqlmock.AnyArg(), sqlmock.AnyArg(), sqlmock.AnyArg()).
WillReturnResult(sqlmock.NewResult(0, 1))
// The fix: LLM_PROVIDER is upserted with the derived provider name.
mock.ExpectExec(`INSERT INTO workspace_secrets[\s\S]*'LLM_PROVIDER'`).
WithArgs(sqlmock.AnyArg(), sqlmock.AnyArg(), sqlmock.AnyArg()).
WillReturnResult(sqlmock.NewResult(0, 1))
// Post-mint side effects (canvas layout + structure_events broadcast
// + the external-workspace UPDATE/IssueToken chain). Order matches
// workspace.go.
// workspace.go. CRITICALLY: no second `INSERT INTO workspace_secrets`
// is expected — sqlmock fails if Create attempts an LLM_PROVIDER
// write.
mock.ExpectExec("INSERT INTO canvas_layouts").
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectExec("INSERT INTO structure_events").
WillReturnResult(sqlmock.NewResult(0, 1))
// External branch with no URL: status → awaiting_agent + IssueToken.
mock.ExpectExec(`UPDATE workspaces SET status =`).
WillReturnResult(sqlmock.NewResult(0, 1))
// wsauth.IssueToken inserts into workspace_auth_tokens.
mock.ExpectExec("INSERT INTO workspace_auth_tokens").
WillReturnResult(sqlmock.NewResult(0, 1))
// awaiting_agent broadcast.
mock.ExpectExec("INSERT INTO structure_events").
WillReturnResult(sqlmock.NewResult(0, 1))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
body := `{"name":"External Minimax Agent","runtime":"external","external":true,"model":"minimax/MiniMax-M2.7"}`
// Body carries an explicit llm_provider AND a slug-prefixed model — both
// of which would have triggered an LLM_PROVIDER write pre-P4. The
// payload field is preserved for backward-compat (older canvases
// still send it) but the value is intentionally ignored by Create.
body := `{"name":"External Minimax Agent","runtime":"external","external":true,"model":"minimax/MiniMax-M2.7","llm_provider":"minimax"}`
c.Request = httptest.NewRequest("POST", "/workspaces", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
@@ -803,7 +792,7 @@ func TestWorkspaceCreate_FirstDeploy_PersistsModelAndProvider(t *testing.T) {
t.Fatalf("expected status 201, got %d: %s", w.Code, w.Body.String())
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("sqlmock expectations not met — first-deploy did NOT persist MODEL + LLM_PROVIDER (this is the prod bug recurrence): %v", err)
t.Errorf("sqlmock expectations not met — Create wrote an unexpected workspace_secrets row (likely a re-introduced LLM_PROVIDER write): %v", err)
}
}
@@ -859,56 +848,12 @@ func TestWorkspaceCreate_FirstDeploy_NoModel_Returns422(t *testing.T) {
}
}
// TestWorkspaceCreate_FirstDeploy_UnknownModel_OnlyMintModelProvider
// asserts the asymmetric case: an unknown model prefix still gets
// MODEL persisted (so the user's exact slug survives restart and
// applyRuntimeModelEnv finds it), but LLM_PROVIDER is skipped (so
// derive-provider.sh's *=auto branch can decide at runtime instead of
// being pre-empted by a guess). The MODEL key was renamed from
// MODEL_PROVIDER on 2026-05-19 — see file-level docstring.
func TestWorkspaceCreate_FirstDeploy_UnknownModel_OnlyMintModelProvider(t *testing.T) {
mock := setupTestDB(t)
setupTestRedis(t)
broadcaster := newTestBroadcaster()
handler := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
mock.ExpectBegin()
mock.ExpectExec("INSERT INTO workspaces").
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectCommit()
// Only MODEL — LLM_PROVIDER must NOT be written for unknown
// prefixes. Same 3-arg shape as above; key is literal in SQL.
mock.ExpectExec(`INSERT INTO workspace_secrets[\s\S]*'MODEL'`).
WithArgs(sqlmock.AnyArg(), sqlmock.AnyArg(), sqlmock.AnyArg()).
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectExec("INSERT INTO canvas_layouts").
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectExec("INSERT INTO structure_events").
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectExec(`UPDATE workspaces SET status =`).
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectExec("INSERT INTO workspace_auth_tokens").
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectExec("INSERT INTO structure_events").
WillReturnResult(sqlmock.NewResult(0, 1))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
body := `{"name":"Unknown Model Agent","runtime":"external","external":true,"model":"totally-unknown-model/foo"}`
c.Request = httptest.NewRequest("POST", "/workspaces", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
handler.Create(c)
if w.Code != http.StatusCreated {
t.Fatalf("expected status 201, got %d: %s", w.Code, w.Body.String())
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("sqlmock expectations not met — unknown-prefix model should mint MODEL but skip LLM_PROVIDER: %v", err)
}
}
// internal#718 P4 closure: the asymmetric "known prefix → both
// MODEL+LLM_PROVIDER; unknown prefix → MODEL only" contract is moot —
// Create never writes LLM_PROVIDER for ANY model now. The equivalent
// coverage is TestWorkspaceCreate_FirstDeploy_OnlyPersistsMODEL above
// (uses a slug-prefixed model that pre-P4 WOULD have triggered an
// LLM_PROVIDER write; sqlmock fails if Create attempts one).
// TestApplyRuntimeModelEnv_SetsUniversalMODELForAllRuntimes pins the
// fix for Bug B (2026-05-02): canvas-selected model was silently dropped
@@ -1023,7 +968,7 @@ func TestApplyPlatformManagedLLMEnv_NonClaudeRuntimeDefaultsOpenAIProxyWhenNoWor
t.Setenv("MOLECULE_LLM_DEFAULT_MODEL", "moonshot/kimi-k2.6")
envVars := map[string]string{}
applyPlatformManagedLLMEnv(context.Background(), envVars, nil, "", "codex", "")
applyPlatformManagedLLMEnv(context.Background(), envVars, "", "codex", "", nil)
applyRuntimeModelEnv(envVars, "codex", "")
if got := envVars["OPENAI_BASE_URL"]; got != "https://api.example.test/api/v1/internal/llm/openai/v1" {
@@ -1053,7 +998,7 @@ func TestApplyPlatformManagedLLMEnv_StripsWorkspaceOpenAIKeyForClaudeCode(t *tes
"OPENAI_BASE_URL": "https://api.openai.com/v1",
"MODEL": "openai/gpt-5.5",
}
applyPlatformManagedLLMEnv(context.Background(), envVars, nil, "", "claude-code", "")
applyPlatformManagedLLMEnv(context.Background(), envVars, "", "claude-code", "", nil)
if _, ok := envVars["OPENAI_API_KEY"]; ok {
t.Fatalf("OPENAI_API_KEY should be stripped for claude-code platform-managed mode")
@@ -1079,7 +1024,7 @@ func TestApplyPlatformManagedLLMEnv_ClaudeCodeUsesAnthropicProxyOverOAuth(t *tes
"CLAUDE_CODE_OAUTH_TOKEN": "user-oauth-token",
"MODEL": "sonnet",
}
applyPlatformManagedLLMEnv(context.Background(), envVars, nil, "", "claude-code", "")
applyPlatformManagedLLMEnv(context.Background(), envVars, "", "claude-code", "", nil)
if _, ok := envVars["CLAUDE_CODE_OAUTH_TOKEN"]; ok {
t.Fatalf("CLAUDE_CODE_OAUTH_TOKEN should be stripped in platform-managed mode")
@@ -1102,7 +1047,7 @@ func TestApplyPlatformManagedLLMEnv_ClaudeCodeInjectsAnthropicProxyWhenNoWorkspa
t.Setenv("MOLECULE_LLM_USAGE_TOKEN", "tenant-admin-token")
envVars := map[string]string{}
applyPlatformManagedLLMEnv(context.Background(), envVars, nil, "", "claude-code", "minimax/MiniMax-M2.7")
applyPlatformManagedLLMEnv(context.Background(), envVars, "", "claude-code", "minimax/MiniMax-M2.7", nil)
if got := envVars["ANTHROPIC_BASE_URL"]; got != "https://api.example.test/api/v1/internal/llm/anthropic/v1" {
t.Fatalf("ANTHROPIC_BASE_URL = %q", got)
@@ -1125,7 +1070,7 @@ func TestApplyPlatformManagedLLMEnv_ClaudeCodeStripsVendorBYOK(t *testing.T) {
"MINIMAX_API_KEY": "user-minimax-key",
"MODEL": "MiniMax-M2.7",
}
applyPlatformManagedLLMEnv(context.Background(), envVars, nil, "", "claude-code", "")
applyPlatformManagedLLMEnv(context.Background(), envVars, "", "claude-code", "", nil)
if _, ok := envVars["MINIMAX_API_KEY"]; ok {
t.Fatalf("MINIMAX_API_KEY should be stripped in platform-managed mode")
@@ -1141,20 +1086,38 @@ func TestApplyPlatformManagedLLMEnv_ClaudeCodeStripsVendorBYOK(t *testing.T) {
}
}
// internal#718 P2-B: byok is now DERIVED, not org-env-driven. A claude-code
// workspace with NO explicit override + a non-platform-deriving model
// (kimi-for-coding → kimi-coding) resolves byok and must NOT get the CP proxy
// creds injected. (Pre-P2 this was driven by the org env MOLECULE_LLM_BILLING_MODE
// with an empty workspace id; that mechanism is retired.)
func TestApplyPlatformManagedLLMEnv_NoopsOutsidePlatformManaged(t *testing.T) {
t.Setenv("MOLECULE_LLM_BILLING_MODE", "byok")
const wsID = "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee"
mock := setupTestDB(t)
// No explicit override → derive from (claude-code, kimi-for-coding) → byok.
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(nil))
t.Setenv("MOLECULE_LLM_BILLING_MODE", "platform_managed") // org env ignored now
t.Setenv("MOLECULE_LLM_BASE_URL", "https://api.example.test/api/v1/internal/llm/openai/v1")
t.Setenv("MOLECULE_LLM_USAGE_TOKEN", "tenant-admin-token")
envVars := map[string]string{}
applyPlatformManagedLLMEnv(context.Background(), envVars, nil, "", "claude-code", "")
res := applyPlatformManagedLLMEnv(context.Background(), envVars, wsID, "claude-code", "kimi-for-coding", nil)
if res.ResolvedMode != LLMBillingModeBYOK {
t.Fatalf("resolved mode = %q, want byok (derived from non-platform model)", res.ResolvedMode)
}
if _, ok := envVars["OPENAI_API_KEY"]; ok {
t.Fatalf("OPENAI_API_KEY should not be set outside platform-managed mode")
}
if _, ok := envVars["MOLECULE_LLM_USAGE_TOKEN"]; ok {
t.Fatalf("MOLECULE_LLM_USAGE_TOKEN should not be set outside platform-managed mode")
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
// TestApplyPlatformManagedLLMEnv_ClaudeCodeByokKeepsOwnProviderEnv is the
@@ -1188,7 +1151,7 @@ func TestApplyPlatformManagedLLMEnv_ClaudeCodeByokKeepsOwnProviderEnv(t *testing
"CLAUDE_CODE_OAUTH_TOKEN": "user-oauth-token",
"MODEL": "sonnet",
}
applyPlatformManagedLLMEnv(context.Background(), envVars, nil, wsID, "claude-code", "")
applyPlatformManagedLLMEnv(context.Background(), envVars, wsID, "claude-code", "", nil)
// 1. OAuth token intact — not stripped.
if got := envVars["CLAUDE_CODE_OAUTH_TOKEN"]; got != "user-oauth-token" {
@@ -1219,19 +1182,19 @@ func TestApplyPlatformManagedLLMEnv_ClaudeCodeByokKeepsOwnProviderEnv(t *testing
}
}
// TestApplyPlatformManagedLLMEnv_ByokStripsGlobalOriginOAuthToken is the
// internal#711 regression guard for the live 2026-05-27 leak (Reno Stars SEO
// + Marketing claude-code agents). A non-platform (byok) workspace that
// brought NO LLM credential of its own, but which inherited the platform's
// scope:global CLAUDE_CODE_OAUTH_TOKEN from global_secrets (provenance =
// globalKeys), must have that platform token STRIPPED — not run on it.
// TestApplyPlatformManagedLLMEnv_ByokGlobalScopeOAuthSurvivesAndRunsDirect is
// the molecule-core#1994 (corrected-model) inversion of the former
// internal#711 strip test, exercised through applyPlatformManagedLLMEnv. The
// live failure this guards: the Reno Stars Marketing/SEO byok agents whose
// Claude oauth lives at GLOBAL scope (the tenant's own credential, shared
// across the tenant's workspaces) were stripped + failed-closed under the
// inverted "global == platform's own" premise → MISSING_BYOK_CREDENTIAL →
// dead. Under the corrected model `global_secrets` is the TENANT's store, so
// that oauth is exactly what byok runs on: it must SURVIVE and route direct.
//
// Pre-fix the byok early-return left envVars untouched, so the platform's
// global OAuth token survived into the container and the agent ran Opus on
// the platform's Anthropic credits. The fix gates the global-cred merge on
// provider==platform: a non-platform workspace keeps only its own
// (workspace_secrets) creds, of which there are none here.
func TestApplyPlatformManagedLLMEnv_ByokStripsGlobalOriginOAuthToken(t *testing.T) {
// Mutation (load-bearing): re-add stripGlobalOriginLLMCreds on the byok branch
// → the oauth disappears → this test RED on both survival + HasUsableLLMCred.
func TestApplyPlatformManagedLLMEnv_ByokGlobalScopeOAuthSurvivesAndRunsDirect(t *testing.T) {
const wsID = "352e3c2b-0546-4e9c-b487-1e2ff1cf29fc" // Reno Stars SEO agent
mock := setupTestDB(t)
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
@@ -1243,45 +1206,169 @@ func TestApplyPlatformManagedLLMEnv_ByokStripsGlobalOriginOAuthToken(t *testing.
t.Setenv("MOLECULE_LLM_ANTHROPIC_BASE_URL", "https://api.example.test/api/v1/internal/llm/anthropic")
t.Setenv("MOLECULE_LLM_USAGE_TOKEN", "tenant-admin-token")
// The ONLY LLM credential in env is the platform's scope:global OAuth
// token, merged from global_secrets (so its key is in globalKeys). The
// workspace set none of its own.
// The tenant's own oauth at GLOBAL scope (a global_secrets row). The
// workspace set no separate row of its own; it relies on the tenant global.
envVars := map[string]string{
"CLAUDE_CODE_OAUTH_TOKEN": "PLATFORM-GLOBAL-OAUTH-TOKEN",
"CLAUDE_CODE_OAUTH_TOKEN": "TENANT-OWN-GLOBAL-OAUTH",
"MODEL": "opus",
}
globalKeys := map[string]struct{}{"CLAUDE_CODE_OAUTH_TOKEN": {}}
res := applyPlatformManagedLLMEnv(context.Background(), envVars, globalKeys, wsID, "claude-code", "")
res := applyPlatformManagedLLMEnv(context.Background(), envVars, wsID, "claude-code", "", nil)
// 1. The platform global OAuth token must be STRIPPED — the leak is closed.
if got, ok := envVars["CLAUDE_CODE_OAUTH_TOKEN"]; ok {
t.Fatalf("CLAUDE_CODE_OAUTH_TOKEN = %q present — platform scope:global token must be stripped for a byok workspace", got)
// 1. The tenant's own global-scope oauth SURVIVES — byok runs on it.
if envVars["CLAUDE_CODE_OAUTH_TOKEN"] != "TENANT-OWN-GLOBAL-OAUTH" {
t.Fatalf("CLAUDE_CODE_OAUTH_TOKEN = %q, want the tenant's own global-scope token preserved for byok", envVars["CLAUDE_CODE_OAUTH_TOKEN"])
}
// 2. No CP proxy creds forced (byok = workspace talks to its own provider).
if got, ok := envVars["ANTHROPIC_API_KEY"]; ok {
t.Fatalf("ANTHROPIC_API_KEY must NOT be injected for byok, got %q", got)
}
// 3. Resolver reports byok with NO usable LLM credential → caller fails closed.
if _, ok := envVars["MOLECULE_LLM_USAGE_TOKEN"]; ok {
t.Fatalf("MOLECULE_LLM_USAGE_TOKEN must NOT be injected for byok")
}
// 3. byok WITH a usable credential → caller does NOT fail closed.
if res.ResolvedMode != LLMBillingModeBYOK {
t.Fatalf("ResolvedMode = %q, want %q", res.ResolvedMode, LLMBillingModeBYOK)
}
if res.HasUsableLLMCred {
t.Fatalf("HasUsableLLMCred = true, want false (only the stripped platform global token was present)")
if !res.HasUsableLLMCred {
t.Fatalf("HasUsableLLMCred = false, want true (tenant's own global-scope oauth is the usable credential)")
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
// TestApplyPlatformManagedLLMEnv_ByokKeepsWorkspaceOwnOAuthEvenWithGlobal is
// the discriminating companion to the strip test: a byok workspace that DID
// set its own CLAUDE_CODE_OAUTH_TOKEN via the canvas Secrets tab (a
// workspace_secrets row) keeps it. loadWorkspaceSecrets drops the global
// provenance flag on a workspace override, so the key is NOT in globalKeys
// and the provenance-aware strip leaves it alone. Proves the fix strips only
// platform-origin creds, never the customer's own.
func TestApplyPlatformManagedLLMEnv_ByokKeepsWorkspaceOwnOAuthEvenWithGlobal(t *testing.T) {
// =========================================================================
// internal#718 P2-B BEHAVIOR DELTA — billing/credential decision DERIVES the
// provider (no stored LLM_PROVIDER, no override). These three tests are the
// explicit delta the RFC calls out, exercised through the real provision path
// (applyPlatformManagedLLMEnv) with the registry derivation driving the mode:
// - platform-derived → platform_managed → platform creds (UNCHANGED)
// - non-platform-derived → byok → #1963 strip + fail-closed (THE FIX)
// - unset model → platform default (CTO-confirmed)
// All use NO explicit override (override read returns NULL) so the DERIVATION
// is what decides — this is what supersedes #1966's stored-LLM_PROVIDER read.
// =========================================================================
// PLATFORM-DERIVED → UNCHANGED. A claude-code workspace with a platform-
// namespaced model (anthropic/claude-opus-4-7) derives to the closed `platform`
// provider → platform_managed → CP proxy creds injected, exactly as before.
func TestApplyPlatformManagedLLMEnv_DERIVED_PlatformModelKeepsPlatformCreds(t *testing.T) {
const wsID = "11111111-2222-3333-4444-555555555555"
mock := setupTestDB(t)
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(nil)) // NO override → derive
t.Setenv("MOLECULE_LLM_BILLING_MODE", LLMBillingModeBYOK) // org env IGNORED now
t.Setenv("MOLECULE_LLM_BASE_URL", "https://api.example.test/api/v1/internal/llm/openai/v1")
t.Setenv("MOLECULE_LLM_ANTHROPIC_BASE_URL", "https://api.example.test/api/v1/internal/llm/anthropic")
t.Setenv("MOLECULE_LLM_USAGE_TOKEN", "tenant-admin-token")
envVars := map[string]string{}
res := applyPlatformManagedLLMEnv(context.Background(), envVars, wsID, "claude-code", "anthropic/claude-opus-4-7", nil)
if res.ResolvedMode != LLMBillingModePlatformManaged {
t.Fatalf("platform-derived model must resolve platform_managed, got %q (source=%s)", res.ResolvedMode, res.Source)
}
if res.Source != BillingModeSourceDerivedProvider {
t.Errorf("source: got %q want derived_provider", res.Source)
}
// Platform path injects the CP proxy creds (UNCHANGED behavior).
if got := envVars["ANTHROPIC_API_KEY"]; got != "tenant-admin-token" {
t.Errorf("platform path must inject the CP proxy token as ANTHROPIC_API_KEY, got %q", got)
}
if !res.HasUsableLLMCred {
t.Errorf("platform path always has a usable cred (the proxy token)")
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
// NON-PLATFORM-DERIVED + NO CREDENTIAL AT ALL → byok + FAIL-CLOSED. This is
// the legitimate remaining fail-closed path under the corrected model
// (molecule-core#1994): a claude-code workspace with a non-platform model
// (kimi-for-coding → byok) and NO override and NO LLM credential at ANY scope
// (no global row, no workspace row) has nothing to run on → HasUsableLLMCred=
// false → caller (prepareProvisionContext) aborts MISSING_BYOK_CREDENTIAL. The
// fail-closed branch is unchanged by the strip removal; only its trigger
// narrowed from "no workspace-scoped cred" to "no cred at any scope".
func TestApplyPlatformManagedLLMEnv_DERIVED_ByokNoCredentialFailsClosed(t *testing.T) {
const wsID = "99999999-8888-7777-6666-555555555555"
mock := setupTestDB(t)
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(nil)) // NO override → derive
t.Setenv("MOLECULE_LLM_BILLING_MODE", LLMBillingModePlatformManaged) // org env IGNORED now
t.Setenv("MOLECULE_LLM_BASE_URL", "https://api.example.test/api/v1/internal/llm/openai/v1")
t.Setenv("MOLECULE_LLM_USAGE_TOKEN", "tenant-admin-token")
// No LLM credential at all — neither global nor workspace scope.
envVars := map[string]string{}
res := applyPlatformManagedLLMEnv(context.Background(), envVars, wsID, "claude-code", "kimi-for-coding", nil)
// 1. DERIVED byok (NOT the old platform_managed default).
if res.ResolvedMode != LLMBillingModeBYOK {
t.Fatalf("non-platform-derived model must resolve byok, got %q (source=%s)", res.ResolvedMode, res.Source)
}
if res.Source != BillingModeSourceDerivedProvider {
t.Errorf("source: got %q want derived_provider", res.Source)
}
// 2. No CP proxy creds forced.
if got, ok := envVars["ANTHROPIC_API_KEY"]; ok {
t.Fatalf("ANTHROPIC_API_KEY must NOT be injected for byok, got %q", got)
}
// 3. No usable cred at any scope → caller fails closed.
if res.HasUsableLLMCred {
t.Fatalf("HasUsableLLMCred = true, want false (no LLM credential present at any scope)")
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
// UNSET model → PLATFORM DEFAULT (CTO-confirmed "unset → platform default").
// No model means nothing to derive; the workspace defaults closed to
// platform_managed and keeps the platform creds (UNCHANGED for the no-model case).
func TestApplyPlatformManagedLLMEnv_DERIVED_UnsetModelPlatformDefault(t *testing.T) {
const wsID = "00000000-1111-2222-3333-444444444444"
mock := setupTestDB(t)
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(nil)) // NO override
t.Setenv("MOLECULE_LLM_BILLING_MODE", LLMBillingModeBYOK) // org env IGNORED now
t.Setenv("MOLECULE_LLM_BASE_URL", "https://api.example.test/api/v1/internal/llm/openai/v1")
t.Setenv("MOLECULE_LLM_ANTHROPIC_BASE_URL", "https://api.example.test/api/v1/internal/llm/anthropic")
t.Setenv("MOLECULE_LLM_USAGE_TOKEN", "tenant-admin-token")
envVars := map[string]string{}
res := applyPlatformManagedLLMEnv(context.Background(), envVars, wsID, "claude-code", "", nil)
if res.ResolvedMode != LLMBillingModePlatformManaged {
t.Fatalf("unset model must default platform_managed, got %q (source=%s)", res.ResolvedMode, res.Source)
}
if res.Source != BillingModeSourceDerivedDefault {
t.Errorf("source: got %q want derived_default", res.Source)
}
if got := envVars["ANTHROPIC_API_KEY"]; got != "tenant-admin-token" {
t.Errorf("unset-model platform default must inject the CP proxy token, got %q", got)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
// TestApplyPlatformManagedLLMEnv_ByokKeepsWorkspaceOwnOAuth is the
// workspace-scope companion to the global-scope survival test: a byok
// workspace that set its own CLAUDE_CODE_OAUTH_TOKEN via the canvas Secrets
// tab (a workspace_secrets row) keeps it and runs direct. Under the corrected
// model (molecule-core#1994) the tenant's credential survives at EITHER scope;
// this pins the workspace-scope half.
func TestApplyPlatformManagedLLMEnv_ByokKeepsWorkspaceOwnOAuth(t *testing.T) {
const wsID = "6b66de8d-9337-4fb4-be8d-6d49dca0d809" // Reno Stars Marketing agent
mock := setupTestDB(t)
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
@@ -1292,15 +1379,13 @@ func TestApplyPlatformManagedLLMEnv_ByokKeepsWorkspaceOwnOAuthEvenWithGlobal(t *
t.Setenv("MOLECULE_LLM_BASE_URL", "https://api.example.test/api/v1/internal/llm/openai/v1")
t.Setenv("MOLECULE_LLM_USAGE_TOKEN", "tenant-admin-token")
// Workspace set its OWN OAuth token — loadWorkspaceSecrets would have
// dropped its global provenance flag, so globalKeys does NOT contain it.
// Workspace set its OWN OAuth token (a workspace_secrets row).
envVars := map[string]string{
"CLAUDE_CODE_OAUTH_TOKEN": "CUSTOMER-OWN-OAUTH-TOKEN",
"MODEL": "opus",
}
globalKeys := map[string]struct{}{} // not from global_secrets
res := applyPlatformManagedLLMEnv(context.Background(), envVars, globalKeys, wsID, "claude-code", "")
res := applyPlatformManagedLLMEnv(context.Background(), envVars, wsID, "claude-code", "", nil)
if got := envVars["CLAUDE_CODE_OAUTH_TOKEN"]; got != "CUSTOMER-OWN-OAUTH-TOKEN" {
t.Fatalf("CLAUDE_CODE_OAUTH_TOKEN = %q, want the workspace's own token left intact", got)
@@ -1316,13 +1401,18 @@ func TestApplyPlatformManagedLLMEnv_ByokKeepsWorkspaceOwnOAuthEvenWithGlobal(t *
}
}
// TestApplyPlatformManagedLLMEnv_DisabledStripsGlobalButReportsNoCred proves
// that "disabled" mode also strips the platform's global LLM creds (the leak
// is closed for disabled too), and reports HasUsableLLMCred=false. The
// caller's fail-closed abort is scoped to byok only, so a disabled workspace
// with no LLM cred still boots (for terminal / non-LLM work); here we pin the
// function-level strip + report.
func TestApplyPlatformManagedLLMEnv_DisabledStripsGlobalButReportsNoCred(t *testing.T) {
// TestApplyPlatformManagedLLMEnv_DisabledKeepsTenantGlobalNoProxy proves the
// corrected-model behavior for "disabled": the tenant's own global-scope LLM
// cred is NOT stripped and the CP proxy is NOT forced. "disabled" means the
// workspace runs no platform-billed LLM, but the tenant's own credential is
// still the tenant's to keep; the caller's fail-closed abort is byok-only so a
// disabled workspace boots regardless. The previous internal#711 behavior
// stripped the global cred here on the same inverted premise; that strip is
// removed.
//
// Mutation (load-bearing): re-add stripGlobalOriginLLMCreds on the non-platform
// branch → the oauth disappears → this test RED on the survival assertion.
func TestApplyPlatformManagedLLMEnv_DisabledKeepsTenantGlobalNoProxy(t *testing.T) {
const wsID = "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"
mock := setupTestDB(t)
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
@@ -1332,31 +1422,33 @@ func TestApplyPlatformManagedLLMEnv_DisabledStripsGlobalButReportsNoCred(t *test
t.Setenv("MOLECULE_LLM_BILLING_MODE", LLMBillingModePlatformManaged)
envVars := map[string]string{
"CLAUDE_CODE_OAUTH_TOKEN": "PLATFORM-GLOBAL-OAUTH-TOKEN",
"CLAUDE_CODE_OAUTH_TOKEN": "TENANT-OWN-GLOBAL-OAUTH",
}
globalKeys := map[string]struct{}{"CLAUDE_CODE_OAUTH_TOKEN": {}}
res := applyPlatformManagedLLMEnv(context.Background(), envVars, globalKeys, wsID, "claude-code", "")
res := applyPlatformManagedLLMEnv(context.Background(), envVars, wsID, "claude-code", "", nil)
if _, ok := envVars["CLAUDE_CODE_OAUTH_TOKEN"]; ok {
t.Fatalf("CLAUDE_CODE_OAUTH_TOKEN must be stripped for disabled mode too")
// The tenant's own global cred survives (not stripped).
if envVars["CLAUDE_CODE_OAUTH_TOKEN"] != "TENANT-OWN-GLOBAL-OAUTH" {
t.Fatalf("tenant's own global cred must survive for disabled mode; got %q", envVars["CLAUDE_CODE_OAUTH_TOKEN"])
}
// No proxy forced for disabled.
if _, ok := envVars["MOLECULE_LLM_USAGE_TOKEN"]; ok {
t.Fatalf("disabled must not inject the platform usage token")
}
if res.ResolvedMode != LLMBillingModeDisabled {
t.Fatalf("ResolvedMode = %q, want %q", res.ResolvedMode, LLMBillingModeDisabled)
}
if res.HasUsableLLMCred {
t.Fatalf("HasUsableLLMCred = true, want false")
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
// TestApplyPlatformManagedLLMEnv_PlatformManagedStillReceivesGlobalCreds is
// the no-regression guard for the OTHER side of the gate (internal#711): a
// platform-managed workspace MUST still receive the platform's creds. Here
// the proxy IS configured, so the contract is the existing one — the global
// OAuth token is replaced by the proxy usage token (HasUsableLLMCred=true).
// the no-regression guard for the metered platform_managed path
// (molecule-core#1994): a platform-managed workspace MUST still strip any
// direct oauth and route through the CP proxy. The direct OAuth token is
// replaced by the proxy usage token (HasUsableLLMCred=true). This path is
// UNCHANGED by the byok strip removal — only the byok/disabled branch changed.
func TestApplyPlatformManagedLLMEnv_PlatformManagedStillReceivesGlobalCreds(t *testing.T) {
const wsID = "99999999-9999-9999-9999-999999999999"
mock := setupTestDB(t)
@@ -1370,12 +1462,11 @@ func TestApplyPlatformManagedLLMEnv_PlatformManagedStillReceivesGlobalCreds(t *t
t.Setenv("MOLECULE_LLM_USAGE_TOKEN", "tenant-admin-token")
envVars := map[string]string{
"CLAUDE_CODE_OAUTH_TOKEN": "PLATFORM-GLOBAL-OAUTH-TOKEN",
"CLAUDE_CODE_OAUTH_TOKEN": "DIRECT-OAUTH-TOKEN",
"MODEL": "opus",
}
globalKeys := map[string]struct{}{"CLAUDE_CODE_OAUTH_TOKEN": {}}
res := applyPlatformManagedLLMEnv(context.Background(), envVars, globalKeys, wsID, "claude-code", "")
res := applyPlatformManagedLLMEnv(context.Background(), envVars, wsID, "claude-code", "", nil)
// Platform-managed routes through the CP proxy: OAuth stripped, proxy creds forced.
if _, ok := envVars["CLAUDE_CODE_OAUTH_TOKEN"]; ok {
@@ -1416,7 +1507,7 @@ func TestApplyPlatformManagedLLMEnv_PlatformManagedStillEmitsResolvedMode(t *tes
"CLAUDE_CODE_OAUTH_TOKEN": "user-oauth-token",
"MODEL": "sonnet",
}
applyPlatformManagedLLMEnv(context.Background(), envVars, nil, wsID, "claude-code", "")
applyPlatformManagedLLMEnv(context.Background(), envVars, wsID, "claude-code", "", nil)
// OAuth stripped, proxy forced — unchanged platform_managed contract.
if _, ok := envVars["CLAUDE_CODE_OAUTH_TOKEN"]; ok {
@@ -501,10 +501,12 @@ func TestWorkspaceCreate_WithSecrets_Persists(t *testing.T) {
// while persisting a secret causes the entire transaction to roll back and
// the handler to return 500. The workspace row must NOT be committed.
func TestWorkspaceCreate_SecretPersistFails_RollsBack(t *testing.T) {
// internal#691: see TestExtended_SecretsSet — same default-closed reasoning.
// This test is asserting the rollback path on DB failure, not the strip gate;
// keep the org in byok so the OPENAI_API_KEY write reaches the INSERT.
t.Setenv("MOLECULE_LLM_BILLING_MODE", "byok")
// internal#718 P2-B: this test asserts the rollback path on DB failure, not
// the strip gate. The create-time secret gate keys off the DERIVED mode now
// (org rung retired). An explicit byok override makes the workspace byok in a
// single resolver read (precedence-1 short-circuit), so the OPENAI_API_KEY
// write is allowed and reaches the INSERT-and-fail path this test exercises.
t.Setenv("MOLECULE_LLM_BILLING_MODE", "platform_managed") // org env ignored now
mock := setupTestDB(t)
setupTestRedis(t)
broadcaster := newTestBroadcaster()
@@ -513,14 +515,11 @@ func TestWorkspaceCreate_SecretPersistFails_RollsBack(t *testing.T) {
mock.ExpectBegin()
mock.ExpectExec("INSERT INTO workspaces").
WillReturnResult(sqlmock.NewResult(0, 1))
// internal#691: Create() now resolves billing mode per-workspace before
// the secret-strip gate. The workspace row was just inserted in the same
// transaction so it isn't readable from a separate query yet; the
// resolver expects the SELECT and the mock returns no row → falls back
// to the org default (byok, set above) so the OPENAI_API_KEY write
// reaches the INSERT-and-fail path this test exercises.
// Create() resolves billing mode per-workspace before the secret-strip gate.
// An explicit byok override short-circuits the resolver (precedence 1) so the
// OPENAI_API_KEY write is allowed and reaches the INSERT-and-fail path.
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}))
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(LLMBillingModeBYOK))
mock.ExpectExec("INSERT INTO workspace_secrets").
WillReturnError(sql.ErrConnDone) // DB failure while writing secret
mock.ExpectRollback() // workspace insert must be rolled back
@@ -1787,7 +1786,7 @@ func TestWorkspaceCreate_TemplateDefaultsMissingRuntimeAndModel(t *testing.T) {
tier: 2
runtime: hermes
runtime_config:
model: nousresearch/hermes-4-70b
model: moonshot/kimi-k2.6
`)
if err := os.WriteFile(filepath.Join(templateDir, "config.yaml"), cfg, 0o644); err != nil {
t.Fatalf("write cfg: %v", err)
@@ -1842,7 +1841,7 @@ func TestWorkspaceCreate_TemplateDefaultsLegacyTopLevelModel(t *testing.T) {
cfg := []byte(`name: Legacy Agent
tier: 1
runtime: hermes
model: anthropic:claude-sonnet-4-5
model: moonshot/kimi-k2.5
`)
if err := os.WriteFile(filepath.Join(templateDir, "config.yaml"), cfg, 0o644); err != nil {
t.Fatalf("write cfg: %v", err)
@@ -1897,7 +1896,7 @@ func TestWorkspaceCreate_CallerModelOverridesTemplateDefault(t *testing.T) {
}
cfg := []byte(`runtime: hermes
runtime_config:
model: nousresearch/hermes-4-70b
model: moonshot/kimi-k2.6
`)
if err := os.WriteFile(filepath.Join(templateDir, "config.yaml"), cfg, 0o644); err != nil {
t.Fatalf("write cfg: %v", err)
@@ -1924,7 +1923,11 @@ runtime_config:
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
body := `{"name":"Custom Hermes","template":"hermes-template","model":"minimax/MiniMax-M2.7"}`
// Caller overrides with a different hermes-valid model — registry permits
// both moonshot/kimi-k2.5 and moonshot/kimi-k2.6 for hermes (P4 PR-1 native
// set). The template default would have been moonshot/kimi-k2.6; caller
// picks kimi-k2.5 explicitly to prove the override actually fires.
body := `{"name":"Custom Hermes","template":"hermes-template","model":"moonshot/kimi-k2.5"}`
c.Request = httptest.NewRequest("POST", "/workspaces", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
@@ -2048,6 +2051,152 @@ func TestWorkspaceCreate_188_NoTemplateNoRuntime_NowMODEL_REQUIRED(t *testing.T)
}
}
// internal#718 P4 PR-2: only-registered validation HARD-REJECT. A known
// (registry) runtime with a model NOT in its registered set is rejected at the
// create boundary with 422 UNREGISTERED_MODEL_FOR_RUNTIME — no DB rows touched,
// no provisioning attempt, no wedged workspace. Replaces P2-B's WARN-mode
// header.
func TestWorkspaceCreate_718_P4_UnregisteredModelHardReject422(t *testing.T) {
mock := setupTestDB(t)
setupTestRedis(t)
broadcaster := newTestBroadcaster()
handler := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
// No DB expectations: the 422 fires BEFORE BeginTx, so any unexpected
// INSERT will fail the test via ExpectationsWereMet.
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
body := `{"name":"Bad Model","runtime":"claude-code","model":"totally-made-up-xyz"}`
c.Request = httptest.NewRequest("POST", "/workspaces", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
handler.Create(c)
if w.Code != http.StatusUnprocessableEntity {
t.Fatalf("unregistered-model create: expected 422, got %d: %s", w.Code, w.Body.String())
}
if !bytes.Contains(w.Body.Bytes(), []byte(`"code":"UNREGISTERED_MODEL_FOR_RUNTIME"`)) {
t.Errorf("expected code=UNREGISTERED_MODEL_FOR_RUNTIME in 422 body, got %s", w.Body.String())
}
if !bytes.Contains(w.Body.Bytes(), []byte(`"runtime":"claude-code"`)) {
t.Errorf("expected runtime=claude-code echoed in 422 body, got %s", w.Body.String())
}
if !bytes.Contains(w.Body.Bytes(), []byte(`"model":"totally-made-up-xyz"`)) {
t.Errorf("expected model echoed in 422 body, got %s", w.Body.String())
}
// The legacy WARN header must NOT fire — there is no "proceeded with
// warning" path anymore.
if w.Header().Get("X-Molecule-Model-Unregistered") != "" {
t.Errorf("P4 hard-reject must not emit the legacy WARN header, got %q", w.Header().Get("X-Molecule-Model-Unregistered"))
}
// Strict mock check: no DB ops should have happened.
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unexpected DB activity on hard-reject path: %v", err)
}
}
// A REGISTERED model on a registry runtime proceeds with 201 and no unregistered header.
func TestWorkspaceCreate_718_P4_RegisteredModelProceeds(t *testing.T) {
mock := setupTestDB(t)
setupTestRedis(t)
broadcaster := newTestBroadcaster()
handler := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
mock.ExpectBegin()
mock.ExpectExec("INSERT INTO workspaces").
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectCommit()
mock.ExpectExec("INSERT INTO workspace_secrets").
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectExec("INSERT INTO canvas_layouts").
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectExec("INSERT INTO structure_events").
WillReturnResult(sqlmock.NewResult(0, 1))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
// claude-opus-4-7 IS a registered claude-code model (anthropic-api).
body := `{"name":"Good Model","runtime":"claude-code","model":"claude-opus-4-7"}`
c.Request = httptest.NewRequest("POST", "/workspaces", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
handler.Create(c)
if w.Code != http.StatusCreated {
t.Fatalf("registered-model create: expected 201, got %d: %s", w.Code, w.Body.String())
}
if w.Header().Get("X-Molecule-Model-Unregistered") != "" {
t.Errorf("registered model must NOT set the legacy unregistered header, got %q", w.Header().Get("X-Molecule-Model-Unregistered"))
}
}
// internal#718 P4 PR-2: the legacy colon-namespaced BYOK vocabulary
// 'anthropic:claude-opus-4-7' is now a FIRST-CLASS registered claude-code model
// (P4 PR-1 reconciled the colon-vocab into the registry). The hard-reject must
// NOT 422 this legitimate live-corpus form — verifying the reconcile + flip work
// together. This is the canonical regression guard for the colon-vocab path.
func TestWorkspaceCreate_718_P4_LegacyColonVocabAccepted(t *testing.T) {
mock := setupTestDB(t)
setupTestRedis(t)
broadcaster := newTestBroadcaster()
handler := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
mock.ExpectBegin()
mock.ExpectExec("INSERT INTO workspaces").
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectCommit()
mock.ExpectExec("INSERT INTO workspace_secrets").
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectExec("INSERT INTO canvas_layouts").
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectExec("INSERT INTO structure_events").
WillReturnResult(sqlmock.NewResult(0, 1))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
body := `{"name":"Legacy Colon","runtime":"claude-code","model":"anthropic:claude-opus-4-7"}`
c.Request = httptest.NewRequest("POST", "/workspaces", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
handler.Create(c)
if w.Code != http.StatusCreated {
t.Fatalf("legacy colon-form create (P4 PR-1 reconciled): expected 201, got %d: %s", w.Code, w.Body.String())
}
}
// internal#718 P2-B: a runtime NOT in the registry (mock — a known core runtime
// absent from the first-party provider registry) fails OPEN — the
// only-registered gate does not block it (federation / non-first-party path
// unchanged). It proceeds past the gate to the normal create flow.
func TestWorkspaceCreate_718_NonRegistryRuntimeFailsOpen(t *testing.T) {
mock := setupTestDB(t)
setupTestRedis(t)
broadcaster := newTestBroadcaster()
handler := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
mock.ExpectBegin()
mock.ExpectExec("INSERT INTO workspaces").
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectCommit()
mock.ExpectExec("INSERT INTO canvas_layouts").
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectExec("INSERT INTO structure_events").
WillReturnResult(sqlmock.NewResult(0, 1))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
// "mock" is a known core runtime but NOT in the first-party registry;
// any model passes the only-registered gate (fail-open).
body := `{"name":"Mock Agent","runtime":"mock","model":"canned-replies"}`
c.Request = httptest.NewRequest("POST", "/workspaces", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
handler.Create(c)
if w.Code != http.StatusCreated {
t.Fatalf("non-registry runtime should fail open (201), got %d: %s", w.Code, w.Body.String())
}
}
// Explicit runtime, no template → honored, 201 (no template resolution
// needed; runtimeExplicitlyRequested true but already resolved).
func TestWorkspaceCreate_188_ExplicitRuntimeNoTemplate_OK(t *testing.T) {
@@ -0,0 +1,259 @@
package providers
import (
"fmt"
"sort"
"strings"
)
// PlatformProviderName is the single, closed, core-only provider key that
// denotes Molecule-managed billing (no tenant key; the platform owns the
// upstream credential + the bill). It is a CLOSED set BY CONSTRUCTION: a
// third-party / contributed runtime manifest can introduce its own providers
// (BYOK by definition), but it can never name one `platform` and thereby
// forge platform billing — the merge/validation layer reserves this key for
// the core catalog (internal#718 federation refinement, CTO 2026-05-27).
// DeriveProvider treats it like any other native provider for resolution;
// the closed-set guarantee is enforced at manifest registration/merge, not
// here. isPlatformProvider is the single predicate billing/credential
// emission keys off the DERIVED provider (P2; not wired in P0).
const PlatformProviderName = "platform"
// IsPlatform reports whether this provider is the closed, core-only
// platform-managed provider. Billing + credential-emission decisions key off
// this predicate applied to a DERIVED provider (P2), so a model can never be
// platform-billed unless DeriveProvider resolves it to the closed platform
// entry. Any BYOK / third-party provider returns false -> fail-closed
// without the tenant's own key.
func (p Provider) IsPlatform() bool {
return p.Name == PlatformProviderName
}
// DeriveProvider resolves the SINGLE owning Provider for a (runtime, model)
// pair against the merged registry Manifest. It is the P0 foundation of
// internal#718: every model->provider decision point will eventually derive
// through this one function instead of one of the ~9 hardcoded, disagreeing
// vocabularies. In P0 NOTHING in production calls it (additive, zero behavior
// change) — it is exercised only by tests + the codegen artifact.
//
// It is written as a method on Manifest (a pure function of the merged
// registry) so a future FEDERATED registry — core catalog UNION validated
// per-runtime contributed manifests — works through the identical code path:
// DeriveProvider neither knows nor cares whether a runtime/provider is
// first-party or contributed; it only sees the merged Manifest.
//
// Resolution (fail-closed at every step — never silently default):
//
// 1. The runtime must be known. An unknown runtime errors (it never falls
// through to "any provider in the catalog").
// 2. The candidate set is the runtime's NATIVE provider set ONLY (the
// `runtimes:` block). A provider absent from the runtime's native set is
// never selectable for that runtime, even if its catalog regex matches.
// 3. EXACT model-id match is authoritative (CTO 2026-05-27 "disambiguate by
// exact model id"): if the model id appears verbatim in exactly one
// native provider ref's Models list, that provider wins outright — this
// resolves the kimi namespace split (moonshot/kimi-k2.6 -> platform vs
// bare kimi-for-coding -> kimi-coding) deterministically and overrides
// any broader prefix match.
// 4. Otherwise, fall back to model_prefix_match among the native providers.
// 5. If >1 native provider still matches, disambiguate by auth env: keep
// only the providers whose auth_env intersects availableAuthEnv. If
// exactly one survives, it wins.
// 6. If still >1 (or 0) -> error. Overlap is an ambiguity the registry data
// must resolve; none is an unregistered (unselectable) model. Both
// fail-closed with a zero-value Provider.
//
// availableAuthEnv is the set of auth-env-var NAMES (never secret values)
// present for the workspace — exactly the disambiguation input the canvas
// uses today to split anthropic-oauth (CLAUDE_CODE_OAUTH_TOKEN) from
// anthropic-api (ANTHROPIC_API_KEY). It may be nil; nil simply means the
// auth-env tie-break cannot fire (an overlap then errors rather than guesses).
func (m *Manifest) DeriveProvider(runtime, model string, availableAuthEnv []string) (Provider, error) {
model = strings.TrimSpace(model)
if model == "" {
return Provider{}, fmt.Errorf("providers: model is required")
}
native, ok := m.Runtimes[runtime]
if !ok {
return Provider{}, fmt.Errorf("providers: unknown runtime %q", runtime)
}
byName := make(map[string]Provider, len(m.Providers))
for _, p := range m.Providers {
byName[p.Name] = p
}
// Step 3: exact model-id match against each native provider ref's Models.
// Authoritative — a verbatim id beats any prefix. If two native refs both
// list the same id, that is a manifest ambiguity we surface rather than
// silently pick (LoadManifest already forbids a provider ref appearing
// twice in one runtime, but two DIFFERENT providers listing the same id
// is not load-rejected, so guard it here).
var exact []Provider
for _, ref := range native.Providers {
for _, mid := range ref.Models {
if mid == model {
if p, ok := byName[ref.Name]; ok {
exact = append(exact, p)
}
break
}
}
}
if len(exact) == 1 {
return exact[0], nil
}
if len(exact) > 1 {
return Provider{}, fmt.Errorf(
"providers: model %q for runtime %q is exact-listed by %d native providers (%s) — manifest ambiguity",
model, runtime, len(exact), strings.Join(providerNames(exact), ", "))
}
// Step 4: prefix match among native providers only.
var matched []Provider
for _, ref := range native.Providers {
p, ok := byName[ref.Name]
if !ok {
continue
}
if p.MatchesModel(model) {
matched = append(matched, p)
}
}
switch len(matched) {
case 1:
return matched[0], nil
case 0:
return Provider{}, fmt.Errorf(
"providers: no native provider for runtime %q owns model %q (unregistered/unselectable)",
runtime, model)
}
// Step 5: >1 prefix match — disambiguate by available auth env.
if len(availableAuthEnv) > 0 {
avail := make(map[string]struct{}, len(availableAuthEnv))
for _, e := range availableAuthEnv {
avail[e] = struct{}{}
}
var byAuth []Provider
for _, p := range matched {
for _, want := range p.AuthEnv {
if _, ok := avail[want]; ok {
byAuth = append(byAuth, p)
break
}
}
}
if len(byAuth) == 1 {
return byAuth[0], nil
}
if len(byAuth) > 1 {
matched = byAuth // narrowed but still ambiguous; report the narrowed set
}
}
// Step 6: still ambiguous -> error (never silently pick).
return Provider{}, fmt.Errorf(
"providers: model %q for runtime %q overlaps %d providers (%s) and auth env did not disambiguate — resolve in the registry",
model, runtime, len(matched), strings.Join(providerNames(matched), ", "))
}
// Upstream is the result of ResolveUpstream: the proxy's upstream-vendor key
// (the 4-name vocabulary {openai, moonshot, anthropic, minimax} the proxy's
// resolveLLMProviderTarget switch dispatches on to pick the upstream base URL +
// key) plus the model id to send upstream (the namespace SUFFIX). Provider is
// the catalog entry the namespace resolved to (its base_url_template /
// base_url_anthropic / auth_env are the SINGLE source for the upstream target).
type Upstream struct {
// Vendor is the proxy upstream-vendor key (Provider.UpstreamVendor). It is
// the axis resolveLLMProviderTarget dispatches on; for "anthropic-api" it is
// "anthropic" (the entry NAME and the upstream VENDOR legitimately differ).
Vendor string
// Model is the id to send upstream — the namespace suffix (e.g. the
// "kimi-k2.6" of "moonshot/kimi-k2.6").
Model string
// Provider is the resolved catalog entry. Its base_url_* / auth_env are the
// one source for the upstream target — there is no parallel routing block.
Provider Provider
}
// ResolveUpstream is the SINGLE registry resolution the LLM proxy uses to pick
// the upstream vendor + base URL + auth for a wire model id (internal#718 P1,
// CONVERGED 2026-05-27). It replaces the proxy's hardcoded inferLLMProvider
// switch AND the earlier two-derivation shape (DeriveUpstreamForModel + a
// separate proxy_routing data block): there is now ONE resolution over the
// EXISTING vendor provider entries — no duplicate routing vocabulary.
//
// Resolution = the platform model id's NAMESPACE. A platform model id is
// `vendor/model` (or the BYOK colon form `vendor:model`); the namespace token
// NAMES the backing provider, whose catalog entry carries the upstream
// base_url_* + auth_env. The upstream vendor key is the entry's UpstreamVendor
// (a property of the entry, recorded once on the entry — NOT a parallel
// routing block). VERIFIED FACT (internal#718, 2026-05-27): all platform model
// ids in providers.yaml are namespaced; ZERO are bare — so namespace
// resolution covers 100% of live proxy traffic.
//
// It is DELIBERATELY separate from DeriveProvider:
// - DeriveProvider is runtime-SCOPED and speaks the REGISTRY vocabulary
// (platform/anthropic-api/kimi-coding/…); for a platform model it returns
// `platform` (the proxy ITSELF), which is useless for upstream routing.
// - ResolveUpstream is runtime-AGNOSTIC (the proxy serves platform models
// across runtimes, with no single runtime) and speaks the proxy's 4-name
// UPSTREAM vocabulary — exactly what selects the upstream base URL + key.
//
// Resolution (fail-closed; never a silent default):
//
// 1. Namespace split: for each separator "/" then ":" (the proxy's loop
// order), cut the id. If the prefix token EQUALS some provider entry's
// UpstreamVendor, that entry wins: Vendor = its UpstreamVendor, Model = the
// SUFFIX. The first separator that yields a known vendor wins ("/" before
// ":"), matching the proxy verbatim.
// 2. Otherwise the id is BARE. Bare ids are VESTIGIAL at the proxy: zero live
// platform traffic is bare (every platform model id is namespaced), so the
// converged path does NOT resolve them — it returns an error and the proxy
// falls back to its documented, retained legacy switch (inferLLMProviderLegacy).
// This is INTENTIONAL: P0 tightened bare `kimi-*` to the kimi-coding
// gateway in the registry, which is NOT a valid proxy upstream, so routing
// bare ids through the shared registry matcher would misroute. Namespace-
// only resolution sidesteps that without a moonshot special-case or a new
// bare→vendor data block.
//
// Callers that need the legacy bare behavior keep the legacy switch as a
// documented vestigial fallback (see internal/handlers/llm_proxy.go).
func (m *Manifest) ResolveUpstream(model string) (Upstream, error) {
// NOTE: model is pre-trimmed by every production caller
// (resolveLLMProviderTargetForProtocol trims + rejects empty before calling
// inferLLMProvider). No TrimSpace here — the prior copy was unreachable in
// prod and is the review nit being dropped in the convergence.
if model == "" {
return Upstream{}, fmt.Errorf("providers: model is required")
}
for _, sep := range []string{"/", ":"} {
before, after, found := strings.Cut(model, sep)
if !found {
continue
}
for _, p := range m.Providers {
if v := p.UpstreamVendor; v != "" && v == before {
return Upstream{Vendor: v, Model: after, Provider: p}, nil
}
}
}
return Upstream{}, fmt.Errorf(
"providers: %q is not an upstream-namespaced model id (vendor/model); bare ids are vestigial at the proxy and resolve via the legacy fallback", model)
}
// providerNames returns the sorted names of a provider slice for stable,
// deterministic error messages (test assertions + operator readability).
func providerNames(ps []Provider) []string {
out := make([]string, 0, len(ps))
for _, p := range ps {
out = append(out, p.Name)
}
sort.Strings(out)
return out
}
@@ -0,0 +1,520 @@
package providers
import (
"strings"
"testing"
)
// TestDeriveProvider_RealManifest exercises DeriveProvider against the
// embedded baseline manifest — the cases the brief (internal#718 P0)
// enumerates. DeriveProvider resolves the SINGLE owning provider for a
// (runtime, model) pair using the runtime's NATIVE set, restricted by:
// 1. exact model-id match (the runtime native ref's Models list is the
// authoritative disambiguator — CTO 2026-05-27 "disambiguate by exact
// model id"), then
// 2. model_prefix_match among native providers, then
// 3. auth-env disambiguation when >1 native provider still matches.
//
// It ERRORS on overlap (>=2 unresolved) and on none — never silently picks.
func TestDeriveProvider_RealManifest(t *testing.T) {
m, err := LoadManifest()
if err != nil {
t.Fatalf("LoadManifest() error = %v", err)
}
cases := []struct {
name string
runtime string
model string
authEnv []string
expect string // provider name DeriveProvider must return
}{
// --- kimi serving split (the central P0 data fix) ---------------
// Platform/proxy path: the moonshot-namespaced id routes to the
// `platform` provider (proxy -> moonshot upstream) for claude-code.
// This is the "kimi-k2.6 -> moonshot (proxy)" CTO decision expressed
// via the platform namespace.
{"claude-code platform moonshot/kimi-k2.6", "claude-code", "moonshot/kimi-k2.6", []string{"ANTHROPIC_API_KEY"}, "platform"},
// BYOK gateway path: bare kimi ids route to the kimi-coding gateway
// (api.kimi.com/coding) for claude-code — "kimi-for-coding ->
// kimi-coding" CTO decision.
{"claude-code byok kimi-for-coding", "claude-code", "kimi-for-coding", []string{"KIMI_API_KEY"}, "kimi-coding"},
{"claude-code byok kimi-k2.5", "claude-code", "kimi-k2.5", []string{"KIMI_API_KEY"}, "kimi-coding"},
{"claude-code byok kimi-k2", "claude-code", "kimi-k2", []string{"KIMI_API_KEY"}, "kimi-coding"},
// --- platform-model -> platform (closed set) --------------------
{"claude-code platform anthropic ns", "claude-code", "anthropic/claude-opus-4-7", []string{"ANTHROPIC_API_KEY"}, "platform"},
{"codex platform openai ns", "codex", "openai/gpt-5.4", []string{"MOLECULE_LLM_USAGE_TOKEN"}, "platform"},
{"hermes platform moonshot ns", "hermes", "moonshot/kimi-k2.6", []string{"ANTHROPIC_API_KEY"}, "platform"},
// --- anthropic alias + authEnv disambiguation (oauth vs api) -----
// Bare aliases are OAuth-only when the OAuth token is the available
// auth env (matches canvas env-gating). Versioned ids are the API
// provider.
{"claude-code oauth opus", "claude-code", "opus", []string{"CLAUDE_CODE_OAUTH_TOKEN"}, "anthropic-oauth"},
{"claude-code oauth sonnet", "claude-code", "sonnet", []string{"CLAUDE_CODE_OAUTH_TOKEN"}, "anthropic-oauth"},
{"claude-code oauth haiku", "claude-code", "haiku", []string{"CLAUDE_CODE_OAUTH_TOKEN"}, "anthropic-oauth"},
{"claude-code api opus versioned", "claude-code", "claude-opus-4-7", []string{"ANTHROPIC_API_KEY"}, "anthropic-api"},
{"claude-code api sonnet versioned", "claude-code", "claude-sonnet-4-6", []string{"ANTHROPIC_API_KEY"}, "anthropic-api"},
// --- other runtimes' native sets --------------------------------
{"codex byok gpt-5.5", "codex", "gpt-5.5", []string{"OPENAI_API_KEY"}, "openai"},
{"claude-code minimax", "claude-code", "MiniMax-M2.7", []string{"MINIMAX_API_KEY"}, "minimax"},
{"openclaw byok colon", "openclaw", "moonshot:kimi-k2.6", []string{"KIMI_API_KEY"}, "kimi-coding"},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
got, err := m.DeriveProvider(tc.runtime, tc.model, tc.authEnv)
if err != nil {
t.Fatalf("DeriveProvider(%q, %q, %v) error = %v", tc.runtime, tc.model, tc.authEnv, err)
}
if got.Name != tc.expect {
t.Errorf("DeriveProvider(%q, %q, %v) = %q, want %q", tc.runtime, tc.model, tc.authEnv, got.Name, tc.expect)
}
})
}
}
// TestDeriveProvider_UnregisteredErrors: a model no native provider owns
// for the runtime must ERROR (never silently default). This is the
// "only-registered-selectable" invariant — fail-closed.
func TestDeriveProvider_UnregisteredErrors(t *testing.T) {
m, err := LoadManifest()
if err != nil {
t.Fatalf("LoadManifest() error = %v", err)
}
cases := []struct {
runtime string
model string
}{
// gpt-* is OpenAI — not in claude-code's native set.
{"claude-code", "gpt-5.5"},
// deepseek is a catalog provider but in NO runtime's native set.
{"claude-code", "deepseek-v4-pro"},
// codex is OpenAI-only — a kimi id is unregistered for it.
{"codex", "kimi-for-coding"},
// a slug no provider in the manifest matches at all.
{"claude-code", "totally-made-up-model-xyz"},
}
for _, tc := range cases {
p, err := m.DeriveProvider(tc.runtime, tc.model, nil)
if err == nil {
t.Errorf("DeriveProvider(%q, %q) expected unregistered error, got provider %q", tc.runtime, tc.model, p.Name)
}
if p.Name != "" {
t.Errorf("DeriveProvider(%q, %q) on error must return a zero Provider, got %q", tc.runtime, tc.model, p.Name)
}
}
}
// TestDeriveProvider_UnknownRuntimeErrors: fail-closed on an unknown
// runtime (never falls through to "all providers").
func TestDeriveProvider_UnknownRuntimeErrors(t *testing.T) {
m, err := LoadManifest()
if err != nil {
t.Fatalf("LoadManifest() error = %v", err)
}
p, err := m.DeriveProvider("does-not-exist", "claude-opus-4-7", nil)
if err == nil {
t.Errorf("DeriveProvider(unknown runtime) expected error, got provider %q", p.Name)
}
if !strings.Contains(strings.ToLower(err.Error()), "runtime") {
t.Errorf("DeriveProvider(unknown runtime) error = %q, want it to name the runtime problem", err.Error())
}
}
// TestDeriveProvider_PlatformIsClosed proves a third-party-style provider
// can never be derived as `platform`. `platform` is a CLOSED core-only set:
// only models a native runtime's `platform` ref lists (vendor-namespaced)
// derive to platform. A BYOK id, even one a runtime natively supports,
// derives to its BYOK provider, never to platform.
func TestDeriveProvider_PlatformIsClosed(t *testing.T) {
m, err := LoadManifest()
if err != nil {
t.Fatalf("LoadManifest() error = %v", err)
}
// kimi-for-coding is a BYOK id natively supported by claude-code; it
// must derive to kimi-coding (BYOK), NOT platform — even though
// `platform` is in claude-code's native set.
got, err := m.DeriveProvider("claude-code", "kimi-for-coding", []string{"KIMI_API_KEY"})
if err != nil {
t.Fatalf("DeriveProvider(claude-code, kimi-for-coding) error = %v", err)
}
if got.Name == "platform" {
t.Fatal("BYOK kimi-for-coding must not derive to the closed platform provider")
}
if got.Name != "kimi-coding" {
t.Errorf("DeriveProvider(claude-code, kimi-for-coding) = %q, want kimi-coding", got.Name)
}
}
// craftedManifest is a tiny well-formed manifest with a DELIBERATE prefix
// overlap between two native providers, used to exercise DeriveProvider's
// overlap-error path and the auth-env disambiguation path without depending
// on the real manifest staying overlap-free (it is, by the load guard).
const craftedOverlapManifest = `
schema_version: 1
providers:
- name: prov-a
display_name: "Provider A"
protocol: openai
auth_mode: anthropic_api
auth_env: [A_API_KEY]
model_prefix_match: "^shared-"
- name: prov-b
display_name: "Provider B"
protocol: openai
auth_mode: anthropic_api
auth_env: [B_API_KEY]
model_prefix_match: "^shared-"
runtimes:
testrt:
providers:
- name: prov-a
models: [a-only-model]
- name: prov-b
models: [b-only-model]
`
// TestDeriveProvider_OverlapErrors proves DeriveProvider ERRORS when >=2
// native providers match the same slug and auth-env cannot disambiguate —
// it never silently picks one. This is the load-time-overlap guard's
// runtime counterpart at derivation time.
func TestDeriveProvider_OverlapErrors(t *testing.T) {
m, err := parseManifest([]byte(craftedOverlapManifest))
if err != nil {
t.Fatalf("parseManifest(crafted) error = %v", err)
}
// "shared-x" matches BOTH prov-a and prov-b via prefix; no exact-id
// resolves it; no auth env is supplied -> unresolved overlap -> error.
p, err := m.DeriveProvider("testrt", "shared-x", nil)
if err == nil {
t.Fatalf("DeriveProvider expected overlap error, got provider %q", p.Name)
}
if !strings.Contains(strings.ToLower(err.Error()), "overlap") &&
!strings.Contains(strings.ToLower(err.Error()), "ambiguous") {
t.Errorf("overlap error = %q, want it to name overlap/ambiguity", err.Error())
}
if p.Name != "" {
t.Errorf("on overlap error DeriveProvider must return zero Provider, got %q", p.Name)
}
}
// TestDeriveProvider_AuthEnvDisambiguates proves auth-env breaks an
// otherwise-ambiguous prefix overlap: when two native providers match the
// same slug but exactly one's auth_env intersects the available env set,
// DeriveProvider resolves to that one.
func TestDeriveProvider_AuthEnvDisambiguates(t *testing.T) {
m, err := parseManifest([]byte(craftedOverlapManifest))
if err != nil {
t.Fatalf("parseManifest(crafted) error = %v", err)
}
// Only B_API_KEY is available -> the shared prefix resolves to prov-b.
got, err := m.DeriveProvider("testrt", "shared-x", []string{"B_API_KEY"})
if err != nil {
t.Fatalf("DeriveProvider(authEnv=B_API_KEY) error = %v", err)
}
if got.Name != "prov-b" {
t.Errorf("DeriveProvider(authEnv=B_API_KEY) = %q, want prov-b", got.Name)
}
// Only A_API_KEY -> prov-a.
got, err = m.DeriveProvider("testrt", "shared-x", []string{"A_API_KEY"})
if err != nil {
t.Fatalf("DeriveProvider(authEnv=A_API_KEY) error = %v", err)
}
if got.Name != "prov-a" {
t.Errorf("DeriveProvider(authEnv=A_API_KEY) = %q, want prov-a", got.Name)
}
// Both keys available -> still ambiguous -> error (auth env doesn't
// narrow to one).
p, err := m.DeriveProvider("testrt", "shared-x", []string{"A_API_KEY", "B_API_KEY"})
if err == nil {
t.Errorf("DeriveProvider(both keys) expected overlap error, got %q", p.Name)
}
}
// TestDeriveProvider_KimiPrefixFallback proves the kimi serving split holds
// on the PREFIX-FALLBACK path too — not only for exact-listed ids. A bare
// kimi id that is NOT in any runtime's exact Models list (e.g. a new
// kimi-latest the gateway serves but the template hasn't enumerated) must
// still resolve to the kimi-coding gateway for claude-code, NOT error
// "unregistered". This catches the false-overlap data bug: before the YAML
// tightening, kimi-coding's regex was too narrow (coding-suffixed ids only)
// and moonshot's was too broad (claimed bare kimi-k2*), so a bare kimi id
// resolved to NEITHER native provider for claude-code.
func TestDeriveProvider_KimiPrefixFallback(t *testing.T) {
m, err := LoadManifest()
if err != nil {
t.Fatalf("LoadManifest() error = %v", err)
}
for _, model := range []string{"kimi-latest", "kimi-thinking-preview"} {
got, err := m.DeriveProvider("claude-code", model, []string{"KIMI_API_KEY"})
if err != nil {
t.Errorf("DeriveProvider(claude-code, %q) prefix-fallback error = %v; want kimi-coding", model, err)
continue
}
if got.Name != "kimi-coding" {
t.Errorf("DeriveProvider(claude-code, %q) = %q, want kimi-coding (gateway serves any kimi id)", model, got.Name)
}
}
}
// TestDeriveProvider_ExactIdBeatsPrefix proves the exact model-id match in
// the runtime native set is authoritative over a prefix match — the CTO
// "disambiguate by exact model id" rule. A model id listed under provider P
// for runtime R derives to P even if another native provider's prefix would
// also match it.
func TestDeriveProvider_ExactIdBeatsPrefix(t *testing.T) {
const yaml = `
schema_version: 1
providers:
- name: gateway
display_name: "Gateway"
protocol: anthropic
auth_mode: third_party_anthropic_compat
auth_env: [GW_KEY]
model_prefix_match: "^never-matches-anything$"
- name: broad
display_name: "Broad"
protocol: openai
auth_mode: anthropic_api
auth_env: [BROAD_KEY]
model_prefix_match: "^kimi-"
runtimes:
rt:
providers:
- name: gateway
models: [kimi-k2.5]
- name: broad
models: [kimi-other]
`
m, err := parseManifest([]byte(yaml))
if err != nil {
t.Fatalf("parseManifest error = %v", err)
}
// kimi-k2.5 is EXACT-listed under `gateway` for rt, but `broad`'s
// ^kimi- prefix also matches it. Exact id wins -> gateway.
got, err := m.DeriveProvider("rt", "kimi-k2.5", nil)
if err != nil {
t.Fatalf("DeriveProvider error = %v", err)
}
if got.Name != "gateway" {
t.Errorf("exact-id should beat prefix: got %q, want gateway", got.Name)
}
}
// TestResolveUpstream_RealManifest exercises the SINGLE runtime-AGNOSTIC
// proxy-upstream resolution (internal#718 P1, CONVERGED) against the embedded
// baseline. ResolveUpstream is the ONE resolution over the EXISTING vendor
// provider entries (no proxy_routing block): it maps a model id's NAMESPACE
// token to the entry whose upstream_vendor equals it, answering "which UPSTREAM
// vendor owns this wire model id" in the proxy's 4-name vocabulary {openai,
// moonshot, anthropic, minimax}, with NO runtime context. The byte-identical
// equivalence guard lives in the handlers package (against the live
// inferLLMProvider oracle); this test pins the resolution's own semantics:
// namespace split, separator order, suffix-stripping, and the
// bare-id-is-vestigial (errors) contract.
func TestResolveUpstream_RealManifest(t *testing.T) {
m, err := LoadManifest()
if err != nil {
t.Fatalf("LoadManifest() error = %v", err)
}
cases := []struct {
name string
model string
wantVendor string
wantResolved string
wantProvider string // catalog entry the namespace resolved to
wantErr bool
}{
// --- namespace split — the LIVE traffic shape (vendor/model + vendor:model)
// jrs SEO's LIVE platform model + sibling — MUST stay on moonshot.
{"platform moonshot slash", "moonshot/kimi-k2.6", "moonshot", "kimi-k2.6", "moonshot", false},
{"platform moonshot colon (openclaw)", "moonshot:kimi-k2.6", "moonshot", "kimi-k2.6", "moonshot", false},
// anthropic namespace resolves to the anthropic-api ENTRY (name != vendor).
{"platform anthropic ns", "anthropic/claude-opus-4-7", "anthropic", "claude-opus-4-7", "anthropic-api", false},
{"platform openai ns", "openai/gpt-5.4", "openai", "gpt-5.4", "openai", false},
{"platform minimax ns", "minimax/MiniMax-M2.7", "minimax", "MiniMax-M2.7", "minimax", false},
{"openai ns gpt-4o", "openai/gpt-4o", "openai", "gpt-4o", "openai", false},
// --- bare ids are VESTIGIAL at the proxy: ResolveUpstream errors (the
// proxy falls back to its legacy switch for these). No live bare traffic.
{"bare kimi -> err (vestigial, legacy fallback)", "kimi-k2.6", "", "", "", true},
{"bare claude -> err (vestigial)", "claude-3-5-sonnet", "", "", "", true},
{"bare minimax -> err (vestigial)", "minimax-m1", "", "", "", true},
{"bare gpt -> err (vestigial)", "gpt-5.5", "", "", "", true},
{"alias sonnet -> err (vestigial)", "sonnet", "", "", "", true},
{"unknown bare id -> err (vestigial)", "totally-made-up-xyz", "", "", "", true},
// non-allowlisted namespace token ("kimi-coding" is no entry's
// upstream_vendor) does NOT resolve; the whole id is then bare -> err.
// (The proxy's legacy fallback routes "kimi-coding/kimi-k2" to moonshot,
// preserving the prior behavior — proven by the handlers equivalence test.)
{"kimi-coding/ ns not a vendor -> err (legacy fallback)", "kimi-coding/kimi-k2", "", "", "", true},
// --- empty -------------------------------------------------------
{"empty -> err", "", "", "", "", true},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
up, err := m.ResolveUpstream(tc.model)
if tc.wantErr {
if err == nil {
t.Fatalf("ResolveUpstream(%q) = %+v, want error", tc.model, up)
}
if up.Vendor != "" || up.Model != "" || up.Provider.Name != "" {
t.Errorf("ResolveUpstream(%q) on error must return zero Upstream, got %+v", tc.model, up)
}
return
}
if err != nil {
t.Fatalf("ResolveUpstream(%q) error = %v", tc.model, err)
}
if up.Vendor != tc.wantVendor {
t.Errorf("ResolveUpstream(%q) vendor = %q, want %q", tc.model, up.Vendor, tc.wantVendor)
}
if up.Model != tc.wantResolved {
t.Errorf("ResolveUpstream(%q) model = %q, want %q", tc.model, up.Model, tc.wantResolved)
}
if up.Provider.Name != tc.wantProvider {
t.Errorf("ResolveUpstream(%q) provider = %q, want %q", tc.model, up.Provider.Name, tc.wantProvider)
}
})
}
}
// TestResolveUpstream_SeparatorOrder pins the proxy's "/" then ":" separator
// order: an id containing BOTH must split on "/" first (the proxy's loop
// order), so the "/"-prefix vendor wins.
func TestResolveUpstream_SeparatorOrder(t *testing.T) {
m, err := LoadManifest()
if err != nil {
t.Fatalf("LoadManifest() error = %v", err)
}
// "moonshot/foo:bar" cuts on "/" first -> before="moonshot", after="foo:bar".
up, err := m.ResolveUpstream("moonshot/foo:bar")
if err != nil || up.Vendor != "moonshot" || up.Model != "foo:bar" {
t.Fatalf("separator order: got (%+v, err=%v), want vendor=moonshot model=foo:bar", up, err)
}
}
// TestResolveUpstream_ResolvesToProviderEntry proves the SINGLE-SOURCE
// invariant of the convergence: ResolveUpstream returns the EXISTING vendor
// provider entry, and that entry carries the upstream base URLs + auth — there
// is no parallel routing data block. The proxy dials the entry's base_url_*;
// the test pins them so a future entry edit that breaks the live upstream is
// caught here, not in production.
func TestResolveUpstream_ResolvesToProviderEntry(t *testing.T) {
m, err := LoadManifest()
if err != nil {
t.Fatalf("LoadManifest() error = %v", err)
}
cases := []struct {
model string
wantProvider string
wantBaseURL string // base_url_template on the resolved entry
wantBaseURLAnthro string // base_url_anthropic on the resolved entry
wantAuthEnvContain string // an auth_env name the entry must carry
}{
{"moonshot/kimi-k2.6", "moonshot", "https://api.moonshot.ai/v1", "https://api.moonshot.ai/anthropic/v1", "MOONSHOT_API_KEY"},
{"anthropic/claude-opus-4-7", "anthropic-api", "https://api.anthropic.com/v1", "https://api.anthropic.com/v1", "ANTHROPIC_API_KEY"},
{"minimax/MiniMax-M2.7", "minimax", "https://api.minimax.io/v1", "https://api.minimax.io/anthropic/v1", "MINIMAX_API_KEY"},
{"openai/gpt-5.4", "openai", "https://api.openai.com/v1", "", "OPENAI_API_KEY"},
}
for _, tc := range cases {
up, err := m.ResolveUpstream(tc.model)
if err != nil {
t.Fatalf("ResolveUpstream(%q) error = %v", tc.model, err)
}
if up.Provider.Name != tc.wantProvider {
t.Errorf("%q: provider = %q, want %q", tc.model, up.Provider.Name, tc.wantProvider)
}
if up.Provider.BaseURLTemplate != tc.wantBaseURL {
t.Errorf("%q: base_url_template = %q, want %q", tc.model, up.Provider.BaseURLTemplate, tc.wantBaseURL)
}
if up.Provider.BaseURLAnthropic != tc.wantBaseURLAnthro {
t.Errorf("%q: base_url_anthropic = %q, want %q", tc.model, up.Provider.BaseURLAnthropic, tc.wantBaseURLAnthro)
}
found := false
for _, e := range up.Provider.AuthEnv {
if e == tc.wantAuthEnvContain {
found = true
break
}
}
if !found {
t.Errorf("%q: auth_env %v missing %q", tc.model, up.Provider.AuthEnv, tc.wantAuthEnvContain)
}
}
}
// TestParseManifest_RejectsDuplicateUpstreamVendor proves the convergence's
// load-time invariant: two entries cannot claim the same upstream_vendor (the
// namespace token must resolve to exactly one entry). Replaces the prior
// closed-catch-all / vendorless-proxy_routing guards.
func TestParseManifest_RejectsDuplicateUpstreamVendor(t *testing.T) {
const dupVendor = `
schema_version: 1
providers:
- name: prov-a
display_name: "Provider A"
protocol: openai
auth_mode: anthropic_api
auth_env: [A_API_KEY]
model_prefix_match: "^a-"
upstream_vendor: shared-vendor
- name: prov-b
display_name: "Provider B"
protocol: openai
auth_mode: anthropic_api
auth_env: [B_API_KEY]
model_prefix_match: "^b-"
upstream_vendor: shared-vendor
runtimes:
testrt:
providers:
- name: prov-a
models: [a-only]
`
_, err := parseManifest([]byte(dupVendor))
if err == nil {
t.Fatal("manifest with two entries claiming the same upstream_vendor must fail to load")
}
if !strings.Contains(strings.ToLower(err.Error()), "upstream_vendor") &&
!strings.Contains(strings.ToLower(err.Error()), "unique") {
t.Errorf("duplicate-vendor error = %q, want it to name the upstream_vendor uniqueness problem", err.Error())
}
}
// TestResolveUpstream_OnlyRoutingEntriesCarryVendor documents the data shape:
// in the real manifest, EXACTLY the four upstream entries carry upstream_vendor,
// they are {anthropic, openai, moonshot, minimax}, and each is unique. This is
// the converged single-source-of-truth assertion (was TestProxyRoutingClosedCatchAll).
func TestResolveUpstream_OnlyRoutingEntriesCarryVendor(t *testing.T) {
m, err := LoadManifest()
if err != nil {
t.Fatalf("LoadManifest() error = %v", err)
}
got := map[string]string{} // vendor -> entry name
for _, p := range m.Providers {
if p.UpstreamVendor == "" {
continue
}
if prev, dup := got[p.UpstreamVendor]; dup {
t.Fatalf("upstream_vendor %q claimed by both %q and %q", p.UpstreamVendor, prev, p.Name)
}
got[p.UpstreamVendor] = p.Name
}
want := map[string]string{
"anthropic": "anthropic-api",
"openai": "openai",
"moonshot": "moonshot",
"minimax": "minimax",
}
if len(got) != len(want) {
t.Fatalf("upstream_vendor entries = %v, want exactly %v", got, want)
}
for v, name := range want {
if got[v] != name {
t.Errorf("upstream_vendor %q -> entry %q, want %q", v, got[v], name)
}
}
}
@@ -0,0 +1,96 @@
// Code generated by cmd/gen-providers; DO NOT EDIT.
//
// Source of truth: internal/providers/providers.yaml (schema_version 1).
// Regenerate with: go generate ./... (or: go run ./cmd/gen-providers)
// The verify-providers-gen CI workflow fails RED if this file drifts from
// providers.yaml or is hand-edited. internal#718 P0 — checked-in + drift-
// gated ONLY; no production path imports this package yet (that is P1+).
package gen
// SchemaVersion is the providers.yaml schema this artifact was generated
// against. It is the semver'd contract version (the MAJOR component for the
// public extension contract; see internal/providers/README.md).
const SchemaVersion = 1
// Fingerprint is a stable content hash of the generated projection (schema
// version + provider catalog + runtime native sets). It changes iff the
// registry DATA changes (comment-only YAML edits do not churn it).
const Fingerprint = "cbd39dfe934302e0"
// GenProvider is the generated projection of one provider catalog entry —
// the subset a downstream consumer needs to derive + display a provider.
type GenProvider struct {
Name string
DisplayName string
Protocol string
AuthMode string
AuthEnv []string
ModelPrefixMatch string
// IsPlatform marks the closed, core-only platform-managed provider.
IsPlatform bool
// UpstreamVendor is the proxy's upstream-vendor key for this entry
// (internal#718 P1, CONVERGED); empty for providers the proxy does not
// route to an upstream vendor. ResolveUpstream maps a model id's namespace
// token to the entry whose UpstreamVendor equals it.
UpstreamVendor string
}
// GenRuntimeRef is one native provider a runtime supports + its exact models.
type GenRuntimeRef struct {
Name string
Models []string
}
// Providers is the full provider catalog, in providers.yaml declaration order.
var Providers = []GenProvider{
{Name: "anthropic-api", DisplayName: "Anthropic API", Protocol: "anthropic", AuthMode: "anthropic_api", AuthEnv: []string{"ANTHROPIC_API_KEY", "ANTHROPIC_AUTH_TOKEN"}, ModelPrefixMatch: "^claude", IsPlatform: false, UpstreamVendor: "anthropic"},
{Name: "anthropic-oauth", DisplayName: "Claude Code subscription", Protocol: "anthropic", AuthMode: "oauth", AuthEnv: []string{"CLAUDE_CODE_OAUTH_TOKEN"}, ModelPrefixMatch: "^(sonnet|opus|haiku)$", IsPlatform: false},
{Name: "openai", DisplayName: "OpenAI", Protocol: "openai", AuthMode: "anthropic_api", AuthEnv: []string{"OPENAI_API_KEY"}, ModelPrefixMatch: "^gpt-", IsPlatform: false, UpstreamVendor: "openai"},
{Name: "moonshot", DisplayName: "Moonshot (Kimi)", Protocol: "openai", AuthMode: "third_party_anthropic_compat", AuthEnv: []string{"MOONSHOT_API_KEY", "KIMI_API_KEY"}, ModelPrefixMatch: "^moonshot[:/-]", IsPlatform: false, UpstreamVendor: "moonshot"},
{Name: "minimax", DisplayName: "MiniMax", Protocol: "openai", AuthMode: "third_party_anthropic_compat", AuthEnv: []string{"MINIMAX_API_KEY", "ANTHROPIC_AUTH_TOKEN", "ANTHROPIC_API_KEY"}, ModelPrefixMatch: "(?i)^minimax-m", IsPlatform: false, UpstreamVendor: "minimax"},
{Name: "platform", DisplayName: "Platform", Protocol: "anthropic", AuthMode: "third_party_anthropic_compat", AuthEnv: []string{"ANTHROPIC_API_KEY", "MOLECULE_LLM_USAGE_TOKEN"}, ModelPrefixMatch: "^platform/", IsPlatform: true},
{Name: "xiaomi-mimo", DisplayName: "Xiaomi MiMo", Protocol: "anthropic", AuthMode: "third_party_anthropic_compat", AuthEnv: []string{"ANTHROPIC_AUTH_TOKEN", "ANTHROPIC_API_KEY"}, ModelPrefixMatch: "^mimo-", IsPlatform: false},
{Name: "zai", DisplayName: "Z.ai (GLM)", Protocol: "anthropic", AuthMode: "third_party_anthropic_compat", AuthEnv: []string{"GLM_API_KEY", "ANTHROPIC_AUTH_TOKEN", "ANTHROPIC_API_KEY"}, ModelPrefixMatch: "(?i)^glm-", IsPlatform: false},
{Name: "kimi-coding", DisplayName: "Moonshot Kimi (coding-tuned)", Protocol: "anthropic", AuthMode: "third_party_anthropic_compat", AuthEnv: []string{"KIMI_API_KEY", "ANTHROPIC_API_KEY", "ANTHROPIC_AUTH_TOKEN"}, ModelPrefixMatch: "^kimi-", IsPlatform: false},
{Name: "deepseek", DisplayName: "DeepSeek", Protocol: "anthropic", AuthMode: "third_party_anthropic_compat", AuthEnv: []string{"DEEPSEEK_API_KEY", "ANTHROPIC_AUTH_TOKEN", "ANTHROPIC_API_KEY"}, ModelPrefixMatch: "^deepseek-", IsPlatform: false},
{Name: "google", DisplayName: "Google Gemini", Protocol: "openai", AuthMode: "third_party_anthropic_compat", AuthEnv: []string{"GEMINI_API_KEY", "GOOGLE_API_KEY"}, ModelPrefixMatch: "^gemini-", IsPlatform: false},
{Name: "alibaba", DisplayName: "Alibaba Qwen (DashScope)", Protocol: "openai", AuthMode: "third_party_anthropic_compat", AuthEnv: []string{"DASHSCOPE_API_KEY", "ALIBABA_API_KEY"}, ModelPrefixMatch: "^qwen-", IsPlatform: false},
{Name: "nousresearch", DisplayName: "Nous Research (Hermes)", Protocol: "openai", AuthMode: "third_party_anthropic_compat", AuthEnv: []string{"NOUSRESEARCH_API_KEY"}, ModelPrefixMatch: "^nousresearch/", IsPlatform: false},
{Name: "openrouter", DisplayName: "OpenRouter (any model)", Protocol: "openai", AuthMode: "third_party_anthropic_compat", AuthEnv: []string{"OPENROUTER_API_KEY"}, ModelPrefixMatch: "^openrouter/", IsPlatform: false},
{Name: "huggingface", DisplayName: "Hugging Face Inference", Protocol: "openai", AuthMode: "third_party_anthropic_compat", AuthEnv: []string{"HUGGINGFACE_API_KEY", "HF_TOKEN"}, ModelPrefixMatch: "^huggingface/", IsPlatform: false},
{Name: "ai-gateway", DisplayName: "Vercel AI Gateway", Protocol: "openai", AuthMode: "third_party_anthropic_compat", AuthEnv: []string{"AI_GATEWAY_API_KEY"}, ModelPrefixMatch: "^ai-gateway/", IsPlatform: false},
{Name: "opencode-zen", DisplayName: "OpenCode Zen", Protocol: "openai", AuthMode: "third_party_anthropic_compat", AuthEnv: []string{"OPENCODE_ZEN_API_KEY"}, ModelPrefixMatch: "^opencode-zen/", IsPlatform: false},
{Name: "opencode-go", DisplayName: "OpenCode Go", Protocol: "openai", AuthMode: "third_party_anthropic_compat", AuthEnv: []string{"OPENCODE_GO_API_KEY"}, ModelPrefixMatch: "^opencode-go/", IsPlatform: false},
{Name: "kilocode", DisplayName: "Kilo Code", Protocol: "openai", AuthMode: "third_party_anthropic_compat", AuthEnv: []string{"KILOCODE_API_KEY"}, ModelPrefixMatch: "^kilocode/", IsPlatform: false},
{Name: "minimax-cn", DisplayName: "MiniMax China", Protocol: "openai", AuthMode: "third_party_anthropic_compat", AuthEnv: []string{"MINIMAX_API_KEY", "ANTHROPIC_AUTH_TOKEN"}, ModelPrefixMatch: "^minimax-cn/", IsPlatform: false},
{Name: "ollama-cloud", DisplayName: "Ollama Cloud", Protocol: "openai", AuthMode: "third_party_anthropic_compat", AuthEnv: []string{"OLLAMA_CLOUD_API_KEY"}, ModelPrefixMatch: "^ollama-cloud/", IsPlatform: false},
{Name: "ollama", DisplayName: "Ollama (self-hosted)", Protocol: "openai", AuthMode: "third_party_anthropic_compat", AuthEnv: []string{"OLLAMA_HOST"}, ModelPrefixMatch: "^ollama/", IsPlatform: false},
{Name: "nvidia", DisplayName: "NVIDIA NIM", Protocol: "openai", AuthMode: "third_party_anthropic_compat", AuthEnv: []string{"NVIDIA_API_KEY"}, ModelPrefixMatch: "^nvidia/", IsPlatform: false},
{Name: "arcee", DisplayName: "Arcee", Protocol: "openai", AuthMode: "third_party_anthropic_compat", AuthEnv: []string{"ARCEE_API_KEY"}, ModelPrefixMatch: "^arcee/", IsPlatform: false},
{Name: "custom", DisplayName: "Custom OpenAI-compat endpoint", Protocol: "openai", AuthMode: "third_party_anthropic_compat", AuthEnv: []string{"CUSTOM_API_KEY", "OPENAI_API_KEY"}, ModelPrefixMatch: "^custom/", IsPlatform: false},
}
// Runtimes maps each runtime to its native provider+model set, runtime names
// sorted for a deterministic artifact.
var Runtimes = map[string][]GenRuntimeRef{
"claude-code": {
{Name: "anthropic-oauth", Models: []string{"sonnet", "opus", "haiku", "anthropic:sonnet", "anthropic:opus", "anthropic:haiku"}},
{Name: "anthropic-api", Models: []string{"claude-sonnet-4-6", "claude-opus-4-7", "claude-haiku-4-5", "claude-sonnet-4-5", "anthropic:claude-sonnet-4-6", "anthropic:claude-opus-4-7", "anthropic:claude-haiku-4-5", "anthropic:claude-sonnet-4-5"}},
{Name: "kimi-coding", Models: []string{"kimi-for-coding", "kimi-k2.5", "kimi-k2", "moonshot:kimi-k2.6", "moonshot:kimi-k2.5"}},
{Name: "minimax", Models: []string{"MiniMax-M2", "MiniMax-M2.7", "MiniMax-M2.7-highspeed", "minimax:MiniMax-M2", "minimax:MiniMax-M2.7", "minimax:MiniMax-M2.7-highspeed"}},
{Name: "platform", Models: []string{"anthropic/claude-opus-4-7", "anthropic/claude-sonnet-4-6", "moonshot/kimi-k2.6", "moonshot/kimi-k2.5", "minimax/MiniMax-M2.7", "minimax/MiniMax-M2.7-highspeed"}},
},
"codex": {
{Name: "openai", Models: []string{"gpt-5.5", "gpt-5.4", "gpt-5.4-mini", "gpt-5.3-codex", "gpt-5.3-codex-spark", "gpt-5.2"}},
{Name: "platform", Models: []string{"openai/gpt-5.4", "openai/gpt-5.4-mini"}},
},
"hermes": {
{Name: "kimi-coding", Models: []string{"kimi-coding/kimi-k2"}},
{Name: "platform", Models: []string{"moonshot/kimi-k2.6", "moonshot/kimi-k2.5"}},
},
"openclaw": {
{Name: "kimi-coding", Models: []string{"moonshot:kimi-k2.6", "moonshot:kimi-k2.5"}},
{Name: "platform", Models: []string{"moonshot/kimi-k2.6", "moonshot/kimi-k2.5"}},
},
}
@@ -0,0 +1,85 @@
package gen
import (
"testing"
"git.moleculesai.app/molecule-ai/molecule-core/workspace-server/internal/providers"
)
// TestGeneratedProjectionMatchesManifest proves the checked-in artifact is a
// FAITHFUL projection of the live manifest — not just byte-stable, but
// semantically correct. The byte-level drift gate (cmd/gen-providers
// TestArtifactInSync) proves "regen produces this file"; this proves "this
// file's DATA equals the loader's data", so a consumer reading the artifact
// (P1+) sees exactly what the loader sees.
func TestGeneratedProjectionMatchesManifest(t *testing.T) {
m, err := providers.LoadManifest()
if err != nil {
t.Fatalf("LoadManifest() error = %v", err)
}
if SchemaVersion != providers.SchemaVersion() {
t.Errorf("generated SchemaVersion = %d, manifest = %d", SchemaVersion, providers.SchemaVersion())
}
if len(Providers) != len(m.Providers) {
t.Fatalf("generated %d providers, manifest has %d", len(Providers), len(m.Providers))
}
for i, gp := range Providers {
mp := m.Providers[i]
if gp.Name != mp.Name {
t.Errorf("provider[%d] name: gen=%q manifest=%q", i, gp.Name, mp.Name)
}
if gp.ModelPrefixMatch != mp.ModelPrefixMatch {
t.Errorf("provider %q model_prefix_match: gen=%q manifest=%q", gp.Name, gp.ModelPrefixMatch, mp.ModelPrefixMatch)
}
if gp.AuthMode != mp.AuthMode {
t.Errorf("provider %q auth_mode: gen=%q manifest=%q", gp.Name, gp.AuthMode, mp.AuthMode)
}
if gp.IsPlatform != mp.IsPlatform() {
t.Errorf("provider %q IsPlatform: gen=%v manifest=%v", gp.Name, gp.IsPlatform, mp.IsPlatform())
}
}
if len(Runtimes) != len(m.Runtimes) {
t.Fatalf("generated %d runtimes, manifest has %d", len(Runtimes), len(m.Runtimes))
}
for rt, native := range m.Runtimes {
genRefs, ok := Runtimes[rt]
if !ok {
t.Errorf("runtime %q missing from generated artifact", rt)
continue
}
if len(genRefs) != len(native.Providers) {
t.Errorf("runtime %q: gen has %d refs, manifest has %d", rt, len(genRefs), len(native.Providers))
continue
}
for i, ref := range native.Providers {
if genRefs[i].Name != ref.Name {
t.Errorf("runtime %q ref[%d] name: gen=%q manifest=%q", rt, i, genRefs[i].Name, ref.Name)
}
if len(genRefs[i].Models) != len(ref.Models) {
t.Errorf("runtime %q ref %q models count: gen=%d manifest=%d", rt, ref.Name, len(genRefs[i].Models), len(ref.Models))
}
}
}
}
// TestExactlyOnePlatformProvider guards the closed-set invariant in the
// generated projection: the platform-managed provider is a single, core-only
// entry. A federation merge that introduced a second IsPlatform=true provider
// (a forged platform) would flip this red.
func TestExactlyOnePlatformProvider(t *testing.T) {
count := 0
for _, p := range Providers {
if p.IsPlatform {
count++
if p.Name != "platform" {
t.Errorf("IsPlatform provider has unexpected name %q (platform is core-only, name must be %q)", p.Name, "platform")
}
}
}
if count != 1 {
t.Errorf("expected exactly 1 platform provider in the generated catalog, got %d", count)
}
}
@@ -0,0 +1,125 @@
package providers
import (
"go/build"
"os"
"path/filepath"
"strings"
"testing"
)
// gen_import_boundary_test.go — arch-lint-equivalent boundary gate
// (internal#718 P2-A, CTO 2026-05-27 "arch-lint so prod doesn't import the raw
// gen package incorrectly").
//
// molecule-controlplane enforces this with go-arch-lint: the
// internal/providers/gen component is absent from every other component's
// mayDependOn list, so a production package importing the raw generated
// projection fails CI. molecule-core has no go-arch-lint regime, so we pin the
// SAME invariant with a behavior-based AST gate (the established core pattern —
// see derive_provider_drift_test.go / class1_ast_gate_test.go).
//
// Invariant: NO production (non-test) Go file in workspace-server may import
// internal/providers/gen, EXCEPT inside internal/providers itself (the loader's
// own parity test wiring) — and even there only test files. The generated
// projection is checked-in + drift-gated DATA; production code derives through
// the loader (internal/providers DeriveProvider / IsPlatform), never the raw
// gen literals. P2-B wires the billing decision onto the loader, not gen.
const genImportPath = "git.moleculesai.app/molecule-ai/molecule-core/workspace-server/internal/providers/gen"
func TestNoProductionImportOfGenPackage(t *testing.T) {
// Walk up to the workspace-server module root (this test runs with cwd =
// internal/providers).
root := moduleRoot(t)
var offenders []string
walkErr := filepath.Walk(root, func(path string, info os.FileInfo, err error) error {
if err != nil {
return err
}
if info.IsDir() {
base := info.Name()
// Skip vendored / non-source trees.
if base == "vendor" || base == "node_modules" || base == ".git" || base == "testdata" {
return filepath.SkipDir
}
return nil
}
if !strings.HasSuffix(path, ".go") {
return nil
}
// Test files are exempt — the loader's own gen parity test
// (gen/registry_gen_test.go) legitimately imports the loader, and any
// test may cross boundaries to assert on the projection.
if strings.HasSuffix(path, "_test.go") {
return nil
}
// The gen package's own files import nothing internal; skip the dir
// itself so we never flag generated code referencing its own path in a
// comment-derived parse (build.ImportDir reads real imports only, but be
// explicit).
dir := filepath.Dir(path)
if filepath.Base(dir) == "gen" && strings.HasSuffix(filepath.Dir(dir), filepath.Join("internal", "providers")) {
return nil
}
pkg, perr := build.ImportDir(dir, build.ImportComment)
if perr != nil {
// A dir with build-tagged-out files or no buildable package for the
// default tags is not an offender; skip quietly.
return nil //nolint:nilerr // unbuildable dir is not a boundary violation
}
for _, imp := range pkg.Imports {
if imp == genImportPath {
rel, _ := filepath.Rel(root, dir)
offenders = append(offenders, rel)
}
}
return nil
})
if walkErr != nil {
t.Fatalf("walk module tree: %v", walkErr)
}
if len(offenders) > 0 {
t.Errorf("production packages import the raw generated projection %q: %v\n"+
"Production code must derive through the loader (internal/providers "+
"DeriveProvider / IsPlatform), never the raw gen literals. The gen "+
"package is checked-in + drift-gated DATA only (internal#718).",
genImportPath, dedupe(offenders))
}
}
// moduleRoot returns the workspace-server module root by walking up from the
// test's cwd (internal/providers) until it finds go.mod.
func moduleRoot(t *testing.T) string {
t.Helper()
dir, err := os.Getwd()
if err != nil {
t.Fatalf("getwd: %v", err)
}
for {
if _, statErr := os.Stat(filepath.Join(dir, "go.mod")); statErr == nil {
return dir
}
parent := filepath.Dir(dir)
if parent == dir {
t.Fatalf("could not locate go.mod above %s", dir)
}
dir = parent
}
}
func dedupe(in []string) []string {
seen := map[string]struct{}{}
var out []string
for _, s := range in {
if _, ok := seen[s]; ok {
continue
}
seen[s] = struct{}{}
out = append(out, s)
}
return out
}
@@ -0,0 +1,364 @@
// Package providers is the molecule-core SIDE of the LLM provider registry
// SSOT (internal#718 P2-A, CTO 2026-05-27 "Distribution = SDK via codegen +
// verify-CI"). It is a load-time mirror of the canonical loader that lives in
// molecule-controlplane internal/providers — same parse, same validation, same
// DeriveProvider/IsPlatform/ResolveUpstream API.
//
// CANONICAL SSOT = molecule-controlplane internal/providers/providers.yaml.
// This package embeds a SYNCED COPY of that file (providers.yaml here is a
// byte-for-byte mirror of the canonical, NOT a second authoring surface). The
// CTO-decided distribution model for a multi-repo registry is
// "codegen-checked-into-each-repo + verify-CI": every consumer repo carries the
// generated projection and a drift gate, so a registry change in CP must be
// re-synced here (the sync-providers-yaml verify gate goes RED if this copy
// drifts from the canonical). molecule-core has no Go module dependency on
// controlplane, so a synced+gated copy is the blessed path (a shared Go module
// is not viable across the two repos today).
//
// P2-A is ADDITIVE, ZERO behavior change (the P0 shape mirrored): the loader +
// DeriveProvider land here, plus the generated artifact (cmd/gen-providers) and
// the verify-providers-gen drift gate, but NO production code path imports this
// package yet. P2-B wires the billing/credential decision onto DeriveProvider.
//
// Distribution model mirrors molecule-controlplane internal/providers: go:embed
// the YAML into the binary so a boot-time Load never touches the network.
package providers
import (
_ "embed"
"fmt"
"regexp"
"gopkg.in/yaml.v3"
)
// schemaVersion is the providers.yaml schema this package knows how to
// parse. It is the MAJOR component of the semver'd extension contract
// (internal#718: the manifest is a first-class versioned public artifact;
// breaking the field set is a governed API break). Bumped only on a breaking
// field-set change; Load fails closed on a mismatch so an older binary cannot
// silently consume a newer manifest (mirrors internal/envs). See
// internal/providers/README.md for the contract + compatibility policy.
const schemaVersion = 1
// SchemaVersion exposes the schema/contract MAJOR version the loader knows
// how to parse. It is the version the codegen artifact (cmd/gen-providers)
// and any future conformance suite pin against. Public so the generator and
// external conformance tooling read the same constant the loader enforces.
func SchemaVersion() int { return schemaVersion }
//go:embed providers.yaml
var embeddedYAML []byte
// Protocol is the wire format the proxy speaks to a provider's upstream.
type Protocol string
const (
// ProtocolOpenAI is the OpenAI chat-completions wire format.
ProtocolOpenAI Protocol = "openai"
// ProtocolAnthropic is the Anthropic messages wire format.
ProtocolAnthropic Protocol = "anthropic"
)
// Provider is one entry in the canonical manifest. It is the superset
// schema from RFC §2 — each consumer reads the subset it needs (the
// proxy reads protocol/base_url/auth_env, the canvas reads
// display_name/vendor_logo/model_prefix_match, the adapter reads
// auth_mode/auth_token_env/base_url). Field names mirror the YAML keys.
type Provider struct {
// Name is the stable key (intended to align with
// llm_price_catalog.provider; see the DRIFT NOTE in providers.yaml).
Name string `yaml:"name"`
// DisplayName is the canvas dropdown label.
DisplayName string `yaml:"display_name"`
// VendorLogo is the canvas asset key.
VendorLogo string `yaml:"vendor_logo"`
// Protocol is the proxy wire format: "openai" or "anthropic".
Protocol Protocol `yaml:"protocol"`
// AuthMode is one of "anthropic_api", "oauth",
// "third_party_anthropic_compat".
AuthMode string `yaml:"auth_mode"`
// BaseURLTemplate is the openai-protocol base URL (empty = SDK/CLI
// default).
BaseURLTemplate string `yaml:"base_url_template"`
// BaseURLAnthropic is the anthropic-protocol base URL where the
// provider exposes one (empty otherwise).
BaseURLAnthropic string `yaml:"base_url_anthropic"`
// AuthEnv is the list of env var NAMES accepted (never secret
// values); any one being set satisfies auth.
AuthEnv []string `yaml:"auth_env"`
// AuthTokenEnv is the env var the adapter projects the vendor key
// into (defaults to ANTHROPIC_AUTH_TOKEN when empty).
AuthTokenEnv string `yaml:"auth_token_env"`
// ModelPrefixMatch is the RE2 regex that unifies the proxy's
// inferLLMProvider prefixes, the canvas BARE_VENDOR_PATTERNS, and
// the adapter model_prefixes.
ModelPrefixMatch string `yaml:"model_prefix_match"`
// ModelAliases are canvas shortcut ids (e.g. sonnet/opus/haiku).
ModelAliases []string `yaml:"model_aliases"`
// Deprecated greys the provider out in the canvas (RFC §8.2)
// without breaking saved workspace configs. Optional; default false.
Deprecated bool `yaml:"deprecated"`
// UpstreamVendor is the proxy's upstream-vendor key for this entry — the
// 4-name vocabulary {openai, moonshot, anthropic, minimax} the proxy's
// resolveLLMProviderTarget switch dispatches on to pick the upstream base
// URL + key (internal#718 P1, CONVERGED). It is set ONLY on the entries the
// proxy routes to an upstream vendor; empty for every other catalog entry.
//
// It is a single PROPERTY of the entry, not a parallel routing block: the
// upstream-vendor IDENTITY of a provider (e.g. "anthropic-api"'s upstream is
// the "anthropic" vendor) is a fact about that one entry. ResolveUpstream
// reads it to map a model id's NAMESPACE token to the backing provider,
// whose base_url_* / auth_env (already on this same entry) are the SINGLE
// source for the upstream target. The token may differ from Name (the entry
// "anthropic-api" has UpstreamVendor "anthropic"); for moonshot/openai/
// minimax the entry name and the upstream vendor coincide.
UpstreamVendor string `yaml:"upstream_vendor"`
// re is the compiled ModelPrefixMatch. Compiled at Load (so a bad
// regex fails the whole manifest, per RFC §8.5) and reused by
// MatchesModel. Nil only for a zero-value Provider not produced by
// Load, in which case MatchesModel compiles on demand.
re *regexp.Regexp
}
// RuntimeProviderRef is one provider a runtime natively supports, plus the
// exact model ids that runtime exposes for it. RFC #340 (CTO correction
// 2026-05-26): the manifest is constrained to each runtime's NATIVE support
// matrix, NOT the 24-provider superset. A provider absent from every
// runtime's native set is over-offer drift the canvas must not surface and
// the proxy must not route (matches cp#334 "use native endpoint, don't
// translate").
type RuntimeProviderRef struct {
// Name references a Provider.Name. Load fails closed if it does not
// resolve, so a typo can never silently drop a model from a runtime.
Name string `yaml:"name"`
// Models is the exact set of model ids this runtime exposes for the
// referenced provider (extracted verbatim from the runtime template's
// config.yaml runtime_config.models block). Empty is a manifest error:
// a native provider with zero models offers nothing.
Models []string `yaml:"models"`
}
// RuntimeNativeSet is the native provider+model matrix for a single runtime.
type RuntimeNativeSet struct {
// Providers is the runtime's native provider set (each with its exact
// model ids). Exactly the set the canvas may offer and the proxy may
// route for this runtime — no more, no fewer.
Providers []RuntimeProviderRef `yaml:"providers"`
}
// Manifest is the parsed providers.yaml: the provider catalog plus the
// per-runtime native constraint layer. Returned by LoadManifest; Load
// remains for callers that only need the flat provider slice.
type Manifest struct {
// Providers is the full provider catalog (protocol, base_url, auth).
Providers []Provider
// Runtimes maps a runtime name (claude-code, hermes, codex, openclaw)
// to its native provider+model set. The SSOT for "which providers and
// models does runtime R natively support".
Runtimes map[string]RuntimeNativeSet
}
type manifest struct {
SchemaVersion int `yaml:"schema_version"`
Providers []Provider `yaml:"providers"`
Runtimes map[string]RuntimeNativeSet `yaml:"runtimes"`
}
// Load parses the embedded providers.yaml and returns the manifest's
// provider slice. It validates the schema version, that every entry has
// the required fields populated, and that every model_prefix_match is a
// compilable RE2 regex. Errors are returned (never panic) so callers
// decide their own fallback (the proxy keeps a legacy switch; see RFC
// §6). Load does not touch the network.
//
// Load is the flat-slice accessor retained for PR-1 callers that only need
// the provider catalog. Callers needing the per-runtime native constraint
// layer use LoadManifest.
func Load() ([]Provider, error) {
m, err := LoadManifest()
if err != nil {
return nil, err
}
return m.Providers, nil
}
// LoadManifest parses the embedded providers.yaml into a Manifest: the
// provider catalog plus the per-runtime native support matrix (RFC #340).
// It performs all of Load's validation AND validates the runtimes block:
// every provider name a runtime references must resolve to a real provider
// entry, and every referenced provider must carry at least one model id.
// Fails closed (never panic, never network) so a typo'd provider ref or an
// empty native set is a load error, not a silent over/under-offer.
func LoadManifest() (*Manifest, error) {
return parseManifest(embeddedYAML)
}
// parseManifest is the byte-level seam LoadManifest delegates to. Split out
// so the validation branches (bad schema version, unknown provider ref,
// empty native set, duplicate ref, model-less ref) are unit-testable
// against crafted YAML without mutating the embedded baseline.
func parseManifest(raw []byte) (*Manifest, error) {
var m manifest
if err := yaml.Unmarshal(raw, &m); err != nil {
return nil, fmt.Errorf("providers: parse manifest: %w", err)
}
if m.SchemaVersion != schemaVersion {
return nil, fmt.Errorf("providers: manifest schema_version %d, loader expects %d", m.SchemaVersion, schemaVersion)
}
if len(m.Providers) == 0 {
return nil, fmt.Errorf("providers: manifest has no providers")
}
seen := make(map[string]struct{}, len(m.Providers))
out := make([]Provider, 0, len(m.Providers))
for i := range m.Providers {
p := m.Providers[i]
if err := p.validate(); err != nil {
return nil, fmt.Errorf("providers: entry %d (%q): %w", i, p.Name, err)
}
if _, dup := seen[p.Name]; dup {
return nil, fmt.Errorf("providers: duplicate provider name %q", p.Name)
}
seen[p.Name] = struct{}{}
re, err := regexp.Compile(p.ModelPrefixMatch)
if err != nil {
return nil, fmt.Errorf("providers: entry %q model_prefix_match %q: %w", p.Name, p.ModelPrefixMatch, err)
}
p.re = re
out = append(out, p)
}
// upstream_vendor validation (internal#718 P1, CONVERGED). It is optional
// (set only on the entries the proxy routes to an upstream), but it must be
// UNIQUE across the catalog: ResolveUpstream maps a model id's namespace
// token to the ONE entry whose UpstreamVendor equals it, so two entries
// claiming the same vendor would make the namespace token ambiguous (a
// non-deterministic upstream). Fail closed so a typo can never produce two
// entries owning the same upstream vendor.
vendorOwner := make(map[string]string, len(out))
for i := range out {
v := out[i].UpstreamVendor
if v == "" {
continue
}
if prev, dup := vendorOwner[v]; dup {
return nil, fmt.Errorf("providers: entries %q and %q both declare upstream_vendor %q — it must be unique (the namespace token resolves to exactly one entry)", prev, out[i].Name, v)
}
vendorOwner[v] = out[i].Name
}
if len(m.Runtimes) == 0 {
return nil, fmt.Errorf("providers: manifest declares no runtimes")
}
for rt, native := range m.Runtimes {
if len(native.Providers) == 0 {
return nil, fmt.Errorf("providers: runtime %q has an empty native provider set", rt)
}
refSeen := make(map[string]struct{}, len(native.Providers))
for _, ref := range native.Providers {
if _, ok := seen[ref.Name]; !ok {
return nil, fmt.Errorf("providers: runtime %q references unknown provider %q", rt, ref.Name)
}
if _, dup := refSeen[ref.Name]; dup {
return nil, fmt.Errorf("providers: runtime %q references provider %q twice", rt, ref.Name)
}
refSeen[ref.Name] = struct{}{}
if len(ref.Models) == 0 {
return nil, fmt.Errorf("providers: runtime %q provider %q has no model ids", rt, ref.Name)
}
}
}
return &Manifest{Providers: out, Runtimes: m.Runtimes}, nil
}
// ProvidersForRuntime returns the providers runtime rt natively supports,
// in the manifest's declared order. An unknown runtime returns a non-nil
// error and a nil slice — it never falls through to "all providers", so a
// caller that fat-fingers a runtime name fails loud rather than offering
// the whole catalog.
func (m *Manifest) ProvidersForRuntime(rt string) ([]Provider, error) {
native, ok := m.Runtimes[rt]
if !ok {
return nil, fmt.Errorf("providers: unknown runtime %q", rt)
}
byName := make(map[string]Provider, len(m.Providers))
for _, p := range m.Providers {
byName[p.Name] = p
}
out := make([]Provider, 0, len(native.Providers))
for _, ref := range native.Providers {
// Resolution is guaranteed by LoadManifest's validation, but guard
// anyway so a hand-built Manifest can't panic here.
if p, ok := byName[ref.Name]; ok {
out = append(out, p)
}
}
return out, nil
}
// ModelsForRuntime returns the exact model ids runtime rt natively exposes,
// flattened across all its native providers, in manifest-declared order.
// An unknown runtime returns a non-nil error and a nil slice (never the
// whole catalog). This is the SSOT the canvas dropdown (PR-4) and the proxy
// router (PR-3) both consume so they can never offer/route a model the
// runtime can't natively run.
func (m *Manifest) ModelsForRuntime(rt string) ([]string, error) {
native, ok := m.Runtimes[rt]
if !ok {
return nil, fmt.Errorf("providers: unknown runtime %q", rt)
}
var out []string
for _, ref := range native.Providers {
out = append(out, ref.Models...)
}
return out, nil
}
// validate checks the required-field invariants for a single entry.
func (p *Provider) validate() error {
if p.Name == "" {
return fmt.Errorf("name is required")
}
switch p.Protocol {
case ProtocolOpenAI, ProtocolAnthropic:
default:
return fmt.Errorf("protocol must be %q or %q, got %q", ProtocolOpenAI, ProtocolAnthropic, p.Protocol)
}
if p.AuthMode == "" {
return fmt.Errorf("auth_mode is required")
}
if len(p.AuthEnv) == 0 {
return fmt.Errorf("auth_env must be non-empty")
}
if p.DisplayName == "" {
return fmt.Errorf("display_name is required")
}
if p.ModelPrefixMatch == "" {
return fmt.Errorf("model_prefix_match is required")
}
return nil
}
// MatchesModel reports whether the given model slug is owned by this
// provider per its ModelPrefixMatch regex. A Provider produced by Load
// uses its precompiled regex. A zero-value Provider (one constructed
// directly, not via Load) compiles on demand; if the pattern is invalid
// or empty it never matches.
func (p Provider) MatchesModel(slug string) bool {
re := p.re
if re == nil {
if p.ModelPrefixMatch == "" {
return false
}
compiled, err := regexp.Compile(p.ModelPrefixMatch)
if err != nil {
return false
}
re = compiled
}
return re.MatchString(slug)
}
@@ -0,0 +1,732 @@
# Canonical providers manifest — single source of truth (SSOT) baseline.
#
# RFC: molecule-ai/molecule-controlplane#340 "Canonical Providers Manifest".
# This file is PR-1: the git-tracked baseline only. NOTHING imports the
# loader yet — no consumer is wired (proxy switch, canvas dropdown, and
# adapter registry are migrated in later PRs). Reverting PR-1 = delete
# this file + providers.go + providers_test.go. Zero runtime behavior
# change.
#
# It transcribes the UNION of the four places that independently define
# "which LLM providers exist" today, so later PRs can converge them:
# 1. Proxy — internal/handlers/llm_proxy.go
# resolveLLMProviderTargetForProtocol (4-arm switch:
# openai/moonshot/anthropic/minimax) + inferLLMProvider
# (prefix table: minimax / kimi->moonshot / claude->anthropic
# / default->openai).
# 2. Canvas — molecule-core/canvas/src/components/ProviderModelSelector.tsx
# VENDOR_LABELS (28 rows) + BARE_VENDOR_PATTERNS.
# 3. Adapter — molecule-ai-workspace-template-claude-code/config.yaml
# `providers:` block (8 entries) + adapter.py _BUILTIN_PROVIDERS.
# The same block is copy-pasted into the seo-agent template.
# 4. DB — migrations 037_llm_usage_billing + 039_minimax_llm_price_catalog
# seed llm_price_catalog with providers
# openai / anthropic / moonshot / minimax.
#
# Schema (RFC §2 superset; each consumer reads the subset it needs):
# name stable key (intended == llm_price_catalog.provider)
# display_name canvas dropdown label
# vendor_logo canvas asset key
# protocol openai | anthropic (proxy wire format)
# auth_mode anthropic_api | oauth | third_party_anthropic_compat
# base_url_template base URL for the openai-protocol surface (null = CLI/SDK default)
# base_url_anthropic base URL for the anthropic-protocol surface (where applicable)
# auth_env env var names accepted (NAMES ONLY — never secrets); any one satisfies auth
# auth_token_env env var the adapter projects the vendor key INTO (default ANTHROPIC_AUTH_TOKEN)
# model_prefix_match RE2 regex unifying proxy inferLLMProvider prefixes +
# canvas BARE_VENDOR_PATTERNS + adapter model_prefixes
# model_aliases canvas shortcut ids (sonnet/opus/haiku, etc.)
# deprecated optional bool (RFC §8.2; default false)
# upstream_vendor OPTIONAL (internal#718 P1, CONVERGED 2026-05-27). The
# proxy's upstream-vendor key for this entry — the 4-name
# vocabulary {openai, moonshot, anthropic, minimax} the
# proxy's resolveLLMProviderTarget switch dispatches on to
# pick the upstream base URL + key. Present ONLY on the
# entries the proxy routes to an upstream; absent everywhere
# else. It is a single PROPERTY of the entry (like protocol
# or base_url_template), NOT a parallel routing block: the
# upstream-vendor identity of "anthropic-api" is the
# "anthropic" vendor; for moonshot/openai/minimax the entry
# name and the vendor coincide. Manifest.ResolveUpstream is
# the ONE resolution over these entries — it maps a platform
# model id's NAMESPACE token (every live platform id is
# `vendor/model`) to the entry whose upstream_vendor equals
# it, then reads that entry's base_url_* / auth_env (the
# SINGLE source) for the upstream target. Bare ids are
# vestigial at the proxy (no live bare traffic) and resolve
# via the proxy's retained legacy fallback, not here.
# Must be UNIQUE across the catalog (load fails closed
# otherwise — the namespace token must resolve to one entry).
#
# DRIFT NOTE on `name` vs DB `provider`: the RFC suggests name == the
# llm_price_catalog.provider column. The DB actually seeds the row
# `anthropic` (not `anthropic-api`), and has no rows for the OAuth /
# platform / third-party providers. PR-1 keeps the RFC's `anthropic-api`
# key and records the mismatch here; reconciling the join key is a
# later-PR / migration concern, not a PR-1 routing change.
schema_version: 1
providers:
# ===========================================================================
# Anthropic — native. proxy + canvas + adapter + DB all know it.
# ===========================================================================
- name: anthropic-api
display_name: "Anthropic API"
vendor_logo: "anthropic"
protocol: anthropic
auth_mode: anthropic_api
base_url_template: "https://api.anthropic.com/v1"
base_url_anthropic: "https://api.anthropic.com/v1"
auth_env: [ANTHROPIC_API_KEY, ANTHROPIC_AUTH_TOKEN]
auth_token_env: ANTHROPIC_API_KEY
# Proxy inferLLMProvider matches HasPrefix "claude"; canvas matches /^claude-/i.
model_prefix_match: "^claude"
model_aliases: []
# internal#718 P1 (CONVERGED): the proxy's upstream-vendor key. ResolveUpstream
# maps the `anthropic/` namespace token to THIS entry, then dials this entry's
# base_url_anthropic / base_url_template + auth (the SINGLE source). The vendor
# key is "anthropic" (NOT the registry provider name "anthropic-api"). The
# anthropic-oauth entry carries NO upstream_vendor — OAuth never traverses the
# proxy (the CLI talks to Anthropic directly). Bare `claude*` ids are vestigial
# at the proxy (no live bare traffic) and resolve via the legacy fallback.
upstream_vendor: anthropic
# Claude Code subscription via OAuth. Adapter + canvas know it; proxy
# never routes OAuth (the CLI talks to Anthropic directly). No base URL.
- name: anthropic-oauth
display_name: "Claude Code subscription"
vendor_logo: "anthropic"
protocol: anthropic
auth_mode: oauth
base_url_template: null
base_url_anthropic: null
auth_env: [CLAUDE_CODE_OAUTH_TOKEN]
auth_token_env: CLAUDE_CODE_OAUTH_TOKEN
# Matched by exact alias, not prefix — the bare ids sonnet/opus/haiku
# only count as OAuth when CLAUDE_CODE_OAUTH_TOKEN is the auth env
# (canvas gates on env; the manifest expresses the alias set here).
model_prefix_match: "^(sonnet|opus|haiku)$"
model_aliases: [sonnet, opus, haiku]
# ===========================================================================
# OpenAI — proxy default arm + DB catalog + canvas. NOT in the adapter
# template (claude-code template is Anthropic-protocol only).
# ===========================================================================
- name: openai
display_name: "OpenAI"
vendor_logo: "openai"
protocol: openai
auth_mode: anthropic_api # OpenAI is openai-protocol; auth is a bearer API key.
base_url_template: "https://api.openai.com/v1"
base_url_anthropic: null # OpenAI exposes only the OpenAI protocol surface.
auth_env: [OPENAI_API_KEY]
auth_token_env: OPENAI_API_KEY
# Proxy treats openai as the DEFAULT (catch-all) arm of inferLLMProvider;
# there is no explicit prefix today. Canvas matches /^gpt-/i. Encode the
# canvas prefix so the explicit slugs route; the proxy's catch-all
# behavior is a routing decision for PR-3, not the manifest.
model_prefix_match: "^gpt-"
model_aliases: []
# internal#718 P1 (CONVERGED): the proxy's upstream-vendor key. ResolveUpstream
# maps the `openai/` namespace token to THIS entry. openai is ALSO the proxy's
# historical catch-all (the switch's `default:` arm) for bare/unknown ids —
# but the catch-all is a VESTIGIAL bare-id behavior (no live bare traffic), so
# it lives in the retained legacy fallback (inferLLMProviderLegacy), NOT as a
# registry data flag. Live `openai/<m>` ids resolve here by namespace.
upstream_vendor: openai
# ===========================================================================
# Moonshot (Kimi) — proxy arm + DB catalog + canvas label "moonshot".
# Distinct from the adapter's `kimi-coding` gateway (different host + auth
# header); both are retained — see kimi-coding below.
# ===========================================================================
- name: moonshot
display_name: "Moonshot (Kimi)"
vendor_logo: "moonshot"
protocol: openai
auth_mode: third_party_anthropic_compat
base_url_template: "https://api.moonshot.ai/v1"
base_url_anthropic: "https://api.moonshot.ai/anthropic/v1"
auth_env: [MOONSHOT_API_KEY, KIMI_API_KEY]
auth_token_env: ANTHROPIC_API_KEY
# internal#718 P0 (CTO 2026-05-27, EMPIRICALLY VERIFIED): the moonshot
# endpoint (api.moonshot.ai) and the kimi-coding gateway
# (api.kimi.com/coding) serve DIFFERENT models on DIFFERENT hosts —
# moonshot serves the moonshot-namespaced ids (the proxy's platform path
# resolves `moonshot/kimi-k2.6` here and 404s `kimi-for-coding`), while
# the bare kimi-* ids are served by the separate `kimi-coding` gateway
# below (which 404s on api.moonshot.ai). They are NOT a single owner.
# `moonshot` therefore owns ONLY the moonshot-prefixed ids:
# * "moonshot/..." — the proxy/platform-namespaced form (claude-code +
# hermes + openclaw platform refs route here),
# * "moonshot:..." — openclaw's colon-namespaced BYOK form,
# * "moonshot-..." — a bare moonshot-v1* model id.
# It deliberately does NOT claim bare kimi-* (those are kimi-coding's, per
# the corrected serving split). RE2 has no negative lookahead; the prefix
# is positively scoped to the moonshot namespace so the two regexes are
# disjoint and DeriveProvider resolves each bare/namespaced id to exactly
# one owner. This removes the false kimi-* overlap RFC#340/PR-1 flagged.
model_prefix_match: "^moonshot[:/-]"
model_aliases: []
# internal#718 P1 (CONVERGED): the proxy's upstream-vendor key. ResolveUpstream
# maps the `moonshot/` (slash) + `moonshot:` (openclaw colon) namespace tokens
# to THIS entry — jrs SEO's LIVE `moonshot/kimi-k2.6` + sibling `moonshot/...`
# ids dial this entry's base_url (api.moonshot.ai). The vendor key coincides
# with the entry name here.
# NOTE on bare kimi-* (the convergence's key clarification): a BARE `kimi*`
# id is NOT routed by this registry resolution. DeriveProvider (registry
# semantics, P0) resolves bare kimi-* to the `kimi-coding` gateway
# (api.kimi.com/coding) — which is NOT a valid proxy upstream — so routing a
# bare kimi id through the shared registry matcher would MISROUTE. Bare ids
# are vestigial at the proxy (zero live bare traffic; every platform id is
# namespaced), so the converged path does not resolve them at all; a bare
# `kimi*` falls through to the proxy's retained legacy switch, which routes
# it to moonshot exactly as before (byte-identical). No moonshot bare-prefix
# data block is recreated.
upstream_vendor: moonshot
# ===========================================================================
# MiniMax — proxy arm + DB catalog (7 models) + adapter + canvas.
# ===========================================================================
- name: minimax
display_name: "MiniMax"
vendor_logo: "minimax"
protocol: openai
auth_mode: third_party_anthropic_compat
base_url_template: "https://api.minimax.io/v1"
base_url_anthropic: "https://api.minimax.io/anthropic/v1"
# Adapter template uses api.minimax.io/anthropic (no /v1); proxy uses
# /anthropic/v1. Manifest follows the proxy's value (the routing layer);
# the adapter base_url is reconciled in PR-5.
auth_env: [MINIMAX_API_KEY, ANTHROPIC_AUTH_TOKEN, ANTHROPIC_API_KEY]
auth_token_env: ANTHROPIC_AUTH_TOKEN
# Proxy: HasPrefix "minimax" (case-insensitive lower). Catalog ids are
# mixed-case "MiniMax-M2.7" — every catalog/canvas id starts "MiniMax-M".
# Anchored on "-m" (not bare "-") so it does NOT also claim the
# `minimax-cn/` slash-prefixed China sibling below (RE2 has no negative
# lookahead; the more-specific China entry owns its slash-prefix).
model_prefix_match: "(?i)^minimax-m"
model_aliases: []
# internal#718 P1 (CONVERGED): the proxy's upstream-vendor key. ResolveUpstream
# maps the `minimax/` namespace token to THIS entry — claude-code's LIVE
# `minimax/MiniMax-M2.7(-highspeed)` platform ids dial this entry's base_url
# (api.minimax.io). The `minimax-cn` China sibling carries NO upstream_vendor
# (the proxy has no arm for it; a bare minimax-cn id is vestigial and falls to
# the legacy fallback, unchanged). Bare `minimax*` ids are vestigial at the
# proxy and resolve via the legacy fallback (which keeps the broader
# HasPrefix "minimax" behavior verbatim), not here.
upstream_vendor: minimax
# ===========================================================================
# Platform — Molecule-managed LLM proxy. Adapter + canvas know it. It is
# the PROXY ITSELF as seen from a workspace, so the manifest entry is the
# client-facing endpoint, not an upstream vendor. proxy switch has no
# "platform" arm (it routes the underlying vendor model instead).
# ===========================================================================
- name: platform
display_name: "Platform"
vendor_logo: "molecule"
protocol: anthropic
auth_mode: third_party_anthropic_compat
# Dual-surface: the platform proxy exposes BOTH the OpenAI-compat
# (/openai/v1/chat/completions) and Anthropic-compat (/anthropic/v1/messages)
# wire formats. Anthropic-protocol runtimes (claude-code) use
# base_url_anthropic; OpenAI-protocol runtimes (hermes/codex/openclaw) use
# base_url_template. Previously both pointed at the anthropic surface — a
# PR-1 simplification when only claude-code referenced platform.
base_url_template: "https://api.moleculesai.app/api/v1/internal/llm/openai/v1"
base_url_anthropic: "https://api.moleculesai.app/api/v1/internal/llm/anthropic/v1"
auth_env: [ANTHROPIC_API_KEY, MOLECULE_LLM_USAGE_TOKEN]
auth_token_env: ANTHROPIC_API_KEY
# Adapter routes kimi- / moonshot/ through platform by default. No bare
# vendor prefix of its own; it multiplexes other vendors' slugs. Match
# the explicit "platform/" slash-prefix only so it never steals another
# vendor's bare slug.
model_prefix_match: "^platform/"
model_aliases: []
# ===========================================================================
# Xiaomi MiMo — adapter + canvas (two canvas keys: "xiaomi-mimo" AND
# "xiaomi", both labelled "Xiaomi MiMo"). proxy has no arm; DB has no rows.
# ===========================================================================
- name: xiaomi-mimo
display_name: "Xiaomi MiMo"
vendor_logo: "xiaomi"
protocol: anthropic
auth_mode: third_party_anthropic_compat
base_url_template: "https://api.xiaomimimo.com/anthropic"
base_url_anthropic: "https://api.xiaomimimo.com/anthropic"
auth_env: [ANTHROPIC_AUTH_TOKEN, ANTHROPIC_API_KEY]
auth_token_env: ANTHROPIC_AUTH_TOKEN
# Adapter prefix "mimo-"; canvas /^mimo-/i. proxy routing TBD (PR-3).
# NOTE: canvas has a duplicate "xiaomi" VENDOR_LABELS key aliasing the
# same vendor — collapsed into this one entry.
model_prefix_match: "^mimo-"
model_aliases: []
# ===========================================================================
# Z.ai (GLM) — adapter + canvas. proxy has no arm; DB has no rows.
# ===========================================================================
- name: zai
display_name: "Z.ai (GLM)"
vendor_logo: "zai"
protocol: anthropic
auth_mode: third_party_anthropic_compat
base_url_template: "https://api.z.ai/api/anthropic"
base_url_anthropic: "https://api.z.ai/api/anthropic"
auth_env: [GLM_API_KEY, ANTHROPIC_AUTH_TOKEN, ANTHROPIC_API_KEY]
auth_token_env: ANTHROPIC_AUTH_TOKEN
# Adapter prefix "glm-" (lowercased match catches GLM-4.6); canvas /^GLM-/i.
# canvas-only + adapter-only today; proxy routing TBD (PR-3).
model_prefix_match: "(?i)^glm-"
model_aliases: []
# ===========================================================================
# Kimi For Coding — adapter ("kimi-coding") + canvas
# ("kimi-coding"="Moonshot Kimi (coding-tuned)"). Distinct host
# (api.kimi.com/coding/) + x-api-key auth from the `moonshot` entry above.
# DB seeds moonshot/kimi-for-coding as alias_for kimi-k2.6.
# ===========================================================================
- name: kimi-coding
display_name: "Moonshot Kimi (coding-tuned)"
vendor_logo: "moonshot"
protocol: anthropic
auth_mode: third_party_anthropic_compat
base_url_template: "https://api.kimi.com/coding/"
base_url_anthropic: "https://api.kimi.com/coding/"
auth_env: [KIMI_API_KEY, ANTHROPIC_API_KEY, ANTHROPIC_AUTH_TOKEN]
# x-api-key header (NOT bearer) per kimi.com's Claude Code integration doc.
auth_token_env: ANTHROPIC_API_KEY
# internal#718 P0 (CTO 2026-05-27, EMPIRICALLY VERIFIED): the
# api.kimi.com/coding gateway is the owner of the BARE kimi-* ids. Per
# kimi.com's official Claude Code integration doc + the claude-code
# template's `kimi-coding` provider (model_prefixes: [kimi-]), this
# gateway authenticates with KIMI_API_KEY (sk-kimi-*) on the x-api-key
# header and "routes to the served K2.6 model regardless of the model
# name on the wire" — so every bare kimi-* id (kimi-for-coding,
# kimi-k2.6, kimi-k2.5, kimi-k2, kimi-latest, ...) is served HERE, while
# api.moonshot.ai 404s these. This OWNS bare "kimi-"; the moonshot-
# namespaced ids (moonshot/, moonshot:, moonshot-) belong to `moonshot`
# above. The two regexes are now disjoint (no negative lookahead needed),
# removing the false kimi-* overlap that RFC#340/PR-1 deferred — each id
# resolves to exactly one owner. Registry-data-only: NO production reader
# consumes model_prefix_match yet (the proxy keeps its own hardcoded
# inferLLMProvider), so this cannot change live routing.
model_prefix_match: "^kimi-"
model_aliases: []
# ===========================================================================
# DeepSeek — adapter + canvas. proxy has no arm; DB has no rows.
# ===========================================================================
- name: deepseek
display_name: "DeepSeek"
vendor_logo: "deepseek"
protocol: anthropic
auth_mode: third_party_anthropic_compat
base_url_template: "https://api.deepseek.com/anthropic"
base_url_anthropic: "https://api.deepseek.com/anthropic"
auth_env: [DEEPSEEK_API_KEY, ANTHROPIC_AUTH_TOKEN, ANTHROPIC_API_KEY]
auth_token_env: ANTHROPIC_AUTH_TOKEN
# Adapter prefix "deepseek-"; canvas /^deepseek-/i. adapter+canvas only;
# proxy routing TBD (PR-3).
model_prefix_match: "^deepseek-"
model_aliases: []
# ===========================================================================
# CANVAS-ONLY vendors — present in ProviderModelSelector VENDOR_LABELS but
# NOT routed by the proxy, NOT in the adapter template, NOT in the DB.
# This is exactly the "canvas offered a provider the proxy can't route"
# drift the RFC targets. Transcribed here so PR-3/PR-4 converge them;
# base_url/auth are best-effort placeholders pending real routing in PR-3.
# Each is marked `proxy routing TBD`. model_prefix_match is the canvas
# heuristic (slash-prefix vendor key) where one exists, else a slash-prefix
# on the vendor key itself.
# ===========================================================================
- name: google
display_name: "Google Gemini"
vendor_logo: "google"
protocol: openai
auth_mode: third_party_anthropic_compat
base_url_template: null
base_url_anthropic: null
auth_env: [GEMINI_API_KEY, GOOGLE_API_KEY]
auth_token_env: ANTHROPIC_AUTH_TOKEN
# canvas-only today; proxy routing TBD. canvas /^gemini-/i.
# canvas also has a duplicate "gemini" label key aliasing the same vendor.
model_prefix_match: "^gemini-"
model_aliases: []
- name: alibaba
display_name: "Alibaba Qwen (DashScope)"
vendor_logo: "alibaba"
protocol: openai
auth_mode: third_party_anthropic_compat
base_url_template: null
base_url_anthropic: null
auth_env: [DASHSCOPE_API_KEY, ALIBABA_API_KEY]
auth_token_env: ANTHROPIC_AUTH_TOKEN
# canvas-only today; proxy routing TBD. canvas /^qwen-/i.
model_prefix_match: "^qwen-"
model_aliases: []
- name: nousresearch
display_name: "Nous Research (Hermes)"
vendor_logo: "nousresearch"
protocol: openai
auth_mode: third_party_anthropic_compat
base_url_template: null
base_url_anthropic: null
auth_env: [NOUSRESEARCH_API_KEY]
auth_token_env: ANTHROPIC_AUTH_TOKEN
# canvas-only today; proxy routing TBD. Slash-prefix id
# (e.g. nousresearch/hermes-4-70b).
model_prefix_match: "^nousresearch/"
model_aliases: []
- name: openrouter
display_name: "OpenRouter (any model)"
vendor_logo: "openrouter"
protocol: openai
auth_mode: third_party_anthropic_compat
base_url_template: "https://openrouter.ai/api/v1"
base_url_anthropic: null
auth_env: [OPENROUTER_API_KEY]
auth_token_env: ANTHROPIC_AUTH_TOKEN
# canvas-only today; proxy routing TBD. Wildcard: openrouter/<model>.
model_prefix_match: "^openrouter/"
model_aliases: []
- name: huggingface
display_name: "Hugging Face Inference"
vendor_logo: "huggingface"
protocol: openai
auth_mode: third_party_anthropic_compat
base_url_template: null
base_url_anthropic: null
auth_env: [HUGGINGFACE_API_KEY, HF_TOKEN]
auth_token_env: ANTHROPIC_AUTH_TOKEN
# canvas-only today; proxy routing TBD. Wildcard: huggingface/<model>.
model_prefix_match: "^huggingface/"
model_aliases: []
- name: ai-gateway
display_name: "Vercel AI Gateway"
vendor_logo: "ai-gateway"
protocol: openai
auth_mode: third_party_anthropic_compat
base_url_template: null
base_url_anthropic: null
auth_env: [AI_GATEWAY_API_KEY]
auth_token_env: ANTHROPIC_AUTH_TOKEN
# canvas-only today; proxy routing TBD.
model_prefix_match: "^ai-gateway/"
model_aliases: []
- name: opencode-zen
display_name: "OpenCode Zen"
vendor_logo: "opencode-zen"
protocol: openai
auth_mode: third_party_anthropic_compat
base_url_template: null
base_url_anthropic: null
auth_env: [OPENCODE_ZEN_API_KEY]
auth_token_env: ANTHROPIC_AUTH_TOKEN
# canvas-only today; proxy routing TBD.
model_prefix_match: "^opencode-zen/"
model_aliases: []
- name: opencode-go
display_name: "OpenCode Go"
vendor_logo: "opencode-go"
protocol: openai
auth_mode: third_party_anthropic_compat
base_url_template: null
base_url_anthropic: null
auth_env: [OPENCODE_GO_API_KEY]
auth_token_env: ANTHROPIC_AUTH_TOKEN
# canvas-only today; proxy routing TBD.
model_prefix_match: "^opencode-go/"
model_aliases: []
- name: kilocode
display_name: "Kilo Code"
vendor_logo: "kilocode"
protocol: openai
auth_mode: third_party_anthropic_compat
base_url_template: null
base_url_anthropic: null
auth_env: [KILOCODE_API_KEY]
auth_token_env: ANTHROPIC_AUTH_TOKEN
# canvas-only today; proxy routing TBD.
model_prefix_match: "^kilocode/"
model_aliases: []
- name: minimax-cn
display_name: "MiniMax China"
vendor_logo: "minimax"
protocol: openai
auth_mode: third_party_anthropic_compat
base_url_template: "https://api.minimaxi.com/v1"
base_url_anthropic: "https://api.minimaxi.com/anthropic"
auth_env: [MINIMAX_API_KEY, ANTHROPIC_AUTH_TOKEN]
auth_token_env: ANTHROPIC_AUTH_TOKEN
# canvas-only today; proxy routing TBD. China endpoint sibling of `minimax`
# (api.minimaxi.com). Matched only by the explicit slash-prefix so it does
# NOT collide with `minimax`'s (?i)^minimax- in the overlap guard.
model_prefix_match: "^minimax-cn/"
model_aliases: []
- name: ollama-cloud
display_name: "Ollama Cloud"
vendor_logo: "ollama"
protocol: openai
auth_mode: third_party_anthropic_compat
base_url_template: null
base_url_anthropic: null
auth_env: [OLLAMA_CLOUD_API_KEY]
auth_token_env: ANTHROPIC_AUTH_TOKEN
# canvas-only today; proxy routing TBD.
model_prefix_match: "^ollama-cloud/"
model_aliases: []
- name: ollama
display_name: "Ollama (self-hosted)"
vendor_logo: "ollama"
protocol: openai
auth_mode: third_party_anthropic_compat
base_url_template: "http://localhost:11434/v1"
base_url_anthropic: null
auth_env: [OLLAMA_HOST]
auth_token_env: ANTHROPIC_AUTH_TOKEN
# canvas-only today; proxy routing TBD. Self-hosted; no key (host only).
model_prefix_match: "^ollama/"
model_aliases: []
- name: nvidia
display_name: "NVIDIA NIM"
vendor_logo: "nvidia"
protocol: openai
auth_mode: third_party_anthropic_compat
base_url_template: "https://integrate.api.nvidia.com/v1"
base_url_anthropic: null
auth_env: [NVIDIA_API_KEY]
auth_token_env: ANTHROPIC_AUTH_TOKEN
# canvas-only today; proxy routing TBD.
model_prefix_match: "^nvidia/"
model_aliases: []
- name: arcee
display_name: "Arcee"
vendor_logo: "arcee"
protocol: openai
auth_mode: third_party_anthropic_compat
base_url_template: null
base_url_anthropic: null
auth_env: [ARCEE_API_KEY]
auth_token_env: ANTHROPIC_AUTH_TOKEN
# canvas-only today; proxy routing TBD.
model_prefix_match: "^arcee/"
model_aliases: []
- name: custom
display_name: "Custom OpenAI-compat endpoint"
vendor_logo: "custom"
protocol: openai
auth_mode: third_party_anthropic_compat
base_url_template: null # operator-supplied via workspace runtime config
base_url_anthropic: null
auth_env: [CUSTOM_API_KEY, OPENAI_API_KEY]
auth_token_env: ANTHROPIC_AUTH_TOKEN
# canvas-only today; proxy routing TBD. Wildcard free-text: custom/<model>.
model_prefix_match: "^custom/"
model_aliases: []
# =============================================================================
# RUNTIME NATIVE SUPPORT MATRIX (RFC #340 — CTO correction 2026-05-26)
# =============================================================================
# The `providers:` list above is the full catalog (the union of proxy /
# canvas / adapter / DB). It is NOT the support matrix. We do NOT support
# every model on every provider.
#
# This `runtimes:` block is the SSOT for "which providers + models does
# runtime R NATIVELY support". It constrains the catalog to each runtime's
# native support matrix — the INVERSE of a superset. Canvas (PR-4) offers
# ONLY a runtime's native models; the proxy (PR-3) routes ONLY native models
# with NO protocol translation (matches the cp#334 "use the native endpoint,
# don't translate" fix). A catalog provider that appears in NO runtime's
# native set is over-offer drift: it stays in `providers:` only if another
# runtime legitimately uses it, otherwise it is the drift this RFC prunes.
#
# AUTHORITATIVE MATRIX (provider level), encoded EXACTLY below:
# claude-code -> anthropic (oauth + api), kimi (kimi-coding), minimax
# hermes -> kimi (kimi-coding)
# codex -> openai
# openclaw -> kimi (kimi-coding)
#
# Each runtime entry lists native provider NAMES (referencing `providers:`
# above; Load fails closed on an unknown ref) plus the EXACT model ids that
# runtime exposes for that provider. Model ids are transcribed verbatim from
# each runtime template's config.yaml `runtime_config.models` block
# (git.moleculesai.app/molecule-ai/molecule-ai-workspace-template-<rt>),
# pruned to the native matrix above.
#
# DRIFT PRUNED (templates declare these, they are NOT in the native matrix,
# so they are deliberately absent from the runtimes block below — flagged in
# the RFC, carried in `providers:` only where another runtime needs them):
# * claude-code template also declares: xiaomi-mimo (mimo-*), zai (GLM-*),
# deepseek (deepseek-*). Outside {anthropic, kimi, minimax} -> pruned.
# * codex template also declares: minimax-token-plan (codex-minimax-*).
# Outside {openai} -> pruned. (Template itself notes the MiniMax
# token-plan leg 404s on /v1/responses — a vendor gap, reinforcing the
# prune.)
# * openclaw template also declares: minimax (the default!), openai (gpt-*),
# groq, openrouter. Outside {kimi} -> pruned. NOTE: openclaw's *default*
# model is minimax:MiniMax-M2.7, NOT kimi — the CTO matrix narrows
# openclaw to its native Kimi path (moonshot: prefix + KIMI_API_KEY ->
# api.kimi.com/coding gateway). See RFC #340 update for the rationale.
# * hermes template declares ~30 providers (nous, openrouter, anthropic,
# gemini, deepseek, zai, minimax, alibaba, xiaomi, arcee, nvidia, ...).
# The CTO matrix narrows hermes to {kimi} only -> all others pruned from
# the native set.
runtimes:
# claude-code: native Anthropic-API / Claude-Code endpoints. Anthropic is
# split across two manifest providers (oauth + api) because the runtime
# exposes both auth paths natively; both count as "anthropic".
#
# internal#718 P4 PR-1 (2026-05-27): the colon-namespaced BYOK form
# `vendor:model` is the legacy spelling for explicit BYOK selection that
# predates the slash-namespaced platform form `vendor/model`. Both forms
# are LIVE across the workspace-create corpus (~44 test files +
# canvas/ConfigTab default + the openclaw template's native list).
# PRECEDENT: the openclaw runtime below already lists colon-form ids
# (`moonshot:kimi-k2.6`) as the BYOK kimi-coding native set — the
# adapter understands the colon form. P4 PR-1 extends the same precedent
# to claude-code so `DeriveProvider` / `Manifest.ModelsForRuntime`
# returns true for every legitimate BYOK model in the corpus. The
# canonical slash form (`anthropic/claude-opus-4-7`) is the
# platform-managed routing form (proxy upstream lookup); the colon
# form is the legacy BYOK selection form. Both are first-class
# registry entries on the runtime's native provider set.
claude-code:
providers:
- name: anthropic-oauth
# Colon-form aliases (`anthropic:sonnet`, ...) are the legacy BYOK
# spelling for the OAuth alias path that the live corpus carries.
# Per the same P4 PR-1 precedent (see colon-form comment above),
# these are first-class registry entries — DeriveProvider resolves
# them to anthropic-oauth deterministically.
models:
- sonnet
- opus
- haiku
- anthropic:sonnet
- anthropic:opus
- anthropic:haiku
- name: anthropic-api
# BYOK versioned API ids — bare form is the canonical id the
# Anthropic SDK accepts on the wire; colon form is the legacy
# BYOK selection spelling used across the create/test corpus
# (internal#718 P4 PR-1). Both forms route to anthropic-api.
models:
- claude-sonnet-4-6
- claude-opus-4-7
- claude-haiku-4-5
- claude-sonnet-4-5
- anthropic:claude-sonnet-4-6
- anthropic:claude-opus-4-7
- anthropic:claude-haiku-4-5
- anthropic:claude-sonnet-4-5
- name: kimi-coding
# BYOK kimi-coding gateway ids — bare form is the canonical id
# the gateway routes; the colon form `moonshot:kimi-k2.*` is the
# legacy BYOK selection form (already in use on the openclaw
# native set below). claude-code's adapter accepts both
# (internal#718 P4 PR-1).
models:
- kimi-for-coding
- kimi-k2.5
- kimi-k2
- moonshot:kimi-k2.6
- moonshot:kimi-k2.5
- name: minimax
# BYOK MiniMax ids — bare form is the canonical id; colon form is
# the legacy BYOK selection spelling carried in the create corpus
# and the openclaw template (internal#718 P4 PR-1).
models:
- MiniMax-M2
- MiniMax-M2.7
- MiniMax-M2.7-highspeed
- minimax:MiniMax-M2
- minimax:MiniMax-M2.7
- minimax:MiniMax-M2.7-highspeed
# Platform-managed (no tenant key; Molecule owns billing). The
# vendor/model-namespaced ids the proxy resolves to the upstream vendor.
# Canonical for the template's `provider: platform` model entries — the
# drift gate (molecule-ci validate-workspace-template) enforces the
# template can offer no platform model absent from this set.
- name: platform
models:
- anthropic/claude-opus-4-7
- anthropic/claude-sonnet-4-6
- moonshot/kimi-k2.6
- moonshot/kimi-k2.5
- minimax/MiniMax-M2.7
- minimax/MiniMax-M2.7-highspeed
# hermes: native Kimi only (kimi-coding gateway). hermes-agent owns its own
# broad provider matrix, but the CTO native matrix for the Molecule
# platform constrains it to kimi.
hermes:
providers:
- name: kimi-coding
models: [kimi-coding/kimi-k2]
# Platform-managed Kimi (hermes's native platform family). Routed via
# the proxy OpenAI-compat surface; see the template's
# scripts/derive-platform-llm.sh.
- name: platform
models:
- moonshot/kimi-k2.6
- moonshot/kimi-k2.5
# codex: OpenAI — BYOK (subscription + API key, both map to the `openai`
# manifest provider) + platform-managed (the `platform` ref below, served
# via the proxy Responses surface).
codex:
providers:
- name: openai
models:
- gpt-5.5
- gpt-5.4
- gpt-5.4-mini
- gpt-5.3-codex
- gpt-5.3-codex-spark
- gpt-5.2
# Platform-managed OpenAI. NOW servable: the proxy exposes the OpenAI
# Responses surface (/internal/llm/openai/v1/responses) that the Codex
# CLI (0.130+, Responses-API-only) requires. The codex template adapter
# routes these via that surface (provider_config.py platform provider,
# auth_mode=openai_compat_responses, wire_api=responses). Default
# mirrors the deploy's MOLECULE_LLM_DEFAULT_MODEL (openai/gpt-5.4-mini).
- name: platform
models:
- openai/gpt-5.4
- openai/gpt-5.4-mini
# openclaw: native Kimi only. openclaw's moonshot: model prefix + a
# KIMI_API_KEY (sk-kimi-*) routes to api.kimi.com/coding (kimi-for-coding),
# which is the native Kimi path. Default minimax / openai / groq / openrouter
# legs are pruned per the CTO matrix.
openclaw:
providers:
- name: kimi-coding
models:
- moonshot:kimi-k2.6
- moonshot:kimi-k2.5
# Platform-managed Kimi. Note the slash form (moonshot/...) here vs the
# BYOK colon form (moonshot:...) above — openclaw's adapter uses colon
# ids natively; the platform path normalizes to the proxy's slash form.
- name: platform
models:
- moonshot/kimi-k2.6
- moonshot/kimi-k2.5
@@ -0,0 +1,207 @@
package providers
import (
"testing"
)
// TestLoadParses asserts the embedded manifest parses and is non-empty.
func TestLoadParses(t *testing.T) {
ps, err := Load()
if err != nil {
t.Fatalf("Load() error = %v", err)
}
if len(ps) == 0 {
t.Fatal("Load() returned an empty provider slice")
}
}
// TestRequiredFieldsPopulated asserts every entry has the fields the
// validate invariant requires (name, protocol, auth_mode, auth_env,
// display_name, model_prefix_match), and that protocol is one of the
// two legal wire formats.
func TestRequiredFieldsPopulated(t *testing.T) {
ps, err := Load()
if err != nil {
t.Fatalf("Load() error = %v", err)
}
for _, p := range ps {
if p.Name == "" {
t.Errorf("provider with display_name %q has empty name", p.DisplayName)
}
if p.DisplayName == "" {
t.Errorf("provider %q has empty display_name", p.Name)
}
if p.AuthMode == "" {
t.Errorf("provider %q has empty auth_mode", p.Name)
}
if len(p.AuthEnv) == 0 {
t.Errorf("provider %q has empty auth_env", p.Name)
}
if p.ModelPrefixMatch == "" {
t.Errorf("provider %q has empty model_prefix_match", p.Name)
}
switch p.Protocol {
case ProtocolOpenAI, ProtocolAnthropic:
default:
t.Errorf("provider %q has invalid protocol %q", p.Name, p.Protocol)
}
}
}
// TestUniqueNames asserts provider names are unique (Load enforces this;
// this test guards the manifest data itself).
func TestUniqueNames(t *testing.T) {
ps, err := Load()
if err != nil {
t.Fatalf("Load() error = %v", err)
}
seen := make(map[string]bool, len(ps))
for _, p := range ps {
if seen[p.Name] {
t.Errorf("duplicate provider name %q", p.Name)
}
seen[p.Name] = true
}
}
// providerByName is a test helper.
func providerByName(t *testing.T, ps []Provider, name string) Provider {
t.Helper()
for _, p := range ps {
if p.Name == name {
return p
}
}
t.Fatalf("provider %q not found in manifest", name)
return Provider{}
}
// TestMatchesModel maps representative slugs from each source (proxy
// prefixes, canvas BARE_VENDOR_PATTERNS, adapter model_prefixes, DB
// catalog ids) to the provider that should own them.
func TestMatchesModel(t *testing.T) {
ps, err := Load()
if err != nil {
t.Fatalf("Load() error = %v", err)
}
cases := []struct {
slug string
expect string // provider name that must match
}{
// Moonshot vs Kimi-coding — corrected serving split (internal#718
// P0, CTO 2026-05-27, empirically verified): the BYOK api.kimi.com/
// coding gateway owns the BARE kimi-* ids; the moonshot endpoint owns
// the moonshot-namespaced/prefixed ids. Bare kimi-k2.6 / kimi-k2.5 /
// kimi-for-coding therefore belong to kimi-coding; only the explicit
// moonshot/ (proxy/platform) and moonshot- (bare moonshot model)
// prefixes belong to moonshot.
{"kimi-k2.6", "kimi-coding"},
{"kimi-k2.5", "kimi-coding"},
{"kimi-latest", "kimi-coding"},
{"moonshot/kimi-k2.6", "moonshot"},
{"moonshot-v1-128k", "moonshot"},
// Anthropic — proxy "claude"->anthropic + DB claude-* + canvas /^claude-/.
{"claude-sonnet-4-6", "anthropic-api"},
{"claude-opus-4-7", "anthropic-api"},
{"claude-haiku-4-5-20251001", "anthropic-api"},
// Anthropic OAuth aliases.
{"sonnet", "anthropic-oauth"},
{"opus", "anthropic-oauth"},
{"haiku", "anthropic-oauth"},
// MiniMax — DB MiniMax-M2.7 (mixed case) + canvas /^MiniMax-/.
{"MiniMax-M2.7", "minimax"},
{"MiniMax-M2", "minimax"},
{"minimax-m2.5", "minimax"},
// OpenAI — DB gpt-5.x + canvas /^gpt-/.
{"gpt-5.5", "openai"},
{"gpt-5.4-mini", "openai"},
// Xiaomi MiMo — adapter mimo- + canvas /^mimo-/.
{"mimo-v2.5-pro", "xiaomi-mimo"},
// Z.ai GLM — adapter glm- + canvas /^GLM-/ (mixed case).
{"GLM-4.6", "zai"},
{"glm-4.5", "zai"},
// DeepSeek.
{"deepseek-v4-pro", "deepseek"},
// Kimi coding-tuned gateway (distinct from moonshot).
{"kimi-for-coding", "kimi-coding"},
// Canvas-only slash-prefixed vendors.
{"openrouter/anthropic/claude-3.5", "openrouter"},
{"huggingface/mistralai/Mistral-7B", "huggingface"},
{"custom/my-local-model", "custom"},
{"gemini-2.5-pro", "google"},
{"qwen-3-max", "alibaba"},
{"nousresearch/hermes-4-70b", "nousresearch"},
}
for _, tc := range cases {
p := providerByName(t, ps, tc.expect)
if !p.MatchesModel(tc.slug) {
t.Errorf("slug %q: expected provider %q to match, but it did not (regex %q)", tc.slug, tc.expect, p.ModelPrefixMatch)
}
}
}
// TestNoAmbiguousModelMatch is the RFC §8.5 overlap guard: no two
// providers may claim the same representative slug. A bad regex that
// over-broadly matches another vendor's ids breaks routing across three
// runtimes, so we catch overlap at PR-1 load time.
func TestNoAmbiguousModelMatch(t *testing.T) {
ps, err := Load()
if err != nil {
t.Fatalf("Load() error = %v", err)
}
// Representative slug corpus spanning every source. Each slug must be
// claimed by exactly one provider.
corpus := []string{
"kimi-k2.6", "kimi-k2.5", "moonshot-v1-128k", "moonshot/kimi-k2.6",
"claude-sonnet-4-6", "claude-opus-4-7", "claude-haiku-4-5-20251001",
"sonnet", "opus", "haiku",
"MiniMax-M2.7", "MiniMax-M2", "minimax-m2.5", "MiniMax-M2.7-highspeed",
"gpt-5.5", "gpt-5.4", "gpt-5.4-mini",
"mimo-v2.5-pro", "mimo-v2-flash",
"GLM-4.6", "glm-4.5",
"deepseek-v4-pro", "deepseek-v4-flash",
"kimi-for-coding",
"openrouter/x", "huggingface/y", "custom/z",
"gemini-2.5-pro", "qwen-3-max", "nousresearch/hermes-4-70b",
"ai-gateway/m", "opencode-zen/m", "opencode-go/m", "kilocode/m",
"minimax-cn/m2", "ollama-cloud/m", "ollama/llama4", "nvidia/m", "arcee/m",
"platform/anything",
}
for _, slug := range corpus {
var matched []string
for _, p := range ps {
if p.MatchesModel(slug) {
matched = append(matched, p.Name)
}
}
if len(matched) > 1 {
t.Errorf("slug %q ambiguously matched %d providers: %v", slug, len(matched), matched)
}
}
}
// TestMatchesModelZeroValue exercises the lazy on-demand compile path of
// a Provider not produced by Load.
func TestMatchesModelZeroValue(t *testing.T) {
p := Provider{ModelPrefixMatch: "^claude-"}
if !p.MatchesModel("claude-opus-4-7") {
t.Error("zero-value Provider should match claude-opus-4-7")
}
if p.MatchesModel("gpt-5.5") {
t.Error("zero-value Provider should not match gpt-5.5")
}
bad := Provider{ModelPrefixMatch: "([unterminated"}
if bad.MatchesModel("anything") {
t.Error("Provider with an invalid regex must never match")
}
empty := Provider{}
if empty.MatchesModel("anything") {
t.Error("Provider with an empty regex must never match")
}
}
@@ -0,0 +1,429 @@
package providers
import (
"sort"
"strings"
"testing"
)
// runtimeNativeProviders is the authoritative per-runtime native provider
// matrix from RFC #340 (CTO correction 2026-05-26): the manifest is
// constrained to what each runtime NATIVELY supports, not a 24-provider
// superset. Provider-level expectations; the model-id-level assertions
// live in TestModelsForRuntime_ModelIDs.
//
// Each runtime also natively supports the `platform` provider (Molecule
// platform-managed LLM: no tenant key, platform owns billing) for the subset
// of its native vendors the proxy can serve — kimi for hermes/openclaw,
// openai for codex, anthropic+kimi+minimax for claude-code.
//
// claude-code -> anthropic (oauth+api), kimi (kimi-coding), minimax, platform
// hermes -> kimi (kimi-coding), platform
// codex -> openai, platform
// openclaw -> kimi (kimi-coding), platform
var runtimeNativeProviders = map[string][]string{
"claude-code": {"anthropic-api", "anthropic-oauth", "kimi-coding", "minimax", "platform"},
"hermes": {"kimi-coding", "platform"},
"codex": {"openai", "platform"}, // platform openai via the proxy Responses surface
"openclaw": {"kimi-coding", "platform"},
}
func sortedCopy(in []string) []string {
out := append([]string(nil), in...)
sort.Strings(out)
return out
}
// TestProvidersForRuntime_ExactNativeSet asserts ProvidersForRuntime
// returns EXACTLY the native provider set for each runtime — no more
// (over-offer drift), no fewer (under-route). Exact set equality, not
// substring/superset, per feedback_assert_exact_not_substring.
func TestProvidersForRuntime_ExactNativeSet(t *testing.T) {
m, err := LoadManifest()
if err != nil {
t.Fatalf("LoadManifest() error = %v", err)
}
for rt, want := range runtimeNativeProviders {
got, err := m.ProvidersForRuntime(rt)
if err != nil {
t.Fatalf("ProvidersForRuntime(%q) error = %v", rt, err)
}
var gotNames []string
for _, p := range got {
gotNames = append(gotNames, p.Name)
}
gotNames = sortedCopy(gotNames)
wantSorted := sortedCopy(want)
if len(gotNames) != len(wantSorted) {
t.Fatalf("ProvidersForRuntime(%q) = %v, want exactly %v", rt, gotNames, wantSorted)
}
for i := range wantSorted {
if gotNames[i] != wantSorted[i] {
t.Fatalf("ProvidersForRuntime(%q) = %v, want exactly %v", rt, gotNames, wantSorted)
}
}
}
}
// TestModelsForRuntime_ExactModelIDs is the brief's central assertion:
// ModelsForRuntime returns EXACTLY the native model-id set for each
// runtime. Encodes the model IDs extracted from each template config.yaml.
func TestModelsForRuntime_ExactModelIDs(t *testing.T) {
m, err := LoadManifest()
if err != nil {
t.Fatalf("LoadManifest() error = %v", err)
}
cases := map[string][]string{
// claude-code: anthropic (oauth aliases + versioned API ids +
// platform-namespaced) + kimi (kimi-coding gateway + platform) +
// minimax (BYOK + platform-namespaced). internal#718 P4 PR-1 added
// the legacy colon-namespaced BYOK spelling (`vendor:model`) as
// first-class registry entries — the live workspace-create corpus
// uses both bare and colon forms across ~44 test files +
// canvas/ConfigTab default + the openclaw template (precedent).
"claude-code": {
// anthropic OAuth aliases (bare + legacy colon-namespaced)
"sonnet", "opus", "haiku",
"anthropic:sonnet", "anthropic:opus", "anthropic:haiku",
// anthropic API versioned (bare + legacy colon-namespaced BYOK)
"claude-sonnet-4-6", "claude-opus-4-7", "claude-haiku-4-5", "claude-sonnet-4-5",
"anthropic:claude-sonnet-4-6", "anthropic:claude-opus-4-7",
"anthropic:claude-haiku-4-5", "anthropic:claude-sonnet-4-5",
// anthropic via platform proxy (namespaced)
"anthropic/claude-opus-4-7", "anthropic/claude-sonnet-4-6",
// kimi (kimi-coding gateway, bare + legacy colon-namespaced BYOK)
"kimi-for-coding", "kimi-k2.5", "kimi-k2",
"moonshot:kimi-k2.6", "moonshot:kimi-k2.5",
// kimi via platform proxy
"moonshot/kimi-k2.6", "moonshot/kimi-k2.5",
// minimax BYOK (bare + legacy colon-namespaced)
"MiniMax-M2", "MiniMax-M2.7", "MiniMax-M2.7-highspeed",
"minimax:MiniMax-M2", "minimax:MiniMax-M2.7", "minimax:MiniMax-M2.7-highspeed",
// minimax via platform proxy
"minimax/MiniMax-M2.7", "minimax/MiniMax-M2.7-highspeed",
},
// hermes: kimi (BYOK gateway) + platform-managed kimi.
"hermes": {
"kimi-coding/kimi-k2",
"moonshot/kimi-k2.6", "moonshot/kimi-k2.5",
},
// codex: openai BYOK + platform-managed openai (served via the proxy
// Responses surface; codex CLI 0.130+ is Responses-API-only).
"codex": {
"gpt-5.5", "gpt-5.4", "gpt-5.4-mini",
"gpt-5.3-codex", "gpt-5.3-codex-spark", "gpt-5.2",
"openai/gpt-5.4", "openai/gpt-5.4-mini",
},
// openclaw: kimi BYOK (moonshot: prefix -> KIMI_API_KEY ->
// api.kimi.com/coding gateway) + platform-managed kimi (moonshot/).
"openclaw": {
"moonshot:kimi-k2.6", "moonshot:kimi-k2.5",
"moonshot/kimi-k2.6", "moonshot/kimi-k2.5",
},
}
for rt, want := range cases {
got, err := m.ModelsForRuntime(rt)
if err != nil {
t.Fatalf("ModelsForRuntime(%q) error = %v", rt, err)
}
gotSorted := sortedCopy(got)
wantSorted := sortedCopy(want)
if len(gotSorted) != len(wantSorted) {
t.Fatalf("ModelsForRuntime(%q) returned %d ids %v, want %d %v",
rt, len(gotSorted), gotSorted, len(wantSorted), wantSorted)
}
for i := range wantSorted {
if gotSorted[i] != wantSorted[i] {
t.Fatalf("ModelsForRuntime(%q) = %v, want exactly %v", rt, gotSorted, wantSorted)
}
}
}
}
// TestModelsForRuntime_UnknownRuntime: an unknown runtime returns an error
// (and an empty slice). Fail-direction proof — a runtime not in the matrix
// must not silently return the whole catalog.
func TestModelsForRuntime_UnknownRuntime(t *testing.T) {
m, err := LoadManifest()
if err != nil {
t.Fatalf("LoadManifest() error = %v", err)
}
got, err := m.ModelsForRuntime("does-not-exist")
if err == nil {
t.Errorf("ModelsForRuntime(unknown) expected error, got nil (returned %v)", got)
}
if len(got) != 0 {
t.Errorf("ModelsForRuntime(unknown) expected empty slice, got %v", got)
}
}
// TestProvidersForRuntime_UnknownRuntime: same fail-closed contract for the
// provider-level accessor.
func TestProvidersForRuntime_UnknownRuntime(t *testing.T) {
m, err := LoadManifest()
if err != nil {
t.Fatalf("LoadManifest() error = %v", err)
}
got, err := m.ProvidersForRuntime("does-not-exist")
if err == nil {
t.Errorf("ProvidersForRuntime(unknown) expected error, got nil (returned %v)", got)
}
if len(got) != 0 {
t.Errorf("ProvidersForRuntime(unknown) expected empty slice, got %v", got)
}
}
// TestNonNativeModelAbsentFromEveryRuntime is the drift-prune proof: a model
// that no runtime natively supports must NOT be returned by ModelsForRuntime
// for ANY runtime. These ids are template-declared drift the RFC prunes:
// - gemini-2.5-pro (canvas/hermes-only, no native CTO matrix entry)
// - GLM-4.6 (zai; claude-code template declares it but it's outside the
// anthropic/kimi/minimax native set)
// - deepseek-v4-pro (claude-code template declares it; outside native set)
// - mimo-v2.5-pro (xiaomi; claude-code template declares it; outside set)
// - openai:gpt-4o (openclaw template declares it; outside the kimi-only set)
func TestNonNativeModelAbsentFromEveryRuntime(t *testing.T) {
m, err := LoadManifest()
if err != nil {
t.Fatalf("LoadManifest() error = %v", err)
}
driftModels := []string{
"gemini-2.5-pro",
"GLM-4.6",
"deepseek-v4-pro",
"mimo-v2.5-pro",
"openai:gpt-4o",
"qwen3-max",
"nousresearch/hermes-4-70b",
}
for rt := range runtimeNativeProviders {
got, err := m.ModelsForRuntime(rt)
if err != nil {
t.Fatalf("ModelsForRuntime(%q) error = %v", rt, err)
}
present := make(map[string]bool, len(got))
for _, id := range got {
present[id] = true
}
for _, drift := range driftModels {
if present[drift] {
t.Errorf("runtime %q must NOT offer non-native drift model %q, but it did", rt, drift)
}
}
}
}
// minimalValidManifest is a tiny well-formed manifest used as the base for
// the fail-direction tests below. Each negative test mutates one field and
// asserts parseManifest rejects it — proving the load-time guards are
// load-bearing, not vacuously satisfied by the embedded baseline.
const minimalValidManifest = `
schema_version: 1
providers:
- name: openai
display_name: "OpenAI"
protocol: openai
auth_mode: anthropic_api
auth_env: [OPENAI_API_KEY]
model_prefix_match: "^gpt-"
runtimes:
codex:
providers:
- name: openai
models: [gpt-5.5]
`
// TestParseManifest_ValidBaseline proves the minimal manifest parses, so the
// negative tests below isolate exactly the field they each mutate.
func TestParseManifest_ValidBaseline(t *testing.T) {
m, err := parseManifest([]byte(minimalValidManifest))
if err != nil {
t.Fatalf("parseManifest(valid) error = %v", err)
}
models, err := m.ModelsForRuntime("codex")
if err != nil || len(models) != 1 || models[0] != "gpt-5.5" {
t.Fatalf("ModelsForRuntime(codex) = %v, err = %v; want [gpt-5.5]", models, err)
}
}
// TestParseManifest_FailDirection is the load-bearing-guard proof: each case
// breaks the manifest in one way and asserts the matching error fires. If a
// future edit removes a guard, the corresponding case flips red.
func TestParseManifest_FailDirection(t *testing.T) {
cases := []struct {
name string
yaml string
wantErr string
}{
{
name: "unknown provider ref",
yaml: `
schema_version: 1
providers:
- {name: openai, display_name: "OpenAI", protocol: openai, auth_mode: anthropic_api, auth_env: [OPENAI_API_KEY], model_prefix_match: "^gpt-"}
runtimes:
codex:
providers:
- {name: typo-provider, models: [gpt-5.5]}
`,
wantErr: "unknown provider",
},
{
name: "empty native set",
yaml: `
schema_version: 1
providers:
- {name: openai, display_name: "OpenAI", protocol: openai, auth_mode: anthropic_api, auth_env: [OPENAI_API_KEY], model_prefix_match: "^gpt-"}
runtimes:
codex:
providers: []
`,
wantErr: "empty native provider set",
},
{
name: "provider ref with no models",
yaml: `
schema_version: 1
providers:
- {name: openai, display_name: "OpenAI", protocol: openai, auth_mode: anthropic_api, auth_env: [OPENAI_API_KEY], model_prefix_match: "^gpt-"}
runtimes:
codex:
providers:
- {name: openai, models: []}
`,
wantErr: "no model ids",
},
{
name: "duplicate provider ref",
yaml: `
schema_version: 1
providers:
- {name: openai, display_name: "OpenAI", protocol: openai, auth_mode: anthropic_api, auth_env: [OPENAI_API_KEY], model_prefix_match: "^gpt-"}
runtimes:
codex:
providers:
- {name: openai, models: [gpt-5.5]}
- {name: openai, models: [gpt-5.4]}
`,
wantErr: "twice",
},
{
name: "no runtimes block",
yaml: `
schema_version: 1
providers:
- {name: openai, display_name: "OpenAI", protocol: openai, auth_mode: anthropic_api, auth_env: [OPENAI_API_KEY], model_prefix_match: "^gpt-"}
`,
wantErr: "no runtimes",
},
{
name: "wrong schema version",
yaml: `
schema_version: 99
providers:
- {name: openai, display_name: "OpenAI", protocol: openai, auth_mode: anthropic_api, auth_env: [OPENAI_API_KEY], model_prefix_match: "^gpt-"}
runtimes:
codex:
providers:
- {name: openai, models: [gpt-5.5]}
`,
wantErr: "schema_version",
},
{
name: "malformed yaml",
yaml: "schema_version: 1\nproviders: [oops: not-a-list",
wantErr: "parse manifest",
},
{
name: "no providers",
yaml: `
schema_version: 1
providers: []
runtimes:
codex:
providers:
- {name: openai, models: [gpt-5.5]}
`,
wantErr: "no providers",
},
{
name: "duplicate provider name",
yaml: `
schema_version: 1
providers:
- {name: openai, display_name: "OpenAI", protocol: openai, auth_mode: anthropic_api, auth_env: [OPENAI_API_KEY], model_prefix_match: "^gpt-"}
- {name: openai, display_name: "OpenAI dup", protocol: openai, auth_mode: anthropic_api, auth_env: [OPENAI_API_KEY], model_prefix_match: "^gpt-"}
runtimes:
codex:
providers:
- {name: openai, models: [gpt-5.5]}
`,
wantErr: "duplicate provider name",
},
{
name: "uncompilable model_prefix_match",
yaml: `
schema_version: 1
providers:
- {name: openai, display_name: "OpenAI", protocol: openai, auth_mode: anthropic_api, auth_env: [OPENAI_API_KEY], model_prefix_match: "([unterminated"}
runtimes:
codex:
providers:
- {name: openai, models: [gpt-5.5]}
`,
wantErr: "model_prefix_match",
},
{
name: "missing required field (protocol)",
yaml: `
schema_version: 1
providers:
- {name: openai, display_name: "OpenAI", auth_mode: anthropic_api, auth_env: [OPENAI_API_KEY], model_prefix_match: "^gpt-"}
runtimes:
codex:
providers:
- {name: openai, models: [gpt-5.5]}
`,
wantErr: "protocol must be",
},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
_, err := parseManifest([]byte(tc.yaml))
if err == nil {
t.Fatalf("parseManifest(%s) expected error containing %q, got nil", tc.name, tc.wantErr)
}
if !strings.Contains(err.Error(), tc.wantErr) {
t.Fatalf("parseManifest(%s) error = %q, want substring %q", tc.name, err.Error(), tc.wantErr)
}
})
}
}
// TestRuntimes_AllProviderRefsResolve guards manifest integrity: every
// provider name referenced in a runtime's native set must resolve to a real
// provider entry. A typo'd provider ref must fail Load, not silently drop a
// model. (Load-time validation; this asserts the loaded manifest is clean.)
func TestRuntimes_AllProviderRefsResolve(t *testing.T) {
m, err := LoadManifest()
if err != nil {
t.Fatalf("LoadManifest() error = %v", err)
}
known := make(map[string]bool, len(m.Providers))
for _, p := range m.Providers {
known[p.Name] = true
}
if len(m.Runtimes) == 0 {
t.Fatal("manifest declares no runtimes")
}
for rt, native := range m.Runtimes {
for _, ref := range native.Providers {
if !known[ref.Name] {
t.Errorf("runtime %q references unknown provider %q", rt, ref.Name)
}
}
}
}
@@ -0,0 +1,45 @@
package providers
import (
"crypto/sha256"
"encoding/hex"
"testing"
)
// sync_canonical_test.go — hermetic half of the canonical↔synced-copy drift
// gate (internal#718 P2-A).
//
// molecule-core's providers.yaml is a SYNCED COPY of the canonical SSOT in
// molecule-controlplane internal/providers/providers.yaml. The live cross-repo
// byte-compare lives in the sync-providers-yaml CI workflow (it fetches the
// canonical from CP and diffs). This test is the HERMETIC backstop: it pins the
// sha256 of the embedded synced copy to the value the canonical produced at sync
// time, so a HAND-EDIT of core's copy (or a partial sync) flips red locally and
// in `go test ./...` even when CI cannot reach controlplane.
//
// When the canonical legitimately changes, the sync procedure is:
// 1. Copy controlplane internal/providers/providers.yaml verbatim over this
// copy.
// 2. `go generate ./...` to regenerate the artifact (verify-providers-gen).
// 3. Update canonicalProvidersYAMLSHA256 below to the new sha (the failure
// message prints the observed sha to paste in).
// The deliberate constant bump is the human checkpoint that a registry change
// was consciously re-synced into core, not silently forked.
// canonicalProvidersYAMLSHA256 is the sha256 of the canonical providers.yaml as
// synced from molecule-controlplane. Bumped deliberately on each re-sync (see
// file doc). Cross-checked live by the sync-providers-yaml CI workflow.
const canonicalProvidersYAMLSHA256 = "73e8003062edaa4ce75bfb324be615b6e2b380f07487e3af4dc16cb644dc12bc"
func TestSyncedYAMLMatchesCanonicalSHA(t *testing.T) {
sum := sha256.Sum256(embeddedYAML)
got := hex.EncodeToString(sum[:])
if got != canonicalProvidersYAMLSHA256 {
t.Fatalf("embedded providers.yaml sha256 = %s, pinned canonical = %s\n"+
"If you intentionally re-synced the canonical from molecule-controlplane, "+
"update canonicalProvidersYAMLSHA256 to %s and regenerate (`go generate ./...`).\n"+
"If you did NOT mean to edit core's copy, revert it — the canonical SSOT is "+
"molecule-controlplane internal/providers/providers.yaml, not this synced copy.",
got, canonicalProvidersYAMLSHA256, got)
}
}
@@ -20,6 +20,7 @@ const defaultRegistryPrefix = "ghcr.io/molecule-ai"
var knownRuntimes = []string{
"claude-code",
"codex",
"google-adk",
"hermes",
"openclaw",
}
@@ -53,8 +53,8 @@ func TestRuntimeImage_AllKnownRuntimes(t *testing.T) {
}
}
// Pin the count so adding a runtime requires explicit test acknowledgement.
if len(knownRuntimes) != 4 {
t.Errorf("knownRuntimes length = %d, want 4 (claude-code, codex, hermes, openclaw)", len(knownRuntimes))
if len(knownRuntimes) != 5 {
t.Errorf("knownRuntimes length = %d, want 5 (claude-code, codex, google-adk, hermes, openclaw)", len(knownRuntimes))
}
}
+9 -2
View File
@@ -444,8 +444,15 @@ func Setup(hub *ws.Hub, broadcaster *events.Broadcaster, prov *provisioner.Provi
wsAuth.DELETE("/secrets/:key", sech.Delete)
wsAuth.GET("/model", sech.GetModel)
wsAuth.PUT("/model", sech.SetModel)
wsAuth.GET("/provider", sech.GetProvider)
wsAuth.PUT("/provider", sech.SetProvider)
// internal#718 P4 closure: /provider endpoint is retired —
// the LLM_PROVIDER workspace_secret no longer exists and the
// provider is derived from (runtime, model) via the registry
// at every decision point. handlers.ProviderEndpointGone returns 410
// with a structured body so older canvases that still call
// PUT /provider on Save surface a loud failure rather than
// silently writing into a vanished row.
wsAuth.GET("/provider", handlers.ProviderEndpointGone)
wsAuth.PUT("/provider", handlers.ProviderEndpointGone)
// Token usage metrics — cost transparency (#593).
// WorkspaceAuth middleware (on wsAuth) binds the bearer to :id.
@@ -0,0 +1,9 @@
-- Reverse of 20260528000000: a no-op.
--
-- The LLM_PROVIDER rows were retired with no remaining consumer.
-- Rolling back the migration cannot reconstitute the rows (they were
-- deleted, not soft-deleted) AND there is no live code path that
-- writes them anymore — SetProvider / setProviderSecret / Create's
-- write are all removed. A genuine revert needs an application-code
-- revert, not just a migration.
SELECT 1;
@@ -0,0 +1,22 @@
-- internal#718 P4 closure — drop any straggler LLM_PROVIDER rows.
--
-- The LLM_PROVIDER workspace_secret is retired. The provider is now DERIVED
-- at every decision point from (runtime, model) via the registry
-- (internal/providers.Manifest.DeriveProvider). No consumer reads the row
-- anymore:
--
-- - core handlers GetProvider / SetProvider — removed (route returns 410)
-- - core handlers WorkspaceHandler.Create setProviderSecret write — removed
-- - core handlers deriveProviderFromModelSlug — removed
-- - core loadWorkspaceSecrets — still hydrates the env map (a defence-
-- in-depth filter in handlers/workspace_provision.go drops the key
-- before envVars is passed to the CP provisioner, so existing rows
-- are idempotent until this migration removes them on the next
-- deploy)
-- - CP provisioner resolveModelAndProvider — replaced with a registry
-- derivation; env["LLM_PROVIDER"] is no longer read
--
-- This migration removes any straggler rows so the table is in the same
-- state as a freshly-provisioned tenant. Idempotent: a fresh tenant
-- with zero LLM_PROVIDER rows produces a 0-row delete.
DELETE FROM workspace_secrets WHERE key = 'LLM_PROVIDER';