Compare commits

..

17 Commits

Author SHA1 Message Date
Molecule AI Dev Engineer A (Kimi) 18ad66051c defensive: nil-map init, marshal-error returns, NullTime guards, a2a timeout comment
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 11s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 13s
CI / Detect changes (pull_request) Successful in 7s
CI / Python Lint & Test (pull_request) Successful in 7s
E2E API Smoke Test / detect-changes (pull_request) Successful in 9s
E2E Chat / detect-changes (pull_request) Successful in 9s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 11s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 40s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 4s
Harness Replays / detect-changes (pull_request) Successful in 8s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 6s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 9s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 9s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m18s
verify-providers-gen / Regenerate providers artifact and fail on drift (pull_request) Successful in 46s
gate-check-v3 / gate-check (pull_request) Successful in 12s
qa-review / approved (pull_request) Failing after 10s
security-review / approved (pull_request) Failing after 8s
sop-checklist / review-refire (pull_request) Has been skipped
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 6s
sop-tier-check / tier-check (pull_request) Successful in 5s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Successful in 4m25s
audit-force-merge / audit (pull_request) Waiting to run
CI / Canvas (Next.js) (pull_request) Successful in 3s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 8s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 18s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m45s
Harness Replays / Harness Replays (pull_request) Successful in 9s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m8s
CI / Platform (Go) (pull_request) Successful in 6m6s
CI / Canvas Deploy Reminder (pull_request) Has been cancelled
CI / all-required (pull_request) Successful in 13m21s
Residue from mol#1933 slim-down — only changes not already in main:

- channels/manager.go, handlers/channels.go: initialise nil maps/slices
  after json.Unmarshal errors so downstream code doesn't panic on nil
  dereference.
- handlers/channels.go: guard sql.NullTime .Time access with .Valid check
  (matches existing lastMsg.Valid pattern).
- handlers/delegation.go, handlers/restart_signals.go: add missing return
  after json.Marshal error logs so we don't continue with zero/invalid
  payload.
- handlers/a2a_proxy.go: document why a2aClient has no Client.Timeout
  (per-request ctx deadlines govern lifetime; fixed timeout would break
  legitimate slow cold-start flows).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-28 02:35:38 +00:00
hongming 1e783ff6a2 Merge pull request 'provider-SSOT P2-B -> main: billing derives from provider (re-target #1971)' (#1972) from feat/internal-718-p2a-registry-codegen-distribution into main
ci-arm64-advisory / fast-checks (push) Waiting to run
Block internal-flavored paths / Block forbidden paths (push) Successful in 8s
CI / Python Lint & Test (push) Successful in 11s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 12s
CI / Detect changes (push) Successful in 14s
E2E API Smoke Test / detect-changes (push) Successful in 13s
E2E Chat / detect-changes (push) Successful in 14s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 9s
E2E Staging SaaS (full lifecycle) / pr-validate (push) Successful in 40s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (push) Successful in 1m13s
Handlers Postgres Integration / detect-changes (push) Successful in 6s
Harness Replays / detect-changes (push) Successful in 2s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 3s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 4s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 4s
verify-providers-gen / Regenerate providers artifact and fail on drift (push) Successful in 39s
publish-workspace-server-image / build-and-push (push) Successful in 4m33s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push) Successful in 4m34s
CI / Canvas (Next.js) (push) Successful in 2s
E2E Staging External Runtime / E2E Staging External Runtime (push) Successful in 5m20s
CI / Shellcheck (E2E scripts) (push) Successful in 2s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push) Successful in 5m57s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 6s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 1m47s
CI / Platform (Go) (push) Successful in 5m19s
Harness Replays / Harness Replays (push) Successful in 15s
CI / Canvas Deploy Reminder (push) Successful in 2s
CI / all-required (push) Successful in 8m8s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m4s
E2E Chat / E2E Chat (push) Successful in 3m49s
publish-workspace-server-image / Production auto-deploy (push) Successful in 5m43s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 11s
ci-required-drift / drift (push) Successful in 1m15s
Sweep stale AWS Secrets Manager secrets / Sweep AWS Secrets Manager (push) Successful in 16s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 12s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 4s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 4m14s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 5m24s
2026-05-28 02:09:09 +00:00
hongming 924dfa5598 test(workspace-server): remove unused wantWhy field in model_registry_validation_test (golangci-lint unused) — internal#718 P2-B
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
CI / Python Lint & Test (pull_request) Successful in 6s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
CI / Detect changes (pull_request) Successful in 7s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 13s
E2E API Smoke Test / detect-changes (pull_request) Successful in 8s
E2E Chat / detect-changes (pull_request) Successful in 7s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 7s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 38s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 50s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
Harness Replays / detect-changes (pull_request) Successful in 3s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 3s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m7s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 3s
verify-providers-gen / Regenerate providers artifact and fail on drift (pull_request) Successful in 40s
gate-check-v3 / gate-check (pull_request) Successful in 3s
qa-review / approved (pull_request) Successful in 4s
security-review / approved (pull_request) Failing after 4s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 4s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 3s
CI / Canvas (Next.js) (pull_request) Successful in 5s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 3s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m24s
E2E Chat / E2E Chat (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m33s
Harness Replays / Harness Replays (pull_request) Successful in 2s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Successful in 6m37s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m48s
CI / Platform (Go) (pull_request) Successful in 5m9s
CI / all-required (pull_request) Successful in 9m1s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
audit-force-merge / audit (pull_request) Successful in 7s
2026-05-28 01:39:27 +00:00
hongming 3ab690c273 Merge pull request 'P2-B internal#718: billing/credential derives from provider + only-registered validation (BEHAVIOR-AFFECTING; supersedes #1966)' (#1971) from feat/internal-718-p2b-billing-derives-from-provider into feat/internal-718-p2a-registry-codegen-distribution
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
CI / Canvas Deploy Reminder (pull_request) Blocked by required conditions
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Python Lint & Test (pull_request) Successful in 5s
E2E API Smoke Test / detect-changes (pull_request) Successful in 8s
CI / Detect changes (pull_request) Successful in 10s
E2E Chat / detect-changes (pull_request) Successful in 9s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 12s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 13s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 31s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
Harness Replays / detect-changes (pull_request) Successful in 4s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 50s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 3s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
verify-providers-gen / Regenerate providers artifact and fail on drift (pull_request) Successful in 1m1s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m10s
gate-check-v3 / gate-check (pull_request) Successful in 9s
qa-review / approved (pull_request) Failing after 6s
security-review / approved (pull_request) Failing after 3s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 4s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 4s
CI / Canvas (Next.js) (pull_request) Successful in 8s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 3s
CI / Platform (Go) (pull_request) Failing after 2m5s
E2E Chat / E2E Chat (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 3s
CI / all-required (pull_request) Failing after 4m41s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m51s
Harness Replays / Harness Replays (pull_request) Successful in 3s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Successful in 4m50s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m12s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m32s
2026-05-28 01:22:20 +00:00
hongming 866a71777f Merge pull request 'P2-A internal#718: bring provider registry to molecule-core via codegen + verify-CI (NO behavior change)' (#1970) from feat/internal-718-p2a-registry-codegen-distribution into main
ci-arm64-advisory / fast-checks (push) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 13s
Block internal-flavored paths / Block forbidden paths (push) Successful in 6s
CI / Python Lint & Test (push) Successful in 6s
CI / Detect changes (push) Successful in 8s
E2E API Smoke Test / detect-changes (push) Successful in 10s
E2E Chat / detect-changes (push) Successful in 18s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 18s
Harness Replays / detect-changes (push) Successful in 16s
Handlers Postgres Integration / detect-changes (push) Successful in 16s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 12s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 10s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 11s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (push) Successful in 15s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 14s
sync-providers-yaml / Compare synced providers.yaml against controlplane canonical (push) Successful in 7s
verify-providers-gen / Regenerate providers artifact and fail on drift (push) Successful in 32s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Failing after 1m15s
CI / Canvas (Next.js) (push) Successful in 12s
CI / Shellcheck (E2E scripts) (push) Successful in 2s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Successful in 1m28s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 13s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 2m12s
Harness Replays / Harness Replays (push) Successful in 6s
CI / Canvas Deploy Reminder (push) Successful in 3s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 16s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m41s
publish-workspace-server-image / build-and-push (push) Successful in 5m44s
E2E Chat / E2E Chat (push) Successful in 4m36s
ci-required-drift / drift (push) Successful in 1m6s
CI / Platform (Go) (push) Successful in 6m32s
CI / all-required (push) Successful in 8m48s
publish-workspace-server-image / Production auto-deploy (push) Successful in 5m12s
Sweep stale AWS Secrets Manager secrets / Sweep AWS Secrets Manager (push) Successful in 6s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 7s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 3s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 4m5s
main-red-watchdog / watchdog (push) Successful in 2m2s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 5m22s
gate-check-v3 / gate-check (push) Successful in 27s
2026-05-28 01:10:25 +00:00
hongming-personal 11b0646b37 fix(ci): sync-providers-yaml gate fetch canonical via /raw not /contents
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 7s
CI / Python Lint & Test (pull_request) Successful in 6s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 10s
E2E API Smoke Test / detect-changes (pull_request) Successful in 8s
E2E Chat / detect-changes (pull_request) Successful in 8s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 9s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 10s
Harness Replays / detect-changes (pull_request) Successful in 7s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 7s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Failing after 1m3s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 5s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Failing after 1m18s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m21s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m17s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s
sync-providers-yaml / Compare synced providers.yaml against controlplane canonical (pull_request) Successful in 4s
gate-check-v3 / gate-check (pull_request) Failing after 10s
qa-review / approved (pull_request) Failing after 9s
sop-checklist / na-declarations (pull_request) N/A: (none)
security-review / approved (pull_request) Failing after 13s
sop-checklist / review-refire (pull_request) Has been skipped
sop-checklist / all-items-acked (pull_request) Successful in 14s
verify-providers-gen / Regenerate providers artifact and fail on drift (pull_request) Successful in 31s
sop-tier-check / tier-check (pull_request) Successful in 7s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m22s
CI / Canvas (Next.js) (pull_request) Successful in 4s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 6s
E2E Chat / E2E Chat (pull_request) Successful in 8s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 9s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 14s
Harness Replays / Harness Replays (pull_request) Successful in 11s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m20s
CI / Platform (Go) (pull_request) Successful in 5m10s
CI / all-required (pull_request) Successful in 8m18s
audit-force-merge / audit (pull_request) Successful in 7s
The cross-repo drift gate fetched controlplane providers.yaml from the
Gitea /contents endpoint with Accept: application/vnd.gitea.raw. On this
Gitea (1.22.6) that header is NOT honored on /contents -- it returns the
JSON+base64 envelope ({"name":"providers.yaml","content":"<base64>"...},
~45.6 KB), not raw bytes. So diff -u compared JSON-vs-YAML and exited 1
(RED) on every run even when byte-identical, making the gate inert
(detected neither sync nor real drift).

Switch the fetch to the /raw endpoint, which returns the file bytes
directly (33319 B, sha256 48a66921...), byte-identical to core's synced
copy. diff now exits 0 on the in-sync state and goes RED on real drift.
Authorization: token header kept; soft-fail backstop and the hermetic
sha-pin in sync_canonical_test.go are untouched.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 00:55:08 +00:00
core-devops 3165b98cc8 fix(workspace-server): P2-B internal#718 — billing/credential decision DERIVES the provider; supersede #1966 stored-read; retire org rung; only-registered validation (BEHAVIOR-AFFECTING)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 11s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 7s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 7s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m11s
gate-check-v3 / gate-check (pull_request) Successful in 11s
verify-providers-gen / Regenerate providers artifact and fail on drift (pull_request) Successful in 43s
qa-review / approved (pull_request) Successful in 10s
security-review / approved (pull_request) Successful in 8s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 5s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
audit-force-merge / audit (pull_request) Successful in 4s
Re-points the platform-vs-BYOK billing/credential decision to DERIVE the provider
from (runtime, model) via the registry SSOT, per the CTO directive (internal#718
comment, 2026-05-27): "the billing read must DERIVE the provider, not read a
stored LLM_PROVIDER", "remove LLM_PROVIDER entirely as a billing source", "retire
organizations.llm_billing_mode as a billing source".

## BEHAVIOR DELTA (this PR changes behavior — tested explicitly)
- platform-derived (or unset → platform default) → platform_managed → platform
  creds. UNCHANGED.
- non-platform-derived → byok → the already-merged #1963 strips platform
  scope:global LLM creds + FAIL-CLOSES if the workspace has no own cred. THIS IS
  THE INTENDED FIX (the Reno billing-leak class: Reno Stars SEO 352e3c2b /
  Marketing 6b66de8d ran on the platform's Anthropic credits because the never-
  written org rung always resolved platform_managed).
- unset model → platform default (CTO-confirmed).

## What changed
- `ResolveLLMBillingModeDerived(ctx, ws, runtime, model, authEnv)` — the new SSOT
  resolver: explicit `workspaces.llm_billing_mode` override (precedence 1, the
  only stored billing signal that survives — operator escape hatch) → else
  DeriveProvider + IsPlatform → else default-closed platform_managed.
- `ResolveLLMBillingMode(ctx, ws, orgMode)` legacy signature retained for callers
  without (runtime, model) (admin route, secrets remote-pull): reads the stored
  runtime + MODEL + auth-env names from DB and delegates to the derived resolver.
  `orgMode` is RETIRED/ignored; the org rung is gone.
- `applyPlatformManagedLLMEnv` calls the derived resolver directly (it has
  runtime + model + the workspace env) — no stored LLM_PROVIDER read. Feeds
  #1963's strip + fail-closed the correct DERIVED signal.
- SUPERSEDES core#1966: that PR made the billing read consult a stored
  LLM_PROVIDER first; this reworks the decision onto derive-from-provider. #1966
  should be closed in favor of this.
- Removed the now-dead org-default normalization (normalizeOrgDefault).
- ONLY-REGISTERED validation at create (model_registry_validation.go +
  WorkspaceHandler.Create): a (runtime, model) not in the registry's
  ModelsForRuntime for a REGISTRY-known runtime is flagged
  (X-Molecule-Model-Unregistered header + warning log). P2 = WARN mode (NOT hard
  422) because the legacy colon-namespaced model vocabulary ("anthropic:claude-
  opus-4-7") is still live across the create/import/template corpus and is not
  yet reconciled into the registry — hard-reject is a one-line flip gated on
  P3/P4 vocabulary convergence. Fails OPEN for non-registry runtimes
  (langgraph/external/kimi/mock/federated) so those flows are unchanged.

## Tests (TDD; behavior delta explicit)
- llm_billing_mode_derived_test.go — platform/non-platform/unset/override/
  unregistered/auth-env-disambiguation table + DB-error default-closed + empty-id.
- workspace_provision_shared_test.go — DERIVED platform→unchanged,
  non-platform→byok+strip+fail-closed (the FIX), unset→platform default, through
  the real applyPlatformManagedLLMEnv path. Existing #1963 override-byok strip +
  fail-closed tests unchanged (still pass).
- model_registry_validation_test.go + workspace_test.go — only-registered warn +
  registered-no-warn + non-registry-fail-open.
- Reworked the legacy resolver/admin/secrets tests off the retired org rung.

## Build/CI
go build ./... (+ -tags=integration) green; full `go test ./...` (43 pkgs) green
incl. -race on handlers; vet clean; changed files gofmt-clean. cp#362 anthropic
passthrough untouched (CP repo); merged #1963 strip+fail-closed reused unchanged.

internal#718 P2-B. BEHAVIOR-AFFECTING. Supersedes #1966. Not merged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 17:39:26 -07:00
core-devops 71c68e44f2 feat(providers): P2-A internal#718 — bring the provider registry to molecule-core via codegen + verify-CI (additive, zero behavior change)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Python Lint & Test (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 7s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 11s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
E2E Chat / detect-changes (pull_request) Successful in 10s
E2E API Smoke Test / detect-changes (pull_request) Successful in 10s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 9s
Harness Replays / detect-changes (pull_request) Successful in 6s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 5s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 8s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 10s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m10s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m3s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m21s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 5s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
sync-providers-yaml / Compare synced providers.yaml against controlplane canonical (pull_request) Failing after 5s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Failing after 1m36s
gate-check-v3 / gate-check (pull_request) Successful in 12s
verify-providers-gen / Regenerate providers artifact and fail on drift (pull_request) Successful in 38s
qa-review / approved (pull_request) Failing after 6s
security-review / approved (pull_request) Failing after 7s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 7s
CI / Canvas (Next.js) (pull_request) Successful in 2s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 3s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 5s
E2E Chat / E2E Chat (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 14s
Harness Replays / Harness Replays (pull_request) Successful in 7s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m34s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m34s
CI / Platform (Go) (pull_request) Successful in 5m44s
CI / all-required (pull_request) Successful in 8m39s
sop-checklist / all-items-acked (pull_request) [info tier:low] acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, l
sop-checklist / na-declarations (pull_request) N/A: (none)
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Distributes the provider-registry SSOT into molecule-core per the CTO-decided
shape (internal#718 comment, 2026-05-27): "Distribution = SDK via codegen +
verify-CI", multi-repo branch "codegen-checked-into-each-repo + verify-CI".

molecule-core has no Go module dependency on molecule-controlplane, so this
lands a SYNCED COPY of the canonical providers.yaml plus the loader,
DeriveProvider/IsPlatform/ResolveUpstream, the generated Go projection
(cmd/gen-providers), and the drift gates — a byte-faithful mirror of the
controlplane P0/P1 machinery. Canonical SSOT stays in controlplane
internal/providers/providers.yaml.

ZERO behavior change (additive, like P0): NO production code path imports the
new package yet. P2-B wires the billing/credential decision onto the loader.

What lands:
- internal/providers/{providers.go,derive_provider.go,providers.yaml} — mirror
  of the controlplane loader + canonical YAML (synced copy).
- internal/providers/gen/registry_gen.go — generated projection; fingerprint
  faffcbe59bb9f38c matches controlplane.
- cmd/gen-providers — the generator (go generate + -check drift mode).
- .gitea/workflows/verify-providers-gen.yml — artifact ↔ synced-copy drift gate
  (mirror of the controlplane workflow; standalone, not in branch protection
  yet — same soak-then-promote posture).
- .gitea/workflows/sync-providers-yaml.yml — NEW cross-repo gate: fetches the
  controlplane canonical providers.yaml and byte-compares against core's synced
  copy (RED on canonical drift). Read-only AUTO_SYNC_TOKEN; degrades to a
  warning if the token is absent.
- internal/providers/sync_canonical_test.go — hermetic sha pin of the synced
  copy (the always-on backstop; catches a hand-edit even with no network).
- internal/providers/gen_import_boundary_test.go — arch-lint-equivalent AST gate
  (core has no go-arch-lint): no production package may import the raw gen
  projection. Proven load-bearing.

Build/test: go build ./... (+ -tags=integration) green; providers/gen/
gen-providers suites pass (incl. -race); gen -check in sync; gofmt + vet clean.

internal#718 P2-A. NO behavior change. Not merged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 17:10:12 -07:00
hongming 7cfec2d61f Merge pull request 'fix(workspace-server): provider-aware gate on platform scope:global LLM creds (internal#711)' (#1963) from fix/byok-global-llm-cred-leak-internal-711 into main
ci-arm64-advisory / fast-checks (push) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 12s
Block internal-flavored paths / Block forbidden paths (push) Successful in 11s
CI / Python Lint & Test (push) Successful in 5s
CI / Detect changes (push) Successful in 14s
E2E API Smoke Test / detect-changes (push) Successful in 11s
E2E Chat / detect-changes (push) Successful in 14s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 8s
Handlers Postgres Integration / detect-changes (push) Successful in 8s
Harness Replays / detect-changes (push) Successful in 5s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 4s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 7s
E2E Staging SaaS (full lifecycle) / pr-validate (push) Successful in 34s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 5s
CI / Canvas (Next.js) (push) Successful in 3s
CI / Shellcheck (E2E scripts) (push) Successful in 2s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 1m49s
publish-workspace-server-image / build-and-push (push) Successful in 3m13s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 13s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m50s
Harness Replays / Harness Replays (push) Successful in 4s
CI / Canvas Deploy Reminder (push) Successful in 2s
E2E Chat / E2E Chat (push) Successful in 3m54s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push) Successful in 5m55s
CI / Platform (Go) (push) Successful in 5m54s
CI / all-required (push) Successful in 7m12s
publish-workspace-server-image / Production auto-deploy (push) Successful in 6m21s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 9s
ci-required-drift / drift (push) Successful in 1m14s
Sweep stale AWS Secrets Manager secrets / Sweep AWS Secrets Manager (push) Successful in 6s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 12s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 8s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 4m44s
main-red-watchdog / watchdog (push) Successful in 27s
gate-check-v3 / gate-check (push) Successful in 23s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 5m57s
2026-05-27 20:18:18 +00:00
agent-platform-engineer 585b3d6ed0 fix(workspace-server): provider-aware gate on platform scope:global LLM creds (internal#711)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 13s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 7s
CI / Python Lint & Test (pull_request) Successful in 6s
CI / Detect changes (pull_request) Successful in 14s
E2E API Smoke Test / detect-changes (pull_request) Successful in 20s
E2E Chat / detect-changes (pull_request) Successful in 15s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 17s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 15s
Harness Replays / detect-changes (pull_request) Successful in 13s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 9s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 8s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 18s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 57s
gate-check-v3 / gate-check (pull_request) Successful in 7s
qa-review / approved (pull_request) Failing after 7s
sop-checklist / na-declarations (pull_request) N/A: (none)
security-review / approved (pull_request) Failing after 7s
sop-checklist / all-items-acked (pull_request) Successful in 7s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 5s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m9s
CI / Canvas (Next.js) (pull_request) Successful in 3s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 4s
E2E Chat / E2E Chat (pull_request) Successful in 19s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 9s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m39s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Successful in 7m40s
Harness Replays / Harness Replays (pull_request) Successful in 4s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m7s
CI / Platform (Go) (pull_request) Successful in 5m52s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) Successful in 15m9s
audit-force-merge / audit (pull_request) Successful in 9s
A workspace whose resolved LLM billing mode is NOT platform_managed
(byok / subscription) was still being injected with the platform's
scope:global CLAUDE_CODE_OAUTH_TOKEN and ran on the platform's Anthropic
credits. Confirmed live 2026-05-27 on the Reno Stars tenant: the SEO
(352e3c2b-...) and Marketing (6b66de8d-...) claude-code agents had no
workspace-scoped LLM credential, yet ran MODEL=opus directly on
api.anthropic.com using the platform's global OAuth token.

Root cause: loadWorkspaceSecrets merges ALL global_secrets into every
workspace's env provenance-blind. applyPlatformManagedLLMEnv's
non-platform (byok/disabled) path then early-returned WITHOUT stripping
those inherited platform globals — so a workspace with no LLM credential
of its own kept the platform's scope:global CLAUDE_CODE_OAUTH_TOKEN.
The same leak existed on the remote-pull path (GET
/workspaces/:id/secrets/values), which also merged globals unconditionally.

Fix (provider-aware, both injection vectors):
- applyPlatformManagedLLMEnv now takes the global-provenance key set and,
  on the non-platform path, strips every platform-managed LLM bypass key
  (CLAUDE_CODE_OAUTH_TOKEN + the rest) that originated from global_secrets.
  A workspace's OWN LLM cred (a workspace_secrets row — provenance flag
  dropped by loadWorkspaceSecrets) is NOT in the global set and survives.
- secrets.Values applies the same provenance-aware gate before returning
  the merged bundle to a remote agent.
- Fail closed: a byok workspace left with no usable LLM credential aborts
  provision with code MISSING_BYOK_CREDENTIAL instead of starting on the
  (now-stripped) platform creds. Scoped to byok; disabled mode strips but
  still boots (no-LLM workspaces are legitimate).
- platform_managed path is unchanged (it still receives + force-routes the
  platform creds via the CP proxy), and the LLM-proxy anthropic path is
  untouched.

Tests (all green; go build/test ./... + -tags=integration build pass):
- ByokStripsGlobalOriginOAuthToken — platform global token stripped, no cred.
- ByokKeepsWorkspaceOwnOAuthEvenWithGlobal — workspace's own token survives.
- DisabledStripsGlobalButReportsNoCred — disabled strips but does not abort.
- PlatformManagedStillReceivesGlobalCreds — no regression on platform path.
- PrepareProvisionContext_ByokWithOnlyGlobalOAuthFailsClosed — e2e abort.
- SecretsValues_ByokStripsGlobalLLMCred — remote-pull path gated.

Note: open PR #1930 (refactor/drop-org-tier-llm-billing-mode, internal#691
follow-up) changes ResolveLLMBillingMode's signature in the same files.
This change is built on current main and is orthogonal in intent; whichever
merges second needs a mechanical 1-line resolver-call adjustment (drop the
orgMode arg). #1930 does NOT fix this leak.

Refs internal#711

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 12:55:58 -07:00
hongming 9deb8e9ea6 Merge pull request 'fix(security): scope peer discovery + a2a routing to caller org (#1953)' (#1954) from fix/1953-scope-peer-discovery-a2a-to-org into main
ci-arm64-advisory / fast-checks (push) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 17s
Block internal-flavored paths / Block forbidden paths (push) Successful in 9s
CI / Python Lint & Test (push) Successful in 9s
CI / Detect changes (push) Successful in 17s
E2E API Smoke Test / detect-changes (push) Successful in 11s
E2E Chat / detect-changes (push) Successful in 15s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 11s
publish-workspace-server-image / build-and-push (push) Successful in 3m23s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (push) Successful in 54s
E2E Staging SaaS (full lifecycle) / pr-validate (push) Successful in 38s
Handlers Postgres Integration / detect-changes (push) Successful in 7s
Harness Replays / detect-changes (push) Successful in 6s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 4s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 4s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 5s
CI / Canvas (Next.js) (push) Successful in 5s
CI / Shellcheck (E2E scripts) (push) Successful in 20s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 1m31s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push) Successful in 5m28s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 3s
CI / Platform (Go) (push) Successful in 4m55s
Harness Replays / Harness Replays (push) Successful in 9s
CI / all-required (push) Successful in 9m15s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push) Successful in 6m44s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m48s
E2E Chat / E2E Chat (push) Successful in 4m13s
publish-workspace-server-image / Production auto-deploy (push) Successful in 8m4s
CI / Canvas Deploy Reminder (push) Successful in 1s
Sweep stale AWS Secrets Manager secrets / Sweep AWS Secrets Manager (push) Successful in 10s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 5s
main-red-watchdog / watchdog (push) Successful in 44s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 6m45s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 5m46s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 3s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 9s
gate-check-v3 / gate-check (push) Successful in 41s
ci-required-drift / drift (push) Successful in 1m3s
2026-05-27 17:51:46 +00:00
core-be 69391595f3 fix(e2e): delete child before parent in test_api delete/round-trip (#1953)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
CI / Python Lint & Test (pull_request) Successful in 5s
E2E API Smoke Test / detect-changes (pull_request) Successful in 13s
CI / Detect changes (pull_request) Successful in 13s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 13s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 7s
E2E Chat / detect-changes (pull_request) Successful in 10s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 27s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 12s
Harness Replays / detect-changes (pull_request) Successful in 6s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 47s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 5s
review-check-tests / review-check.sh regression tests (pull_request) Successful in 9s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 1m8s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s
gate-check-v3 / gate-check (pull_request) Successful in 7s
qa-review / approved (pull_request) Failing after 7s
security-review / approved (pull_request) Failing after 3s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 4s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 5s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m5s
CI / Canvas (Next.js) (pull_request) Successful in 2s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 20s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m12s
E2E Chat / E2E Chat (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 6s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2m3s
Harness Replays / Harness Replays (pull_request) Successful in 4s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m4s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Successful in 6m38s
CI / Platform (Go) (pull_request) Successful in 5m56s
CI / all-required (pull_request) Successful in 8m1s
audit-force-merge / audit (pull_request) Successful in 10s
The #1953 fixture re-seed made Summarizer a CHILD of Echo (same-org) so
the peer-discovery assertions exercise legit same-org enumeration. But
Test 21 still deleted the PARENT (Echo) first and asserted the other
workspace survives (count=1). CascadeDelete walks the recursive parent_id
CTE, so deleting Echo also removed its child Summarizer -> "List after
delete" saw 0, and Test 22 then hit 410 Gone deleting an already-removed
Summarizer ("got: {error: workspace removed}").

Fix: capture Summarizer's bundle, delete the CHILD (Summarizer) first
(child delete does not cascade upward so Echo survives -> count=1), then
delete the parent Echo in the round-trip block and re-import the captured
bundle. Cross-tenant isolation and the same-org parent/child relationship
are unchanged; only the delete ordering is corrected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 17:42:44 +00:00
hongming 46606801c6 Merge pull request 'fix(ci): add explicit utf-8 encoding to Python open() calls' (#1920) from fix/python-open-encoding into main
ci-arm64-advisory / fast-checks (push) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 11s
publish-workspace-server-image / build-and-push (push) Successful in 6m8s
Block internal-flavored paths / Block forbidden paths (push) Successful in 5s
CI / Detect changes (push) Successful in 8s
CI / Python Lint & Test (push) Successful in 4s
E2E Chat / detect-changes (push) Successful in 9s
E2E API Smoke Test / detect-changes (push) Successful in 9s
Handlers Postgres Integration / detect-changes (push) Successful in 4s
CI / all-required (push) Successful in 8m28s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 7s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 5s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 6s
review-check-tests / review-check.sh regression tests (push) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 4s
Ops Scripts Tests / Ops scripts (unittest) (push) Successful in 1m5s
publish-workspace-server-image / Production auto-deploy (push) Successful in 4m52s
main-red-watchdog / watchdog (push) Successful in 57s
CI / Platform (Go) (push) Successful in 6s
CI / Canvas (Next.js) (push) Successful in 5s
CI / Shellcheck (E2E scripts) (push) Successful in 5s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 6s
E2E Chat / E2E Chat (push) Successful in 6s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 6s
gate-check-v3 / gate-check (push) Successful in 1m12s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 15s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m2s
ci-required-drift / drift (push) Successful in 1m10s
CI / Canvas Deploy Reminder (push) Successful in 5s
Sweep stale AWS Secrets Manager secrets / Sweep AWS Secrets Manager (push) Successful in 6s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 4m31s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 7m11s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 5s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 17s
2026-05-27 17:01:54 +00:00
hongming cd671e1263 Merge pull request 'fix(memory): upsert namespace before v2 commit' (#1925) from fix/memory-v2-upsert-namespace-20260526 into main
Block internal-flavored paths / Block forbidden paths (push) Waiting to run
ci-arm64-advisory / fast-checks (push) Waiting to run
CI / Detect changes (push) Waiting to run
CI / Platform (Go) (push) Blocked by required conditions
CI / Canvas (Next.js) (push) Blocked by required conditions
CI / Shellcheck (E2E scripts) (push) Blocked by required conditions
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / Python Lint & Test (push) Waiting to run
E2E API Smoke Test / detect-changes (push) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / detect-changes (push) Waiting to run
E2E Chat / E2E Chat (push) Blocked by required conditions
E2E Staging Canvas (Playwright) / detect-changes (push) Waiting to run
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Waiting to run
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Waiting to run
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 12s
publish-workspace-server-image / build-and-push (push) Successful in 3m2s
Harness Replays / detect-changes (push) Successful in 3s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 10s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 20s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 5m52s
Harness Replays / Harness Replays (push) Successful in 4s
CI / all-required (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
publish-workspace-server-image / Production auto-deploy (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
2026-05-27 16:43:49 +00:00
core-be 51f74e9d8a fix(security): correct org-root CTE so same-org a2a routing works (#1953)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 11s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 34s
CI / Python Lint & Test (pull_request) Successful in 20s
CI / Detect changes (pull_request) Successful in 20s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E API Smoke Test / detect-changes (pull_request) Successful in 28s
E2E Chat / detect-changes (pull_request) Successful in 28s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 43s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 10s
Harness Replays / detect-changes (pull_request) Successful in 8s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 1m2s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 7s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 5s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 1m30s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 18s
gate-check-v3 / gate-check (pull_request) Successful in 8s
qa-review / approved (pull_request) Successful in 12s
security-review / approved (pull_request) Failing after 10s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 9s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 5s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m11s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Successful in 7m21s
CI / Canvas (Next.js) (pull_request) Successful in 3s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 11s
E2E Chat / E2E Chat (pull_request) Successful in 25s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 12s
Harness Replays / Harness Replays (pull_request) Successful in 4s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 1m36s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m37s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Platform (Go) (pull_request) Successful in 5m46s
CI / all-required (pull_request) Successful in 28m17s
The #1953 sameOrg() guard over-blocked legitimate SAME-ORG a2a routing:
orgRootSubtreeCTE carried `id AS root_id` from the recursive SEED, so a
non-root workspace resolved to ITSELF instead of its topmost ancestor.
sameOrg(child, root) therefore compared child-id vs root-id, reported the
pair as DIFFERENT orgs, and 403'd a legitimate same-org delegation. The
cross-org case was unaffected (two distinct roots already resolve to
different ids), so isolation stayed closed — but real same-org delegation
broke. Caught only by the real-Postgres integration suite: the sqlmock
unit tests hand-feed sameOrg() a root_id row and so structurally cannot
exercise the CTE.

Fix: select the parentless chain row's own `id` (aliased root_id) instead
of the seed-carried value. A node that already IS an org root has a
one-row chain and still resolves to itself.

Why the two required checks were red:

- handlers-postgres-integration (real CTE): the executeDelegation
  success-path fixtures seeded source AND target both parent_id=NULL —
  two DISTINCT org roots, i.e. a CROSS-tenant pair that only ever
  "communicated" via the OLD leaky root-sibling behavior #1953 closes.
  Re-seeded target as a CHILD of source (same org). With the same-org
  fixture, the CTE bug surfaced and is now fixed; all 5 ExecuteDelegation
  tests pass (success + failure paths). Added
  TestIntegration_SameOrg_RealCTE_ResolvesAncestorChain as the real-SQL
  regression gate for root→child→grandchild resolution + cross-org denial.

- e2e-api (test_api.sh): created Echo + Summarizer both as org roots and
  asserted they appear in each other's /registry/:id/peers — that
  enumeration WAS the cross-tenant leak (org root seeing another org
  root). Re-created Summarizer as a child of Echo so the peer assertions
  exercise legitimate same-org parent/child enumeration.

Cross-tenant isolation remains closed (all cross-org negative tests pass);
same-org peers + a2a now work. go build ./... + go test ./internal/handlers/...
green; integration suite green.

Refs #1953

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 09:41:44 -07:00
core-be 6211d27bc7 fix(security): scope peer discovery + a2a routing to caller org (#1953)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 11s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Python Lint & Test (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 7s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
E2E Chat / detect-changes (pull_request) Successful in 8s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 14s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 13s
Harness Replays / detect-changes (pull_request) Successful in 5s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 40s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 10s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 5s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 56s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 10s
gate-check-v3 / gate-check (pull_request) Successful in 6s
qa-review / approved (pull_request) Successful in 7s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 5s
security-review / approved (pull_request) Failing after 5s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 5s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m2s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Successful in 8m6s
CI / Canvas (Next.js) (pull_request) Successful in 4s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 8s
E2E Chat / E2E Chat (pull_request) Successful in 6s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 9s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 1m36s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Failing after 2m6s
Harness Replays / Harness Replays (pull_request) Successful in 1s
CI / Platform (Go) (pull_request) Successful in 5m39s
CI / all-required (pull_request) Successful in 6m4s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Three workspace-server paths computed an "org-root sibling set" as
`WHERE parent_id IS NULL`, which matches EVERY tenant's org root (the
workspaces table has no org_id column) → cross-tenant data exposure:

  1. GET /registry/:id/peers (discovery.Peers) — returned peer
     id/name/role/url/agent_card across ALL tenants when the caller
     was itself an org root.
  2. MCP toolListPeers (mcp_tools.go) — same cross-tenant peer
     enumeration via the MCP bridge.
  3. a2a routing (a2a_proxy.proxyA2ARequest → resolveAgentURL) —
     CanCommunicate's "root-level siblings, both no parent" rule treats
     every tenant's org root as a sibling, and resolveAgentURL accepts
     ANY workspace id with no org check, so an org root could resolve
     and route a2a to another tenant's org root.

Fix — reuse the OFFSEC-015 broadcast scoping (commit 5a05302c,
workspace_broadcast.go): the org is the parent_id-chain subtree from a
single org root. New org_scope.go centralises that recursive CTE
(orgRootID / sameOrg) so all paths derive "the caller's org" the same way:

  - discovery.Peers + toolListPeers: drop the `parent_id IS NULL`
    sibling branch entirely. An org root has no siblings inside its own
    org; its peers are its children (still enumerated). Only the
    parent_id-bound sibling branch remains, already scoped to one tenant.
  - a2a proxyA2ARequest: after CanCommunicate, add a sameOrg() guard that
    rejects (403) before resolveAgentURL when caller and target resolve
    to different org roots. Fail-closed: a DB error denies routing.

No org_id column is added — that is a separate architecture decision
pending CTO. This uses the existing parent_id-chain scoping.

Tests (cross_tenant_isolation_test.go): per-path cross-tenant regression
— a DIFFERENT-org workspace must NOT appear in /registry peers, must NOT
appear in toolListPeers, and a2a MUST reject resolving/routing to a
workspace outside the caller's org; plus same-org positive tests. The
three negative tests were verified to FAIL against the pre-fix code.
Existing peer/a2a/delegation tests updated to the org-scoped behavior.

Follow-up for CTO: registry.CanCommunicate still treats any two org
roots as siblings, so discovery.Discover and CheckAccess share the same
root-sibling weakness. Scoping CanCommunicate itself (registry package)
would close that class fully; flagged separately as it is outside the
three #1953 paths.

Refs #1953

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 08:45:27 -07:00
claude-ceo-assistant 42b16b33fb fix(memory): upsert namespace before v2 commit
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 11s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 10s
CI / Detect changes (pull_request) Successful in 13s
CI / Python Lint & Test (pull_request) Successful in 9s
E2E API Smoke Test / detect-changes (pull_request) Successful in 14s
E2E Chat / detect-changes (pull_request) Successful in 14s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 14s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
Harness Replays / detect-changes (pull_request) Successful in 9s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 7s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 7s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m11s
CI / Canvas (Next.js) (pull_request) Successful in 10s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 10s
E2E Chat / E2E Chat (pull_request) Successful in 10s
Harness Replays / Harness Replays (pull_request) Successful in 3s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m58s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m45s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Platform (Go) (pull_request) Successful in 6m7s
CI / all-required (pull_request) Successful in 13m6s
security-review / approved (pull_request) Refired via /security-recheck by unknown
qa-review / approved (pull_request) Refired via /qa-recheck by unknown
gate-check-v3 / gate-check (pull_request) Successful in 31s
sop-checklist / review-refire (pull_request) Has been skipped
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 22s
sop-tier-check / tier-check (pull_request) Successful in 14s
audit-force-merge / audit (pull_request) Successful in 7s
2026-05-26 12:38:50 -07:00
47 changed files with 5908 additions and 328 deletions
+99
View File
@@ -0,0 +1,99 @@
name: sync-providers-yaml
# Cross-repo canonical↔synced-copy drift gate (internal#718 P2-A, CTO
# 2026-05-27 "Distribution = SDK via codegen + verify-CI", multi-repo branch:
# "codegen-checked-into-each-repo + verify-CI").
#
# The canonical provider-registry SSOT is molecule-controlplane
# internal/providers/providers.yaml. molecule-core has NO Go module dependency
# on controlplane, so instead of importing it we carry a SYNCED COPY at
# workspace-server/internal/providers/providers.yaml and gate it.
#
# This workflow fetches the canonical providers.yaml from controlplane (via the
# Gitea raw endpoint, read-only) and byte-compares it against core's synced
# copy. RED if they differ — meaning the canonical moved and core's copy must be
# re-synced (copy verbatim + `go generate ./...` + bump
# canonicalProvidersYAMLSHA256 in sync_canonical_test.go).
#
# Pairs with:
# * sync_canonical_test.go — hermetic sha pin (catches a hand-edit of core's
# copy even with no network); runs in the normal `go test ./...`.
# * verify-providers-gen.yml — artifact ↔ synced-copy drift.
#
# ENFORCEMENT GATING: standalone workflow, NOT a job in ci.yml and NOT in
# branch protection (same soak-then-promote posture as verify-providers-gen).
# It is intentionally absent from ci.yml's job set so the ci-required-drift
# sentinel does not fire on it.
#
# AUTH: uses AUTO_SYNC_TOKEN (the existing cross-repo read token used to sync
# template/provider content from sibling repos). If the secret is absent the
# job emits a clear ::warning:: and exits 0 — the hermetic sha pin in
# sync_canonical_test.go is the always-on backstop, so a missing cross-repo
# token degrades to "hand-edit still caught, live canonical drift not caught"
# rather than a hard red that blocks unrelated PRs.
on:
pull_request:
types: [opened, synchronize, reopened]
paths:
- 'workspace-server/internal/providers/providers.yaml'
- '.gitea/workflows/sync-providers-yaml.yml'
push:
branches: [main, staging]
paths:
- 'workspace-server/internal/providers/providers.yaml'
- '.gitea/workflows/sync-providers-yaml.yml'
schedule:
# Daily at :23 — catch a canonical change in controlplane that landed
# without a paired core re-sync PR (off-zero to spread cron load).
- cron: '23 4 * * *'
workflow_dispatch:
env:
GITHUB_SERVER_URL: https://git.moleculesai.app
permissions:
contents: read
concurrency:
group: sync-providers-yaml-${{ github.ref }}
cancel-in-progress: true
jobs:
compare:
name: Compare synced providers.yaml against controlplane canonical
runs-on: ubuntu-latest
timeout-minutes: 6
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- name: Fetch canonical providers.yaml from controlplane and byte-compare
env:
AUTO_SYNC_TOKEN: ${{ secrets.AUTO_SYNC_TOKEN }}
API_ROOT: ${{ github.server_url }}/api/v1
run: |
set -euo pipefail
if [ -z "${AUTO_SYNC_TOKEN:-}" ]; then
echo "::warning::AUTO_SYNC_TOKEN secret missing — skipping the live cross-repo compare."
echo "The hermetic sha pin (sync_canonical_test.go) still gates hand-edits of core's copy."
echo "Provision AUTO_SYNC_TOKEN (read scope on molecule-controlplane) to enable live canonical-drift detection."
exit 0
fi
CANON_URL="${API_ROOT}/repos/molecule-ai/molecule-controlplane/raw/internal/providers/providers.yaml?ref=main"
# Use the /raw endpoint: it returns the file bytes directly. (The
# /contents endpoint ignores Accept: application/vnd.gitea.raw on
# Gitea 1.22.6 and returns the JSON+base64 envelope, which made this
# diff a permanent false RED.)
curl -fsS \
-H "Authorization: token ${AUTO_SYNC_TOKEN}" \
"${CANON_URL}" -o /tmp/canonical-providers.yaml
LOCAL=workspace-server/internal/providers/providers.yaml
if diff -u /tmp/canonical-providers.yaml "$LOCAL"; then
echo "OK — core's synced providers.yaml is byte-identical to the controlplane canonical."
else
echo "::error::core's synced providers.yaml DRIFTED from the controlplane canonical (SSOT)."
echo "Re-sync: copy controlplane internal/providers/providers.yaml verbatim over"
echo " $LOCAL, run 'go generate ./...' in workspace-server/, and bump"
echo " canonicalProvidersYAMLSHA256 in internal/providers/sync_canonical_test.go."
exit 1
fi
+89
View File
@@ -0,0 +1,89 @@
name: verify-providers-gen
# Provider-registry SSOT enforcement gate — molecule-core side (internal#718
# P2-A, CTO 2026-05-27 "Distribution = SDK via codegen + verify-CI").
#
# The canonical schema SSOT is molecule-controlplane
# internal/providers/providers.yaml. molecule-core carries a SYNCED COPY at
# workspace-server/internal/providers/providers.yaml (kept in sync by the
# companion sync-providers-yaml.yml gate), and cmd/gen-providers emits the
# checked-in Go projection workspace-server/internal/providers/gen/registry_gen.go.
#
# This workflow regenerates the artifact into the working tree and fails RED if
# it differs from what is committed — catching BOTH:
# * a providers.yaml (synced-copy) change that wasn't followed by `go generate ./...`, and
# * a hand-edit of the generated artifact (it carries a DO NOT EDIT header).
#
# It is the molecule-core mirror of molecule-controlplane's verify-providers-gen
# workflow. Together with sync-providers-yaml (canonical↔synced-copy drift) it
# closes the codegen-checked-into-each-repo + verify-CI loop the RFC mandates.
#
# ENFORCEMENT GATING (deliberate, per dev-SOP "implementation gating"):
# this is a STANDALONE workflow, NOT a job inside ci.yml, and is NOT yet in any
# branch-protection status_check_contexts. Rationale (identical to the CP P0
# rollout):
# * It runs + reports RED on every PR/push immediately (visible signal).
# * It is intentionally absent from ci.yml's job set so the ci-required-drift
# sentinel (jobs ↔ branch-protection ↔ audit-env) does NOT fire on it, and
# from branch protection (turning it into a hard merge gate has blast radius
# — operator GO required, same pattern as sop-tier-check / verify-providers-gen
# on controlplane). Promote it into branch protection in a follow-up once
# P2 has soaked.
# Until then it behaves like secret-scan / block-internal-paths: a standalone
# advisory-to-hard gate the author is expected to keep green.
on:
pull_request:
types: [opened, synchronize, reopened]
push:
branches: [main, staging]
env:
GITHUB_SERVER_URL: https://git.moleculesai.app
permissions:
contents: read
concurrency:
group: verify-providers-gen-${{ github.ref }}
cancel-in-progress: true
jobs:
verify:
name: Regenerate providers artifact and fail on drift
runs-on: ubuntu-latest
timeout-minutes: 8
defaults:
run:
working-directory: workspace-server
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- uses: actions/setup-go@4a3601121dd01d1626a1e23e37211e3254c1c06c # v6.4.0
with:
go-version: 'stable'
cache: true
cache-dependency-path: workspace-server/go.sum
- name: Verify generated artifact is in sync with providers.yaml
run: |
set -euo pipefail
# -check regenerates in memory and byte-compares against the
# checked-in artifact; exit 1 (RED) on any drift. This is the
# single source of the gate's verdict — the same code path
# `go test ./cmd/gen-providers` exercises.
go run ./cmd/gen-providers -check
- name: Belt-and-braces — regenerate in place and assert clean tree
run: |
set -euo pipefail
# Independent confirmation that does not trust the -check path:
# actually write the artifact and assert git sees no change. If
# this and the step above ever disagree, the gate is suspect.
go generate ./...
if ! git diff --quiet -- internal/providers/gen/registry_gen.go; then
echo "::error::workspace-server/internal/providers/gen/registry_gen.go drifted from providers.yaml."
echo "Run 'go generate ./...' (or 'go run ./cmd/gen-providers') in workspace-server/ and commit the result."
git --no-pager diff -- internal/providers/gen/registry_gen.go | head -80
exit 1
fi
echo "OK — generated providers artifact is in sync with the schema SSOT."
+43 -25
View File
@@ -73,7 +73,15 @@ else
fi
# Test 4: Create workspace B (needs bearer — tokens now exist in DB)
R=$(acurl -X POST "$BASE/workspaces" -H "Content-Type: application/json" -d '{"name":"Summarizer Agent","tier":1,"runtime":"external","external":true}')
# #1953 cross-tenant isolation: Summarizer is created as a CHILD of Echo so the
# two live in the SAME org (Echo is the org root; Summarizer hangs off it via
# parent_id). The peer-discovery tests below assert same-org peer enumeration
# (Echo sees its child, the child sees its parent). Previously both were created
# parent_id=NULL — two DISTINCT org roots — and "peers" only listed each other
# via the `WHERE parent_id IS NULL` branch that returned every tenant's org root.
# That branch WAS the cross-tenant leak (#1953) and is now removed, so two org
# roots no longer see each other; the assertions must run inside one org.
R=$(acurl -X POST "$BASE/workspaces" -H "Content-Type: application/json" -d "{\"name\":\"Summarizer Agent\",\"tier\":1,\"runtime\":\"external\",\"external\":true,\"parent_id\":\"$ECHO_ID\"}")
check "POST /workspaces (create summarizer)" '"status":"awaiting_agent"' "$R"
SUM_ID=$(echo "$R" | python3 -c "import sys,json; print(json.load(sys.stdin)['id'])")
@@ -133,21 +141,23 @@ check "Heartbeat updated uptime" '"uptime_seconds":120' "$R"
R=$(curl -s "$BASE/registry/discover/$ECHO_ID")
check "GET /registry/discover/:id (missing caller rejected)" 'X-Workspace-ID header is required' "$R"
# Test 12: Discover (from sibling — allowed)
# Test 12: Discover (from same-org child — allowed)
R=$(curl -s "$BASE/registry/discover/$ECHO_ID" -H "X-Workspace-ID: $SUM_ID" -H "Authorization: Bearer $SUM_TOKEN")
check "GET /registry/discover/:id (sibling)" '"url"' "$R"
check "GET /registry/discover/:id (same-org)" '"url"' "$R"
# Test 13: Peers (root siblings see each other)
# Test 13: Peers — same-org parent/child see each other (#1953). Echo is the org
# root and lists its child Summarizer; Summarizer lists its parent Echo. A
# cross-org workspace would NOT appear here (see cross_tenant_isolation_test.go).
R=$(curl -s "$BASE/registry/$ECHO_ID/peers" -H "Authorization: Bearer $ECHO_TOKEN")
check "GET /registry/:id/peers (has summarizer)" '"Summarizer' "$R"
R=$(curl -s "$BASE/registry/$SUM_ID/peers" -H "Authorization: Bearer $SUM_TOKEN")
check "GET /registry/:id/peers (has echo)" '"Echo Agent"' "$R"
# Test 14: Check access (root siblings)
# Test 14: Check access (same-org parent↔child — allowed)
R=$(curl -s -X POST "$BASE/registry/check-access" -H "Content-Type: application/json" \
-d "{\"caller_id\":\"$ECHO_ID\",\"target_id\":\"$SUM_ID\"}")
check "POST /registry/check-access (siblings allowed)" '"allowed":true' "$R"
check "POST /registry/check-access (same-org allowed)" '"allowed":true' "$R"
# Test 15: PATCH workspace (update position)
R=$(acurl -X PATCH "$BASE/workspaces/$ECHO_ID" -H "Content-Type: application/json" -d '{"x":100,"y":200}')
@@ -289,32 +299,40 @@ R=$(curl -s "$BASE/workspaces" -H "Authorization: Bearer $ECHO_TOKEN")
check "current_task in list response" '"current_task"' "$R"
# Test 21: Delete
R=$(acurl -X DELETE "$BASE/workspaces/$ECHO_ID?confirm=true" \
-H "Authorization: Bearer $ECHO_TOKEN" \
-H "X-Confirm-Name: Echo Agent v2")
check "DELETE /workspaces/:id" '"status":"removed"' "$R"
R=$(curl -s "$BASE/workspaces" -H "Authorization: Bearer $SUM_TOKEN")
COUNT=$(echo "$R" | python3 -c "import sys,json; print(len(json.load(sys.stdin)))")
check "List after delete (count=1)" "1" "$COUNT"
# Test 22: Bundle round-trip — export → delete → import → verify same config
echo ""
echo "--- Bundle Round-Trip Test ---"
# Export the summarizer workspace (#165 / PR #167 — admin-gated)
# #1953: Summarizer is now a CHILD of Echo (same-org, for the peer-discovery
# tests above). DELETE on the *parent* (Echo) cascade-removes its descendants
# (CascadeDelete walks the recursive `parent_id` CTE), so deleting Echo first
# would also remove Summarizer and the "one survives" assertion would see 0.
# Delete the CHILD (Summarizer) here instead: a child delete does NOT cascade
# upward, so the parent Echo survives and count=1 holds. The bundle round-trip
# below needs Summarizer's exported config, so capture it BEFORE this delete.
BUNDLE=$(curl -s "$BASE/bundles/export/$SUM_ID" -H "Authorization: Bearer $SUM_TOKEN")
check "GET /bundles/export/:id" '"name":"Summarizer Agent"' "$BUNDLE"
# Capture original config for comparison
ORIG_NAME=$(echo "$BUNDLE" | python3 -c "import sys,json; print(json.load(sys.stdin)['name'])")
ORIG_TIER=$(echo "$BUNDLE" | python3 -c "import sys,json; print(json.load(sys.stdin)['tier'])")
# Delete the workspace — use SUM_TOKEN (per-workspace) for WorkspaceAuth
# and ADMIN_TOKEN for the AdminAuth layer.
R=$(curl -s -X DELETE "$BASE/workspaces/$SUM_ID?confirm=true" \
R=$(acurl -X DELETE "$BASE/workspaces/$SUM_ID?confirm=true" \
-H "Authorization: Bearer $SUM_TOKEN" \
-H "X-Confirm-Name: Summarizer Agent")
check "DELETE /workspaces/:id" '"status":"removed"' "$R"
# Parent Echo must survive a child delete — list as Echo and expect count=1.
R=$(curl -s "$BASE/workspaces" -H "Authorization: Bearer $ECHO_TOKEN")
COUNT=$(echo "$R" | python3 -c "import sys,json; print(len(json.load(sys.stdin)))")
check "List after delete (count=1)" "1" "$COUNT"
# Test 22: Bundle round-trip — export → delete → import → verify same config.
# Summarizer's bundle was captured above; now delete the parent Echo (the only
# remaining workspace) so the import lands in a clean org, then re-import the
# Summarizer bundle.
echo ""
echo "--- Bundle Round-Trip Test ---"
# Delete the remaining parent Echo — use ECHO_TOKEN (per-workspace) for
# WorkspaceAuth and ADMIN_TOKEN for the AdminAuth layer.
R=$(acurl -X DELETE "$BASE/workspaces/$ECHO_ID?confirm=true" \
-H "Authorization: Bearer $ECHO_TOKEN" \
-H "X-Confirm-Name: Echo Agent v2")
check "Delete before re-import" '"status":"removed"' "$R"
# After deleting both workspaces, all per-workspace tokens are revoked.
+271
View File
@@ -0,0 +1,271 @@
// Command gen-providers is the codegen half of the provider-registry SSOT
// machinery on the molecule-core side (internal#718 P2-A, CTO 2026-05-27
// "Distribution = SDK via codegen + verify-CI"). It is the byte-for-byte mirror
// of molecule-controlplane's cmd/gen-providers (the canonical generator). It
// reads core's SYNCED COPY of the schema — internal/providers/providers.yaml
// (via the providers loader, so it shares the SAME parse + validation as the
// runtime) — and emits a checked-in Go artifact:
//
// internal/providers/gen/registry_gen.go
//
// The artifact is a deterministic projection of the merged registry: the
// provider catalog + per-runtime native sets as Go literals, plus the schema
// version and a content fingerprint. It is core's leaf of the multi-language SDK
// layer the RFC calls for (Go(CP+core)/TS(canvas)/Python(adapters)).
//
// CONTRACT for P2-A (zero behavior change): the generated artifact is
// checked-in + drift-gated ONLY. NO production code path imports
// internal/providers/gen — the gen-import-boundary test pins that. P2-B wires
// the billing/credential decision onto the LOADER (DeriveProvider/IsPlatform),
// not the raw gen literals. The generator is the build-time half;
// verify-providers-gen.yml is the CI half that regenerates and fails RED on any
// diff (drift or hand-edit); sync-providers-yaml.yml gates the synced copy
// against the controlplane canonical.
//
// Usage:
//
// go run ./cmd/gen-providers # write the artifact in place
// go run ./cmd/gen-providers -check # exit non-zero if the on-disk
// # artifact differs from a fresh gen
// # (the CI drift gate)
// go run ./cmd/gen-providers -o PATH # write to a specific path
//
//go:generate go run ../gen-providers -o ../../internal/providers/gen/registry_gen.go
package main
import (
"bytes"
"crypto/sha256"
"encoding/hex"
"flag"
"fmt"
"go/format"
"os"
"sort"
"strconv"
"text/template"
"git.moleculesai.app/molecule-ai/molecule-core/workspace-server/internal/providers"
)
// defaultOutPath is the checked-in artifact location, relative to the repo
// root (the directory `go run ./cmd/gen-providers` is invoked from).
const defaultOutPath = "internal/providers/gen/registry_gen.go"
func main() {
var (
outPath string
check bool
)
flag.StringVar(&outPath, "o", defaultOutPath, "output path for the generated artifact")
flag.BoolVar(&check, "check", false, "verify the on-disk artifact matches a fresh generation; exit 1 on drift")
flag.Parse()
generated, err := render()
if err != nil {
fmt.Fprintf(os.Stderr, "gen-providers: %v\n", err)
os.Exit(1)
}
if check {
existing, err := os.ReadFile(outPath)
if err != nil {
fmt.Fprintf(os.Stderr, "gen-providers -check: cannot read %s: %v\n", outPath, err)
fmt.Fprintln(os.Stderr, "Run `go generate ./...` (or `go run ./cmd/gen-providers`) and commit the result.")
os.Exit(1)
}
if !bytes.Equal(existing, generated) {
fmt.Fprintf(os.Stderr, "gen-providers -check: DRIFT — %s is out of sync with providers.yaml.\n", outPath)
fmt.Fprintln(os.Stderr, "The generated artifact was hand-edited or providers.yaml changed without regen.")
fmt.Fprintln(os.Stderr, "Fix: run `go generate ./...` (or `go run ./cmd/gen-providers`) and commit.")
os.Exit(1)
}
fmt.Println("gen-providers -check: OK — artifact in sync with providers.yaml")
return
}
if err := os.WriteFile(outPath, generated, 0o644); err != nil {
fmt.Fprintf(os.Stderr, "gen-providers: write %s: %v\n", outPath, err)
os.Exit(1)
}
fmt.Printf("gen-providers: wrote %s\n", outPath)
}
// render loads the manifest and produces the gofmt'd artifact bytes.
func render() ([]byte, error) {
m, err := providers.LoadManifest()
if err != nil {
return nil, fmt.Errorf("load manifest: %w", err)
}
// Deterministic ordering: providers in catalog order is already stable
// (slice). Runtimes is a map — sort its keys so the artifact is
// reproducible regardless of Go map iteration order.
runtimeNames := make([]string, 0, len(m.Runtimes))
for rt := range m.Runtimes {
runtimeNames = append(runtimeNames, rt)
}
sort.Strings(runtimeNames)
type genProvider struct {
Name string
DisplayName string
Protocol string
AuthMode string
AuthEnv []string
ModelPrefixMatch string
IsPlatform bool
// UpstreamVendor is the proxy's upstream-vendor key for this entry
// (internal#718 P1, CONVERGED) — empty for entries the proxy does not
// route to an upstream. A plain scalar (no pointer), so both the rendered
// literal and the fingerprint stay deterministic.
UpstreamVendor string
}
type genRef struct {
Name string
Models []string
}
type genRuntime struct {
Name string
Providers []genRef
}
data := struct {
SchemaVersion int
Fingerprint string
Providers []genProvider
Runtimes []genRuntime
}{
SchemaVersion: providers.SchemaVersion(),
}
for _, p := range m.Providers {
gp := genProvider{
Name: p.Name,
DisplayName: p.DisplayName,
Protocol: string(p.Protocol),
AuthMode: p.AuthMode,
AuthEnv: p.AuthEnv,
ModelPrefixMatch: p.ModelPrefixMatch,
IsPlatform: p.IsPlatform(),
UpstreamVendor: p.UpstreamVendor,
}
data.Providers = append(data.Providers, gp)
}
for _, rt := range runtimeNames {
native := m.Runtimes[rt]
gr := genRuntime{Name: rt}
for _, ref := range native.Providers {
gr.Providers = append(gr.Providers, genRef{Name: ref.Name, Models: ref.Models})
}
data.Runtimes = append(data.Runtimes, gr)
}
// Fingerprint pins the artifact to the data it was generated from. It is
// derived from the structured projection (schema version + providers +
// runtimes), NOT the raw YAML bytes, so a comment-only YAML edit does not
// churn the artifact while any data change does.
data.Fingerprint = fingerprint(data.SchemaVersion, data.Providers, data.Runtimes)
var buf bytes.Buffer
if err := artifactTmpl.Execute(&buf, data); err != nil {
return nil, fmt.Errorf("execute template: %w", err)
}
formatted, err := format.Source(buf.Bytes())
if err != nil {
return nil, fmt.Errorf("gofmt generated source: %w\n----\n%s", err, buf.String())
}
return formatted, nil
}
// fingerprint is a stable content hash of the structured projection. Any
// fields below this function references must be kept in sync with the
// template's emitted data so the hash and the literals never diverge.
func fingerprint(schema int, provs any, runtimes any) string {
h := sha256.New()
fmt.Fprintf(h, "schema=%d\n", schema)
fmt.Fprintf(h, "%#v\n%#v\n", provs, runtimes)
return hex.EncodeToString(h.Sum(nil))[:16]
}
func quote(s string) string { return strconv.Quote(s) }
func quoteSlice(ss []string) string {
var b bytes.Buffer
b.WriteString("[]string{")
for i, s := range ss {
if i > 0 {
b.WriteString(", ")
}
b.WriteString(strconv.Quote(s))
}
b.WriteString("}")
return b.String()
}
var artifactTmpl = template.Must(template.New("artifact").Funcs(template.FuncMap{
"quote": quote,
"quoteSlice": quoteSlice,
}).Parse(`// Code generated by cmd/gen-providers; DO NOT EDIT.
//
// Source of truth: internal/providers/providers.yaml (schema_version {{.SchemaVersion}}).
// Regenerate with: go generate ./... (or: go run ./cmd/gen-providers)
// The verify-providers-gen CI workflow fails RED if this file drifts from
// providers.yaml or is hand-edited. internal#718 P0 — checked-in + drift-
// gated ONLY; no production path imports this package yet (that is P1+).
package gen
// SchemaVersion is the providers.yaml schema this artifact was generated
// against. It is the semver'd contract version (the MAJOR component for the
// public extension contract; see internal/providers/README.md).
const SchemaVersion = {{.SchemaVersion}}
// Fingerprint is a stable content hash of the generated projection (schema
// version + provider catalog + runtime native sets). It changes iff the
// registry DATA changes (comment-only YAML edits do not churn it).
const Fingerprint = {{quote .Fingerprint}}
// GenProvider is the generated projection of one provider catalog entry —
// the subset a downstream consumer needs to derive + display a provider.
type GenProvider struct {
Name string
DisplayName string
Protocol string
AuthMode string
AuthEnv []string
ModelPrefixMatch string
// IsPlatform marks the closed, core-only platform-managed provider.
IsPlatform bool
// UpstreamVendor is the proxy's upstream-vendor key for this entry
// (internal#718 P1, CONVERGED); empty for providers the proxy does not
// route to an upstream vendor. ResolveUpstream maps a model id's namespace
// token to the entry whose UpstreamVendor equals it.
UpstreamVendor string
}
// GenRuntimeRef is one native provider a runtime supports + its exact models.
type GenRuntimeRef struct {
Name string
Models []string
}
// Providers is the full provider catalog, in providers.yaml declaration order.
var Providers = []GenProvider{
{{- range .Providers}}
{Name: {{quote .Name}}, DisplayName: {{quote .DisplayName}}, Protocol: {{quote .Protocol}}, AuthMode: {{quote .AuthMode}}, AuthEnv: {{quoteSlice .AuthEnv}}, ModelPrefixMatch: {{quote .ModelPrefixMatch}}, IsPlatform: {{.IsPlatform}}{{if .UpstreamVendor}}, UpstreamVendor: {{quote .UpstreamVendor}}{{end}}},
{{- end}}
}
// Runtimes maps each runtime to its native provider+model set, runtime names
// sorted for a deterministic artifact.
var Runtimes = map[string][]GenRuntimeRef{
{{- range .Runtimes}}
{{quote .Name}}: {
{{- range .Providers}}
{Name: {{quote .Name}}, Models: {{quoteSlice .Models}}},
{{- end}}
},
{{- end}}
}
`))
@@ -0,0 +1,121 @@
package main
import (
"bytes"
"os"
"path/filepath"
"testing"
)
// repoRoot walks up from the test's working dir (cmd/gen-providers) to the
// module root so the test can locate the checked-in artifact regardless of
// where `go test` is invoked from.
func repoRoot(t *testing.T) string {
t.Helper()
dir, err := os.Getwd()
if err != nil {
t.Fatalf("getwd: %v", err)
}
for i := 0; i < 6; i++ {
if _, err := os.Stat(filepath.Join(dir, "go.mod")); err == nil {
return dir
}
dir = filepath.Dir(dir)
}
t.Fatal("could not locate repo root (go.mod) from cmd/gen-providers")
return ""
}
// TestArtifactInSync is the drift gate's Go-test counterpart: the checked-in
// internal/providers/gen/registry_gen.go MUST byte-equal a fresh render. If a
// future edit changes providers.yaml without regenerating, OR hand-edits the
// artifact, this flips red — the same signal the verify-providers-gen CI
// workflow emits, but caught locally by `go test ./...` too.
func TestArtifactInSync(t *testing.T) {
generated, err := render()
if err != nil {
t.Fatalf("render() error = %v", err)
}
artifactPath := filepath.Join(repoRoot(t), defaultOutPath)
onDisk, err := os.ReadFile(artifactPath)
if err != nil {
t.Fatalf("read checked-in artifact %s: %v (run `go generate ./...` and commit)", artifactPath, err)
}
if !bytes.Equal(onDisk, generated) {
t.Fatalf("DRIFT: %s is out of sync with providers.yaml.\n"+
"Run `go generate ./...` (or `go run ./cmd/gen-providers`) and commit the result.", defaultOutPath)
}
}
// TestDriftGateCatchesMutation is the load-bearing-gate proof (per the SOP
// fail-direction discipline). The original P0 version was TAUTOLOGICAL
// (internal#718 P1 review carry-over): it appended bytes to an in-memory copy
// and asserted the copy differed from the original — true by construction,
// touching neither the on-disk artifact nor the actual in-sync comparison the
// gate runs. This version exercises the REAL gate: it writes a MUTATED artifact
// to disk and re-runs the SAME comparison TestArtifactInSync / `-check` perform
// (`render()` bytes vs the on-disk file), asserting it now reports drift — then
// restores the original. So the test would fail if the gate were vacuous (e.g.
// if the comparison ignored content), not merely if append changes bytes.
func TestDriftGateCatchesMutation(t *testing.T) {
generated, err := render()
if err != nil {
t.Fatalf("render() error = %v", err)
}
artifactPath := filepath.Join(repoRoot(t), defaultOutPath)
original, err := os.ReadFile(artifactPath)
if err != nil {
t.Fatalf("read checked-in artifact %s: %v", artifactPath, err)
}
// Precondition: the tree is in sync (so the mutation is what flips the gate,
// not pre-existing drift).
if !bytes.Equal(original, generated) {
t.Fatalf("precondition failed: %s already drifted from render() — run `go generate ./...`", defaultOutPath)
}
// Restore the pristine artifact no matter how the test exits.
t.Cleanup(func() {
if err := os.WriteFile(artifactPath, original, 0o644); err != nil {
t.Fatalf("CRITICAL: failed to restore %s after mutation: %v", artifactPath, err)
}
})
// Mutate the ON-DISK artifact (simulating a hand-edit / a providers.yaml
// change that wasn't regenerated).
mutated := append(append([]byte(nil), original...), []byte("\n// injected drift\n")...)
if err := os.WriteFile(artifactPath, mutated, 0o644); err != nil {
t.Fatalf("write mutated artifact: %v", err)
}
// Re-run the EXACT in-sync comparison the gate uses: fresh render vs the
// (now mutated) on-disk file. It MUST report drift.
onDiskAfter, err := os.ReadFile(artifactPath)
if err != nil {
t.Fatalf("re-read mutated artifact: %v", err)
}
freshRender, err := render()
if err != nil {
t.Fatalf("render() after mutation error = %v", err)
}
if bytes.Equal(onDiskAfter, freshRender) {
t.Fatal("drift gate did NOT detect a mutated on-disk artifact — gate is not load-bearing")
}
}
// TestRenderDeterministic proves regeneration is idempotent: two renders of
// the same manifest produce byte-identical output (sorted runtime keys, stable
// catalog order). A non-deterministic generator would make the drift gate
// flap on Go map iteration order.
func TestRenderDeterministic(t *testing.T) {
a, err := render()
if err != nil {
t.Fatalf("render() #1 error = %v", err)
}
b, err := render()
if err != nil {
t.Fatalf("render() #2 error = %v", err)
}
if !bytes.Equal(a, b) {
t.Fatal("render() is non-deterministic — two runs differ; the drift gate would flap")
}
}
@@ -532,6 +532,7 @@ func (m *Manager) FetchWorkspaceChannelContext(ctx context.Context, workspaceID
var config map[string]interface{}
if err := json.Unmarshal(configJSON, &config); err != nil {
log.Printf("ChannelManager: unmarshal config: %v", err)
config = map[string]interface{}{}
}
if err := DecryptSensitiveFields(config); err != nil {
return ""
@@ -126,6 +126,12 @@ const maxProxyResponseBody = 10 << 20
// gets `{"error":"workspace agent unreachable","restarting":true}` instead
// of Cloudflare's opaque 502 error page. Without these, dead workspaces hang
// long enough that CF gives up first and shows its own page.
//
// No Client.Timeout here — per-request context deadlines govern the full
// request lifetime (canvas = 5 min, agent-to-agent = 30 min). A fixed
// Client.Timeout would pre-empt legitimate slow cold-start flows (e.g.
// Claude Code first-token over OAuth can take 30-60s on boot). Transport-
// level timeouts (Dial, TLS, ResponseHeader) are sufficient safety nets.
var a2aClient = &http.Client{
Transport: &http.Transport{
DialContext: (&net.Dialer{
@@ -375,6 +381,30 @@ func (h *WorkspaceHandler) proxyA2ARequest(ctx context.Context, workspaceID stri
Response: gin.H{"error": "access denied: workspaces cannot communicate per hierarchy rules"},
}
}
// #1953 cross-tenant isolation. CanCommunicate alone does NOT enforce
// org boundaries: its "root-level siblings — both have no parent" rule
// treats every tenant's org root as a sibling, so a caller that is an
// org root could resolve and route a2a to another tenant's org root
// (and resolveAgentURL accepts ANY workspace id with no org check).
// Gate on the SAME parent_id-chain org scoping the OFFSEC-015 broadcast
// fix uses: reject before resolveAgentURL when caller and target are in
// different orgs. Fail-closed — a DB error denies cross-org routing.
ok, err := sameOrg(ctx, db.DB, callerID, workspaceID)
if err != nil {
log.Printf("ProxyA2A: org-scope check failed %s → %s: %v — denying", callerID, workspaceID, err)
return 0, nil, &proxyA2AError{
Status: http.StatusForbidden,
Response: gin.H{"error": "access denied: org isolation check failed"},
}
}
if !ok {
log.Printf("ProxyA2A: cross-org routing denied %s → %s (#1953)", callerID, workspaceID)
return 0, nil, &proxyA2AError{
Status: http.StatusForbidden,
Response: gin.H{"error": "access denied: target workspace is in a different org"},
}
}
}
// Budget enforcement: reject A2A calls when the workspace has exceeded its
@@ -437,6 +437,10 @@ func TestProxyA2A_CallerIDPropagated(t *testing.T) {
WithArgs("ws-target").
WillReturnRows(sqlmock.NewRows([]string{"id", "parent_id"}).AddRow("ws-target", "ws-parent"))
// #1953 cross-tenant guard: same-org check after CanCommunicate. Both
// workspaces resolve to the same org root → routing allowed.
mockSameOrg(mock, "ws-caller", "ws-target", true)
expectBudgetCheck(mock, "ws-target")
// Expect activity log with source_id set
@@ -465,6 +469,24 @@ func TestProxyA2A_CallerIDPropagated(t *testing.T) {
}
}
// mockSameOrg sets up the two org-root recursive-CTE expectations that the
// #1953 cross-tenant guard in proxyA2ARequest runs after CanCommunicate passes.
// sameOrg=true returns the SAME root_id for both caller and target (same tenant);
// sameOrg=false returns different root_ids (cross-tenant → routing must be denied).
func mockSameOrg(mock sqlmock.Sqlmock, caller, target string, sameOrg bool) {
callerRoot := "org-root-shared"
targetRoot := "org-root-shared"
if !sameOrg {
targetRoot = "org-root-other-tenant"
}
mock.ExpectQuery("WITH RECURSIVE org_chain AS").
WithArgs(caller).
WillReturnRows(sqlmock.NewRows([]string{"root_id"}).AddRow(callerRoot))
mock.ExpectQuery("WITH RECURSIVE org_chain AS").
WithArgs(target).
WillReturnRows(sqlmock.NewRows([]string{"root_id"}).AddRow(targetRoot))
}
// mockCanCommunicate sets up sqlmock expectations for CanCommunicate(caller, target).
// allowed=true sets up rows that satisfy the access policy (siblings under same parent).
// allowed=false sets up rows that don't (different parents).
@@ -659,6 +681,9 @@ func TestProxyA2A_CallerIDDerivedFromBearer(t *testing.T) {
WithArgs("ws-target").
WillReturnRows(sqlmock.NewRows([]string{"id", "parent_id"}).AddRow("ws-target", "ws-parent"))
// 3b. #1953 cross-tenant guard — same org root → routing allowed.
mockSameOrg(mock, "ws-caller", "ws-target", true)
expectBudgetCheck(mock, "ws-target")
// 4. activity_logs INSERT — verify source_id arg is the derived ws-caller
+10 -2
View File
@@ -73,6 +73,7 @@ func (h *ChannelHandler) List(c *gin.Context) {
var config map[string]interface{}
if err := json.Unmarshal(configJSON, &config); err != nil {
log.Printf("Channels: unmarshal config for channel %s: %v", id, err)
config = map[string]interface{}{}
}
// #319: decrypt sensitive fields first so the mask operates on
// plaintext (first-4 / last-4 of the real token, not the ciphertext
@@ -94,6 +95,7 @@ func (h *ChannelHandler) List(c *gin.Context) {
var allowed []string
if err := json.Unmarshal(allowedJSON, &allowed); err != nil {
log.Printf("Channels: unmarshal allowed_users for channel %s: %v", id, err)
allowed = []string{}
}
entry := map[string]interface{}{
@@ -104,8 +106,12 @@ func (h *ChannelHandler) List(c *gin.Context) {
"enabled": enabled,
"allowed_users": allowed,
"message_count": msgCount,
"created_at": createdAt.Time,
"updated_at": updatedAt.Time,
}
if createdAt.Valid {
entry["created_at"] = createdAt.Time
}
if updatedAt.Valid {
entry["updated_at"] = updatedAt.Time
}
if lastMsg.Valid {
entry["last_message_at"] = lastMsg.Time
@@ -540,9 +546,11 @@ func (h *ChannelHandler) Webhook(c *gin.Context) {
}
if err := json.Unmarshal(configJSON, &row.Config); err != nil {
log.Printf("Channels: unmarshal config for webhook row %s: %v", row.ID, err)
row.Config = map[string]interface{}{}
}
if err := json.Unmarshal(allowedJSON, &row.AllowedUsers); err != nil {
log.Printf("Channels: unmarshal allowed_users for webhook row %s: %v", row.ID, err)
row.AllowedUsers = []string{}
}
if err := channels.DecryptSensitiveFields(row.Config); err != nil {
log.Printf("Channels: decrypt webhook row %s: %v", row.ID, err)
@@ -0,0 +1,427 @@
package handlers
// cross_tenant_isolation_test.go — #1953 regression tests.
//
// Three workspace-server paths historically derived an "org-root sibling set"
// as `WHERE parent_id IS NULL`, which matches EVERY tenant's org root (the
// workspaces table has no org_id column) → cross-tenant data exposure:
//
// 1. GET /registry/:id/peers (discovery.Peers)
// 2. MCP toolListPeers (mcp_tools.toolListPeers)
// 3. a2a routing (a2a_proxy.proxyA2ARequest → resolveAgentURL)
//
// These tests assert that a workspace in a DIFFERENT org is never returned as a
// peer and that a2a refuses to resolve/route to a workspace outside the caller's
// org, while same-org peers/targets still work. They reuse the SAME parent_id-
// chain org scoping the OFFSEC-015 broadcast fix introduced (org_scope.go).
import (
"bytes"
"context"
"database/sql"
"encoding/json"
"fmt"
"net/http"
"net/http/httptest"
"strings"
"testing"
"time"
"git.moleculesai.app/molecule-ai/molecule-core/workspace-server/internal/db"
"github.com/DATA-DOG/go-sqlmock"
"github.com/gin-gonic/gin"
)
// dbHandleForTest returns the global sqlmock-backed *sql.DB that setupTestDB
// installs, for tests that need to hand a *sql.DB to a component (e.g.
// MCPHandler.database, sameOrg) rather than relying on the package-global.
func dbHandleForTest() *sql.DB { return db.DB }
// peerColsForIsolation matches queryPeerMaps' SELECT column set.
var peerColsForIsolation = []string{
"id", "name", "role", "tier", "status", "agent_card", "url", "parent_id", "active_tasks",
}
// -------------------------------------------------------------------------
// Path 1: GET /registry/:id/peers — discovery.Peers
// -------------------------------------------------------------------------
// TestPeers_CrossTenant_OrgRootNotLeaked is the core #1953 regression for the
// discovery path. The caller is an org root (parent_id IS NULL). Pre-fix the
// handler ran `SELECT ... WHERE w.parent_id IS NULL AND w.id != $1`, returning
// every OTHER tenant's org root as a "sibling" peer. Post-fix an org-root caller
// issues NO sibling query — its only peers are its own children. If the handler
// regressed and issued the cross-tenant sibling query, sqlmock would report an
// unexpected query (the expectation below is intentionally NOT registered) and
// the test fails.
func TestPeers_CrossTenant_OrgRootNotLeaked(t *testing.T) {
mock := setupTestDB(t)
setupTestRedis(t)
handler := NewDiscoveryHandler()
// Behavioural leak test: register the OLD leaky `parent_id IS NULL` sibling
// query so that IF the handler still issues it, it returns another tenant's
// org root (org-b-root). The fix removes that query for an org-root caller,
// so org-b-root must never appear in the output. Unordered matching makes
// the leaky-sibling expectation optional — the fix simply never consumes it.
mock.MatchExpectationsInOrder(false)
caller := "org-a-root" // parent_id IS NULL — an org root for tenant A
// parent_id lookup → NULL (caller is an org root)
mock.ExpectQuery("SELECT parent_id FROM workspaces WHERE id =").
WithArgs(caller).
WillReturnRows(sqlmock.NewRows([]string{"parent_id"}).AddRow(nil))
// LEAKY sibling query (pre-fix). Returns a DIFFERENT tenant's org root.
// The fix must NOT issue this query; if it does, org-b-root leaks into the
// peer list and the output assertion below fails.
mock.ExpectQuery("SELECT w.id, w.name.*WHERE w.parent_id IS NULL AND w.id != \\$1").
WithArgs(caller).
WillReturnRows(sqlmock.NewRows(peerColsForIsolation).
AddRow("org-b-root", "Org B Root", "lead", 0, "online", []byte("null"), "http://b-root", nil, 0))
// Children query — caller's own org-A children only. Return one child.
mock.ExpectQuery("SELECT w.id, w.name.*WHERE w.parent_id = \\$1 AND w.id != \\$2").
WithArgs(caller, caller).
WillReturnRows(sqlmock.NewRows(peerColsForIsolation).
AddRow("org-a-child", "Org A Child", "worker", 1, "online", []byte("null"), "http://a-child", caller, 0))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: caller}}
c.Request = httptest.NewRequest("GET", "/registry/"+caller+"/peers", nil)
handler.Peers(c)
if w.Code != http.StatusOK {
t.Fatalf("expected 200, got %d: %s", w.Code, w.Body.String())
}
var peers []map[string]interface{}
if err := json.Unmarshal(w.Body.Bytes(), &peers); err != nil {
t.Fatalf("failed to parse response: %v", err)
}
// The other-tenant org root must NEVER appear; only the same-org child.
for _, p := range peers {
if id, _ := p["id"].(string); id == "org-b-root" {
t.Fatalf("cross-tenant leak (#1953): org-b-root appeared in org-a-root's peer list: %v", peers)
}
}
if len(peers) != 1 {
t.Fatalf("expected exactly 1 peer (same-org child), got %d: %v", len(peers), peers)
}
// NOTE: ExpectationsWereMet is intentionally NOT asserted — the leaky
// sibling expectation is deliberately left unconsumed by the fixed path.
}
// TestPeers_SameOrg_SiblingsStillWork is the positive companion: a non-root
// child caller still sees its same-org siblings, children, and parent. This
// guards against the fix over-scoping and breaking legitimate intra-org
// discovery.
func TestPeers_SameOrg_SiblingsStillWork(t *testing.T) {
mock := setupTestDB(t)
setupTestRedis(t)
handler := NewDiscoveryHandler()
caller := "org-a-child-1"
parent := "org-a-root"
mock.ExpectQuery("SELECT parent_id FROM workspaces WHERE id =").
WithArgs(caller).
WillReturnRows(sqlmock.NewRows([]string{"parent_id"}).AddRow(parent))
// Siblings — scoped to the shared parent (one tenant).
mock.ExpectQuery("SELECT w.id, w.name.*WHERE w.parent_id = \\$1 AND w.id != \\$2").
WithArgs(parent, caller).
WillReturnRows(sqlmock.NewRows(peerColsForIsolation).
AddRow("org-a-child-2", "Org A Sibling", "worker", 1, "online", []byte("null"), "http://a-sib", parent, 0))
// Children — none.
mock.ExpectQuery("SELECT w.id, w.name.*WHERE w.parent_id = \\$1 AND w.id != \\$2 AND w.status").
WithArgs(caller, caller).
WillReturnRows(sqlmock.NewRows(peerColsForIsolation))
// Parent.
mock.ExpectQuery("SELECT w.id, w.name.*WHERE w.id = \\$1 AND w.id != \\$2 AND w.status").
WithArgs(parent, caller).
WillReturnRows(sqlmock.NewRows(peerColsForIsolation).
AddRow(parent, "Org A Root", "lead", 0, "online", []byte("null"), "http://a-root", nil, 0))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: caller}}
c.Request = httptest.NewRequest("GET", "/registry/"+caller+"/peers", nil)
handler.Peers(c)
if w.Code != http.StatusOK {
t.Fatalf("expected 200, got %d: %s", w.Code, w.Body.String())
}
var peers []map[string]interface{}
if err := json.Unmarshal(w.Body.Bytes(), &peers); err != nil {
t.Fatalf("failed to parse response: %v", err)
}
// Sibling + parent = 2 same-org peers.
if len(peers) != 2 {
t.Fatalf("expected 2 same-org peers (sibling + parent), got %d: %v", len(peers), peers)
}
names := map[string]bool{}
for _, p := range peers {
names[fmt.Sprint(p["name"])] = true
}
if !names["Org A Sibling"] || !names["Org A Root"] {
t.Errorf("expected same-org sibling + parent in peer list, got %v", names)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
// -------------------------------------------------------------------------
// Path 2: MCP toolListPeers — mcp_tools.toolListPeers
// -------------------------------------------------------------------------
// mcpPeerCols matches toolListPeers' SELECT column set.
var mcpPeerCols = []string{"id", "name", "role", "status", "tier"}
// TestToolListPeers_CrossTenant_OrgRootNotLeaked is the #1953 regression for
// the MCP path. Same shape as the discovery test: an org-root caller must NOT
// enumerate other tenants' org roots. The cross-tenant `parent_id IS NULL`
// sibling query is intentionally not registered, so if it runs sqlmock fails.
func TestToolListPeers_CrossTenant_OrgRootNotLeaked(t *testing.T) {
mock := setupTestDB(t)
mock.MatchExpectationsInOrder(false)
h := &MCPHandler{database: dbHandleForTest()}
caller := "org-a-root"
mock.ExpectQuery("SELECT parent_id FROM workspaces WHERE id =").
WithArgs(caller).
WillReturnRows(sqlmock.NewRows([]string{"parent_id"}).AddRow(nil))
// LEAKY sibling query (pre-fix). Returns another tenant's org root. The fix
// must NOT issue this for an org-root caller; if it does, org-b-root leaks
// into the output and the assertion below fails. Left optional via
// unordered matching, so the fixed path simply never consumes it.
mock.ExpectQuery("WHERE w.parent_id IS NULL AND w.id != \\$1").
WithArgs(caller).
WillReturnRows(sqlmock.NewRows(mcpPeerCols).
AddRow("org-b-root", "Org B Root", "lead", "online", 0))
// Children — caller's own org-A children only.
mock.ExpectQuery("WHERE w.parent_id = \\$1 AND w.status").
WithArgs(caller).
WillReturnRows(sqlmock.NewRows(mcpPeerCols).
AddRow("org-a-child", "Org A Child", "worker", "online", 1))
out, err := h.toolListPeers(context.Background(), caller)
if err != nil {
t.Fatalf("toolListPeers returned error: %v", err)
}
if strings.Contains(out, "org-b-root") || strings.Contains(out, "Org B Root") {
t.Fatalf("cross-tenant leak (#1953): another tenant's org root appeared in toolListPeers output:\n%s", out)
}
if !strings.Contains(out, "org-a-child") {
t.Errorf("same-org child missing from toolListPeers output:\n%s", out)
}
// ExpectationsWereMet intentionally NOT asserted — leaky sibling expectation
// is deliberately left unconsumed by the fixed path.
}
// TestToolListPeers_SameOrg_SiblingsStillWork — positive companion for the MCP
// path: a non-root child still enumerates its same-org siblings + children + parent.
func TestToolListPeers_SameOrg_SiblingsStillWork(t *testing.T) {
mock := setupTestDB(t)
h := &MCPHandler{database: dbHandleForTest()}
caller := "org-a-child-1"
parent := "org-a-root"
mock.ExpectQuery("SELECT parent_id FROM workspaces WHERE id =").
WithArgs(caller).
WillReturnRows(sqlmock.NewRows([]string{"parent_id"}).AddRow(parent))
// Siblings — scoped to shared parent.
mock.ExpectQuery("WHERE w.parent_id = \\$1 AND w.id != \\$2 AND w.status").
WithArgs(parent, caller).
WillReturnRows(sqlmock.NewRows(mcpPeerCols).
AddRow("org-a-child-2", "Org A Sibling", "worker", "online", 1))
// Children — none.
mock.ExpectQuery("WHERE w.parent_id = \\$1 AND w.status").
WithArgs(caller).
WillReturnRows(sqlmock.NewRows(mcpPeerCols))
// Parent.
mock.ExpectQuery("WHERE w.id = \\$1 AND w.status").
WithArgs(parent).
WillReturnRows(sqlmock.NewRows(mcpPeerCols).
AddRow(parent, "Org A Root", "lead", "online", 0))
out, err := h.toolListPeers(context.Background(), caller)
if err != nil {
t.Fatalf("toolListPeers returned error: %v", err)
}
if !strings.Contains(out, "Org A Sibling") || !strings.Contains(out, "Org A Root") {
t.Errorf("expected same-org sibling + parent in toolListPeers output:\n%s", out)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
// -------------------------------------------------------------------------
// Path 3: a2a routing — a2a_proxy.proxyA2ARequest / resolveAgentURL
// -------------------------------------------------------------------------
// TestProxyA2A_CrossTenant_RoutingDenied is the #1953 regression for a2a
// routing. Caller and target are both org roots (parent_id IS NULL) belonging
// to DIFFERENT tenants. Pre-fix, CanCommunicate's "root-level siblings" rule
// waved this through and resolveAgentURL routed to the foreign tenant. Post-fix
// the org-scope guard resolves each to a different org root and returns 403
// BEFORE resolveAgentURL/dispatch.
func TestProxyA2A_CrossTenant_RoutingDenied(t *testing.T) {
mock := setupTestDB(t)
mr := setupTestRedis(t)
broadcaster := newTestBroadcaster()
handler := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
caller := "org-a-root"
target := "org-b-root" // different tenant
// A URL exists for the target; the guard must deny BEFORE it is used.
mr.Set(fmt.Sprintf("ws:%s:url", target), "http://localhost:1")
// CanCommunicate: both root-level (parent_id NULL) → its weak "root-level
// siblings" rule ALLOWS this. The org guard must catch it afterward.
mock.ExpectQuery("SELECT id, parent_id FROM workspaces WHERE id = ").
WithArgs(caller).
WillReturnRows(sqlmock.NewRows([]string{"id", "parent_id"}).AddRow(caller, nil))
mock.ExpectQuery("SELECT id, parent_id FROM workspaces WHERE id = ").
WithArgs(target).
WillReturnRows(sqlmock.NewRows([]string{"id", "parent_id"}).AddRow(target, nil))
// #1953 org-scope guard: caller resolves to org-a-root, target to org-b-root
// → different orgs → 403. (Each org root resolves to itself.)
mock.ExpectQuery("WITH RECURSIVE org_chain AS").
WithArgs(caller).
WillReturnRows(sqlmock.NewRows([]string{"root_id"}).AddRow(caller))
mock.ExpectQuery("WITH RECURSIVE org_chain AS").
WithArgs(target).
WillReturnRows(sqlmock.NewRows([]string{"root_id"}).AddRow(target))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: target}}
body := `{"method":"message/send","params":{"message":{"role":"user","parts":[{"text":"cross-tenant"}]}}}`
c.Request = httptest.NewRequest("POST", "/workspaces/"+target+"/a2a", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
c.Request.Header.Set("X-Workspace-ID", caller)
handler.ProxyA2A(c)
if w.Code != http.StatusForbidden {
t.Fatalf("expected 403 for cross-tenant a2a routing, got %d: %s", w.Code, w.Body.String())
}
var resp map[string]interface{}
if err := json.Unmarshal(w.Body.Bytes(), &resp); err != nil {
t.Fatalf("body not JSON: %v", err)
}
if msg, _ := resp["error"].(string); !strings.Contains(msg, "different org") {
t.Errorf("expected cross-org denial message, got %v", resp["error"])
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
// TestResolveAgentURL_CrossTenant_RejectedViaSameOrg is a direct unit test of
// the sameOrg primitive that gates resolveAgentURL: a target in a different org
// must be reported as NOT same-org, so the a2a guard rejects it before
// resolveAgentURL is ever called.
func TestResolveAgentURL_CrossTenant_RejectedViaSameOrg(t *testing.T) {
mock := setupTestDB(t)
caller := "org-a-root"
target := "org-b-root"
mock.ExpectQuery("WITH RECURSIVE org_chain AS").
WithArgs(caller).
WillReturnRows(sqlmock.NewRows([]string{"root_id"}).AddRow(caller))
mock.ExpectQuery("WITH RECURSIVE org_chain AS").
WithArgs(target).
WillReturnRows(sqlmock.NewRows([]string{"root_id"}).AddRow(target))
ok, err := sameOrg(context.Background(), dbHandleForTest(), caller, target)
if err != nil {
t.Fatalf("sameOrg returned unexpected error: %v", err)
}
if ok {
t.Errorf("expected cross-tenant workspaces to be reported as DIFFERENT orgs, got sameOrg=true")
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
// TestProxyA2A_SameOrg_RoutingAllowed — positive companion for a2a: two
// same-org siblings route successfully (mirrors TestProxyA2A_CallerIDPropagated
// but named to document the #1953 same-org allow path).
func TestProxyA2A_SameOrg_RoutingAllowed(t *testing.T) {
mock := setupTestDB(t)
mr := setupTestRedis(t)
allowLoopbackForTest(t)
broadcaster := newTestBroadcaster()
handler := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
waitForHandlerAsyncBeforeDBCleanup(t, handler)
caller := "org-a-child-1"
target := "org-a-child-2"
parent := "org-a-root"
agentServer := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
fmt.Fprint(w, `{"jsonrpc":"2.0","id":"1","result":{}}`)
}))
defer agentServer.Close()
mr.Set(fmt.Sprintf("ws:%s:url", target), agentServer.URL)
// CanCommunicate — siblings under shared parent.
mock.ExpectQuery("SELECT id, parent_id FROM workspaces WHERE id = ").
WithArgs(caller).
WillReturnRows(sqlmock.NewRows([]string{"id", "parent_id"}).AddRow(caller, parent))
mock.ExpectQuery("SELECT id, parent_id FROM workspaces WHERE id = ").
WithArgs(target).
WillReturnRows(sqlmock.NewRows([]string{"id", "parent_id"}).AddRow(target, parent))
// #1953 org guard — both resolve to the same org root → allowed.
mock.ExpectQuery("WITH RECURSIVE org_chain AS").
WithArgs(caller).
WillReturnRows(sqlmock.NewRows([]string{"root_id"}).AddRow(parent))
mock.ExpectQuery("WITH RECURSIVE org_chain AS").
WithArgs(target).
WillReturnRows(sqlmock.NewRows([]string{"root_id"}).AddRow(parent))
expectBudgetCheck(mock, target)
mock.ExpectExec("INSERT INTO activity_logs").WillReturnResult(sqlmock.NewResult(0, 1))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: target}}
body := `{"method":"message/send","params":{"message":{"role":"user","parts":[{"text":"same-org"}]}}}`
c.Request = httptest.NewRequest("POST", "/workspaces/"+target+"/a2a", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
c.Request.Header.Set("X-Workspace-ID", caller)
handler.ProxyA2A(c)
time.Sleep(50 * time.Millisecond) // allow the async logA2ASuccess INSERT to flush
if w.Code != http.StatusOK {
t.Fatalf("expected 200 for same-org a2a routing, got %d: %s", w.Code, w.Body.String())
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
@@ -186,6 +186,8 @@ func (h *DelegationHandler) Delegate(c *gin.Context) {
})
if marshalErr != nil {
log.Printf("Delegation %s: json.Marshal a2aBody failed: %v", delegationID, marshalErr)
c.JSON(http.StatusInternalServerError, gin.H{"error": "failed to build A2A request"})
return
}
// Fire-and-forget: send A2A in a background goroutine.
@@ -140,7 +140,14 @@ func buildHTTPResponse(statusCode int, body string) []byte {
}
// setupIntegrationFixtures inserts the rows executeDelegation requires:
// - workspaces: source and target (siblings, parent_id=NULL so CanCommunicate=true)
// - workspaces: source (org root) + target as its CHILD, so both live in the
// SAME org. CanCommunicate=true (parent↔child) AND the #1953 sameOrg() guard
// in proxyA2ARequest passes (both resolve to the same org root). A real
// delegation happens INSIDE one org. (Previously both were parent_id=NULL —
// two DISTINCT org roots — which only "communicated" via CanCommunicate's
// root-sibling rule; #1953 added a sameOrg() guard that now denies routing
// between two org roots as cross-tenant, so the success-path tests below
// must use a same-org source/target pair.)
// - activity_logs: the 'delegate' row that updateDelegationStatus UPDATE will find
// - delegations: the ledger row that recordLedgerStatus will UPDATE
//
@@ -148,13 +155,14 @@ func buildHTTPResponse(statusCode int, body string) []byte {
func setupIntegrationFixtures(t *testing.T, conn *sql.DB) func() {
t.Helper()
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
sourceID := integrationTestSourceID // org root (parent_id NULL); target hangs off it
for _, ws := range []struct {
id string
name string
parentID *string
}{
{integrationTestSourceID, "test-source", nil},
{integrationTestTargetID, "test-target", nil},
{integrationTestTargetID, "test-target", &sourceID}, // child of source → same org
} {
if _, err := conn.ExecContext(ctx,
`INSERT INTO workspaces (id, name, parent_id) VALUES ($1::uuid, $2, $3) ON CONFLICT (id) DO NOTHING`,
@@ -510,6 +518,94 @@ func TestIntegration_ExecuteDelegation_RedisDown_FallsBackToDB(t *testing.T) {
}
}
// TestIntegration_SameOrg_RealCTE_ResolvesAncestorChain is the regression gate
// for the org_scope.go recursive-CTE bug (#1953 follow-up). The sqlmock unit
// tests feed sameOrg() a pre-computed root_id row, so they CANNOT catch a wrong
// CTE — they assume it already returns the right value. Only a real Postgres
// run exercises orgRootSubtreeCTE itself.
//
// The bug: the CTE carried `id AS root_id` from the recursive SEED, so a
// non-root workspace resolved to ITSELF instead of its topmost ancestor. That
// made sameOrg() return false for two genuinely same-org workspaces and 403 a
// legitimate same-org a2a route (over-block). This test seeds a real
// root → child → grandchild chain plus a separate org root, and asserts:
// - every node in the chain resolves to the SAME org root (root, child, grandchild)
// - two workspaces in the same chain are sameOrg (incl. grandchild ↔ root)
// - a workspace in a DIFFERENT chain is NOT sameOrg (cross-tenant stays closed)
func TestIntegration_SameOrg_RealCTE_ResolvesAncestorChain(t *testing.T) {
conn := integrationDB(t)
const (
rootA = "11111111-1111-1111-1111-111111111111"
childA = "22222222-2222-2222-2222-222222222222"
grandchildA = "33333333-3333-3333-3333-333333333333"
rootB = "44444444-4444-4444-4444-444444444444"
)
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
t.Cleanup(func() {
c2, cancel2 := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel2()
// Delete leaf-first to respect the parent_id self-FK.
for _, id := range []string{grandchildA, childA, rootA, rootB} {
conn.ExecContext(c2, `DELETE FROM workspaces WHERE id = $1`, id)
}
})
// Insert parent-before-child to satisfy the self-referential FK.
seed := []struct {
id, name string
parent *string
}{
{rootA, "org-a-root", nil},
{childA, "org-a-child", strPtr(rootA)},
{grandchildA, "org-a-grandchild", strPtr(childA)},
{rootB, "org-b-root", nil},
}
for _, s := range seed {
if _, err := conn.ExecContext(ctx,
`INSERT INTO workspaces (id, name, parent_id) VALUES ($1::uuid, $2, $3) ON CONFLICT (id) DO NOTHING`,
s.id, s.name, s.parent); err != nil {
t.Fatalf("seed %s: %v", s.name, err)
}
}
// Every node in chain A must resolve to rootA via the REAL CTE.
for _, id := range []string{rootA, childA, grandchildA} {
got, err := orgRootID(ctx, conn, id)
if err != nil {
t.Fatalf("orgRootID(%s): %v", id, err)
}
if got != rootA {
t.Errorf("orgRootID(%s) = %q, want rootA %q (CTE must walk to topmost ancestor)", id, got, rootA)
}
}
// Same-org positives — including the grandchild↔root pair that the buggy
// CTE got wrong.
for _, pair := range [][2]string{{childA, grandchildA}, {rootA, grandchildA}, {rootA, childA}} {
ok, err := sameOrg(ctx, conn, pair[0], pair[1])
if err != nil {
t.Fatalf("sameOrg(%s,%s): %v", pair[0], pair[1], err)
}
if !ok {
t.Errorf("sameOrg(%s,%s) = false, want true (same org chain)", pair[0], pair[1])
}
}
// Cross-org negative — isolation must stay closed.
for _, pair := range [][2]string{{rootA, rootB}, {grandchildA, rootB}, {childA, rootB}} {
ok, err := sameOrg(ctx, conn, pair[0], pair[1])
if err != nil {
t.Fatalf("sameOrg(%s,%s): %v", pair[0], pair[1], err)
}
if ok {
t.Errorf("sameOrg(%s,%s) = true, want false (different orgs — cross-tenant must stay denied)", pair[0], pair[1])
}
}
}
// extractHostPort parses "http://127.0.0.1:PORT/" and returns "127.0.0.1:PORT".
func extractHostPort(rawURL string) string {
// Simple parse: strip "http://" prefix and trailing slash.
@@ -1059,13 +1059,25 @@ func expectExecuteDelegationBase(mock sqlmock.Sqlmock) {
WillReturnResult(sqlmock.NewResult(0, 1))
// CanCommunicate: getWorkspaceRef(source) + getWorkspaceRef(target).
// Both are root-level workspaces (parent_id=NULL) → root-level siblings → allowed.
// Source and target are siblings under one shared parent (one tenant) →
// CanCommunicate allowed. (#1953: they must NOT both be parent_id=NULL —
// two distinct org roots are now treated as DIFFERENT orgs and routing
// between them is denied. A real delegation happens inside one org.)
mock.ExpectQuery("SELECT id, parent_id FROM workspaces WHERE id = ").
WithArgs(testDeliverySourceID).
WillReturnRows(sqlmock.NewRows([]string{"id", "parent_id"}).AddRow(testDeliverySourceID, nil))
WillReturnRows(sqlmock.NewRows([]string{"id", "parent_id"}).AddRow(testDeliverySourceID, "ws-org-root-159"))
mock.ExpectQuery("SELECT id, parent_id FROM workspaces WHERE id = ").
WithArgs(testDeliveryTargetID).
WillReturnRows(sqlmock.NewRows([]string{"id", "parent_id"}).AddRow(testDeliveryTargetID, nil))
WillReturnRows(sqlmock.NewRows([]string{"id", "parent_id"}).AddRow(testDeliveryTargetID, "ws-org-root-159"))
// #1953 cross-tenant guard: same-org check after CanCommunicate. Both
// resolve to the same org root → routing allowed.
mock.ExpectQuery("WITH RECURSIVE org_chain AS").
WithArgs(testDeliverySourceID).
WillReturnRows(sqlmock.NewRows([]string{"root_id"}).AddRow("ws-org-root-159"))
mock.ExpectQuery("WITH RECURSIVE org_chain AS").
WithArgs(testDeliveryTargetID).
WillReturnRows(sqlmock.NewRows([]string{"root_id"}).AddRow("ws-org-root-159"))
// resolveAgentURL: test callers always set the URL in Redis (mr.Set ws:{id}:url),
// so resolveAgentURL gets a cache hit and never falls back to DB.
@@ -237,7 +237,17 @@ func (h *DiscoveryHandler) Peers(c *gin.Context) {
var peers []map[string]interface{}
// Siblings
// Siblings — workspaces sharing the caller's parent.
//
// #1953 cross-tenant isolation: the OLD code's else-branch handled the
// org-root caller (parent_id IS NULL) by returning EVERY workspace with
// parent_id IS NULL — i.e. every other tenant's org root, since the
// workspaces table has no org_id column. That leaked peer identities/URLs
// across tenants. An org root has no siblings inside its own org (each
// tenant is a distinct org root), so the org-root caller now gets an empty
// sibling set; its real peers are its children, returned below. Only the
// parent_id-bound branch enumerates siblings, and that is already scoped to
// one parent (one tenant).
if parentID.Valid {
siblings, _ := queryPeerMaps(`
SELECT w.id, w.name, COALESCE(w.role, ''), w.tier, w.status,
@@ -246,14 +256,6 @@ func (h *DiscoveryHandler) Peers(c *gin.Context) {
FROM workspaces w WHERE w.parent_id = $1 AND w.id != $2 AND w.status != 'removed'`,
parentID.String, workspaceID)
peers = append(peers, siblings...)
} else {
siblings, _ := queryPeerMaps(`
SELECT w.id, w.name, COALESCE(w.role, ''), w.tier, w.status,
COALESCE(w.agent_card, 'null'::jsonb), COALESCE(w.url, ''),
w.parent_id, w.active_tasks
FROM workspaces w WHERE w.parent_id IS NULL AND w.id != $1 AND w.status != 'removed'`,
workspaceID)
peers = append(peers, siblings...)
}
// Children — exclude self defensively. A child row whose parent_id
@@ -223,10 +223,10 @@ func TestPeers_RootWorkspace_NoPeers(t *testing.T) {
peerCols := []string{"id", "name", "role", "tier", "status", "agent_card", "url", "parent_id", "active_tasks"}
// Siblings (other root-level workspaces) — none
mock.ExpectQuery("SELECT w.id, w.name.*WHERE w.parent_id IS NULL AND w.id != \\$1").
WithArgs("ws-root-alone").
WillReturnRows(sqlmock.NewRows(peerCols))
// #1953: an org-root caller (parent_id IS NULL) now issues NO sibling
// query at all. The old `WHERE w.parent_id IS NULL` sibling read returned
// EVERY tenant's org root (cross-tenant leak); an org root has no siblings
// inside its own org, so the handler skips the sibling read entirely.
// Children — none. #383 added explicit `w.id != $2` self-filter.
mock.ExpectQuery("SELECT w.id, w.name.*WHERE w.parent_id = \\$1 AND w.id != \\$2").
@@ -255,22 +255,20 @@ func TestExtended_SecretsListEmpty(t *testing.T) {
// ---------- TestSecretsSet (Extended) ----------
func TestExtended_SecretsSet(t *testing.T) {
// internal#691: the per-workspace strip gate now defaults to platform_managed
// on empty MOLECULE_LLM_BILLING_MODE (closed default). This test's intent is
// the happy path of persisting a vendor key, so put the org into byok which
// matches the pre-#691 implicit behavior of an unset env.
t.Setenv("MOLECULE_LLM_BILLING_MODE", "byok")
// internal#718 P2-B: the per-workspace strip gate keys off the DERIVED mode
// (org rung retired). This test's intent is the happy path of persisting a
// vendor key on a byok workspace; the realistic way a workspace is byok for
// a direct vendor-key write is an explicit operator override (the escape
// hatch the reject error itself points to: PUT /admin/.../llm-billing-mode).
// The override short-circuits the resolver to byok in a single read, so the
// bypass-list check is skipped and the write proceeds.
t.Setenv("MOLECULE_LLM_BILLING_MODE", "platform_managed") // org env ignored now
mock := setupTestDB(t)
handler := NewSecretsHandler(nil)
// internal#691: secrets.Set now consults ResolveLLMBillingMode before the
// strip gate. Mock returns no row → resolver falls through to the org
// default (byok, set via t.Setenv above) → bypass-list check is skipped
// and the write proceeds. This pattern is the test-side mirror of the
// real-prod fall-through behavior for a fresh workspace with no override.
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs("22222222-2222-2222-2222-222222222222").
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}))
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(LLMBillingModeBYOK))
// Expect INSERT (encrypted value is dynamic, use AnyArg)
mock.ExpectExec("INSERT INTO workspace_secrets").
@@ -453,6 +451,14 @@ func TestExtended_DiscoverMissingHeader(t *testing.T) {
// ---------- TestPeers (Extended) ----------
// TestExtended_Peers verifies a root-level (org-root) workspace's peer view.
//
// #1953: previously a root-level caller issued `WHERE w.parent_id IS NULL`
// for siblings, which returned EVERY other tenant's org root as a "peer"
// (cross-tenant leak, since the workspaces table has no org_id column). After
// the fix an org root has no cross-tenant siblings; its only peers are its own
// children. This test asserts the child is returned and that NO sibling query
// is issued (no `parent_id IS NULL` read).
func TestExtended_Peers(t *testing.T) {
mock := setupTestDB(t)
setupTestRedis(t)
@@ -463,17 +469,14 @@ func TestExtended_Peers(t *testing.T) {
WithArgs("ws-peer").
WillReturnRows(sqlmock.NewRows([]string{"parent_id"}).AddRow(nil))
// Expect root-level siblings query (parent IS NULL, excluding self)
mock.ExpectQuery("SELECT w.id, w.name").
WithArgs("ws-peer").
WillReturnRows(sqlmock.NewRows([]string{"id", "name", "role", "tier", "status", "agent_card", "url", "parent_id", "active_tasks"}).
AddRow("ws-sibling", "Sibling Agent", "worker", 1, "online", []byte("null"), "http://localhost:9001", nil, 0))
// NO root-level sibling query is issued for an org-root caller anymore.
// Expect children query (workspaces with parent_id = ws-peer, excluding self)
// Query now binds (parent_id, self_id) for the self-filter guard added in #383.
// Children query (workspaces with parent_id = ws-peer, excluding self).
// Query binds (parent_id, self_id) for the self-filter guard added in #383.
mock.ExpectQuery("SELECT w.id, w.name").
WithArgs("ws-peer", "ws-peer").
WillReturnRows(sqlmock.NewRows([]string{"id", "name", "role", "tier", "status", "agent_card", "url", "parent_id", "active_tasks"}))
WillReturnRows(sqlmock.NewRows([]string{"id", "name", "role", "tier", "status", "agent_card", "url", "parent_id", "active_tasks"}).
AddRow("ws-child", "Child Agent", "worker", 1, "online", []byte("null"), "http://localhost:9001", "ws-peer", 0))
// No parent query since workspace is root-level
@@ -493,10 +496,10 @@ func TestExtended_Peers(t *testing.T) {
t.Fatalf("failed to parse response: %v", err)
}
if len(resp) != 1 {
t.Fatalf("expected 1 peer, got %d", len(resp))
t.Fatalf("expected 1 peer (the child), got %d", len(resp))
}
if resp[0]["name"] != "Sibling Agent" {
t.Errorf("expected peer name 'Sibling Agent', got %v", resp[0]["name"])
if resp[0]["name"] != "Child Agent" {
t.Errorf("expected peer name 'Child Agent', got %v", resp[0]["name"])
}
if err := mock.ExpectationsWereMet(); err != nil {
@@ -43,10 +43,36 @@ import (
"database/sql"
"errors"
"fmt"
"log"
"sync"
"git.moleculesai.app/molecule-ai/molecule-core/workspace-server/internal/crypto"
"git.moleculesai.app/molecule-ai/molecule-core/workspace-server/internal/db"
"git.moleculesai.app/molecule-ai/molecule-core/workspace-server/internal/providers"
)
// providerManifest is the parsed provider registry, loaded once. The registry
// is embedded (go:embed, no network) and immutable for the process lifetime, so
// a single Load is safe to memoize. A load failure is cached too (registryErr):
// it can only happen on a malformed embedded YAML, which is a build-time defect
// the verify-providers-gen + sync gates already catch, so failing closed
// (treat as "cannot derive" → platform default) is correct and we don't retry.
var (
providerRegistryOnce sync.Once
providerRegistryManifest *providers.Manifest
providerRegistryErr error
)
func providerRegistry() (*providers.Manifest, error) {
providerRegistryOnce.Do(func() {
providerRegistryManifest, providerRegistryErr = providers.LoadManifest()
if providerRegistryErr != nil {
log.Printf("llm_billing_mode: FATAL — provider registry failed to load: %v (billing will default-closed to platform_managed)", providerRegistryErr)
}
})
return providerRegistryManifest, providerRegistryErr
}
// Constants mirror molecule-controlplane/internal/credits/llm_billing.go.
// Kept as string literals (not imports) because workspace-server has no
// build-time dependency on the CP module; the values are stable wire
@@ -67,6 +93,19 @@ const (
BillingModeSourceWorkspaceOverride BillingModeSource = "workspace_override"
BillingModeSourceOrgDefault BillingModeSource = "org_default"
BillingModeSourceConstantFallback BillingModeSource = "constant_fallback"
// BillingModeSourceDerivedProvider means the mode was DERIVED from the
// workspace's (runtime, model) via the provider registry — the SSOT
// (internal#718 P2-B). IsPlatform(derived) → platform_managed, else byok.
// This is the highest-precedence source after an explicit operator override
// and SUPERSEDES the prior stored-LLM_PROVIDER read (#1966).
BillingModeSourceDerivedProvider BillingModeSource = "derived_provider"
// BillingModeSourceDerivedDefault means the registry could not derive a
// provider for the (runtime, model) — no model, unknown runtime,
// unregistered/ambiguous model — so the mode defaulted closed to
// platform_managed (CTO-confirmed "unset → platform default"). Distinct from
// derived_provider so operators can see "we defaulted" vs "we derived
// platform".
BillingModeSourceDerivedDefault BillingModeSource = "derived_default"
)
// BillingModeResolution is the structured answer the admin GET route returns
@@ -74,11 +113,18 @@ const (
// shape, so the resolver test asserts both the mode AND the source per case
// (catches a bug where the right mode is returned via the wrong layer).
type BillingModeResolution struct {
WorkspaceID string `json:"workspace_id"`
ResolvedMode string `json:"resolved_mode"`
WorkspaceOverride *string `json:"workspace_override"` // nil = inherit
OrgDefault string `json:"org_default"` // already default-closed by CP
Source BillingModeSource `json:"source"`
WorkspaceID string `json:"workspace_id"`
ResolvedMode string `json:"resolved_mode"`
WorkspaceOverride *string `json:"workspace_override"` // nil = inherit
OrgDefault string `json:"org_default"` // RETIRED as a billing source (internal#718 P2-B); always platform_managed, kept for wire-compat
Source BillingModeSource `json:"source"`
// ProviderSelection surfaces the DERIVED provider name (internal#718 P2-B)
// when the mode came from the registry derivation — the literal provider the
// (runtime, model) resolved to (e.g. "platform", "kimi-coding", "openai"), or
// the raw model id when derivation failed. nil when an explicit operator
// override or the empty-id default decided. Lets the admin route answer "why
// is this workspace byok?" with the derived provider, not a stored value.
ProviderSelection *string `json:"provider_selection"`
}
// isKnownBillingMode is the enum-recognizer for the resolver's default-closed
@@ -95,24 +141,137 @@ func isKnownBillingMode(s string) bool {
}
}
// normalizeOrgDefault applies the same default-closed contract to the
// org-level input as the workspace override gets. The org_default arrives
// from tenant_config which already COALESCEs NULL → platform_managed at the
// CP SQL layer, but we DO NOT trust that contract here — if CP regresses or
// the tenant_config env wasn't populated (race on boot), we still default-
// close. Same principle: never honor a garbled value.
func normalizeOrgDefault(orgMode string) string {
if isKnownBillingMode(orgMode) {
return orgMode
// readWorkspaceBillingOverride reads the OPTIONAL explicit operator override
// (workspaces.llm_billing_mode). Returns:
//
// (mode, true, nil) — a recognized override is set → operator pinned the mode
// ("", false, nil) — NULL / garbled / row-missing → no explicit override
// ("", false, err) — DB error → caller defaults closed + propagates
//
// internal#718 P2-B retires the org rung; this column is the ONLY stored
// billing signal that survives, and ONLY as an explicit override on top of the
// derived provider (CTO 2026-05-27).
func readWorkspaceBillingOverride(ctx context.Context, workspaceID string) (string, bool, error) {
var wsOverride sql.NullString
err := db.DB.QueryRowContext(ctx,
`SELECT llm_billing_mode FROM workspaces WHERE id = $1`,
workspaceID,
).Scan(&wsOverride)
switch {
case errors.Is(err, sql.ErrNoRows):
return "", false, nil
case err != nil:
return "", false, fmt.Errorf("resolve workspace llm_billing_mode override for %s: %w", workspaceID, err)
}
return LLMBillingModePlatformManaged
if wsOverride.Valid && isKnownBillingMode(wsOverride.String) {
return wsOverride.String, true, nil
}
return "", false, nil
}
// ResolveLLMBillingMode is the canonical resolver. Every code path that
// previously gated on `os.Getenv("MOLECULE_LLM_BILLING_MODE") == "platform_managed"`
// must call this instead and gate on the returned mode. The architectural
// test (resolver_ast_test.go) asserts there is no remaining call site of
// the old shape outside the resolver-input wiring.
// ResolveLLMBillingModeDerived is the SSOT billing-mode resolver (internal#718
// P2-B). It DERIVES the provider from (runtime, model) via the provider
// registry and decides platform-vs-byok from IsPlatform(derived) — it does NOT
// read a stored LLM_PROVIDER (superseding #1966's stored-read approach) and
// does NOT read the org rung (retired, CTO 2026-05-27).
//
// Precedence (highest first):
//
// 1. EXPLICIT operator override (workspaces.llm_billing_mode, a recognized
// value). The only stored billing signal that survives — an escape hatch,
// not the primary signal.
// 2. DERIVE: providers.DeriveProvider(runtime, model, availableAuthEnv).
// - resolves to the closed `platform` provider → platform_managed
// - resolves to any other (BYOK/third-party) provider → byok ← THE FIX
// 3. DEFAULT-CLOSED: derive fails (no model, unknown runtime, unregistered or
// ambiguous model) → platform_managed (CTO "unset → platform default"). A
// derive failure NEVER silently flips a workspace to byok (which would
// strip the platform creds it may legitimately need).
//
// availableAuthEnv is the set of auth-env-var NAMES present for the workspace
// (never secret values) — the same disambiguation input DeriveProvider uses to
// split anthropic-oauth from anthropic-api. May be nil.
//
// A returned error never prevents a decision: ResolvedMode is always a valid
// enum value (default-closed). The error is informational (log + surface).
func ResolveLLMBillingModeDerived(ctx context.Context, workspaceID, runtime, model string, availableAuthEnv []string) (BillingModeResolution, error) {
res := BillingModeResolution{
WorkspaceID: workspaceID,
// OrgDefault is retired as a billing source (internal#718 P2-B). Kept on
// the struct for wire-compat (admin route / CP mirror) but always the
// closed constant — never consulted in the decision.
OrgDefault: LLMBillingModePlatformManaged,
}
// Pre-provision context (no workspace row yet): no override to read, default
// closed. (DeriveProvider could still run from the passed runtime/model, but
// the no-id path historically does no DB work and the strip gate only runs
// post-create, so keep it a pure default to preserve that contract.)
if workspaceID == "" {
res.ResolvedMode = LLMBillingModePlatformManaged
res.Source = BillingModeSourceDerivedDefault
return res, nil
}
// Precedence 1: explicit operator override.
if mode, ok, err := readWorkspaceBillingOverride(ctx, workspaceID); err != nil {
// DB error — default closed AND propagate (never flip on a transient error).
res.ResolvedMode = LLMBillingModePlatformManaged
res.Source = BillingModeSourceConstantFallback
return res, err
} else if ok {
m := mode
res.WorkspaceOverride = &m
res.ResolvedMode = mode
res.Source = BillingModeSourceWorkspaceOverride
return res, nil
}
// Precedence 2: DERIVE the provider from (runtime, model).
manifest, mErr := providerRegistry()
if mErr != nil || manifest == nil {
// Registry unavailable (malformed embedded YAML — a build-time defect the
// gates catch). Default closed.
res.ResolvedMode = LLMBillingModePlatformManaged
res.Source = BillingModeSourceDerivedDefault
return res, mErr
}
provider, dErr := manifest.DeriveProvider(runtime, model, availableAuthEnv)
if dErr != nil {
// No model / unknown runtime / unregistered / ambiguous → default closed.
// NOT an error to the caller: an unregistered model is a legitimate
// "we can't say it's BYOK, so bill the platform default" outcome, and the
// only-registered gate at the create/config API is where an unregistered
// model is rejected loudly. Here we just fail closed for safety.
res.ResolvedMode = LLMBillingModePlatformManaged
res.Source = BillingModeSourceDerivedDefault
sel := model
if sel != "" {
res.ProviderSelection = &sel
}
return res, nil
}
derivedName := provider.Name
res.ProviderSelection = &derivedName
res.Source = BillingModeSourceDerivedProvider
if provider.IsPlatform() {
res.ResolvedMode = LLMBillingModePlatformManaged
} else {
// A specific (non-platform) vendor was derived → bring-your-own-key.
res.ResolvedMode = LLMBillingModeBYOK
}
return res, nil
}
// ResolveLLMBillingMode is the legacy-signature resolver retained for callers
// that do not have (runtime, model) in hand (the admin GET/PUT route and the
// secrets remote-pull path). It reads the workspace's stored runtime + model +
// available auth env from the DB and delegates to the DERIVED resolver
// (internal#718 P2-B) — the orgMode parameter is RETIRED (the org rung is no
// longer a billing source) and is ignored; it stays in the signature only to
// avoid churning the two callers in this PR. The architectural test asserts no
// remaining code path gates on os.Getenv("MOLECULE_LLM_BILLING_MODE") for the
// strip decision (that env is no longer read into the decision at all).
//
// Returning an error does NOT prevent the caller from making a decision —
// the returned mode is always a valid enum value (default-closed to
@@ -120,75 +279,160 @@ func normalizeOrgDefault(orgMode string) string {
// branch. The error is informational: log it, surface it to operators, but
// the strip-gate decision is already safe.
func ResolveLLMBillingMode(ctx context.Context, workspaceID, orgMode string) (BillingModeResolution, error) {
res := BillingModeResolution{
WorkspaceID: workspaceID,
OrgDefault: normalizeOrgDefault(orgMode),
}
_ = orgMode // org rung retired (internal#718 P2-B); parameter ignored.
if workspaceID == "" {
// No workspace ID = pre-provision context (templating, validation).
// Resolve against the org default only, no DB read.
res.ResolvedMode = res.OrgDefault
res.Source = BillingModeSourceOrgDefault
if !isKnownBillingMode(orgMode) {
// Org default was garbled/NULL and we clamped to platform_managed.
// Mark the source as constant_fallback so the operator can see
// the clamp happened, not that the org "really" said platform_managed.
res.Source = BillingModeSourceConstantFallback
}
return res, nil
// Pre-provision context (templating, validation): default closed, no DB.
return ResolveLLMBillingModeDerived(ctx, "", "", "", nil)
}
var wsOverride sql.NullString
err := db.DB.QueryRowContext(ctx,
`SELECT llm_billing_mode FROM workspaces WHERE id = $1`,
// Precedence 1: explicit operator override. Read it FIRST so an overridden
// workspace short-circuits without the extra runtime/secrets reads (and so
// the query order is override → runtime → secrets, matching the derived
// resolver's own override-first precedence).
if mode, ok, err := readWorkspaceBillingOverride(ctx, workspaceID); err != nil {
return BillingModeResolution{
WorkspaceID: workspaceID,
OrgDefault: LLMBillingModePlatformManaged,
ResolvedMode: LLMBillingModePlatformManaged,
Source: BillingModeSourceConstantFallback,
}, err
} else if ok {
m := mode
return BillingModeResolution{
WorkspaceID: workspaceID,
OrgDefault: LLMBillingModePlatformManaged,
ResolvedMode: mode,
WorkspaceOverride: &m,
Source: BillingModeSourceWorkspaceOverride,
}, nil
}
// Precedence 2: DERIVE. Read the stored (runtime, model, available-auth-env)
// so the derived resolver can DeriveProvider for callers that don't carry
// them (admin route, secrets remote-pull). A read miss/error degrades
// gracefully: pass the empty/partial inputs through — DeriveProvider then
// errors and the derived resolver defaults closed to platform_managed.
//
// ResolveLLMBillingModeDerived re-reads the override (NULL again here) before
// deriving; that one extra cheap read keeps the derived resolver a complete,
// independently-callable SSOT rather than splitting its precedence across two
// functions.
runtime, model, authEnv := readWorkspaceDeriveInputs(ctx, workspaceID)
return ResolveLLMBillingModeDerived(ctx, workspaceID, runtime, model, authEnv)
}
// readWorkspaceDeriveInputs loads the workspace's stored runtime + selected
// model + the auth-env-var NAMES present in its secrets — the inputs
// DeriveProvider needs. Best-effort: any read error returns whatever was
// gathered (the derived resolver fails closed on incomplete inputs). The model
// is the MODEL workspace_secret (the canvas-picked id, written by setModelSecret
// / Create); runtime is the workspaces.runtime column (defaults claude-code).
// availableAuthEnv is the subset of secret KEYS that are recognized provider
// auth-env names (never values), so DeriveProvider's auth-env tie-break can fire
// the same way it does on the provision path.
func readWorkspaceDeriveInputs(ctx context.Context, workspaceID string) (runtime, model string, availableAuthEnv []string) {
var rt sql.NullString
if err := db.DB.QueryRowContext(ctx,
`SELECT runtime FROM workspaces WHERE id = $1`, workspaceID,
).Scan(&rt); err != nil {
if !errors.Is(err, sql.ErrNoRows) {
log.Printf("llm_billing_mode: read runtime for %s: %v (deriving with empty runtime)", workspaceID, err)
}
}
runtime = rt.String
if runtime == "" {
// Mirror the DB column default so an unset runtime still derives.
runtime = "claude-code"
}
// Gather model + auth-env-name keys from workspace_secrets in one pass.
authSet := authEnvNameSet()
rows, err := db.DB.QueryContext(ctx,
`SELECT key, encrypted_value, encryption_version FROM workspace_secrets WHERE workspace_id = $1`,
workspaceID,
).Scan(&wsOverride)
switch {
case errors.Is(err, sql.ErrNoRows):
// Workspace row missing — concurrent delete, or pre-create call. Don't
// silently flip; fall through to org default. Source stays org_default
// so operators can see the row-missing case is being handled as a
// fallback, not a workspace-explicit decision.
res.ResolvedMode = res.OrgDefault
res.Source = BillingModeSourceOrgDefault
if !isKnownBillingMode(orgMode) {
res.Source = BillingModeSourceConstantFallback
)
if err != nil {
log.Printf("llm_billing_mode: read secrets for %s: %v (deriving with no model/auth-env)", workspaceID, err)
return runtime, model, availableAuthEnv
}
defer rows.Close()
for rows.Next() {
var k string
var v []byte
var ver int
if rows.Scan(&k, &v, &ver) != nil {
continue
}
if k == "MODEL" {
if dec, derr := crypto.DecryptVersioned(v, ver); derr == nil {
model = string(dec)
}
continue
}
// Only the KEY matters for auth-env disambiguation (the value is the
// secret; we never decrypt it for this purpose). Record recognized
// provider auth-env names.
if _, ok := authSet[k]; ok {
availableAuthEnv = append(availableAuthEnv, k)
}
return res, nil
case err != nil:
// DB error — default-closed to platform_managed AND propagate the
// error so operators get a structured log line. The caller is
// expected to log and continue with the safe default.
res.ResolvedMode = LLMBillingModePlatformManaged
res.Source = BillingModeSourceConstantFallback
return res, fmt.Errorf("resolve workspace llm_billing_mode for %s: %w", workspaceID, err)
}
return runtime, model, availableAuthEnv
}
if wsOverride.Valid && isKnownBillingMode(wsOverride.String) {
mode := wsOverride.String
res.WorkspaceOverride = &mode
res.ResolvedMode = mode
res.Source = BillingModeSourceWorkspaceOverride
return res, nil
}
// authEnvNameSet is the union of every provider's auth_env names in the
// registry — the recognized set readWorkspaceDeriveInputs filters secret keys
// against. Loaded once from the registry so it stays in sync with the SSOT (no
// hardcoded auth-env vocabulary). Registry-load failure yields an empty set
// (derive then runs without the auth-env tie-break, which only matters for the
// oauth-vs-api overlap; safe — it errors to default-closed rather than guessing).
var (
authEnvNameSetOnce sync.Once
authEnvNameSetVal map[string]struct{}
)
// Override row present but the value is NULL or garbled. Fall through.
// If the value was non-NULL but garbled (CHECK constraint should prevent
// this, but defense in depth — a future migration could relax the check
// or another path could write the column directly), surface the raw
// override value so operators can spot the corrupt row.
if wsOverride.Valid {
raw := wsOverride.String
res.WorkspaceOverride = &raw
func authEnvNameSet() map[string]struct{} {
authEnvNameSetOnce.Do(func() {
authEnvNameSetVal = map[string]struct{}{}
m, err := providerRegistry()
if err != nil || m == nil {
return
}
for _, p := range m.Providers {
for _, e := range p.AuthEnv {
authEnvNameSetVal[e] = struct{}{}
}
}
})
return authEnvNameSetVal
}
// availableAuthEnvNames returns the recognized provider auth-env-var NAMES
// present (non-empty) in envVars — the DeriveProvider auth-env tie-break input.
// Never returns secret VALUES, only the env-var names. Used by the provision
// path (applyPlatformManagedLLMEnv), which already has the workspace env in
// hand, so it derives without a secrets DB round-trip.
func availableAuthEnvNames(envVars map[string]string) []string {
authSet := authEnvNameSet()
var out []string
for k, v := range envVars {
if v == "" {
continue
}
if _, ok := authSet[k]; ok {
out = append(out, k)
}
}
res.ResolvedMode = res.OrgDefault
res.Source = BillingModeSourceOrgDefault
if !isKnownBillingMode(orgMode) {
res.Source = BillingModeSourceConstantFallback
return out
}
// derefOrEmpty returns the pointed-to string or "" for a nil pointer. Used in
// log lines that surface an optional *string field.
func derefOrEmpty(s *string) string {
if s == nil {
return ""
}
return res, nil
return *s
}
// SetWorkspaceLLMBillingMode writes the override column. Pass mode=="" to
@@ -0,0 +1,232 @@
package handlers
// llm_billing_mode_derived_test.go — tests for the DERIVED billing-mode
// resolver (internal#718 P2-B). The platform-vs-byok decision now DERIVES the
// provider from (runtime, model) via the provider registry and keys off
// IsPlatform(derived) — it does NOT read a stored LLM_PROVIDER (supersedes
// #1966's stored-read approach) and does NOT read the org rung (retired,
// CTO 2026-05-27). `workspaces.llm_billing_mode` survives ONLY as an optional
// explicit operator override (first precedence).
//
// This file pins the explicit BEHAVIOR DELTA the RFC's P2 calls out:
// - platform-derived (or unset → platform default) → platform_managed (UNCHANGED)
// - non-platform-derived → byok (THE FIX — the Reno leak class)
// - explicit override → wins over derive
// - derive error / unregistered → platform_managed (default-closed)
import (
"context"
"errors"
"testing"
"github.com/DATA-DOG/go-sqlmock"
)
// expectOverrideQuery sets up the workspaces.llm_billing_mode override read
// (first precedence). value=="" means NULL (no override).
func expectOverrideQuery(m sqlmock.Sqlmock, wsID, value string) {
rows := sqlmock.NewRows([]string{"llm_billing_mode"})
if value == "" {
rows.AddRow(nil)
} else {
rows.AddRow(value)
}
m.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(rows)
}
func TestResolveLLMBillingModeDerived_BehaviorDelta(t *testing.T) {
ctx := context.Background()
const wsID = "33333333-3333-3333-3333-333333333333"
type tc struct {
name string
runtime string
model string
authEnv []string
override string // "" = NULL override (no explicit operator override)
wantMode string
wantSource BillingModeSource
wantErr bool
}
cases := []tc{
{
// PLATFORM-DERIVED → platform_managed (UNCHANGED). claude-code +
// a platform-namespaced model id derives to the closed `platform`
// provider → IsPlatform → platform_managed.
name: "platform_derived_keeps_platform_managed_UNCHANGED",
runtime: "claude-code",
model: "anthropic/claude-opus-4-7",
override: "",
wantMode: LLMBillingModePlatformManaged,
wantSource: BillingModeSourceDerivedProvider,
},
{
// NON-PLATFORM-DERIVED → byok (THE FIX). claude-code + the
// kimi-coding-native model derives to the non-platform kimi-coding
// provider → IsPlatform=false → byok. This is the Reno billing-leak
// class: pre-P2 it resolved platform_managed and ran on platform creds.
name: "non_platform_derived_resolves_byok_THE_FIX",
runtime: "claude-code",
model: "kimi-for-coding",
override: "",
wantMode: LLMBillingModeBYOK,
wantSource: BillingModeSourceDerivedProvider,
},
{
// NON-PLATFORM vendor on codex: gpt-5.5 derives to `openai` (BYOK).
name: "non_platform_openai_codex_byok",
runtime: "codex",
model: "gpt-5.5",
override: "",
wantMode: LLMBillingModeBYOK,
wantSource: BillingModeSourceDerivedProvider,
},
{
// PLATFORM-DERIVED on codex: openai/gpt-5.4 is platform-namespaced.
name: "platform_derived_codex_platform_managed",
runtime: "codex",
model: "openai/gpt-5.4",
override: "",
wantMode: LLMBillingModePlatformManaged,
wantSource: BillingModeSourceDerivedProvider,
},
{
// UNSET model → platform default (CTO-confirmed "unset → platform
// default"). No model means nothing to derive; default-closed.
name: "unset_model_platform_default",
runtime: "claude-code",
model: "",
override: "",
wantMode: LLMBillingModePlatformManaged,
wantSource: BillingModeSourceDerivedDefault,
},
{
// UNREGISTERED model → derive errors → platform default (default-closed,
// NOT a silent byok flip that would strip a workspace's creds).
name: "unregistered_model_derive_error_platform_default",
runtime: "claude-code",
model: "totally-made-up-model-xyz",
override: "",
wantMode: LLMBillingModePlatformManaged,
wantSource: BillingModeSourceDerivedDefault,
},
{
// UNKNOWN runtime → derive errors → platform default (default-closed).
name: "unknown_runtime_platform_default",
runtime: "no-such-runtime",
model: "claude-opus-4-7",
override: "",
wantMode: LLMBillingModePlatformManaged,
wantSource: BillingModeSourceDerivedDefault,
},
{
// EXPLICIT OVERRIDE wins over derive: a non-platform-deriving model
// kept on platform_managed by an operator override (escape hatch).
name: "explicit_override_platform_managed_wins_over_byok_derive",
runtime: "claude-code",
model: "kimi-for-coding", // would derive byok
override: LLMBillingModePlatformManaged,
wantMode: LLMBillingModePlatformManaged,
wantSource: BillingModeSourceWorkspaceOverride,
},
{
// EXPLICIT OVERRIDE byok wins over a platform-deriving model.
name: "explicit_override_byok_wins_over_platform_derive",
runtime: "claude-code",
model: "anthropic/claude-opus-4-7", // would derive platform_managed
override: LLMBillingModeBYOK,
wantMode: LLMBillingModeBYOK,
wantSource: BillingModeSourceWorkspaceOverride,
},
{
// EXPLICIT OVERRIDE disabled wins (no-LLM workspace).
name: "explicit_override_disabled_wins",
runtime: "claude-code",
model: "anthropic/claude-opus-4-7",
override: LLMBillingModeDisabled,
wantMode: LLMBillingModeDisabled,
wantSource: BillingModeSourceWorkspaceOverride,
},
{
// AUTH-ENV disambiguation: claude-code's anthropic-oauth (alias
// model "opus") vs anthropic-api both could match a bare alias; with
// CLAUDE_CODE_OAUTH_TOKEN present it derives anthropic-oauth → byok.
name: "auth_env_disambiguates_oauth_byok",
runtime: "claude-code",
model: "opus",
authEnv: []string{"CLAUDE_CODE_OAUTH_TOKEN"},
override: "",
wantMode: LLMBillingModeBYOK,
wantSource: BillingModeSourceDerivedProvider,
},
}
for _, c := range cases {
t.Run(c.name, func(t *testing.T) {
mock := setupTestDB(t)
expectOverrideQuery(mock, wsID, c.override)
res, err := ResolveLLMBillingModeDerived(ctx, wsID, c.runtime, c.model, c.authEnv)
if (err != nil) != c.wantErr {
t.Fatalf("err: got %v wantErr=%v", err, c.wantErr)
}
if res.ResolvedMode != c.wantMode {
t.Errorf("mode: got %q want %q", res.ResolvedMode, c.wantMode)
}
if res.Source != c.wantSource {
t.Errorf("source: got %q want %q", res.Source, c.wantSource)
}
if !isKnownBillingMode(res.ResolvedMode) {
t.Errorf("post-condition: resolved mode %q not a known enum", res.ResolvedMode)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("sqlmock expectations: %v", err)
}
})
}
}
// TestResolveLLMBillingModeDerived_OverrideDBError_DefaultClosed asserts a DB
// error reading the override column defaults closed to platform_managed and
// propagates the error — never silently flips a workspace off platform creds.
func TestResolveLLMBillingModeDerived_OverrideDBError_DefaultClosed(t *testing.T) {
ctx := context.Background()
const wsID = "44444444-4444-4444-4444-444444444444"
mock := setupTestDB(t)
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnError(errors.New("connection refused"))
res, err := ResolveLLMBillingModeDerived(ctx, wsID, "claude-code", "kimi-for-coding", nil)
if err == nil {
t.Fatalf("expected propagated DB error, got nil")
}
if res.ResolvedMode != LLMBillingModePlatformManaged {
t.Errorf("default-closed: DB error must resolve platform_managed, got %q", res.ResolvedMode)
}
if res.Source != BillingModeSourceConstantFallback {
t.Errorf("source: got %q want %q", res.Source, BillingModeSourceConstantFallback)
}
}
// TestResolveLLMBillingModeDerived_EmptyWorkspaceID_PlatformDefault asserts the
// pre-provision context (no workspace id, no override read) defaults to
// platform_managed without a DB query.
func TestResolveLLMBillingModeDerived_EmptyWorkspaceID_PlatformDefault(t *testing.T) {
ctx := context.Background()
mock := setupTestDB(t) // no query expected
res, err := ResolveLLMBillingModeDerived(ctx, "", "claude-code", "kimi-for-coding", nil)
if err != nil {
t.Fatalf("unexpected err: %v", err)
}
if res.ResolvedMode != LLMBillingModePlatformManaged {
t.Errorf("empty workspace id must default platform_managed, got %q", res.ResolvedMode)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("sqlmock expectations: %v", err)
}
}
@@ -36,10 +36,12 @@ import (
// GetWorkspaceLLMBillingMode handles GET /admin/workspaces/:id/llm-billing-mode.
//
// Reads the workspace override + the org-level default (from the same
// MOLECULE_LLM_BILLING_MODE env var the provisioner reads at strip-gate time —
// keeps the two paths consistent so the GET result matches what the strip
// gate would compute) and returns the structured resolution.
// internal#718 P2-B: the resolution now DERIVES the provider from the
// workspace's stored (runtime, model) via the registry (org rung retired). The
// passed orgMode is ignored by the resolver; it is left here only to avoid
// churning the call signature. The returned resolution matches what the
// provision-time strip gate computes (same derived resolver), so operators see
// the real platform-vs-byok decision + the derived provider in ProviderSelection.
func GetWorkspaceLLMBillingMode(c *gin.Context) {
workspaceID := strings.TrimSpace(c.Param("id"))
if !uuidRegex.MatchString(workspaceID) {
@@ -29,13 +29,42 @@ func init() {
const testWSID = "44444444-4444-4444-4444-444444444444"
func TestGetWorkspaceLLMBillingMode_HappyPath_InheritsOrgDefault(t *testing.T) {
t.Setenv("MOLECULE_LLM_BILLING_MODE", LLMBillingModeBYOK)
// expectDeriveShimQueries sets up the three reads the legacy-signature
// ResolveLLMBillingMode shim makes on a no-explicit-override path
// (internal#718 P2-B): the override read (NULL here), the workspaces.runtime
// read, and the workspace_secrets scan (for MODEL + auth-env names). model==""
// means no MODEL secret row.
func expectDeriveShimQueries(m sqlmock.Sqlmock, wsID, runtime, model string) {
nullOverride := func() {
m.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(nil))
}
// Order: override(NULL) shim check, runtime, secrets, override(NULL) again
// (the derived resolver re-checks the override as a complete SSOT).
nullOverride()
m.ExpectQuery(`SELECT runtime FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"runtime"}).AddRow(runtime))
secretRows := sqlmock.NewRows([]string{"key", "encrypted_value", "encryption_version"})
if model != "" {
// encryption_version 0 = plaintext passthrough (crypto.DecryptVersioned).
secretRows.AddRow("MODEL", []byte(model), 0)
}
m.ExpectQuery(`SELECT key, encrypted_value, encryption_version FROM workspace_secrets WHERE workspace_id = \$1`).
WithArgs(wsID).
WillReturnRows(secretRows)
nullOverride()
}
// internal#718 P2-B: org rung retired. A no-override workspace's mode is now
// DERIVED from its stored (runtime, model). A claude-code workspace with a
// non-platform-deriving model (kimi-for-coding) resolves byok via
// derived_provider — NOT the old "inherit org default".
func TestGetWorkspaceLLMBillingMode_HappyPath_DerivesByokFromModel(t *testing.T) {
t.Setenv("MOLECULE_LLM_BILLING_MODE", LLMBillingModeBYOK) // org env ignored now
mock := setupTestDB(t)
// Workspace has no override → resolver returns org_default = byok.
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(testWSID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(nil))
expectDeriveShimQueries(mock, testWSID, "claude-code", "kimi-for-coding")
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
@@ -54,12 +83,15 @@ func TestGetWorkspaceLLMBillingMode_HappyPath_InheritsOrgDefault(t *testing.T) {
if res.ResolvedMode != LLMBillingModeBYOK {
t.Errorf("resolved mode: got %q want %q", res.ResolvedMode, LLMBillingModeBYOK)
}
if res.Source != BillingModeSourceOrgDefault {
t.Errorf("source: got %q want %q", res.Source, BillingModeSourceOrgDefault)
if res.Source != BillingModeSourceDerivedProvider {
t.Errorf("source: got %q want %q", res.Source, BillingModeSourceDerivedProvider)
}
if res.WorkspaceOverride != nil {
t.Errorf("expected nil override, got %v", *res.WorkspaceOverride)
}
if res.ProviderSelection == nil || *res.ProviderSelection != "kimi-coding" {
t.Errorf("expected derived provider kimi-coding, got %v", res.ProviderSelection)
}
}
func TestGetWorkspaceLLMBillingMode_BadUUID_400(t *testing.T) {
@@ -117,9 +149,9 @@ func TestPutWorkspaceLLMBillingMode_ExplicitNullClearsOverride(t *testing.T) {
mock.ExpectExec(`UPDATE workspaces SET llm_billing_mode = NULL WHERE id = \$1`).
WithArgs(testWSID).
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(testWSID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(nil))
// After clear, the post-write re-resolution DERIVES (internal#718 P2-B):
// no override + no MODEL secret → derived_default → platform_managed.
expectDeriveShimQueries(mock, testWSID, "claude-code", "")
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
@@ -142,8 +174,8 @@ func TestPutWorkspaceLLMBillingMode_ExplicitNullClearsOverride(t *testing.T) {
if res.ResolvedMode != LLMBillingModePlatformManaged {
t.Errorf("post-clear resolved: got %q want %q", res.ResolvedMode, LLMBillingModePlatformManaged)
}
if res.Source != BillingModeSourceOrgDefault {
t.Errorf("post-clear source: got %q want %q", res.Source, BillingModeSourceOrgDefault)
if res.Source != BillingModeSourceDerivedDefault {
t.Errorf("post-clear source: got %q want %q", res.Source, BillingModeSourceDerivedDefault)
}
if res.WorkspaceOverride != nil {
t.Errorf("post-clear override should be nil, got %v", *res.WorkspaceOverride)
@@ -1,10 +1,12 @@
package handlers
// llm_billing_mode_test.go — table-driven tests for the per-workspace
// resolver (internal#691). The cases below enumerate every documented
// branch in the default-closed contract; if one of them flips behavior
// later the test names will tell the reviewer exactly which RFC clause
// regressed.
// llm_billing_mode_test.go — tests for the LEGACY-signature resolver
// ResolveLLMBillingMode after internal#718 P2-B. The org rung is RETIRED: the
// legacy shim now reads the explicit override first, then DERIVES the provider
// from the workspace's stored (runtime, model) via the registry (no org
// default). The dedicated derived-resolver cases live in
// llm_billing_mode_derived_test.go; this file pins the legacy shim's DB-read
// sequence + that it routes through the derived semantics.
import (
"context"
@@ -14,35 +16,56 @@ import (
"github.com/DATA-DOG/go-sqlmock"
)
func TestResolveLLMBillingMode_TableDriven(t *testing.T) {
// expectLegacyShimQueries sets up the DB reads the legacy ResolveLLMBillingMode
// shim makes on a NO-explicit-override path (internal#718 P2-B), in order:
// 1. override read (NULL) — the shim's own precedence-1 check,
// 2. workspaces.runtime read,
// 3. workspace_secrets scan (MODEL + auth-env names),
// 4. override read AGAIN (NULL) — the derived resolver re-checks it so it is a
// complete, independently-callable SSOT.
//
// model=="" means no MODEL secret row.
func expectLegacyShimQueries(m sqlmock.Sqlmock, wsID, runtime, model string) {
nullOverride := func() {
m.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(nil))
}
nullOverride()
m.ExpectQuery(`SELECT runtime FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"runtime"}).AddRow(runtime))
secretRows := sqlmock.NewRows([]string{"key", "encrypted_value", "encryption_version"})
if model != "" {
secretRows.AddRow("MODEL", []byte(model), 0) // version 0 = plaintext
}
m.ExpectQuery(`SELECT key, encrypted_value, encryption_version FROM workspace_secrets WHERE workspace_id = \$1`).
WithArgs(wsID).
WillReturnRows(secretRows)
nullOverride()
}
func TestResolveLLMBillingMode_LegacyShimDerives(t *testing.T) {
ctx := context.Background()
const wsID = "11111111-1111-1111-1111-111111111111"
type want struct {
mode string
source BillingModeSource
// hasOverride asserts whether the resolver surfaced the override
// value in the result (nil pointer = clean inherit, non-nil = the
// row was present even if it ultimately fell through because it
// was garbled). Lets us distinguish "row missing, fell through"
// from "row present but garbled, fell through" — both resolve to
// the same mode but the resolver tells operators which case it was.
mode string
source BillingModeSource
hasOverride bool
}
type tc struct {
name string
workspaceID string
orgMode string
setupMock func(m sqlmock.Sqlmock)
want want
wantErr bool
name string
setupMock func(m sqlmock.Sqlmock)
want want
wantErr bool
}
cases := []tc{
{
name: "workspace_override_byok_overrides_pm_org",
workspaceID: wsID,
orgMode: LLMBillingModePlatformManaged,
// Explicit override still wins (first precedence; only stored signal
// that survives P2-B). No runtime/secrets read needed.
name: "explicit_override_byok_wins",
setupMock: func(m sqlmock.Sqlmock) {
m.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
@@ -51,106 +74,60 @@ func TestResolveLLMBillingMode_TableDriven(t *testing.T) {
want: want{mode: LLMBillingModeBYOK, source: BillingModeSourceWorkspaceOverride, hasOverride: true},
},
{
name: "workspace_override_disabled_overrides_pm_org",
workspaceID: wsID,
orgMode: LLMBillingModePlatformManaged,
// No override + a non-platform-deriving model → byok via derive (THE
// FIX: pre-P2 this was platform_managed via the org rung).
name: "no_override_derives_byok_from_model",
setupMock: func(m sqlmock.Sqlmock) {
m.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(LLMBillingModeDisabled))
expectLegacyShimQueries(m, wsID, "claude-code", "kimi-for-coding")
},
want: want{mode: LLMBillingModeDisabled, source: BillingModeSourceWorkspaceOverride, hasOverride: true},
want: want{mode: LLMBillingModeBYOK, source: BillingModeSourceDerivedProvider, hasOverride: false},
},
{
name: "workspace_override_null_inherits_byok_org",
workspaceID: wsID,
orgMode: LLMBillingModeBYOK,
// No override + a platform-namespaced model → platform_managed (UNCHANGED).
name: "no_override_derives_platform_from_model",
setupMock: func(m sqlmock.Sqlmock) {
m.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(nil))
expectLegacyShimQueries(m, wsID, "claude-code", "anthropic/claude-opus-4-7")
},
want: want{mode: LLMBillingModeBYOK, source: BillingModeSourceOrgDefault, hasOverride: false},
want: want{mode: LLMBillingModePlatformManaged, source: BillingModeSourceDerivedProvider, hasOverride: false},
},
{
name: "workspace_override_null_inherits_pm_org",
workspaceID: wsID,
orgMode: LLMBillingModePlatformManaged,
// No override + no model → derived_default → platform_managed (unset → platform).
name: "no_override_no_model_platform_default",
setupMock: func(m sqlmock.Sqlmock) {
m.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(nil))
expectLegacyShimQueries(m, wsID, "claude-code", "")
},
want: want{mode: LLMBillingModePlatformManaged, source: BillingModeSourceOrgDefault, hasOverride: false},
want: want{mode: LLMBillingModePlatformManaged, source: BillingModeSourceDerivedDefault, hasOverride: false},
},
{
name: "workspace_override_garbled_falls_through_to_pm_org_DEFAULT_CLOSED",
workspaceID: wsID,
orgMode: LLMBillingModePlatformManaged,
// Garbled override is NOT honored — falls through to derive
// (default-closed). Here no model → platform default.
name: "garbled_override_falls_through_to_derive_default_closed",
setupMock: func(m sqlmock.Sqlmock) {
// CHECK constraint would normally prevent this but if a future
// migration loosens it (or a direct UPDATE bypasses it on a
// non-PG driver in a test stub), a garbled value MUST NOT
// be honored as if it were valid. This is the default-closed
// safety axis the RFC calls out.
// override read 1 (garbled → not honored), runtime, secrets,
// override read 2 (garbled again, derived resolver re-check).
m.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow("byokk"))
m.ExpectQuery(`SELECT runtime FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"runtime"}).AddRow("claude-code"))
m.ExpectQuery(`SELECT key, encrypted_value, encryption_version FROM workspace_secrets WHERE workspace_id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"key", "encrypted_value", "encryption_version"}))
m.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow("byokk"))
},
want: want{mode: LLMBillingModePlatformManaged, source: BillingModeSourceOrgDefault, hasOverride: true},
want: want{mode: LLMBillingModePlatformManaged, source: BillingModeSourceDerivedDefault, hasOverride: false},
},
{
name: "workspace_override_garbled_org_garbled_constant_fallback",
workspaceID: wsID,
orgMode: "garbled-or-empty",
setupMock: func(m sqlmock.Sqlmock) {
m.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow("nonsense"))
},
// Both layers garbled → constant fallback. Source is constant_fallback
// so operators can see the org-default-was-also-bad case explicitly.
want: want{mode: LLMBillingModePlatformManaged, source: BillingModeSourceConstantFallback, hasOverride: true},
},
{
name: "workspace_row_missing_falls_through_to_org_byok",
workspaceID: wsID,
orgMode: LLMBillingModeBYOK,
setupMock: func(m sqlmock.Sqlmock) {
m.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}))
},
want: want{mode: LLMBillingModeBYOK, source: BillingModeSourceOrgDefault, hasOverride: false},
},
{
name: "workspace_id_empty_pre_provision_org_only",
workspaceID: "",
orgMode: LLMBillingModeBYOK,
setupMock: func(m sqlmock.Sqlmock) { /* no DB read expected — empty ws id short-circuits */ },
want: want{mode: LLMBillingModeBYOK, source: BillingModeSourceOrgDefault, hasOverride: false},
},
{
name: "workspace_id_empty_org_garbled_constant_fallback",
workspaceID: "",
orgMode: "",
setupMock: func(m sqlmock.Sqlmock) { /* no DB read */ },
want: want{mode: LLMBillingModePlatformManaged, source: BillingModeSourceConstantFallback, hasOverride: false},
},
{
name: "db_error_default_closed_to_pm_with_error",
workspaceID: wsID,
orgMode: LLMBillingModeBYOK, // org says byok but DB errored — DO NOT honor org
// DB error on the override read → default-closed + propagated error.
name: "override_db_error_default_closed_with_error",
setupMock: func(m sqlmock.Sqlmock) {
m.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnError(errors.New("connection refused"))
},
// Critical: even though orgMode=byok, a DB error means we can't
// confirm the workspace doesn't have an override, so we default
// to the closed mode. This is the safer of the two failures —
// silently flipping to org-byok on a DB error would leak the
// OAuth-keeping behavior to workspaces whose row says NULL.
want: want{mode: LLMBillingModePlatformManaged, source: BillingModeSourceConstantFallback, hasOverride: false},
wantErr: true,
},
@@ -161,7 +138,8 @@ func TestResolveLLMBillingMode_TableDriven(t *testing.T) {
mock := setupTestDB(t)
c.setupMock(mock)
res, err := ResolveLLMBillingMode(ctx, c.workspaceID, c.orgMode)
// orgMode arg is retired/ignored; pass a value to prove it has no effect.
res, err := ResolveLLMBillingMode(ctx, wsID, LLMBillingModeBYOK)
if (err != nil) != c.wantErr {
t.Fatalf("err: got %v wantErr=%v", err, c.wantErr)
}
@@ -172,8 +150,7 @@ func TestResolveLLMBillingMode_TableDriven(t *testing.T) {
t.Errorf("source: got %q want %q", res.Source, c.want.source)
}
if (res.WorkspaceOverride != nil) != c.want.hasOverride {
t.Errorf("hasOverride: got %v want %v (override=%v)",
res.WorkspaceOverride != nil, c.want.hasOverride, res.WorkspaceOverride)
t.Errorf("hasOverride: got %v want %v", res.WorkspaceOverride != nil, c.want.hasOverride)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("sqlmock expectations: %v", err)
@@ -182,21 +159,48 @@ func TestResolveLLMBillingMode_TableDriven(t *testing.T) {
}
}
// TestResolveLLMBillingMode_EmptyWorkspaceID_PlatformDefault: pre-provision
// (no workspace id) defaults closed with no DB read (org rung retired, so the
// old "org_only" behavior is gone — it's now the platform default).
func TestResolveLLMBillingMode_EmptyWorkspaceID_PlatformDefault(t *testing.T) {
ctx := context.Background()
mock := setupTestDB(t) // no DB read expected
res, err := ResolveLLMBillingMode(ctx, "", LLMBillingModeBYOK)
if err != nil {
t.Fatalf("unexpected err: %v", err)
}
if res.ResolvedMode != LLMBillingModePlatformManaged {
t.Errorf("empty ws id must default platform_managed, got %q", res.ResolvedMode)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("sqlmock expectations: %v", err)
}
}
// TestResolveLLMBillingMode_ResolvedModeIsAlwaysValid asserts the resolver's
// post-condition: the returned mode is ALWAYS one of the three known enum
// values, never an empty string and never a garbled passthrough. The strip
// gate downstream relies on this so it can switch on res.ResolvedMode
// without a separate is-valid check on every call site.
// values. The strip gate downstream relies on this so it can switch on
// res.ResolvedMode without a separate is-valid check on every call site.
func TestResolveLLMBillingMode_ResolvedModeIsAlwaysValid(t *testing.T) {
ctx := context.Background()
const wsID = "22222222-2222-2222-2222-222222222222"
// Throw a pathological row at the resolver: garbled override + garbled
// org default. Resolved mode must still be a recognized enum.
// Garbled override + no derivable model: must still resolve a known enum
// (platform_managed, default-closed). Query order: override(garbled),
// runtime, secrets, override(garbled again — derived resolver re-check).
mock := setupTestDB(t)
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow("totally-bogus"))
mock.ExpectQuery(`SELECT runtime FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"runtime"}).AddRow("claude-code"))
mock.ExpectQuery(`SELECT key, encrypted_value, encryption_version FROM workspace_secrets WHERE workspace_id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"key", "encrypted_value", "encryption_version"}))
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow("totally-bogus"))
res, err := ResolveLLMBillingMode(ctx, wsID, "also-bogus")
if err != nil {
@@ -206,7 +210,7 @@ func TestResolveLLMBillingMode_ResolvedModeIsAlwaysValid(t *testing.T) {
t.Errorf("post-condition violated: resolved mode %q is not a known enum value", res.ResolvedMode)
}
if res.ResolvedMode != LLMBillingModePlatformManaged {
t.Errorf("default-closed contract: garbled-x-garbled must resolve to platform_managed, got %q", res.ResolvedMode)
t.Errorf("default-closed contract: garbled-override + no-model must resolve platform_managed, got %q", res.ResolvedMode)
}
}
@@ -97,7 +97,15 @@ func (h *MCPHandler) toolListPeers(ctx context.Context, workspaceID string) (str
const cols = `SELECT w.id, w.name, COALESCE(w.role,''), w.status, w.tier`
// Siblings
// Siblings — workspaces sharing the caller's parent.
//
// #1953 cross-tenant isolation: the OLD else-branch returned every
// workspace with parent_id IS NULL when the caller was itself an org root,
// i.e. every other tenant's org root (the workspaces table has no org_id
// column). That leaked peer identities across tenants via MCP list_peers.
// An org root has no siblings inside its own org, so the org-root caller
// now gets no siblings; its peers are its children, enumerated below. Only
// the parent_id-bound branch enumerates siblings, scoped to one tenant.
if parentID.Valid {
rows, err := h.database.QueryContext(ctx,
cols+` FROM workspaces w WHERE w.parent_id = $1 AND w.id != $2 AND w.status != 'removed'`,
@@ -107,15 +115,6 @@ func (h *MCPHandler) toolListPeers(ctx context.Context, workspaceID string) (str
log.Printf("MCP toolListPeers: sibling scan error: %v", scanErr)
}
}
} else {
rows, err := h.database.QueryContext(ctx,
cols+` FROM workspaces w WHERE w.parent_id IS NULL AND w.id != $1 AND w.status != 'removed'`,
workspaceID)
if err == nil {
if scanErr := scanPeers(rows); scanErr != nil {
log.Printf("MCP toolListPeers: sibling scan error: %v", scanErr)
}
}
}
// Children
@@ -48,6 +48,7 @@ type memoryV2Deps struct {
// call. Defining an interface here lets handler tests stub the plugin
// without spinning up an HTTP server.
type memoryPluginAPI interface {
UpsertNamespace(ctx context.Context, name string, body contract.NamespaceUpsert) (*contract.Namespace, error)
CommitMemory(ctx context.Context, namespace string, body contract.MemoryWrite) (*contract.MemoryWriteResponse, error)
Search(ctx context.Context, body contract.SearchRequest) (*contract.SearchResponse, error)
ForgetMemory(ctx context.Context, id string, body contract.ForgetRequest) error
@@ -117,6 +118,9 @@ func (h *MCPHandler) toolCommitMemoryV2(ctx context.Context, workspaceID string,
if !ok {
return "", fmt.Errorf("workspace %s cannot write to namespace %s", workspaceID, ns)
}
if _, err := h.memv2.plugin.UpsertNamespace(ctx, ns, contract.NamespaceUpsert{Kind: kindFromNamespace(ns)}); err != nil {
return "", fmt.Errorf("plugin upsert namespace: %w", err)
}
// SAFE-T1201: scrub credential-shaped strings BEFORE the plugin sees
// them. Non-negotiable; see memories.go:180.
@@ -171,6 +175,19 @@ func (h *MCPHandler) toolCommitMemoryV2(ctx context.Context, workspaceID string,
return string(out), nil
}
func kindFromNamespace(ns string) contract.NamespaceKind {
switch {
case strings.HasPrefix(ns, "workspace:"):
return contract.NamespaceKindWorkspace
case strings.HasPrefix(ns, "team:"):
return contract.NamespaceKindTeam
case strings.HasPrefix(ns, "org:"):
return contract.NamespaceKindOrg
default:
return contract.NamespaceKindCustom
}
}
// ─────────────────────────────────────────────────────────────────────────────
// search_memory
// ─────────────────────────────────────────────────────────────────────────────
@@ -20,11 +20,18 @@ import (
// --- stubs ---
type stubMemoryPlugin struct {
upsertFn func(ctx context.Context, name string, body contract.NamespaceUpsert) (*contract.Namespace, error)
commitFn func(ctx context.Context, ns string, body contract.MemoryWrite) (*contract.MemoryWriteResponse, error)
searchFn func(ctx context.Context, body contract.SearchRequest) (*contract.SearchResponse, error)
forgetFn func(ctx context.Context, id string, body contract.ForgetRequest) error
}
func (s *stubMemoryPlugin) UpsertNamespace(ctx context.Context, name string, body contract.NamespaceUpsert) (*contract.Namespace, error) {
if s.upsertFn != nil {
return s.upsertFn(ctx, name, body)
}
return &contract.Namespace{Name: name, Kind: body.Kind}, nil
}
func (s *stubMemoryPlugin) CommitMemory(ctx context.Context, ns string, body contract.MemoryWrite) (*contract.MemoryWriteResponse, error) {
if s.commitFn != nil {
return s.commitFn(ctx, ns, body)
@@ -159,7 +166,15 @@ func TestMemoryV2Available(t *testing.T) {
func TestCommitMemoryV2_HappyPathDefaultNamespace(t *testing.T) {
db, _, _ := sqlmock.New()
defer db.Close()
gotUpsertNS := ""
h := newV2Handler(t, db, &stubMemoryPlugin{
upsertFn: func(_ context.Context, name string, body contract.NamespaceUpsert) (*contract.Namespace, error) {
gotUpsertNS = name
if body.Kind != contract.NamespaceKindWorkspace {
t.Errorf("upsert kind = %q, want workspace", body.Kind)
}
return &contract.Namespace{Name: name, Kind: body.Kind}, nil
},
commitFn: func(_ context.Context, ns string, body contract.MemoryWrite) (*contract.MemoryWriteResponse, error) {
if ns != "workspace:root-1" {
t.Errorf("ns = %q, want default workspace:root-1", ns)
@@ -180,6 +195,9 @@ func TestCommitMemoryV2_HappyPathDefaultNamespace(t *testing.T) {
if !strings.Contains(got, `"id":"mem-1"`) {
t.Errorf("got = %s", got)
}
if gotUpsertNS != "workspace:root-1" {
t.Errorf("upsert namespace = %q, want workspace:root-1", gotUpsertNS)
}
}
func TestCommitMemoryV2_NamespaceParamUsed(t *testing.T) {
@@ -45,6 +45,9 @@ type fakePlugin struct {
forgetReq contract.ForgetRequest
}
func (f *fakePlugin) UpsertNamespace(ctx context.Context, name string, body contract.NamespaceUpsert) (*contract.Namespace, error) {
return &contract.Namespace{Name: name, Kind: body.Kind}, nil
}
func (f *fakePlugin) CommitMemory(ctx context.Context, ns string, body contract.MemoryWrite) (*contract.MemoryWriteResponse, error) {
return nil, errors.New("not implemented in fake")
}
@@ -511,11 +514,11 @@ func TestMemoriesV2_Forget_MissingMemoryID_400(t *testing.T) {
// DisplayName over UUID-prefix fallback (issue #2988).
func TestNamespaceLabelWithName_PrefersDisplayNameWhenSet(t *testing.T) {
cases := []struct {
name string
raw string
kind contract.NamespaceKind
display string
want string
name string
raw string
kind contract.NamespaceKind
display string
want string
}{
{"workspace with name", "workspace:abc-1234", contract.NamespaceKindWorkspace, "mac laptop", "Workspace (mac laptop)"},
{"team with name", "team:abc-1234", contract.NamespaceKindTeam, "Engineering", "Team (Engineering)"},
@@ -625,12 +628,12 @@ func TestParseLimit(t *testing.T) {
}{
{"", memoriesV2DefaultLimit},
{"10", 10},
{"0", memoriesV2DefaultLimit}, // ≤0 → default, not error
{"-5", memoriesV2DefaultLimit}, // negative → default
{"abc", memoriesV2DefaultLimit}, // non-numeric → default
{"99999", memoriesV2MaxLimit}, // over cap → clamped
{"100", memoriesV2MaxLimit}, // exactly cap → kept
{"99", 99}, // just under cap → kept
{"0", memoriesV2DefaultLimit}, // ≤0 → default, not error
{"-5", memoriesV2DefaultLimit}, // negative → default
{"abc", memoriesV2DefaultLimit}, // non-numeric → default
{"99999", memoriesV2MaxLimit}, // over cap → clamped
{"100", memoriesV2MaxLimit}, // exactly cap → kept
{"99", 99}, // just under cap → kept
}
for _, tc := range cases {
t.Run("raw="+tc.raw, func(t *testing.T) {
@@ -741,11 +744,11 @@ func TestWithMemoryV2_FluentReturnsReceiver(t *testing.T) {
func TestShortID(t *testing.T) {
cases := map[string]string{
"": "",
"short": "short",
"exactly8": "exactly8",
"longer-than-eight": "longer-t",
"abc-1234-5678-90ab": "abc-1234",
"": "",
"short": "short",
"exactly8": "exactly8",
"longer-than-eight": "longer-t",
"abc-1234-5678-90ab": "abc-1234",
}
for in, want := range cases {
if got := shortID(in); got != want {
@@ -0,0 +1,57 @@
package handlers
// model_registry_validation.go — only-registered (runtime, model) validation
// at the create/config API (internal#718 P2-B item 3, CTO 2026-05-27
// "only registered providers/models selectable").
//
// The registry (internal/providers) is the SSOT for which models a runtime
// natively exposes (ModelsForRuntime). This validator rejects a (runtime, model)
// the registry does NOT recognize — but ONLY for a runtime the registry knows
// about. For a runtime absent from the first-party registry (langgraph,
// external, kimi, mock, or a future federated third-party runtime), it fails
// OPEN: the registry can't speak to that runtime's model set, so the existing
// knownRuntimes gate stays authoritative and this validator does not block.
// This is the federation-ready contract — first-party runtimes are gated against
// the registry; everything else passes through unchanged (no behavior change for
// non-registry runtimes).
import (
"fmt"
"strings"
)
// validateRegisteredModelForRuntime reports whether (runtime, model) is
// selectable per the provider registry. Returns:
//
// (true, "") — allowed: model is registered for this runtime, OR the
// runtime is not in the registry (fail-open), OR model=="".
// (false, reason) — rejected: the runtime IS registered but the model is not
// in its native ModelsForRuntime set.
//
// model=="" is allowed here: the MODEL_REQUIRED gate owns the empty-model case,
// so this validator must not double-reject it.
func validateRegisteredModelForRuntime(runtime, model string) (bool, string) {
model = strings.TrimSpace(model)
if model == "" {
return true, "" // MODEL_REQUIRED owns this.
}
m, err := providerRegistry()
if err != nil || m == nil {
// Registry unavailable (build-time defect the gates catch). Fail open —
// do not block create on a registry-load failure.
return true, ""
}
models, err := m.ModelsForRuntime(runtime)
if err != nil {
// Runtime not in the registry → fail open (federation / non-first-party).
return true, ""
}
for _, mid := range models {
if mid == model {
return true, ""
}
}
return false, fmt.Sprintf(
"model %q is not a registered model for runtime %q; pick one of the runtime's registered models (provider-registry SSOT, internal#718)",
model, runtime)
}
@@ -0,0 +1,82 @@
package handlers
// model_registry_validation_test.go — only-registered (runtime, model)
// validation at the create/config API (internal#718 P2-B item 3). Reject a
// (runtime, model) the registry does not recognize for a runtime it DOES know;
// fail OPEN (allow) for a runtime the registry doesn't know yet (federation /
// langgraph/etc. not in the first-party registry) so the existing knownRuntimes
// gate stays authoritative there.
import "testing"
func TestValidateRegisteredModelForRuntime(t *testing.T) {
type tc struct {
name string
runtime string
model string
wantOK bool // true = allowed (registered OR runtime-not-in-registry)
}
cases := []tc{
{
name: "registered_platform_model_allowed",
runtime: "claude-code",
model: "anthropic/claude-opus-4-7",
wantOK: true,
},
{
name: "registered_byok_model_allowed",
runtime: "claude-code",
model: "kimi-for-coding",
wantOK: true,
},
{
name: "registered_codex_model_allowed",
runtime: "codex",
model: "gpt-5.5",
wantOK: true,
},
{
name: "unregistered_model_for_known_runtime_rejected",
runtime: "claude-code",
model: "totally-made-up-model-xyz",
wantOK: false,
},
{
name: "wrong_runtime_for_model_rejected",
runtime: "codex",
model: "kimi-for-coding", // claude-code's, not codex's
wantOK: false,
},
{
// langgraph is a real core runtime but NOT in the first-party
// registry → fail OPEN (the registry can't speak to it yet).
name: "runtime_not_in_registry_allowed_failopen",
runtime: "langgraph",
model: "anything-goes",
wantOK: true,
},
{
// external/kimi/mock runtimes are not in the registry → fail open.
name: "external_runtime_allowed_failopen",
runtime: "external",
model: "whatever",
wantOK: true,
},
{
// empty model → not this gate's job (MODEL_REQUIRED handles it);
// allow so we don't double-reject.
name: "empty_model_allowed_other_gate_owns_it",
runtime: "claude-code",
model: "",
wantOK: true,
},
}
for _, c := range cases {
t.Run(c.name, func(t *testing.T) {
ok, _ := validateRegisteredModelForRuntime(c.runtime, c.model)
if ok != c.wantOK {
t.Errorf("validateRegisteredModelForRuntime(%q,%q) ok=%v want %v", c.runtime, c.model, ok, c.wantOK)
}
})
}
}
@@ -0,0 +1,104 @@
package handlers
// org_scope.go — cross-tenant isolation helpers (#1953).
//
// The `workspaces` table has no `org_id` column; an "org" is the subtree of
// workspaces reachable through the `parent_id` chain from a single org root
// (a row with parent_id IS NULL). Several code paths historically computed an
// org-root sibling set as `WHERE parent_id IS NULL`, which matches EVERY
// tenant's org root and therefore leaks peer metadata / routing across tenants.
//
// This file centralises the org-scoping primitive so peer discovery, the MCP
// list_peers tool, and a2a routing all derive "the caller's org" the SAME way
// the OFFSEC-015 broadcast fix (commit 5a05302c, workspace_broadcast.go) does:
// a recursive CTE that walks the parent_id chain up to the org root. Keeping
// the CTE in one place means there is a single, testable source of truth for
// tenant isolation rather than four hand-copied queries that can drift.
//
// NOTE: this is the parent_id-chain scoping that the broadcast fix already
// ships. It is deliberately NOT an `org_id` column — adding that column is a
// separate architecture decision pending CTO sign-off. See #1953.
import (
"context"
"database/sql"
"errors"
)
// errNoOrgRoot is returned by orgRootID when the workspace id has no row (and
// therefore no resolvable org root). Callers translate this into a 404/not-found
// at their own layer; it is distinct from a transient DB error so a missing
// workspace never gets treated as "belongs to every org".
var errNoOrgRoot = errors.New("org root not found for workspace")
// orgRootSubtreeCTE is the recursive CTE — identical in shape to the OFFSEC-015
// broadcast fix — that walks UP the parent_id chain from a single workspace to
// its org root. The org root is the row on the chain whose parent_id IS NULL.
//
// $1 = workspace id to resolve
//
// The recursive member walks UP the parent_id chain: each step joins to the row
// whose id is the current row's parent_id. The topmost ancestor is the single
// chain row with parent_id IS NULL — and THAT row's own `id` is the org root.
//
// We select that parentless row's `id` (aliased root_id). We must NOT carry a
// fixed `id AS root_id` from the recursive seed: that value is just the input
// workspace id, so a non-root caller (e.g. a child delegating to a sibling)
// would resolve to ITSELF instead of its org root, and sameOrg() would wrongly
// report two genuinely same-org workspaces as different orgs and 403 a
// legitimate a2a route. A workspace that already IS an org root has a one-row
// chain whose id == itself, so it correctly resolves to itself.
const orgRootSubtreeCTE = `
WITH RECURSIVE org_chain AS (
SELECT id, parent_id
FROM workspaces
WHERE id = $1
UNION ALL
SELECT w.id, w.parent_id
FROM workspaces w
JOIN org_chain c ON w.id = c.parent_id
)
SELECT id AS root_id FROM org_chain WHERE parent_id IS NULL LIMIT 1
`
// orgRootID resolves the org root of `workspaceID` by walking the parent_id
// chain via orgRootSubtreeCTE. Returns errNoOrgRoot when the workspace (or its
// chain) yields no org root row, and the underlying error on any DB failure.
//
// This is the SAME lookup the broadcast handler performs inline; the three
// leak paths in #1953 call this instead of re-deriving "the org" from
// `parent_id IS NULL` (which spans all tenants).
func orgRootID(ctx context.Context, database *sql.DB, workspaceID string) (string, error) {
var root string
err := database.QueryRowContext(ctx, orgRootSubtreeCTE, workspaceID).Scan(&root)
if errors.Is(err, sql.ErrNoRows) {
return "", errNoOrgRoot
}
if err != nil {
return "", err
}
if root == "" {
return "", errNoOrgRoot
}
return root, nil
}
// sameOrg reports whether workspaces `a` and `b` share an org root, i.e. they
// belong to the same tenant. Used by a2a routing to reject resolving/dispatching
// to a workspace id outside the caller's org. Fail-CLOSED: any lookup error or
// missing org root yields (false, err) so a DB hiccup denies cross-tenant
// routing rather than allowing it.
func sameOrg(ctx context.Context, database *sql.DB, a, b string) (bool, error) {
if a == b {
return true, nil
}
rootA, err := orgRootID(ctx, database, a)
if err != nil {
return false, err
}
rootB, err := orgRootID(ctx, database, b)
if err != nil {
return false, err
}
return rootA == rootB, nil
}
@@ -83,6 +83,7 @@ func (h *WorkspaceHandler) gracefulPreRestart(ctx context.Context, workspaceID s
body, marshalErr := json.Marshal(payload)
if marshalErr != nil {
log.Printf("A2AGracefulRestart %s: json.Marshal payload failed: %v", workspaceID, marshalErr)
return
}
req, reqErr := http.NewRequestWithContext(signalCtx, http.MethodPost, url, bytes.NewReader(body))
@@ -245,6 +245,11 @@ func (h *SecretsHandler) Values(c *gin.Context) {
// provisioner path in workspace_provision.go so env-vars look identical
// whether the workspace was bootstrapped locally or remotely).
out := map[string]string{}
// Provenance side-channel (internal#711): which keys in `out` originated
// from global_secrets and were NOT overridden by a workspace_secrets row.
// Used by the provider-aware gate below so a non-platform workspace's
// remote pull never receives the platform's scope:global LLM credential.
globalKeys := map[string]struct{}{}
// Track decrypt failures so we can refuse the response with a list
// instead of returning a partial bundle that boots a broken agent.
var failedKeys []string
@@ -270,6 +275,7 @@ func (h *SecretsHandler) Values(c *gin.Context) {
continue
}
out[k] = string(decrypted)
globalKeys[k] = struct{}{}
}
}
if err := globalRows.Err(); err != nil {
@@ -294,6 +300,10 @@ func (h *SecretsHandler) Values(c *gin.Context) {
continue
}
out[k] = string(decrypted) // workspace override wins over global
// User explicitly re-set this via the canvas Secrets tab — it is
// no longer "the operator-store version", so drop the global
// provenance flag (mirrors loadWorkspaceSecrets).
delete(globalKeys, k)
}
}
if err := wsRows.Err(); err != nil {
@@ -309,6 +319,32 @@ func (h *SecretsHandler) Values(c *gin.Context) {
return
}
// internal#711: provider-aware gate on the remote-pull path. A workspace
// whose resolved billing mode is NOT platform_managed (byok / subscription)
// must NOT receive the platform's scope:global LLM credentials
// (CLAUDE_CODE_OAUTH_TOKEN + the rest of the bypass-key set). Those keys
// were merged from global_secrets above; here we drop any that are still
// of global provenance (a workspace override survives, since its flag was
// cleared). Symmetric with applyPlatformManagedLLMEnv's strip on the
// provision/restart env path — both injection vectors are now gated.
//
// Default-closed: ResolveLLMBillingMode collapses any DB error / NULL /
// garbled value to platform_managed, so a transient failure leaves the
// existing (global-inheriting) behavior in place rather than stripping a
// platform_managed workspace's creds.
orgMode := strings.ToLower(strings.TrimSpace(os.Getenv("MOLECULE_LLM_BILLING_MODE")))
res, resolveErr := ResolveLLMBillingMode(ctx, workspaceID, orgMode)
if resolveErr != nil {
log.Printf("secrets.Values: resolve billing mode workspace=%s err=%v (defaulting to platform_managed)", workspaceID, resolveErr)
}
if res.ResolvedMode != LLMBillingModePlatformManaged {
for k := range globalKeys {
if isPlatformManagedDirectLLMBypassKey(k) {
delete(out, k)
}
}
}
c.JSON(http.StatusOK, out)
}
@@ -865,6 +865,12 @@ func TestSecretsValues_LegacyWorkspaceGrandfathered(t *testing.T) {
WithArgs(testWsID).
WillReturnRows(sqlmock.NewRows([]string{"key", "encrypted_value", "encryption_version"}).
AddRow("WS_KEY", []byte("ws_plainvalue"), 0))
// internal#711: Values now resolves billing mode to gate the global LLM-cred
// merge. Neither key here is a platform-managed LLM bypass key, so the mode
// is immaterial to the assertions — but the resolver query must be mocked.
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(testWsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(LLMBillingModePlatformManaged))
w := httptest.NewRecorder()
c := secretsValuesRequest(w, "") // no auth — grandfathered
@@ -942,6 +948,12 @@ func TestSecretsValues_ValidTokenReturnsDecryptedMerge(t *testing.T) {
WillReturnRows(sqlmock.NewRows([]string{"key", "encrypted_value", "encryption_version"}).
AddRow("ONLY_WS", []byte("ws_val"), 0).
AddRow("SHARED_KEY", []byte("ws_wins"), 0))
// internal#711: billing-mode resolver query. None of these keys is a
// platform-managed LLM bypass key, so the resolved mode does not affect the
// merge assertions; platform_managed keeps the existing pass-through.
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(testWsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(LLMBillingModePlatformManaged))
w := httptest.NewRecorder()
c := secretsValuesRequest(w, "Bearer good-token")
@@ -963,6 +975,68 @@ func TestSecretsValues_ValidTokenReturnsDecryptedMerge(t *testing.T) {
}
}
// TestSecretsValues_ByokStripsGlobalLLMCred is the internal#711 regression
// guard for the remote-pull injection vector. A non-platform (byok) workspace
// that pulls its secrets via GET /workspaces/:id/secrets/values must NOT
// receive the platform's scope:global CLAUDE_CODE_OAUTH_TOKEN — that key is
// of global_secrets provenance and is dropped by the provider-aware gate.
// Its OWN ANTHROPIC_API_KEY (a workspace_secrets row) survives, and unrelated
// non-LLM global secrets are untouched.
func TestSecretsValues_ByokStripsGlobalLLMCred(t *testing.T) {
mock := setupTestDB(t)
handler := NewSecretsHandler(nil)
mock.ExpectQuery(`SELECT COUNT\(\*\) FROM workspace_auth_tokens`).
WithArgs(testWsID).
WillReturnRows(sqlmock.NewRows([]string{"count"}).AddRow(1))
mock.ExpectQuery(`SELECT t\.id, t\.workspace_id.*FROM workspace_auth_tokens t.*JOIN workspaces`).
WithArgs(sqlmock.AnyArg()).
WillReturnRows(sqlmock.NewRows([]string{"id", "workspace_id"}).AddRow("tok-1", testWsID))
mock.ExpectExec(`UPDATE workspace_auth_tokens SET last_used_at`).
WithArgs("tok-1").
WillReturnResult(sqlmock.NewResult(0, 1))
// global_secrets holds the platform's scope:global OAuth token + a
// non-LLM operator global (should be untouched).
mock.ExpectQuery(`SELECT key, encrypted_value, encryption_version FROM global_secrets`).
WillReturnRows(sqlmock.NewRows([]string{"key", "encrypted_value", "encryption_version"}).
AddRow("CLAUDE_CODE_OAUTH_TOKEN", []byte("PLATFORM-GLOBAL-OAUTH"), 0).
AddRow("SENTRY_DSN", []byte("https://sentry.example/123"), 0))
// The workspace brought its OWN Anthropic API key via the Secrets tab.
mock.ExpectQuery(`SELECT key, encrypted_value, encryption_version FROM workspace_secrets WHERE workspace_id`).
WithArgs(testWsID).
WillReturnRows(sqlmock.NewRows([]string{"key", "encrypted_value", "encryption_version"}).
AddRow("ANTHROPIC_API_KEY", []byte("CUSTOMER-OWN-ANTHROPIC-KEY"), 0))
// Resolver: this workspace is byok.
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(testWsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(LLMBillingModeBYOK))
w := httptest.NewRecorder()
c := secretsValuesRequest(w, "Bearer good-token")
handler.Values(c)
if w.Code != http.StatusOK {
t.Fatalf("expected 200, got %d: %s", w.Code, w.Body.String())
}
var body map[string]string
_ = json.Unmarshal(w.Body.Bytes(), &body)
// 1. Platform global OAuth token stripped — the leak is closed on the pull path.
if got, ok := body["CLAUDE_CODE_OAUTH_TOKEN"]; ok {
t.Fatalf("CLAUDE_CODE_OAUTH_TOKEN = %q present — platform scope:global token must be stripped for byok pull", got)
}
// 2. The workspace's own LLM key survives.
if body["ANTHROPIC_API_KEY"] != "CUSTOMER-OWN-ANTHROPIC-KEY" {
t.Fatalf("ANTHROPIC_API_KEY = %q, want the workspace's own key preserved", body["ANTHROPIC_API_KEY"])
}
// 3. Unrelated non-LLM global secrets are untouched.
if body["SENTRY_DSN"] != "https://sentry.example/123" {
t.Fatalf("SENTRY_DSN = %q, want non-LLM globals untouched", body["SENTRY_DSN"])
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
func TestSecretsValues_InvalidWorkspaceID(t *testing.T) {
setupTestDB(t)
handler := NewSecretsHandler(nil)
@@ -428,6 +428,33 @@ func (h *WorkspaceHandler) Create(c *gin.Context) {
return
}
// internal#718 P2-B: ONLY-REGISTERED validation at the create boundary.
// For a runtime the provider registry knows (first-party:
// claude-code/codex/hermes/openclaw) this checks the (runtime, model) pair
// against the registry's native model set. Fails OPEN for runtimes the
// registry doesn't know (langgraph/external/kimi/mock/federated) so
// non-first-party flows are UNCHANGED. Skipped for external workspaces.
//
// P2 ENFORCEMENT MODE = WARN, not hard-reject (deliberate, scoped). The
// legacy colon-namespaced BYOK model vocabulary ("anthropic:claude-opus-4-7"
// etc.) is still live across the create/import/template corpus and is NOT
// yet reconciled into the registry's exact-id model sets — that convergence
// is P3 (canvas only-offers-registered) + P4 (template codegen). Hard-
// rejecting an unregistered (runtime, model) now would 422 those legitimate
// existing flows, a large behavior change outside P2's scope (P2's behavior
// delta is the billing/credential flip, below). So P2 surfaces the
// unregistered pair as a queryable warning + an X-Molecule-Model-Unregistered
// response header (operator/canvas signal) and lets create proceed; the gate
// flips to hard-reject (uncomment the 422 below) once P3/P4 land the
// vocabulary convergence. The registry model set is code-generated from the
// canonical providers.yaml (PR-A), so the check stays in sync with the SSOT.
if !isExternal {
if ok, why := validateRegisteredModelForRuntime(payload.Runtime, payload.Model); !ok {
log.Printf("Create: WARN unregistered model (runtime=%q model=%q): %s [internal#718 P2 warn-mode; hard-reject gated on P3/P4 vocabulary convergence]", payload.Runtime, payload.Model, why)
c.Header("X-Molecule-Model-Unregistered", "true")
}
}
ctx := c.Request.Context()
// Convert empty role to NULL
@@ -75,3 +75,21 @@ func formatMissingEnvError(missing []string) string {
strings.Join(missing, ", "),
)
}
// formatMissingBYOKCredentialError builds the user-facing message for a
// provision failure caused by a non-platform (byok/subscription) workspace
// that has no usable LLM credential of its own (internal#711). The platform's
// scope:global LLM credentials are NOT a valid fallback for a non-platform
// workspace — resolving to them would bill the platform's Anthropic credits —
// so the provision fails closed here rather than starting the workspace on
// stripped/absent creds. Rendered verbatim in the canvas Events tab.
func formatMissingBYOKCredentialError(mode string) string {
return fmt.Sprintf(
"this workspace's LLM billing mode is %q (not platform-managed) but it has no LLM credential of its own. "+
"Add a workspace-scoped credential (e.g. CLAUDE_CODE_OAUTH_TOKEN or your provider's API key) under "+
"Config → Secrets, or switch the workspace to platform-managed billing via "+
"/admin/workspaces/:id/llm-billing-mode, then retry. The platform's shared LLM credentials are not "+
"used for non-platform workspaces.",
mode,
)
}
@@ -943,16 +943,70 @@ func applyRuntimeModelEnv(envVars map[string]string, runtime, model string) {
// MOLECULE_LLM_BILLING_MODE_RESOLVED so an in-container debug check can
// answer "what mode is this workspace running under" without DB queries
// (RFC Observability hot-spot).
func applyPlatformManagedLLMEnv(ctx context.Context, envVars map[string]string, workspaceID, runtime, model string) {
orgMode := strings.ToLower(strings.TrimSpace(os.Getenv("MOLECULE_LLM_BILLING_MODE")))
res, resolveErr := ResolveLLMBillingMode(ctx, workspaceID, orgMode)
//
// internal#711 — PROVIDER-AWARE GLOBAL-LLM-CRED GATE. The platform's
// LLM credentials (CLAUDE_CODE_OAUTH_TOKEN + the rest of the
// platformManagedDirectLLMBypassKeys set) live in `global_secrets` and
// are merged into EVERY workspace's env by loadWorkspaceSecrets — that
// merge is provenance-blind. Pre-fix, the non-platform (byok/disabled)
// early-return left envVars untouched, so a BYOK / subscription
// workspace that brought NO LLM credential of its own still inherited
// the platform's scope:global CLAUDE_CODE_OAUTH_TOKEN and ran Opus on
// the platform's (Molecule's) Anthropic credits (Reno Stars SEO +
// Marketing agents, confirmed live 2026-05-27).
//
// The gate: on the non-platform path we strip every platform-managed
// LLM key whose PROVENANCE is `global_secrets` (the globalKeys set).
// A workspace's OWN LLM credential — set via the canvas Secrets tab,
// i.e. a `workspace_secrets` row — has had its global provenance flag
// dropped by loadWorkspaceSecrets, so it is NOT in globalKeys and
// survives. Net effect: platform global LLM creds reach a workspace
// ONLY when its resolved mode is platform_managed; a non-platform
// workspace resolves to its own (workspace-scoped) credential or none.
//
// The boolean return reports whether, after the gate, the workspace
// still has at least one usable LLM credential. The caller
// (prepareProvisionContext) uses it to FAIL CLOSED — a non-platform
// workspace with no usable LLM credential is aborted with a clear
// MISSING_BYOK_CREDENTIAL error at provision time rather than being
// started on (now-stripped) platform creds.
// platformLLMEnvResult is the structured outcome of applyPlatformManagedLLMEnv.
// ResolvedMode is the per-workspace billing/provider mode the resolver
// landed on. HasUsableLLMCred reports whether — AFTER the provider-aware
// global-cred gate — the workspace still has at least one platform-managed
// LLM credential key in its env (its own, workspace-scoped one). Only the
// non-platform path consults HasUsableLLMCred for the fail-closed decision;
// the platform_managed path always returns true (it forces the CP proxy
// usage token, which IS the usable credential).
type platformLLMEnvResult struct {
ResolvedMode string
HasUsableLLMCred bool
// Source records which layer decided the mode (internal#718 P2-B):
// derived_provider (registry derivation), derived_default (derive failed →
// platform default), workspace_override (explicit operator pin), or
// constant_fallback (DB error). Surfaced for observability + asserted by the
// behavior-delta tests so a regression of "derived, not stored" flips red.
Source BillingModeSource
}
func applyPlatformManagedLLMEnv(ctx context.Context, envVars map[string]string, globalKeys map[string]struct{}, workspaceID, runtime, model string) platformLLMEnvResult {
// internal#718 P2-B: the platform-vs-byok decision now DERIVES the provider
// from (runtime, model) via the registry and keys off IsPlatform(derived) —
// NOT a stored LLM_PROVIDER and NOT the org rung. This path already carries
// runtime + model + the workspace env, so it calls the DERIVED resolver
// directly (no DB round-trip for runtime/model). availableAuthEnv is the set
// of recognized provider auth-env-var NAMES present in envVars (the same
// disambiguation input the registry uses to split oauth-vs-api). The org-env
// MOLECULE_LLM_BILLING_MODE is NO LONGER read into the decision (retired).
availableAuthEnv := availableAuthEnvNames(envVars)
res, resolveErr := ResolveLLMBillingModeDerived(ctx, workspaceID, runtime, model, availableAuthEnv)
if resolveErr != nil {
// resolveErr != nil ⇒ resolver hit a DB error AND already defaulted
// res.ResolvedMode to platform_managed. Log + proceed; the safe default
// is already in place, no early return needed.
log.Printf("workspace_provision: resolve billing mode workspace=%s err=%v (defaulting to platform_managed)", workspaceID, resolveErr)
}
log.Printf("workspace_provision: billing mode workspace=%s resolved=%s source=%s org_default=%s", workspaceID, res.ResolvedMode, res.Source, res.OrgDefault)
log.Printf("workspace_provision: billing mode workspace=%s resolved=%s source=%s derived_provider=%s", workspaceID, res.ResolvedMode, res.Source, derefOrEmpty(res.ProviderSelection))
// internal#703: MOLECULE_LLM_BILLING_MODE in the container must reflect the
// RESOLVED per-workspace mode, not a hardcoded literal. Pre-fix this var was
// only emitted (hardcoded "platform_managed") on the strip path below, so a
@@ -966,18 +1020,36 @@ func applyPlatformManagedLLMEnv(ctx context.Context, envVars map[string]string,
// pulling logs or hitting the admin route.
envVars["MOLECULE_LLM_BILLING_MODE_RESOLVED"] = res.ResolvedMode
if res.ResolvedMode != LLMBillingModePlatformManaged {
// byok or disabled — DO NOT strip vendor keys, DO NOT force-route to CP,
// DO NOT override the workspace own ANTHROPIC_BASE_URL / OAuth token.
// Leave envVars alone so CLAUDE_CODE_OAUTH_TOKEN / vendor API keys
// pulled from workspace_secrets survive into the container, and the
// workspace talks to its own provider directly (internal#703).
return
// byok or disabled — DO NOT force-route to CP, DO NOT override the
// workspace's own ANTHROPIC_BASE_URL / OAuth token.
//
// internal#711: but DO strip platform-origin LLM credentials. The
// platform's scope:global CLAUDE_CODE_OAUTH_TOKEN (+ the rest of the
// bypass-key set) was merged into envVars by loadWorkspaceSecrets
// from global_secrets; without this strip a BYOK workspace that
// brought no LLM credential of its own would inherit the platform's
// global token and bill the platform's Anthropic credits. The strip
// is PROVENANCE-AWARE: only keys still flagged as global_secrets
// origin are removed; a workspace's own LLM cred (a workspace_secrets
// row — provenance flag already dropped by loadWorkspaceSecrets)
// survives so the workspace talks to its own provider directly.
stripGlobalOriginLLMCreds(envVars, globalKeys)
return platformLLMEnvResult{
ResolvedMode: res.ResolvedMode,
HasUsableLLMCred: hasAnyPlatformManagedLLMKey(envVars),
Source: res.Source,
}
}
baseURL := firstNonEmptyEnv("MOLECULE_LLM_BASE_URL", "OPENAI_BASE_URL")
anthropicBaseURL := firstNonEmptyEnv("MOLECULE_LLM_ANTHROPIC_BASE_URL", "ANTHROPIC_BASE_URL")
token := firstNonEmptyEnv("MOLECULE_LLM_USAGE_TOKEN", "OPENAI_API_KEY")
if baseURL == "" || token == "" {
return
// Proxy not configured (boot race / misconfig). On the platform_managed
// path the workspace IS entitled to platform creds, so we do NOT strip
// here — but we report HasUsableLLMCred from whatever survived so the
// caller's fail-closed branch (non-platform only) is never reached on
// this path.
return platformLLMEnvResult{ResolvedMode: res.ResolvedMode, HasUsableLLMCred: true, Source: res.Source}
}
stripPlatformManagedLLMBypassEnv(envVars)
@@ -1006,6 +1078,10 @@ func applyPlatformManagedLLMEnv(ctx context.Context, envVars map[string]string,
envVars["MOLECULE_MODEL"] = defaultModel
}
}
// platform_managed: the CP proxy usage token (injected as ANTHROPIC_API_KEY
// / OPENAI_API_KEY above) IS the usable credential, so the workspace is
// never fail-closed on this path.
return platformLLMEnvResult{ResolvedMode: res.ResolvedMode, HasUsableLLMCred: true, Source: res.Source}
}
func stripPlatformManagedLLMBypassEnv(envVars map[string]string) {
@@ -1014,6 +1090,41 @@ func stripPlatformManagedLLMBypassEnv(envVars map[string]string) {
}
}
// stripGlobalOriginLLMCreds removes platform-managed LLM credential keys
// (CLAUDE_CODE_OAUTH_TOKEN + the rest of platformManagedDirectLLMBypassKeys)
// from envVars ONLY when they originated from the operator-controlled
// `global_secrets` table (i.e. their key is present in globalKeys).
//
// internal#711 provider-aware gate. A platform global LLM credential is the
// platform's own credential and must never be the credential a non-platform
// (byok / subscription) workspace runs on. loadWorkspaceSecrets drops the
// global-provenance flag for any key the workspace re-set via the canvas
// Secrets tab (a workspace_secrets row), so a workspace's OWN LLM credential
// is NOT in globalKeys and survives this strip — only the inherited platform
// global creds are removed.
func stripGlobalOriginLLMCreds(envVars map[string]string, globalKeys map[string]struct{}) {
for key := range platformManagedDirectLLMBypassKeys {
if _, fromGlobal := globalKeys[key]; fromGlobal {
delete(envVars, key)
}
}
}
// hasAnyPlatformManagedLLMKey reports whether envVars still carries at least
// one non-empty platform-managed LLM credential key after the provider-aware
// gate. Used by the non-platform fail-closed branch: a byok/subscription
// workspace with no surviving (workspace-scoped) LLM credential must be
// aborted with MISSING_BYOK_CREDENTIAL rather than started credential-less or
// on stripped platform creds.
func hasAnyPlatformManagedLLMKey(envVars map[string]string) bool {
for key := range platformManagedDirectLLMBypassKeys {
if strings.TrimSpace(envVars[key]) != "" {
return true
}
}
return false
}
func runtimeUsesAnthropicNativeProxy(runtime string) bool {
return strings.EqualFold(strings.TrimSpace(runtime), "claude-code")
}
@@ -193,7 +193,35 @@ func (h *WorkspaceHandler) prepareProvisionContext(
// continue to rely on workspace_secrets / org-import persona-env
// merge for their git auth.
applyAgentGitHTTPCreds(envVars, payload.Role)
applyPlatformManagedLLMEnv(ctx, envVars, workspaceID, payload.Runtime, payload.Model)
// internal#711: provider-aware LLM-credential resolution. On a non-platform
// (byok/subscription) workspace this strips the platform's scope:global LLM
// creds inherited from global_secrets and reports whether the workspace
// still has a usable (workspace-scoped) LLM credential of its own.
llmRes := applyPlatformManagedLLMEnv(ctx, envVars, globalSecretKeys, workspaceID, payload.Runtime, payload.Model)
// Fail closed for a BYOK workspace with no usable LLM credential: do NOT
// start it on the platform's (now-stripped) global creds. Mirror the
// "model+provider+credential REQUIRED at create" spirit (internal#711)
// with an actionable error surfaced at provision time.
//
// Scoped to byok specifically (NOT disabled): "byok" means "the user
// intends to run an LLM on their own credential" — a missing one is a
// misconfiguration worth surfacing loudly. "disabled" means "this
// workspace runs no platform-billed LLM at all" (terminal / file work, or
// a runtime that talks to a non-bypass-key endpoint); stripping the
// inherited platform globals is sufficient there and aborting would
// regress a legitimate no-LLM workspace. The strip above already ran for
// both non-platform modes.
//
// The bypass-key check is intentionally broad — any surviving bypass key
// (the workspace's own, of workspace_secrets provenance) clears it.
if llmRes.ResolvedMode == LLMBillingModeBYOK && !llmRes.HasUsableLLMCred {
msg := formatMissingBYOKCredentialError(llmRes.ResolvedMode)
log.Printf("Provisioner: ABORT workspace=%s — byok billing mode has no usable LLM credential (MISSING_BYOK_CREDENTIAL, internal#711)", workspaceID)
return nil, &provisionAbort{
Msg: msg,
Extra: map[string]interface{}{"error": msg, "code": "MISSING_BYOK_CREDENTIAL", "billing_mode": llmRes.ResolvedMode, "issue": "711"},
}
}
applyRuntimeModelEnv(envVars, payload.Runtime, payload.Model)
if payload.Role != "" {
envVars["MOLECULE_AGENT_ROLE"] = payload.Role
@@ -494,6 +494,57 @@ func TestPrepareProvisionContext_WorkspaceSecretWinsOverPersonaToken(t *testing.
}
}
// TestPrepareProvisionContext_ByokWithOnlyGlobalOAuthFailsClosed is the
// internal#711 end-to-end guard for the live Reno Stars leak. A byok
// workspace whose ONLY LLM credential is the platform's scope:global
// CLAUDE_CODE_OAUTH_TOKEN (inherited from global_secrets, no workspace
// override) must:
//
// 1. have that platform token STRIPPED from the prepared env (no leak), and
// 2. ABORT the provision with the MISSING_BYOK_CREDENTIAL code rather than
// start the workspace on the platform's credits.
//
// This is the discriminating end-to-end test: pre-fix prepared.EnvVars would
// carry CLAUDE_CODE_OAUTH_TOKEN=<platform token> and the provision would
// succeed, running Opus on Molecule's Anthropic credits.
func TestPrepareProvisionContext_ByokWithOnlyGlobalOAuthFailsClosed(t *testing.T) {
const wsID = "352e3c2b-0546-4e9c-b487-1e2ff1cf29fc" // Reno Stars SEO agent
t.Setenv("MOLECULE_LLM_BILLING_MODE", LLMBillingModePlatformManaged)
mock := setupTestDB(t)
// global_secrets carries the platform's scope:global OAuth token.
mock.ExpectQuery(`SELECT key, encrypted_value, encryption_version FROM global_secrets`).
WillReturnRows(sqlmock.NewRows([]string{"key", "encrypted_value", "encryption_version"}).
AddRow("CLAUDE_CODE_OAUTH_TOKEN", []byte("PLATFORM-GLOBAL-OAUTH"), 0))
// Workspace set NO secrets of its own.
mock.ExpectQuery(`SELECT key, encrypted_value, encryption_version FROM workspace_secrets`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"key", "encrypted_value", "encryption_version"}))
// Resolver: workspace override = byok.
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(LLMBillingModeBYOK))
handler := NewWorkspaceHandler(&captureBroadcaster{}, nil, "http://localhost:8080", t.TempDir())
payload := models.CreateWorkspacePayload{
Name: "Reno Stars SEO",
Runtime: "claude-code",
Tier: 1,
}
prepared, abort := handler.prepareProvisionContext(
context.Background(), wsID, "/nonexistent", nil, payload, false)
if abort == nil {
t.Fatalf("expected MISSING_BYOK_CREDENTIAL abort, got success (prepared=%v) — the leak would still ship", prepared)
}
if code, _ := abort.Extra["code"].(string); code != "MISSING_BYOK_CREDENTIAL" {
t.Fatalf("abort.Extra[code] = %v, want MISSING_BYOK_CREDENTIAL", abort.Extra["code"])
}
if mode, _ := abort.Extra["billing_mode"].(string); mode != LLMBillingModeBYOK {
t.Fatalf("abort.Extra[billing_mode] = %v, want %q", abort.Extra["billing_mode"], LLMBillingModeBYOK)
}
}
// TestReadOrLazyHealInboundSecret pins the four branches of the
// shared lazy-heal helper directly. Each call site (chat_files,
// registry) has its own integration test, but those go through the
@@ -972,7 +1023,7 @@ func TestApplyPlatformManagedLLMEnv_NonClaudeRuntimeDefaultsOpenAIProxyWhenNoWor
t.Setenv("MOLECULE_LLM_DEFAULT_MODEL", "moonshot/kimi-k2.6")
envVars := map[string]string{}
applyPlatformManagedLLMEnv(context.Background(), envVars, "", "codex", "")
applyPlatformManagedLLMEnv(context.Background(), envVars, nil, "", "codex", "")
applyRuntimeModelEnv(envVars, "codex", "")
if got := envVars["OPENAI_BASE_URL"]; got != "https://api.example.test/api/v1/internal/llm/openai/v1" {
@@ -1002,7 +1053,7 @@ func TestApplyPlatformManagedLLMEnv_StripsWorkspaceOpenAIKeyForClaudeCode(t *tes
"OPENAI_BASE_URL": "https://api.openai.com/v1",
"MODEL": "openai/gpt-5.5",
}
applyPlatformManagedLLMEnv(context.Background(), envVars, "", "claude-code", "")
applyPlatformManagedLLMEnv(context.Background(), envVars, nil, "", "claude-code", "")
if _, ok := envVars["OPENAI_API_KEY"]; ok {
t.Fatalf("OPENAI_API_KEY should be stripped for claude-code platform-managed mode")
@@ -1028,7 +1079,7 @@ func TestApplyPlatformManagedLLMEnv_ClaudeCodeUsesAnthropicProxyOverOAuth(t *tes
"CLAUDE_CODE_OAUTH_TOKEN": "user-oauth-token",
"MODEL": "sonnet",
}
applyPlatformManagedLLMEnv(context.Background(), envVars, "", "claude-code", "")
applyPlatformManagedLLMEnv(context.Background(), envVars, nil, "", "claude-code", "")
if _, ok := envVars["CLAUDE_CODE_OAUTH_TOKEN"]; ok {
t.Fatalf("CLAUDE_CODE_OAUTH_TOKEN should be stripped in platform-managed mode")
@@ -1051,7 +1102,7 @@ func TestApplyPlatformManagedLLMEnv_ClaudeCodeInjectsAnthropicProxyWhenNoWorkspa
t.Setenv("MOLECULE_LLM_USAGE_TOKEN", "tenant-admin-token")
envVars := map[string]string{}
applyPlatformManagedLLMEnv(context.Background(), envVars, "", "claude-code", "minimax/MiniMax-M2.7")
applyPlatformManagedLLMEnv(context.Background(), envVars, nil, "", "claude-code", "minimax/MiniMax-M2.7")
if got := envVars["ANTHROPIC_BASE_URL"]; got != "https://api.example.test/api/v1/internal/llm/anthropic/v1" {
t.Fatalf("ANTHROPIC_BASE_URL = %q", got)
@@ -1074,7 +1125,7 @@ func TestApplyPlatformManagedLLMEnv_ClaudeCodeStripsVendorBYOK(t *testing.T) {
"MINIMAX_API_KEY": "user-minimax-key",
"MODEL": "MiniMax-M2.7",
}
applyPlatformManagedLLMEnv(context.Background(), envVars, "", "claude-code", "")
applyPlatformManagedLLMEnv(context.Background(), envVars, nil, "", "claude-code", "")
if _, ok := envVars["MINIMAX_API_KEY"]; ok {
t.Fatalf("MINIMAX_API_KEY should be stripped in platform-managed mode")
@@ -1090,20 +1141,38 @@ func TestApplyPlatformManagedLLMEnv_ClaudeCodeStripsVendorBYOK(t *testing.T) {
}
}
// internal#718 P2-B: byok is now DERIVED, not org-env-driven. A claude-code
// workspace with NO explicit override + a non-platform-deriving model
// (kimi-for-coding → kimi-coding) resolves byok and must NOT get the CP proxy
// creds injected. (Pre-P2 this was driven by the org env MOLECULE_LLM_BILLING_MODE
// with an empty workspace id; that mechanism is retired.)
func TestApplyPlatformManagedLLMEnv_NoopsOutsidePlatformManaged(t *testing.T) {
t.Setenv("MOLECULE_LLM_BILLING_MODE", "byok")
const wsID = "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee"
mock := setupTestDB(t)
// No explicit override → derive from (claude-code, kimi-for-coding) → byok.
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(nil))
t.Setenv("MOLECULE_LLM_BILLING_MODE", "platform_managed") // org env ignored now
t.Setenv("MOLECULE_LLM_BASE_URL", "https://api.example.test/api/v1/internal/llm/openai/v1")
t.Setenv("MOLECULE_LLM_USAGE_TOKEN", "tenant-admin-token")
envVars := map[string]string{}
applyPlatformManagedLLMEnv(context.Background(), envVars, "", "claude-code", "")
res := applyPlatformManagedLLMEnv(context.Background(), envVars, nil, wsID, "claude-code", "kimi-for-coding")
if res.ResolvedMode != LLMBillingModeBYOK {
t.Fatalf("resolved mode = %q, want byok (derived from non-platform model)", res.ResolvedMode)
}
if _, ok := envVars["OPENAI_API_KEY"]; ok {
t.Fatalf("OPENAI_API_KEY should not be set outside platform-managed mode")
}
if _, ok := envVars["MOLECULE_LLM_USAGE_TOKEN"]; ok {
t.Fatalf("MOLECULE_LLM_USAGE_TOKEN should not be set outside platform-managed mode")
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
// TestApplyPlatformManagedLLMEnv_ClaudeCodeByokKeepsOwnProviderEnv is the
@@ -1137,7 +1206,7 @@ func TestApplyPlatformManagedLLMEnv_ClaudeCodeByokKeepsOwnProviderEnv(t *testing
"CLAUDE_CODE_OAUTH_TOKEN": "user-oauth-token",
"MODEL": "sonnet",
}
applyPlatformManagedLLMEnv(context.Background(), envVars, wsID, "claude-code", "")
applyPlatformManagedLLMEnv(context.Background(), envVars, nil, wsID, "claude-code", "")
// 1. OAuth token intact — not stripped.
if got := envVars["CLAUDE_CODE_OAUTH_TOKEN"]; got != "user-oauth-token" {
@@ -1168,6 +1237,312 @@ func TestApplyPlatformManagedLLMEnv_ClaudeCodeByokKeepsOwnProviderEnv(t *testing
}
}
// TestApplyPlatformManagedLLMEnv_ByokStripsGlobalOriginOAuthToken is the
// internal#711 regression guard for the live 2026-05-27 leak (Reno Stars SEO
// + Marketing claude-code agents). A non-platform (byok) workspace that
// brought NO LLM credential of its own, but which inherited the platform's
// scope:global CLAUDE_CODE_OAUTH_TOKEN from global_secrets (provenance =
// globalKeys), must have that platform token STRIPPED — not run on it.
//
// Pre-fix the byok early-return left envVars untouched, so the platform's
// global OAuth token survived into the container and the agent ran Opus on
// the platform's Anthropic credits. The fix gates the global-cred merge on
// provider==platform: a non-platform workspace keeps only its own
// (workspace_secrets) creds, of which there are none here.
func TestApplyPlatformManagedLLMEnv_ByokStripsGlobalOriginOAuthToken(t *testing.T) {
const wsID = "352e3c2b-0546-4e9c-b487-1e2ff1cf29fc" // Reno Stars SEO agent
mock := setupTestDB(t)
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(LLMBillingModeBYOK))
t.Setenv("MOLECULE_LLM_BILLING_MODE", LLMBillingModePlatformManaged)
t.Setenv("MOLECULE_LLM_BASE_URL", "https://api.example.test/api/v1/internal/llm/openai/v1")
t.Setenv("MOLECULE_LLM_ANTHROPIC_BASE_URL", "https://api.example.test/api/v1/internal/llm/anthropic")
t.Setenv("MOLECULE_LLM_USAGE_TOKEN", "tenant-admin-token")
// The ONLY LLM credential in env is the platform's scope:global OAuth
// token, merged from global_secrets (so its key is in globalKeys). The
// workspace set none of its own.
envVars := map[string]string{
"CLAUDE_CODE_OAUTH_TOKEN": "PLATFORM-GLOBAL-OAUTH-TOKEN",
"MODEL": "opus",
}
globalKeys := map[string]struct{}{"CLAUDE_CODE_OAUTH_TOKEN": {}}
res := applyPlatformManagedLLMEnv(context.Background(), envVars, globalKeys, wsID, "claude-code", "")
// 1. The platform global OAuth token must be STRIPPED — the leak is closed.
if got, ok := envVars["CLAUDE_CODE_OAUTH_TOKEN"]; ok {
t.Fatalf("CLAUDE_CODE_OAUTH_TOKEN = %q present — platform scope:global token must be stripped for a byok workspace", got)
}
// 2. No CP proxy creds forced (byok = workspace talks to its own provider).
if got, ok := envVars["ANTHROPIC_API_KEY"]; ok {
t.Fatalf("ANTHROPIC_API_KEY must NOT be injected for byok, got %q", got)
}
// 3. Resolver reports byok with NO usable LLM credential → caller fails closed.
if res.ResolvedMode != LLMBillingModeBYOK {
t.Fatalf("ResolvedMode = %q, want %q", res.ResolvedMode, LLMBillingModeBYOK)
}
if res.HasUsableLLMCred {
t.Fatalf("HasUsableLLMCred = true, want false (only the stripped platform global token was present)")
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
// =========================================================================
// internal#718 P2-B BEHAVIOR DELTA — billing/credential decision DERIVES the
// provider (no stored LLM_PROVIDER, no override). These three tests are the
// explicit delta the RFC calls out, exercised through the real provision path
// (applyPlatformManagedLLMEnv) with the registry derivation driving the mode:
// - platform-derived → platform_managed → platform creds (UNCHANGED)
// - non-platform-derived → byok → #1963 strip + fail-closed (THE FIX)
// - unset model → platform default (CTO-confirmed)
// All use NO explicit override (override read returns NULL) so the DERIVATION
// is what decides — this is what supersedes #1966's stored-LLM_PROVIDER read.
// =========================================================================
// PLATFORM-DERIVED → UNCHANGED. A claude-code workspace with a platform-
// namespaced model (anthropic/claude-opus-4-7) derives to the closed `platform`
// provider → platform_managed → CP proxy creds injected, exactly as before.
func TestApplyPlatformManagedLLMEnv_DERIVED_PlatformModelKeepsPlatformCreds(t *testing.T) {
const wsID = "11111111-2222-3333-4444-555555555555"
mock := setupTestDB(t)
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(nil)) // NO override → derive
t.Setenv("MOLECULE_LLM_BILLING_MODE", LLMBillingModeBYOK) // org env IGNORED now
t.Setenv("MOLECULE_LLM_BASE_URL", "https://api.example.test/api/v1/internal/llm/openai/v1")
t.Setenv("MOLECULE_LLM_ANTHROPIC_BASE_URL", "https://api.example.test/api/v1/internal/llm/anthropic")
t.Setenv("MOLECULE_LLM_USAGE_TOKEN", "tenant-admin-token")
envVars := map[string]string{}
res := applyPlatformManagedLLMEnv(context.Background(), envVars, nil, wsID, "claude-code", "anthropic/claude-opus-4-7")
if res.ResolvedMode != LLMBillingModePlatformManaged {
t.Fatalf("platform-derived model must resolve platform_managed, got %q (source=%s)", res.ResolvedMode, res.Source)
}
if res.Source != BillingModeSourceDerivedProvider {
t.Errorf("source: got %q want derived_provider", res.Source)
}
// Platform path injects the CP proxy creds (UNCHANGED behavior).
if got := envVars["ANTHROPIC_API_KEY"]; got != "tenant-admin-token" {
t.Errorf("platform path must inject the CP proxy token as ANTHROPIC_API_KEY, got %q", got)
}
if !res.HasUsableLLMCred {
t.Errorf("platform path always has a usable cred (the proxy token)")
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
// NON-PLATFORM-DERIVED → byok + STRIP + FAIL-CLOSED signal (THE FIX, the Reno
// billing-leak class). A claude-code workspace with a non-platform model
// (kimi-for-coding → kimi-coding) and NO override + NO own cred, inheriting only
// the platform's scope:global OAuth token, now DERIVES byok → #1963 strips the
// global token → HasUsableLLMCred=false → caller fails closed. Pre-P2 this same
// workspace resolved platform_managed (via the never-written org rung) and ran
// on the platform's credits. This is the discriminating delta test.
func TestApplyPlatformManagedLLMEnv_DERIVED_NonPlatformModelStripsAndFailsClosed(t *testing.T) {
const wsID = "99999999-8888-7777-6666-555555555555"
mock := setupTestDB(t)
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(nil)) // NO override → derive
t.Setenv("MOLECULE_LLM_BILLING_MODE", LLMBillingModePlatformManaged) // org env IGNORED now
t.Setenv("MOLECULE_LLM_BASE_URL", "https://api.example.test/api/v1/internal/llm/openai/v1")
t.Setenv("MOLECULE_LLM_USAGE_TOKEN", "tenant-admin-token")
// Only LLM cred is the platform's scope:global OAuth token (globalKeys).
envVars := map[string]string{
"CLAUDE_CODE_OAUTH_TOKEN": "PLATFORM-GLOBAL-OAUTH-TOKEN",
}
globalKeys := map[string]struct{}{"CLAUDE_CODE_OAUTH_TOKEN": {}}
res := applyPlatformManagedLLMEnv(context.Background(), envVars, globalKeys, wsID, "claude-code", "kimi-for-coding")
// 1. DERIVED byok (NOT the old platform_managed default).
if res.ResolvedMode != LLMBillingModeBYOK {
t.Fatalf("non-platform-derived model must resolve byok, got %q (source=%s) — THE FIX regressed", res.ResolvedMode, res.Source)
}
if res.Source != BillingModeSourceDerivedProvider {
t.Errorf("source: got %q want derived_provider", res.Source)
}
// 2. #1963 strip: the platform global OAuth token is removed (leak closed).
if got, ok := envVars["CLAUDE_CODE_OAUTH_TOKEN"]; ok {
t.Fatalf("CLAUDE_CODE_OAUTH_TOKEN = %q present — must be stripped for a derived-byok workspace (Reno leak)", got)
}
// 3. No CP proxy creds forced.
if got, ok := envVars["ANTHROPIC_API_KEY"]; ok {
t.Fatalf("ANTHROPIC_API_KEY must NOT be injected for byok, got %q", got)
}
// 4. No usable cred → caller (prepareProvisionContext) fails closed.
if res.HasUsableLLMCred {
t.Fatalf("HasUsableLLMCred = true, want false (only the stripped platform global token was present)")
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
// UNSET model → PLATFORM DEFAULT (CTO-confirmed "unset → platform default").
// No model means nothing to derive; the workspace defaults closed to
// platform_managed and keeps the platform creds (UNCHANGED for the no-model case).
func TestApplyPlatformManagedLLMEnv_DERIVED_UnsetModelPlatformDefault(t *testing.T) {
const wsID = "00000000-1111-2222-3333-444444444444"
mock := setupTestDB(t)
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(nil)) // NO override
t.Setenv("MOLECULE_LLM_BILLING_MODE", LLMBillingModeBYOK) // org env IGNORED now
t.Setenv("MOLECULE_LLM_BASE_URL", "https://api.example.test/api/v1/internal/llm/openai/v1")
t.Setenv("MOLECULE_LLM_ANTHROPIC_BASE_URL", "https://api.example.test/api/v1/internal/llm/anthropic")
t.Setenv("MOLECULE_LLM_USAGE_TOKEN", "tenant-admin-token")
envVars := map[string]string{}
res := applyPlatformManagedLLMEnv(context.Background(), envVars, nil, wsID, "claude-code", "")
if res.ResolvedMode != LLMBillingModePlatformManaged {
t.Fatalf("unset model must default platform_managed, got %q (source=%s)", res.ResolvedMode, res.Source)
}
if res.Source != BillingModeSourceDerivedDefault {
t.Errorf("source: got %q want derived_default", res.Source)
}
if got := envVars["ANTHROPIC_API_KEY"]; got != "tenant-admin-token" {
t.Errorf("unset-model platform default must inject the CP proxy token, got %q", got)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
// TestApplyPlatformManagedLLMEnv_ByokKeepsWorkspaceOwnOAuthEvenWithGlobal is
// the discriminating companion to the strip test: a byok workspace that DID
// set its own CLAUDE_CODE_OAUTH_TOKEN via the canvas Secrets tab (a
// workspace_secrets row) keeps it. loadWorkspaceSecrets drops the global
// provenance flag on a workspace override, so the key is NOT in globalKeys
// and the provenance-aware strip leaves it alone. Proves the fix strips only
// platform-origin creds, never the customer's own.
func TestApplyPlatformManagedLLMEnv_ByokKeepsWorkspaceOwnOAuthEvenWithGlobal(t *testing.T) {
const wsID = "6b66de8d-9337-4fb4-be8d-6d49dca0d809" // Reno Stars Marketing agent
mock := setupTestDB(t)
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(LLMBillingModeBYOK))
t.Setenv("MOLECULE_LLM_BILLING_MODE", LLMBillingModePlatformManaged)
t.Setenv("MOLECULE_LLM_BASE_URL", "https://api.example.test/api/v1/internal/llm/openai/v1")
t.Setenv("MOLECULE_LLM_USAGE_TOKEN", "tenant-admin-token")
// Workspace set its OWN OAuth token — loadWorkspaceSecrets would have
// dropped its global provenance flag, so globalKeys does NOT contain it.
envVars := map[string]string{
"CLAUDE_CODE_OAUTH_TOKEN": "CUSTOMER-OWN-OAUTH-TOKEN",
"MODEL": "opus",
}
globalKeys := map[string]struct{}{} // not from global_secrets
res := applyPlatformManagedLLMEnv(context.Background(), envVars, globalKeys, wsID, "claude-code", "")
if got := envVars["CLAUDE_CODE_OAUTH_TOKEN"]; got != "CUSTOMER-OWN-OAUTH-TOKEN" {
t.Fatalf("CLAUDE_CODE_OAUTH_TOKEN = %q, want the workspace's own token left intact", got)
}
if !res.HasUsableLLMCred {
t.Fatalf("HasUsableLLMCred = false, want true (workspace brought its own credential)")
}
if res.ResolvedMode != LLMBillingModeBYOK {
t.Fatalf("ResolvedMode = %q, want %q", res.ResolvedMode, LLMBillingModeBYOK)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
// TestApplyPlatformManagedLLMEnv_DisabledStripsGlobalButReportsNoCred proves
// that "disabled" mode also strips the platform's global LLM creds (the leak
// is closed for disabled too), and reports HasUsableLLMCred=false. The
// caller's fail-closed abort is scoped to byok only, so a disabled workspace
// with no LLM cred still boots (for terminal / non-LLM work); here we pin the
// function-level strip + report.
func TestApplyPlatformManagedLLMEnv_DisabledStripsGlobalButReportsNoCred(t *testing.T) {
const wsID = "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"
mock := setupTestDB(t)
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(LLMBillingModeDisabled))
t.Setenv("MOLECULE_LLM_BILLING_MODE", LLMBillingModePlatformManaged)
envVars := map[string]string{
"CLAUDE_CODE_OAUTH_TOKEN": "PLATFORM-GLOBAL-OAUTH-TOKEN",
}
globalKeys := map[string]struct{}{"CLAUDE_CODE_OAUTH_TOKEN": {}}
res := applyPlatformManagedLLMEnv(context.Background(), envVars, globalKeys, wsID, "claude-code", "")
if _, ok := envVars["CLAUDE_CODE_OAUTH_TOKEN"]; ok {
t.Fatalf("CLAUDE_CODE_OAUTH_TOKEN must be stripped for disabled mode too")
}
if res.ResolvedMode != LLMBillingModeDisabled {
t.Fatalf("ResolvedMode = %q, want %q", res.ResolvedMode, LLMBillingModeDisabled)
}
if res.HasUsableLLMCred {
t.Fatalf("HasUsableLLMCred = true, want false")
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
// TestApplyPlatformManagedLLMEnv_PlatformManagedStillReceivesGlobalCreds is
// the no-regression guard for the OTHER side of the gate (internal#711): a
// platform-managed workspace MUST still receive the platform's creds. Here
// the proxy IS configured, so the contract is the existing one — the global
// OAuth token is replaced by the proxy usage token (HasUsableLLMCred=true).
func TestApplyPlatformManagedLLMEnv_PlatformManagedStillReceivesGlobalCreds(t *testing.T) {
const wsID = "99999999-9999-9999-9999-999999999999"
mock := setupTestDB(t)
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(LLMBillingModePlatformManaged))
t.Setenv("MOLECULE_LLM_BILLING_MODE", LLMBillingModePlatformManaged)
t.Setenv("MOLECULE_LLM_BASE_URL", "https://api.example.test/api/v1/internal/llm/openai/v1")
t.Setenv("MOLECULE_LLM_ANTHROPIC_BASE_URL", "https://api.example.test/api/v1/internal/llm/anthropic")
t.Setenv("MOLECULE_LLM_USAGE_TOKEN", "tenant-admin-token")
envVars := map[string]string{
"CLAUDE_CODE_OAUTH_TOKEN": "PLATFORM-GLOBAL-OAUTH-TOKEN",
"MODEL": "opus",
}
globalKeys := map[string]struct{}{"CLAUDE_CODE_OAUTH_TOKEN": {}}
res := applyPlatformManagedLLMEnv(context.Background(), envVars, globalKeys, wsID, "claude-code", "")
// Platform-managed routes through the CP proxy: OAuth stripped, proxy creds forced.
if _, ok := envVars["CLAUDE_CODE_OAUTH_TOKEN"]; ok {
t.Fatalf("CLAUDE_CODE_OAUTH_TOKEN should be stripped + replaced by the proxy token for platform_managed")
}
if got := envVars["ANTHROPIC_API_KEY"]; got != "tenant-admin-token" {
t.Fatalf("ANTHROPIC_API_KEY = %q, want proxy usage token for platform_managed", got)
}
if !res.HasUsableLLMCred {
t.Fatalf("HasUsableLLMCred = false, want true for platform_managed (proxy token is the credential)")
}
if res.ResolvedMode != LLMBillingModePlatformManaged {
t.Fatalf("ResolvedMode = %q, want %q", res.ResolvedMode, LLMBillingModePlatformManaged)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
// TestApplyPlatformManagedLLMEnv_PlatformManagedStillEmitsResolvedMode is the
// no-regression companion: a workspace that resolves to platform_managed must
// still strip + force the proxy AND emit MOLECULE_LLM_BILLING_MODE=
@@ -1189,7 +1564,7 @@ func TestApplyPlatformManagedLLMEnv_PlatformManagedStillEmitsResolvedMode(t *tes
"CLAUDE_CODE_OAUTH_TOKEN": "user-oauth-token",
"MODEL": "sonnet",
}
applyPlatformManagedLLMEnv(context.Background(), envVars, wsID, "claude-code", "")
applyPlatformManagedLLMEnv(context.Background(), envVars, nil, wsID, "claude-code", "")
// OAuth stripped, proxy forced — unchanged platform_managed contract.
if _, ok := envVars["CLAUDE_CODE_OAUTH_TOKEN"]; ok {
@@ -501,10 +501,12 @@ func TestWorkspaceCreate_WithSecrets_Persists(t *testing.T) {
// while persisting a secret causes the entire transaction to roll back and
// the handler to return 500. The workspace row must NOT be committed.
func TestWorkspaceCreate_SecretPersistFails_RollsBack(t *testing.T) {
// internal#691: see TestExtended_SecretsSet — same default-closed reasoning.
// This test is asserting the rollback path on DB failure, not the strip gate;
// keep the org in byok so the OPENAI_API_KEY write reaches the INSERT.
t.Setenv("MOLECULE_LLM_BILLING_MODE", "byok")
// internal#718 P2-B: this test asserts the rollback path on DB failure, not
// the strip gate. The create-time secret gate keys off the DERIVED mode now
// (org rung retired). An explicit byok override makes the workspace byok in a
// single resolver read (precedence-1 short-circuit), so the OPENAI_API_KEY
// write is allowed and reaches the INSERT-and-fail path this test exercises.
t.Setenv("MOLECULE_LLM_BILLING_MODE", "platform_managed") // org env ignored now
mock := setupTestDB(t)
setupTestRedis(t)
broadcaster := newTestBroadcaster()
@@ -513,14 +515,11 @@ func TestWorkspaceCreate_SecretPersistFails_RollsBack(t *testing.T) {
mock.ExpectBegin()
mock.ExpectExec("INSERT INTO workspaces").
WillReturnResult(sqlmock.NewResult(0, 1))
// internal#691: Create() now resolves billing mode per-workspace before
// the secret-strip gate. The workspace row was just inserted in the same
// transaction so it isn't readable from a separate query yet; the
// resolver expects the SELECT and the mock returns no row → falls back
// to the org default (byok, set above) so the OPENAI_API_KEY write
// reaches the INSERT-and-fail path this test exercises.
// Create() resolves billing mode per-workspace before the secret-strip gate.
// An explicit byok override short-circuits the resolver (precedence 1) so the
// OPENAI_API_KEY write is allowed and reaches the INSERT-and-fail path.
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}))
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(LLMBillingModeBYOK))
mock.ExpectExec("INSERT INTO workspace_secrets").
WillReturnError(sql.ErrConnDone) // DB failure while writing secret
mock.ExpectRollback() // workspace insert must be rolled back
@@ -2048,6 +2047,111 @@ func TestWorkspaceCreate_188_NoTemplateNoRuntime_NowMODEL_REQUIRED(t *testing.T)
}
}
// internal#718 P2-B: only-registered validation in P2 WARN-mode. A known
// (registry) runtime with a model NOT in its registered set is allowed to
// proceed (201) but flagged with the X-Molecule-Model-Unregistered response
// header + a queryable warning log. (Hard-reject is gated on P3/P4 vocabulary
// convergence — see the create handler comment; flipping to 422 there is a
// one-line change once the legacy colon-form model vocabulary is reconciled.)
func TestWorkspaceCreate_718_UnregisteredModelWarnsButProceeds(t *testing.T) {
mock := setupTestDB(t)
setupTestRedis(t)
broadcaster := newTestBroadcaster()
handler := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
mock.ExpectBegin()
mock.ExpectExec("INSERT INTO workspaces").
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectCommit()
mock.ExpectExec("INSERT INTO workspace_secrets").
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectExec("INSERT INTO canvas_layouts").
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectExec("INSERT INTO structure_events").
WillReturnResult(sqlmock.NewResult(0, 1))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
body := `{"name":"Bad Model","runtime":"claude-code","model":"totally-made-up-xyz"}`
c.Request = httptest.NewRequest("POST", "/workspaces", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
handler.Create(c)
if w.Code != http.StatusCreated {
t.Fatalf("unregistered-model create (warn-mode): expected 201, got %d: %s", w.Code, w.Body.String())
}
if w.Header().Get("X-Molecule-Model-Unregistered") != "true" {
t.Errorf("expected X-Molecule-Model-Unregistered=true header on unregistered model, got %q", w.Header().Get("X-Molecule-Model-Unregistered"))
}
}
// A REGISTERED model on a registry runtime sets NO unregistered header.
func TestWorkspaceCreate_718_RegisteredModelNoWarnHeader(t *testing.T) {
mock := setupTestDB(t)
setupTestRedis(t)
broadcaster := newTestBroadcaster()
handler := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
mock.ExpectBegin()
mock.ExpectExec("INSERT INTO workspaces").
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectCommit()
mock.ExpectExec("INSERT INTO workspace_secrets").
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectExec("INSERT INTO canvas_layouts").
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectExec("INSERT INTO structure_events").
WillReturnResult(sqlmock.NewResult(0, 1))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
// claude-opus-4-7 IS a registered claude-code model (anthropic-api).
body := `{"name":"Good Model","runtime":"claude-code","model":"claude-opus-4-7"}`
c.Request = httptest.NewRequest("POST", "/workspaces", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
handler.Create(c)
if w.Code != http.StatusCreated {
t.Fatalf("registered-model create: expected 201, got %d: %s", w.Code, w.Body.String())
}
if w.Header().Get("X-Molecule-Model-Unregistered") != "" {
t.Errorf("registered model must NOT set the unregistered header, got %q", w.Header().Get("X-Molecule-Model-Unregistered"))
}
}
// internal#718 P2-B: a runtime NOT in the registry (mock — a known core runtime
// absent from the first-party provider registry) fails OPEN — the
// only-registered gate does not block it (federation / non-first-party path
// unchanged). It proceeds past the gate to the normal create flow.
func TestWorkspaceCreate_718_NonRegistryRuntimeFailsOpen(t *testing.T) {
mock := setupTestDB(t)
setupTestRedis(t)
broadcaster := newTestBroadcaster()
handler := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
mock.ExpectBegin()
mock.ExpectExec("INSERT INTO workspaces").
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectCommit()
mock.ExpectExec("INSERT INTO canvas_layouts").
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectExec("INSERT INTO structure_events").
WillReturnResult(sqlmock.NewResult(0, 1))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
// "mock" is a known core runtime but NOT in the first-party registry;
// any model passes the only-registered gate (fail-open).
body := `{"name":"Mock Agent","runtime":"mock","model":"canned-replies"}`
c.Request = httptest.NewRequest("POST", "/workspaces", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
handler.Create(c)
if w.Code != http.StatusCreated {
t.Fatalf("non-registry runtime should fail open (201), got %d: %s", w.Code, w.Body.String())
}
}
// Explicit runtime, no template → honored, 201 (no template resolution
// needed; runtimeExplicitlyRequested true but already resolved).
func TestWorkspaceCreate_188_ExplicitRuntimeNoTemplate_OK(t *testing.T) {
@@ -0,0 +1,259 @@
package providers
import (
"fmt"
"sort"
"strings"
)
// PlatformProviderName is the single, closed, core-only provider key that
// denotes Molecule-managed billing (no tenant key; the platform owns the
// upstream credential + the bill). It is a CLOSED set BY CONSTRUCTION: a
// third-party / contributed runtime manifest can introduce its own providers
// (BYOK by definition), but it can never name one `platform` and thereby
// forge platform billing — the merge/validation layer reserves this key for
// the core catalog (internal#718 federation refinement, CTO 2026-05-27).
// DeriveProvider treats it like any other native provider for resolution;
// the closed-set guarantee is enforced at manifest registration/merge, not
// here. isPlatformProvider is the single predicate billing/credential
// emission keys off the DERIVED provider (P2; not wired in P0).
const PlatformProviderName = "platform"
// IsPlatform reports whether this provider is the closed, core-only
// platform-managed provider. Billing + credential-emission decisions key off
// this predicate applied to a DERIVED provider (P2), so a model can never be
// platform-billed unless DeriveProvider resolves it to the closed platform
// entry. Any BYOK / third-party provider returns false -> fail-closed
// without the tenant's own key.
func (p Provider) IsPlatform() bool {
return p.Name == PlatformProviderName
}
// DeriveProvider resolves the SINGLE owning Provider for a (runtime, model)
// pair against the merged registry Manifest. It is the P0 foundation of
// internal#718: every model->provider decision point will eventually derive
// through this one function instead of one of the ~9 hardcoded, disagreeing
// vocabularies. In P0 NOTHING in production calls it (additive, zero behavior
// change) — it is exercised only by tests + the codegen artifact.
//
// It is written as a method on Manifest (a pure function of the merged
// registry) so a future FEDERATED registry — core catalog UNION validated
// per-runtime contributed manifests — works through the identical code path:
// DeriveProvider neither knows nor cares whether a runtime/provider is
// first-party or contributed; it only sees the merged Manifest.
//
// Resolution (fail-closed at every step — never silently default):
//
// 1. The runtime must be known. An unknown runtime errors (it never falls
// through to "any provider in the catalog").
// 2. The candidate set is the runtime's NATIVE provider set ONLY (the
// `runtimes:` block). A provider absent from the runtime's native set is
// never selectable for that runtime, even if its catalog regex matches.
// 3. EXACT model-id match is authoritative (CTO 2026-05-27 "disambiguate by
// exact model id"): if the model id appears verbatim in exactly one
// native provider ref's Models list, that provider wins outright — this
// resolves the kimi namespace split (moonshot/kimi-k2.6 -> platform vs
// bare kimi-for-coding -> kimi-coding) deterministically and overrides
// any broader prefix match.
// 4. Otherwise, fall back to model_prefix_match among the native providers.
// 5. If >1 native provider still matches, disambiguate by auth env: keep
// only the providers whose auth_env intersects availableAuthEnv. If
// exactly one survives, it wins.
// 6. If still >1 (or 0) -> error. Overlap is an ambiguity the registry data
// must resolve; none is an unregistered (unselectable) model. Both
// fail-closed with a zero-value Provider.
//
// availableAuthEnv is the set of auth-env-var NAMES (never secret values)
// present for the workspace — exactly the disambiguation input the canvas
// uses today to split anthropic-oauth (CLAUDE_CODE_OAUTH_TOKEN) from
// anthropic-api (ANTHROPIC_API_KEY). It may be nil; nil simply means the
// auth-env tie-break cannot fire (an overlap then errors rather than guesses).
func (m *Manifest) DeriveProvider(runtime, model string, availableAuthEnv []string) (Provider, error) {
model = strings.TrimSpace(model)
if model == "" {
return Provider{}, fmt.Errorf("providers: model is required")
}
native, ok := m.Runtimes[runtime]
if !ok {
return Provider{}, fmt.Errorf("providers: unknown runtime %q", runtime)
}
byName := make(map[string]Provider, len(m.Providers))
for _, p := range m.Providers {
byName[p.Name] = p
}
// Step 3: exact model-id match against each native provider ref's Models.
// Authoritative — a verbatim id beats any prefix. If two native refs both
// list the same id, that is a manifest ambiguity we surface rather than
// silently pick (LoadManifest already forbids a provider ref appearing
// twice in one runtime, but two DIFFERENT providers listing the same id
// is not load-rejected, so guard it here).
var exact []Provider
for _, ref := range native.Providers {
for _, mid := range ref.Models {
if mid == model {
if p, ok := byName[ref.Name]; ok {
exact = append(exact, p)
}
break
}
}
}
if len(exact) == 1 {
return exact[0], nil
}
if len(exact) > 1 {
return Provider{}, fmt.Errorf(
"providers: model %q for runtime %q is exact-listed by %d native providers (%s) — manifest ambiguity",
model, runtime, len(exact), strings.Join(providerNames(exact), ", "))
}
// Step 4: prefix match among native providers only.
var matched []Provider
for _, ref := range native.Providers {
p, ok := byName[ref.Name]
if !ok {
continue
}
if p.MatchesModel(model) {
matched = append(matched, p)
}
}
switch len(matched) {
case 1:
return matched[0], nil
case 0:
return Provider{}, fmt.Errorf(
"providers: no native provider for runtime %q owns model %q (unregistered/unselectable)",
runtime, model)
}
// Step 5: >1 prefix match — disambiguate by available auth env.
if len(availableAuthEnv) > 0 {
avail := make(map[string]struct{}, len(availableAuthEnv))
for _, e := range availableAuthEnv {
avail[e] = struct{}{}
}
var byAuth []Provider
for _, p := range matched {
for _, want := range p.AuthEnv {
if _, ok := avail[want]; ok {
byAuth = append(byAuth, p)
break
}
}
}
if len(byAuth) == 1 {
return byAuth[0], nil
}
if len(byAuth) > 1 {
matched = byAuth // narrowed but still ambiguous; report the narrowed set
}
}
// Step 6: still ambiguous -> error (never silently pick).
return Provider{}, fmt.Errorf(
"providers: model %q for runtime %q overlaps %d providers (%s) and auth env did not disambiguate — resolve in the registry",
model, runtime, len(matched), strings.Join(providerNames(matched), ", "))
}
// Upstream is the result of ResolveUpstream: the proxy's upstream-vendor key
// (the 4-name vocabulary {openai, moonshot, anthropic, minimax} the proxy's
// resolveLLMProviderTarget switch dispatches on to pick the upstream base URL +
// key) plus the model id to send upstream (the namespace SUFFIX). Provider is
// the catalog entry the namespace resolved to (its base_url_template /
// base_url_anthropic / auth_env are the SINGLE source for the upstream target).
type Upstream struct {
// Vendor is the proxy upstream-vendor key (Provider.UpstreamVendor). It is
// the axis resolveLLMProviderTarget dispatches on; for "anthropic-api" it is
// "anthropic" (the entry NAME and the upstream VENDOR legitimately differ).
Vendor string
// Model is the id to send upstream — the namespace suffix (e.g. the
// "kimi-k2.6" of "moonshot/kimi-k2.6").
Model string
// Provider is the resolved catalog entry. Its base_url_* / auth_env are the
// one source for the upstream target — there is no parallel routing block.
Provider Provider
}
// ResolveUpstream is the SINGLE registry resolution the LLM proxy uses to pick
// the upstream vendor + base URL + auth for a wire model id (internal#718 P1,
// CONVERGED 2026-05-27). It replaces the proxy's hardcoded inferLLMProvider
// switch AND the earlier two-derivation shape (DeriveUpstreamForModel + a
// separate proxy_routing data block): there is now ONE resolution over the
// EXISTING vendor provider entries — no duplicate routing vocabulary.
//
// Resolution = the platform model id's NAMESPACE. A platform model id is
// `vendor/model` (or the BYOK colon form `vendor:model`); the namespace token
// NAMES the backing provider, whose catalog entry carries the upstream
// base_url_* + auth_env. The upstream vendor key is the entry's UpstreamVendor
// (a property of the entry, recorded once on the entry — NOT a parallel
// routing block). VERIFIED FACT (internal#718, 2026-05-27): all platform model
// ids in providers.yaml are namespaced; ZERO are bare — so namespace
// resolution covers 100% of live proxy traffic.
//
// It is DELIBERATELY separate from DeriveProvider:
// - DeriveProvider is runtime-SCOPED and speaks the REGISTRY vocabulary
// (platform/anthropic-api/kimi-coding/…); for a platform model it returns
// `platform` (the proxy ITSELF), which is useless for upstream routing.
// - ResolveUpstream is runtime-AGNOSTIC (the proxy serves platform models
// across runtimes, with no single runtime) and speaks the proxy's 4-name
// UPSTREAM vocabulary — exactly what selects the upstream base URL + key.
//
// Resolution (fail-closed; never a silent default):
//
// 1. Namespace split: for each separator "/" then ":" (the proxy's loop
// order), cut the id. If the prefix token EQUALS some provider entry's
// UpstreamVendor, that entry wins: Vendor = its UpstreamVendor, Model = the
// SUFFIX. The first separator that yields a known vendor wins ("/" before
// ":"), matching the proxy verbatim.
// 2. Otherwise the id is BARE. Bare ids are VESTIGIAL at the proxy: zero live
// platform traffic is bare (every platform model id is namespaced), so the
// converged path does NOT resolve them — it returns an error and the proxy
// falls back to its documented, retained legacy switch (inferLLMProviderLegacy).
// This is INTENTIONAL: P0 tightened bare `kimi-*` to the kimi-coding
// gateway in the registry, which is NOT a valid proxy upstream, so routing
// bare ids through the shared registry matcher would misroute. Namespace-
// only resolution sidesteps that without a moonshot special-case or a new
// bare→vendor data block.
//
// Callers that need the legacy bare behavior keep the legacy switch as a
// documented vestigial fallback (see internal/handlers/llm_proxy.go).
func (m *Manifest) ResolveUpstream(model string) (Upstream, error) {
// NOTE: model is pre-trimmed by every production caller
// (resolveLLMProviderTargetForProtocol trims + rejects empty before calling
// inferLLMProvider). No TrimSpace here — the prior copy was unreachable in
// prod and is the review nit being dropped in the convergence.
if model == "" {
return Upstream{}, fmt.Errorf("providers: model is required")
}
for _, sep := range []string{"/", ":"} {
before, after, found := strings.Cut(model, sep)
if !found {
continue
}
for _, p := range m.Providers {
if v := p.UpstreamVendor; v != "" && v == before {
return Upstream{Vendor: v, Model: after, Provider: p}, nil
}
}
}
return Upstream{}, fmt.Errorf(
"providers: %q is not an upstream-namespaced model id (vendor/model); bare ids are vestigial at the proxy and resolve via the legacy fallback", model)
}
// providerNames returns the sorted names of a provider slice for stable,
// deterministic error messages (test assertions + operator readability).
func providerNames(ps []Provider) []string {
out := make([]string, 0, len(ps))
for _, p := range ps {
out = append(out, p.Name)
}
sort.Strings(out)
return out
}
@@ -0,0 +1,520 @@
package providers
import (
"strings"
"testing"
)
// TestDeriveProvider_RealManifest exercises DeriveProvider against the
// embedded baseline manifest — the cases the brief (internal#718 P0)
// enumerates. DeriveProvider resolves the SINGLE owning provider for a
// (runtime, model) pair using the runtime's NATIVE set, restricted by:
// 1. exact model-id match (the runtime native ref's Models list is the
// authoritative disambiguator — CTO 2026-05-27 "disambiguate by exact
// model id"), then
// 2. model_prefix_match among native providers, then
// 3. auth-env disambiguation when >1 native provider still matches.
//
// It ERRORS on overlap (>=2 unresolved) and on none — never silently picks.
func TestDeriveProvider_RealManifest(t *testing.T) {
m, err := LoadManifest()
if err != nil {
t.Fatalf("LoadManifest() error = %v", err)
}
cases := []struct {
name string
runtime string
model string
authEnv []string
expect string // provider name DeriveProvider must return
}{
// --- kimi serving split (the central P0 data fix) ---------------
// Platform/proxy path: the moonshot-namespaced id routes to the
// `platform` provider (proxy -> moonshot upstream) for claude-code.
// This is the "kimi-k2.6 -> moonshot (proxy)" CTO decision expressed
// via the platform namespace.
{"claude-code platform moonshot/kimi-k2.6", "claude-code", "moonshot/kimi-k2.6", []string{"ANTHROPIC_API_KEY"}, "platform"},
// BYOK gateway path: bare kimi ids route to the kimi-coding gateway
// (api.kimi.com/coding) for claude-code — "kimi-for-coding ->
// kimi-coding" CTO decision.
{"claude-code byok kimi-for-coding", "claude-code", "kimi-for-coding", []string{"KIMI_API_KEY"}, "kimi-coding"},
{"claude-code byok kimi-k2.5", "claude-code", "kimi-k2.5", []string{"KIMI_API_KEY"}, "kimi-coding"},
{"claude-code byok kimi-k2", "claude-code", "kimi-k2", []string{"KIMI_API_KEY"}, "kimi-coding"},
// --- platform-model -> platform (closed set) --------------------
{"claude-code platform anthropic ns", "claude-code", "anthropic/claude-opus-4-7", []string{"ANTHROPIC_API_KEY"}, "platform"},
{"codex platform openai ns", "codex", "openai/gpt-5.4", []string{"MOLECULE_LLM_USAGE_TOKEN"}, "platform"},
{"hermes platform moonshot ns", "hermes", "moonshot/kimi-k2.6", []string{"ANTHROPIC_API_KEY"}, "platform"},
// --- anthropic alias + authEnv disambiguation (oauth vs api) -----
// Bare aliases are OAuth-only when the OAuth token is the available
// auth env (matches canvas env-gating). Versioned ids are the API
// provider.
{"claude-code oauth opus", "claude-code", "opus", []string{"CLAUDE_CODE_OAUTH_TOKEN"}, "anthropic-oauth"},
{"claude-code oauth sonnet", "claude-code", "sonnet", []string{"CLAUDE_CODE_OAUTH_TOKEN"}, "anthropic-oauth"},
{"claude-code oauth haiku", "claude-code", "haiku", []string{"CLAUDE_CODE_OAUTH_TOKEN"}, "anthropic-oauth"},
{"claude-code api opus versioned", "claude-code", "claude-opus-4-7", []string{"ANTHROPIC_API_KEY"}, "anthropic-api"},
{"claude-code api sonnet versioned", "claude-code", "claude-sonnet-4-6", []string{"ANTHROPIC_API_KEY"}, "anthropic-api"},
// --- other runtimes' native sets --------------------------------
{"codex byok gpt-5.5", "codex", "gpt-5.5", []string{"OPENAI_API_KEY"}, "openai"},
{"claude-code minimax", "claude-code", "MiniMax-M2.7", []string{"MINIMAX_API_KEY"}, "minimax"},
{"openclaw byok colon", "openclaw", "moonshot:kimi-k2.6", []string{"KIMI_API_KEY"}, "kimi-coding"},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
got, err := m.DeriveProvider(tc.runtime, tc.model, tc.authEnv)
if err != nil {
t.Fatalf("DeriveProvider(%q, %q, %v) error = %v", tc.runtime, tc.model, tc.authEnv, err)
}
if got.Name != tc.expect {
t.Errorf("DeriveProvider(%q, %q, %v) = %q, want %q", tc.runtime, tc.model, tc.authEnv, got.Name, tc.expect)
}
})
}
}
// TestDeriveProvider_UnregisteredErrors: a model no native provider owns
// for the runtime must ERROR (never silently default). This is the
// "only-registered-selectable" invariant — fail-closed.
func TestDeriveProvider_UnregisteredErrors(t *testing.T) {
m, err := LoadManifest()
if err != nil {
t.Fatalf("LoadManifest() error = %v", err)
}
cases := []struct {
runtime string
model string
}{
// gpt-* is OpenAI — not in claude-code's native set.
{"claude-code", "gpt-5.5"},
// deepseek is a catalog provider but in NO runtime's native set.
{"claude-code", "deepseek-v4-pro"},
// codex is OpenAI-only — a kimi id is unregistered for it.
{"codex", "kimi-for-coding"},
// a slug no provider in the manifest matches at all.
{"claude-code", "totally-made-up-model-xyz"},
}
for _, tc := range cases {
p, err := m.DeriveProvider(tc.runtime, tc.model, nil)
if err == nil {
t.Errorf("DeriveProvider(%q, %q) expected unregistered error, got provider %q", tc.runtime, tc.model, p.Name)
}
if p.Name != "" {
t.Errorf("DeriveProvider(%q, %q) on error must return a zero Provider, got %q", tc.runtime, tc.model, p.Name)
}
}
}
// TestDeriveProvider_UnknownRuntimeErrors: fail-closed on an unknown
// runtime (never falls through to "all providers").
func TestDeriveProvider_UnknownRuntimeErrors(t *testing.T) {
m, err := LoadManifest()
if err != nil {
t.Fatalf("LoadManifest() error = %v", err)
}
p, err := m.DeriveProvider("does-not-exist", "claude-opus-4-7", nil)
if err == nil {
t.Errorf("DeriveProvider(unknown runtime) expected error, got provider %q", p.Name)
}
if !strings.Contains(strings.ToLower(err.Error()), "runtime") {
t.Errorf("DeriveProvider(unknown runtime) error = %q, want it to name the runtime problem", err.Error())
}
}
// TestDeriveProvider_PlatformIsClosed proves a third-party-style provider
// can never be derived as `platform`. `platform` is a CLOSED core-only set:
// only models a native runtime's `platform` ref lists (vendor-namespaced)
// derive to platform. A BYOK id, even one a runtime natively supports,
// derives to its BYOK provider, never to platform.
func TestDeriveProvider_PlatformIsClosed(t *testing.T) {
m, err := LoadManifest()
if err != nil {
t.Fatalf("LoadManifest() error = %v", err)
}
// kimi-for-coding is a BYOK id natively supported by claude-code; it
// must derive to kimi-coding (BYOK), NOT platform — even though
// `platform` is in claude-code's native set.
got, err := m.DeriveProvider("claude-code", "kimi-for-coding", []string{"KIMI_API_KEY"})
if err != nil {
t.Fatalf("DeriveProvider(claude-code, kimi-for-coding) error = %v", err)
}
if got.Name == "platform" {
t.Fatal("BYOK kimi-for-coding must not derive to the closed platform provider")
}
if got.Name != "kimi-coding" {
t.Errorf("DeriveProvider(claude-code, kimi-for-coding) = %q, want kimi-coding", got.Name)
}
}
// craftedManifest is a tiny well-formed manifest with a DELIBERATE prefix
// overlap between two native providers, used to exercise DeriveProvider's
// overlap-error path and the auth-env disambiguation path without depending
// on the real manifest staying overlap-free (it is, by the load guard).
const craftedOverlapManifest = `
schema_version: 1
providers:
- name: prov-a
display_name: "Provider A"
protocol: openai
auth_mode: anthropic_api
auth_env: [A_API_KEY]
model_prefix_match: "^shared-"
- name: prov-b
display_name: "Provider B"
protocol: openai
auth_mode: anthropic_api
auth_env: [B_API_KEY]
model_prefix_match: "^shared-"
runtimes:
testrt:
providers:
- name: prov-a
models: [a-only-model]
- name: prov-b
models: [b-only-model]
`
// TestDeriveProvider_OverlapErrors proves DeriveProvider ERRORS when >=2
// native providers match the same slug and auth-env cannot disambiguate —
// it never silently picks one. This is the load-time-overlap guard's
// runtime counterpart at derivation time.
func TestDeriveProvider_OverlapErrors(t *testing.T) {
m, err := parseManifest([]byte(craftedOverlapManifest))
if err != nil {
t.Fatalf("parseManifest(crafted) error = %v", err)
}
// "shared-x" matches BOTH prov-a and prov-b via prefix; no exact-id
// resolves it; no auth env is supplied -> unresolved overlap -> error.
p, err := m.DeriveProvider("testrt", "shared-x", nil)
if err == nil {
t.Fatalf("DeriveProvider expected overlap error, got provider %q", p.Name)
}
if !strings.Contains(strings.ToLower(err.Error()), "overlap") &&
!strings.Contains(strings.ToLower(err.Error()), "ambiguous") {
t.Errorf("overlap error = %q, want it to name overlap/ambiguity", err.Error())
}
if p.Name != "" {
t.Errorf("on overlap error DeriveProvider must return zero Provider, got %q", p.Name)
}
}
// TestDeriveProvider_AuthEnvDisambiguates proves auth-env breaks an
// otherwise-ambiguous prefix overlap: when two native providers match the
// same slug but exactly one's auth_env intersects the available env set,
// DeriveProvider resolves to that one.
func TestDeriveProvider_AuthEnvDisambiguates(t *testing.T) {
m, err := parseManifest([]byte(craftedOverlapManifest))
if err != nil {
t.Fatalf("parseManifest(crafted) error = %v", err)
}
// Only B_API_KEY is available -> the shared prefix resolves to prov-b.
got, err := m.DeriveProvider("testrt", "shared-x", []string{"B_API_KEY"})
if err != nil {
t.Fatalf("DeriveProvider(authEnv=B_API_KEY) error = %v", err)
}
if got.Name != "prov-b" {
t.Errorf("DeriveProvider(authEnv=B_API_KEY) = %q, want prov-b", got.Name)
}
// Only A_API_KEY -> prov-a.
got, err = m.DeriveProvider("testrt", "shared-x", []string{"A_API_KEY"})
if err != nil {
t.Fatalf("DeriveProvider(authEnv=A_API_KEY) error = %v", err)
}
if got.Name != "prov-a" {
t.Errorf("DeriveProvider(authEnv=A_API_KEY) = %q, want prov-a", got.Name)
}
// Both keys available -> still ambiguous -> error (auth env doesn't
// narrow to one).
p, err := m.DeriveProvider("testrt", "shared-x", []string{"A_API_KEY", "B_API_KEY"})
if err == nil {
t.Errorf("DeriveProvider(both keys) expected overlap error, got %q", p.Name)
}
}
// TestDeriveProvider_KimiPrefixFallback proves the kimi serving split holds
// on the PREFIX-FALLBACK path too — not only for exact-listed ids. A bare
// kimi id that is NOT in any runtime's exact Models list (e.g. a new
// kimi-latest the gateway serves but the template hasn't enumerated) must
// still resolve to the kimi-coding gateway for claude-code, NOT error
// "unregistered". This catches the false-overlap data bug: before the YAML
// tightening, kimi-coding's regex was too narrow (coding-suffixed ids only)
// and moonshot's was too broad (claimed bare kimi-k2*), so a bare kimi id
// resolved to NEITHER native provider for claude-code.
func TestDeriveProvider_KimiPrefixFallback(t *testing.T) {
m, err := LoadManifest()
if err != nil {
t.Fatalf("LoadManifest() error = %v", err)
}
for _, model := range []string{"kimi-latest", "kimi-thinking-preview"} {
got, err := m.DeriveProvider("claude-code", model, []string{"KIMI_API_KEY"})
if err != nil {
t.Errorf("DeriveProvider(claude-code, %q) prefix-fallback error = %v; want kimi-coding", model, err)
continue
}
if got.Name != "kimi-coding" {
t.Errorf("DeriveProvider(claude-code, %q) = %q, want kimi-coding (gateway serves any kimi id)", model, got.Name)
}
}
}
// TestDeriveProvider_ExactIdBeatsPrefix proves the exact model-id match in
// the runtime native set is authoritative over a prefix match — the CTO
// "disambiguate by exact model id" rule. A model id listed under provider P
// for runtime R derives to P even if another native provider's prefix would
// also match it.
func TestDeriveProvider_ExactIdBeatsPrefix(t *testing.T) {
const yaml = `
schema_version: 1
providers:
- name: gateway
display_name: "Gateway"
protocol: anthropic
auth_mode: third_party_anthropic_compat
auth_env: [GW_KEY]
model_prefix_match: "^never-matches-anything$"
- name: broad
display_name: "Broad"
protocol: openai
auth_mode: anthropic_api
auth_env: [BROAD_KEY]
model_prefix_match: "^kimi-"
runtimes:
rt:
providers:
- name: gateway
models: [kimi-k2.5]
- name: broad
models: [kimi-other]
`
m, err := parseManifest([]byte(yaml))
if err != nil {
t.Fatalf("parseManifest error = %v", err)
}
// kimi-k2.5 is EXACT-listed under `gateway` for rt, but `broad`'s
// ^kimi- prefix also matches it. Exact id wins -> gateway.
got, err := m.DeriveProvider("rt", "kimi-k2.5", nil)
if err != nil {
t.Fatalf("DeriveProvider error = %v", err)
}
if got.Name != "gateway" {
t.Errorf("exact-id should beat prefix: got %q, want gateway", got.Name)
}
}
// TestResolveUpstream_RealManifest exercises the SINGLE runtime-AGNOSTIC
// proxy-upstream resolution (internal#718 P1, CONVERGED) against the embedded
// baseline. ResolveUpstream is the ONE resolution over the EXISTING vendor
// provider entries (no proxy_routing block): it maps a model id's NAMESPACE
// token to the entry whose upstream_vendor equals it, answering "which UPSTREAM
// vendor owns this wire model id" in the proxy's 4-name vocabulary {openai,
// moonshot, anthropic, minimax}, with NO runtime context. The byte-identical
// equivalence guard lives in the handlers package (against the live
// inferLLMProvider oracle); this test pins the resolution's own semantics:
// namespace split, separator order, suffix-stripping, and the
// bare-id-is-vestigial (errors) contract.
func TestResolveUpstream_RealManifest(t *testing.T) {
m, err := LoadManifest()
if err != nil {
t.Fatalf("LoadManifest() error = %v", err)
}
cases := []struct {
name string
model string
wantVendor string
wantResolved string
wantProvider string // catalog entry the namespace resolved to
wantErr bool
}{
// --- namespace split — the LIVE traffic shape (vendor/model + vendor:model)
// jrs SEO's LIVE platform model + sibling — MUST stay on moonshot.
{"platform moonshot slash", "moonshot/kimi-k2.6", "moonshot", "kimi-k2.6", "moonshot", false},
{"platform moonshot colon (openclaw)", "moonshot:kimi-k2.6", "moonshot", "kimi-k2.6", "moonshot", false},
// anthropic namespace resolves to the anthropic-api ENTRY (name != vendor).
{"platform anthropic ns", "anthropic/claude-opus-4-7", "anthropic", "claude-opus-4-7", "anthropic-api", false},
{"platform openai ns", "openai/gpt-5.4", "openai", "gpt-5.4", "openai", false},
{"platform minimax ns", "minimax/MiniMax-M2.7", "minimax", "MiniMax-M2.7", "minimax", false},
{"openai ns gpt-4o", "openai/gpt-4o", "openai", "gpt-4o", "openai", false},
// --- bare ids are VESTIGIAL at the proxy: ResolveUpstream errors (the
// proxy falls back to its legacy switch for these). No live bare traffic.
{"bare kimi -> err (vestigial, legacy fallback)", "kimi-k2.6", "", "", "", true},
{"bare claude -> err (vestigial)", "claude-3-5-sonnet", "", "", "", true},
{"bare minimax -> err (vestigial)", "minimax-m1", "", "", "", true},
{"bare gpt -> err (vestigial)", "gpt-5.5", "", "", "", true},
{"alias sonnet -> err (vestigial)", "sonnet", "", "", "", true},
{"unknown bare id -> err (vestigial)", "totally-made-up-xyz", "", "", "", true},
// non-allowlisted namespace token ("kimi-coding" is no entry's
// upstream_vendor) does NOT resolve; the whole id is then bare -> err.
// (The proxy's legacy fallback routes "kimi-coding/kimi-k2" to moonshot,
// preserving the prior behavior — proven by the handlers equivalence test.)
{"kimi-coding/ ns not a vendor -> err (legacy fallback)", "kimi-coding/kimi-k2", "", "", "", true},
// --- empty -------------------------------------------------------
{"empty -> err", "", "", "", "", true},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
up, err := m.ResolveUpstream(tc.model)
if tc.wantErr {
if err == nil {
t.Fatalf("ResolveUpstream(%q) = %+v, want error", tc.model, up)
}
if up.Vendor != "" || up.Model != "" || up.Provider.Name != "" {
t.Errorf("ResolveUpstream(%q) on error must return zero Upstream, got %+v", tc.model, up)
}
return
}
if err != nil {
t.Fatalf("ResolveUpstream(%q) error = %v", tc.model, err)
}
if up.Vendor != tc.wantVendor {
t.Errorf("ResolveUpstream(%q) vendor = %q, want %q", tc.model, up.Vendor, tc.wantVendor)
}
if up.Model != tc.wantResolved {
t.Errorf("ResolveUpstream(%q) model = %q, want %q", tc.model, up.Model, tc.wantResolved)
}
if up.Provider.Name != tc.wantProvider {
t.Errorf("ResolveUpstream(%q) provider = %q, want %q", tc.model, up.Provider.Name, tc.wantProvider)
}
})
}
}
// TestResolveUpstream_SeparatorOrder pins the proxy's "/" then ":" separator
// order: an id containing BOTH must split on "/" first (the proxy's loop
// order), so the "/"-prefix vendor wins.
func TestResolveUpstream_SeparatorOrder(t *testing.T) {
m, err := LoadManifest()
if err != nil {
t.Fatalf("LoadManifest() error = %v", err)
}
// "moonshot/foo:bar" cuts on "/" first -> before="moonshot", after="foo:bar".
up, err := m.ResolveUpstream("moonshot/foo:bar")
if err != nil || up.Vendor != "moonshot" || up.Model != "foo:bar" {
t.Fatalf("separator order: got (%+v, err=%v), want vendor=moonshot model=foo:bar", up, err)
}
}
// TestResolveUpstream_ResolvesToProviderEntry proves the SINGLE-SOURCE
// invariant of the convergence: ResolveUpstream returns the EXISTING vendor
// provider entry, and that entry carries the upstream base URLs + auth — there
// is no parallel routing data block. The proxy dials the entry's base_url_*;
// the test pins them so a future entry edit that breaks the live upstream is
// caught here, not in production.
func TestResolveUpstream_ResolvesToProviderEntry(t *testing.T) {
m, err := LoadManifest()
if err != nil {
t.Fatalf("LoadManifest() error = %v", err)
}
cases := []struct {
model string
wantProvider string
wantBaseURL string // base_url_template on the resolved entry
wantBaseURLAnthro string // base_url_anthropic on the resolved entry
wantAuthEnvContain string // an auth_env name the entry must carry
}{
{"moonshot/kimi-k2.6", "moonshot", "https://api.moonshot.ai/v1", "https://api.moonshot.ai/anthropic/v1", "MOONSHOT_API_KEY"},
{"anthropic/claude-opus-4-7", "anthropic-api", "https://api.anthropic.com/v1", "https://api.anthropic.com/v1", "ANTHROPIC_API_KEY"},
{"minimax/MiniMax-M2.7", "minimax", "https://api.minimax.io/v1", "https://api.minimax.io/anthropic/v1", "MINIMAX_API_KEY"},
{"openai/gpt-5.4", "openai", "https://api.openai.com/v1", "", "OPENAI_API_KEY"},
}
for _, tc := range cases {
up, err := m.ResolveUpstream(tc.model)
if err != nil {
t.Fatalf("ResolveUpstream(%q) error = %v", tc.model, err)
}
if up.Provider.Name != tc.wantProvider {
t.Errorf("%q: provider = %q, want %q", tc.model, up.Provider.Name, tc.wantProvider)
}
if up.Provider.BaseURLTemplate != tc.wantBaseURL {
t.Errorf("%q: base_url_template = %q, want %q", tc.model, up.Provider.BaseURLTemplate, tc.wantBaseURL)
}
if up.Provider.BaseURLAnthropic != tc.wantBaseURLAnthro {
t.Errorf("%q: base_url_anthropic = %q, want %q", tc.model, up.Provider.BaseURLAnthropic, tc.wantBaseURLAnthro)
}
found := false
for _, e := range up.Provider.AuthEnv {
if e == tc.wantAuthEnvContain {
found = true
break
}
}
if !found {
t.Errorf("%q: auth_env %v missing %q", tc.model, up.Provider.AuthEnv, tc.wantAuthEnvContain)
}
}
}
// TestParseManifest_RejectsDuplicateUpstreamVendor proves the convergence's
// load-time invariant: two entries cannot claim the same upstream_vendor (the
// namespace token must resolve to exactly one entry). Replaces the prior
// closed-catch-all / vendorless-proxy_routing guards.
func TestParseManifest_RejectsDuplicateUpstreamVendor(t *testing.T) {
const dupVendor = `
schema_version: 1
providers:
- name: prov-a
display_name: "Provider A"
protocol: openai
auth_mode: anthropic_api
auth_env: [A_API_KEY]
model_prefix_match: "^a-"
upstream_vendor: shared-vendor
- name: prov-b
display_name: "Provider B"
protocol: openai
auth_mode: anthropic_api
auth_env: [B_API_KEY]
model_prefix_match: "^b-"
upstream_vendor: shared-vendor
runtimes:
testrt:
providers:
- name: prov-a
models: [a-only]
`
_, err := parseManifest([]byte(dupVendor))
if err == nil {
t.Fatal("manifest with two entries claiming the same upstream_vendor must fail to load")
}
if !strings.Contains(strings.ToLower(err.Error()), "upstream_vendor") &&
!strings.Contains(strings.ToLower(err.Error()), "unique") {
t.Errorf("duplicate-vendor error = %q, want it to name the upstream_vendor uniqueness problem", err.Error())
}
}
// TestResolveUpstream_OnlyRoutingEntriesCarryVendor documents the data shape:
// in the real manifest, EXACTLY the four upstream entries carry upstream_vendor,
// they are {anthropic, openai, moonshot, minimax}, and each is unique. This is
// the converged single-source-of-truth assertion (was TestProxyRoutingClosedCatchAll).
func TestResolveUpstream_OnlyRoutingEntriesCarryVendor(t *testing.T) {
m, err := LoadManifest()
if err != nil {
t.Fatalf("LoadManifest() error = %v", err)
}
got := map[string]string{} // vendor -> entry name
for _, p := range m.Providers {
if p.UpstreamVendor == "" {
continue
}
if prev, dup := got[p.UpstreamVendor]; dup {
t.Fatalf("upstream_vendor %q claimed by both %q and %q", p.UpstreamVendor, prev, p.Name)
}
got[p.UpstreamVendor] = p.Name
}
want := map[string]string{
"anthropic": "anthropic-api",
"openai": "openai",
"moonshot": "moonshot",
"minimax": "minimax",
}
if len(got) != len(want) {
t.Fatalf("upstream_vendor entries = %v, want exactly %v", got, want)
}
for v, name := range want {
if got[v] != name {
t.Errorf("upstream_vendor %q -> entry %q, want %q", v, got[v], name)
}
}
}
@@ -0,0 +1,96 @@
// Code generated by cmd/gen-providers; DO NOT EDIT.
//
// Source of truth: internal/providers/providers.yaml (schema_version 1).
// Regenerate with: go generate ./... (or: go run ./cmd/gen-providers)
// The verify-providers-gen CI workflow fails RED if this file drifts from
// providers.yaml or is hand-edited. internal#718 P0 — checked-in + drift-
// gated ONLY; no production path imports this package yet (that is P1+).
package gen
// SchemaVersion is the providers.yaml schema this artifact was generated
// against. It is the semver'd contract version (the MAJOR component for the
// public extension contract; see internal/providers/README.md).
const SchemaVersion = 1
// Fingerprint is a stable content hash of the generated projection (schema
// version + provider catalog + runtime native sets). It changes iff the
// registry DATA changes (comment-only YAML edits do not churn it).
const Fingerprint = "faffcbe59bb9f38c"
// GenProvider is the generated projection of one provider catalog entry —
// the subset a downstream consumer needs to derive + display a provider.
type GenProvider struct {
Name string
DisplayName string
Protocol string
AuthMode string
AuthEnv []string
ModelPrefixMatch string
// IsPlatform marks the closed, core-only platform-managed provider.
IsPlatform bool
// UpstreamVendor is the proxy's upstream-vendor key for this entry
// (internal#718 P1, CONVERGED); empty for providers the proxy does not
// route to an upstream vendor. ResolveUpstream maps a model id's namespace
// token to the entry whose UpstreamVendor equals it.
UpstreamVendor string
}
// GenRuntimeRef is one native provider a runtime supports + its exact models.
type GenRuntimeRef struct {
Name string
Models []string
}
// Providers is the full provider catalog, in providers.yaml declaration order.
var Providers = []GenProvider{
{Name: "anthropic-api", DisplayName: "Anthropic API", Protocol: "anthropic", AuthMode: "anthropic_api", AuthEnv: []string{"ANTHROPIC_API_KEY", "ANTHROPIC_AUTH_TOKEN"}, ModelPrefixMatch: "^claude", IsPlatform: false, UpstreamVendor: "anthropic"},
{Name: "anthropic-oauth", DisplayName: "Claude Code subscription", Protocol: "anthropic", AuthMode: "oauth", AuthEnv: []string{"CLAUDE_CODE_OAUTH_TOKEN"}, ModelPrefixMatch: "^(sonnet|opus|haiku)$", IsPlatform: false},
{Name: "openai", DisplayName: "OpenAI", Protocol: "openai", AuthMode: "anthropic_api", AuthEnv: []string{"OPENAI_API_KEY"}, ModelPrefixMatch: "^gpt-", IsPlatform: false, UpstreamVendor: "openai"},
{Name: "moonshot", DisplayName: "Moonshot (Kimi)", Protocol: "openai", AuthMode: "third_party_anthropic_compat", AuthEnv: []string{"MOONSHOT_API_KEY", "KIMI_API_KEY"}, ModelPrefixMatch: "^moonshot[:/-]", IsPlatform: false, UpstreamVendor: "moonshot"},
{Name: "minimax", DisplayName: "MiniMax", Protocol: "openai", AuthMode: "third_party_anthropic_compat", AuthEnv: []string{"MINIMAX_API_KEY", "ANTHROPIC_AUTH_TOKEN", "ANTHROPIC_API_KEY"}, ModelPrefixMatch: "(?i)^minimax-m", IsPlatform: false, UpstreamVendor: "minimax"},
{Name: "platform", DisplayName: "Platform", Protocol: "anthropic", AuthMode: "third_party_anthropic_compat", AuthEnv: []string{"ANTHROPIC_API_KEY", "MOLECULE_LLM_USAGE_TOKEN"}, ModelPrefixMatch: "^platform/", IsPlatform: true},
{Name: "xiaomi-mimo", DisplayName: "Xiaomi MiMo", Protocol: "anthropic", AuthMode: "third_party_anthropic_compat", AuthEnv: []string{"ANTHROPIC_AUTH_TOKEN", "ANTHROPIC_API_KEY"}, ModelPrefixMatch: "^mimo-", IsPlatform: false},
{Name: "zai", DisplayName: "Z.ai (GLM)", Protocol: "anthropic", AuthMode: "third_party_anthropic_compat", AuthEnv: []string{"GLM_API_KEY", "ANTHROPIC_AUTH_TOKEN", "ANTHROPIC_API_KEY"}, ModelPrefixMatch: "(?i)^glm-", IsPlatform: false},
{Name: "kimi-coding", DisplayName: "Moonshot Kimi (coding-tuned)", Protocol: "anthropic", AuthMode: "third_party_anthropic_compat", AuthEnv: []string{"KIMI_API_KEY", "ANTHROPIC_API_KEY", "ANTHROPIC_AUTH_TOKEN"}, ModelPrefixMatch: "^kimi-", IsPlatform: false},
{Name: "deepseek", DisplayName: "DeepSeek", Protocol: "anthropic", AuthMode: "third_party_anthropic_compat", AuthEnv: []string{"DEEPSEEK_API_KEY", "ANTHROPIC_AUTH_TOKEN", "ANTHROPIC_API_KEY"}, ModelPrefixMatch: "^deepseek-", IsPlatform: false},
{Name: "google", DisplayName: "Google Gemini", Protocol: "openai", AuthMode: "third_party_anthropic_compat", AuthEnv: []string{"GEMINI_API_KEY", "GOOGLE_API_KEY"}, ModelPrefixMatch: "^gemini-", IsPlatform: false},
{Name: "alibaba", DisplayName: "Alibaba Qwen (DashScope)", Protocol: "openai", AuthMode: "third_party_anthropic_compat", AuthEnv: []string{"DASHSCOPE_API_KEY", "ALIBABA_API_KEY"}, ModelPrefixMatch: "^qwen-", IsPlatform: false},
{Name: "nousresearch", DisplayName: "Nous Research (Hermes)", Protocol: "openai", AuthMode: "third_party_anthropic_compat", AuthEnv: []string{"NOUSRESEARCH_API_KEY"}, ModelPrefixMatch: "^nousresearch/", IsPlatform: false},
{Name: "openrouter", DisplayName: "OpenRouter (any model)", Protocol: "openai", AuthMode: "third_party_anthropic_compat", AuthEnv: []string{"OPENROUTER_API_KEY"}, ModelPrefixMatch: "^openrouter/", IsPlatform: false},
{Name: "huggingface", DisplayName: "Hugging Face Inference", Protocol: "openai", AuthMode: "third_party_anthropic_compat", AuthEnv: []string{"HUGGINGFACE_API_KEY", "HF_TOKEN"}, ModelPrefixMatch: "^huggingface/", IsPlatform: false},
{Name: "ai-gateway", DisplayName: "Vercel AI Gateway", Protocol: "openai", AuthMode: "third_party_anthropic_compat", AuthEnv: []string{"AI_GATEWAY_API_KEY"}, ModelPrefixMatch: "^ai-gateway/", IsPlatform: false},
{Name: "opencode-zen", DisplayName: "OpenCode Zen", Protocol: "openai", AuthMode: "third_party_anthropic_compat", AuthEnv: []string{"OPENCODE_ZEN_API_KEY"}, ModelPrefixMatch: "^opencode-zen/", IsPlatform: false},
{Name: "opencode-go", DisplayName: "OpenCode Go", Protocol: "openai", AuthMode: "third_party_anthropic_compat", AuthEnv: []string{"OPENCODE_GO_API_KEY"}, ModelPrefixMatch: "^opencode-go/", IsPlatform: false},
{Name: "kilocode", DisplayName: "Kilo Code", Protocol: "openai", AuthMode: "third_party_anthropic_compat", AuthEnv: []string{"KILOCODE_API_KEY"}, ModelPrefixMatch: "^kilocode/", IsPlatform: false},
{Name: "minimax-cn", DisplayName: "MiniMax China", Protocol: "openai", AuthMode: "third_party_anthropic_compat", AuthEnv: []string{"MINIMAX_API_KEY", "ANTHROPIC_AUTH_TOKEN"}, ModelPrefixMatch: "^minimax-cn/", IsPlatform: false},
{Name: "ollama-cloud", DisplayName: "Ollama Cloud", Protocol: "openai", AuthMode: "third_party_anthropic_compat", AuthEnv: []string{"OLLAMA_CLOUD_API_KEY"}, ModelPrefixMatch: "^ollama-cloud/", IsPlatform: false},
{Name: "ollama", DisplayName: "Ollama (self-hosted)", Protocol: "openai", AuthMode: "third_party_anthropic_compat", AuthEnv: []string{"OLLAMA_HOST"}, ModelPrefixMatch: "^ollama/", IsPlatform: false},
{Name: "nvidia", DisplayName: "NVIDIA NIM", Protocol: "openai", AuthMode: "third_party_anthropic_compat", AuthEnv: []string{"NVIDIA_API_KEY"}, ModelPrefixMatch: "^nvidia/", IsPlatform: false},
{Name: "arcee", DisplayName: "Arcee", Protocol: "openai", AuthMode: "third_party_anthropic_compat", AuthEnv: []string{"ARCEE_API_KEY"}, ModelPrefixMatch: "^arcee/", IsPlatform: false},
{Name: "custom", DisplayName: "Custom OpenAI-compat endpoint", Protocol: "openai", AuthMode: "third_party_anthropic_compat", AuthEnv: []string{"CUSTOM_API_KEY", "OPENAI_API_KEY"}, ModelPrefixMatch: "^custom/", IsPlatform: false},
}
// Runtimes maps each runtime to its native provider+model set, runtime names
// sorted for a deterministic artifact.
var Runtimes = map[string][]GenRuntimeRef{
"claude-code": {
{Name: "anthropic-oauth", Models: []string{"sonnet", "opus", "haiku"}},
{Name: "anthropic-api", Models: []string{"claude-sonnet-4-6", "claude-opus-4-7", "claude-haiku-4-5"}},
{Name: "kimi-coding", Models: []string{"kimi-for-coding", "kimi-k2.5", "kimi-k2"}},
{Name: "minimax", Models: []string{"MiniMax-M2", "MiniMax-M2.7", "MiniMax-M2.7-highspeed"}},
{Name: "platform", Models: []string{"anthropic/claude-opus-4-7", "anthropic/claude-sonnet-4-6", "moonshot/kimi-k2.6", "moonshot/kimi-k2.5", "minimax/MiniMax-M2.7", "minimax/MiniMax-M2.7-highspeed"}},
},
"codex": {
{Name: "openai", Models: []string{"gpt-5.5", "gpt-5.4", "gpt-5.4-mini", "gpt-5.3-codex", "gpt-5.3-codex-spark", "gpt-5.2"}},
{Name: "platform", Models: []string{"openai/gpt-5.4", "openai/gpt-5.4-mini"}},
},
"hermes": {
{Name: "kimi-coding", Models: []string{"kimi-coding/kimi-k2"}},
{Name: "platform", Models: []string{"moonshot/kimi-k2.6", "moonshot/kimi-k2.5"}},
},
"openclaw": {
{Name: "kimi-coding", Models: []string{"moonshot:kimi-k2.6", "moonshot:kimi-k2.5"}},
{Name: "platform", Models: []string{"moonshot/kimi-k2.6", "moonshot/kimi-k2.5"}},
},
}
@@ -0,0 +1,85 @@
package gen
import (
"testing"
"git.moleculesai.app/molecule-ai/molecule-core/workspace-server/internal/providers"
)
// TestGeneratedProjectionMatchesManifest proves the checked-in artifact is a
// FAITHFUL projection of the live manifest — not just byte-stable, but
// semantically correct. The byte-level drift gate (cmd/gen-providers
// TestArtifactInSync) proves "regen produces this file"; this proves "this
// file's DATA equals the loader's data", so a consumer reading the artifact
// (P1+) sees exactly what the loader sees.
func TestGeneratedProjectionMatchesManifest(t *testing.T) {
m, err := providers.LoadManifest()
if err != nil {
t.Fatalf("LoadManifest() error = %v", err)
}
if SchemaVersion != providers.SchemaVersion() {
t.Errorf("generated SchemaVersion = %d, manifest = %d", SchemaVersion, providers.SchemaVersion())
}
if len(Providers) != len(m.Providers) {
t.Fatalf("generated %d providers, manifest has %d", len(Providers), len(m.Providers))
}
for i, gp := range Providers {
mp := m.Providers[i]
if gp.Name != mp.Name {
t.Errorf("provider[%d] name: gen=%q manifest=%q", i, gp.Name, mp.Name)
}
if gp.ModelPrefixMatch != mp.ModelPrefixMatch {
t.Errorf("provider %q model_prefix_match: gen=%q manifest=%q", gp.Name, gp.ModelPrefixMatch, mp.ModelPrefixMatch)
}
if gp.AuthMode != mp.AuthMode {
t.Errorf("provider %q auth_mode: gen=%q manifest=%q", gp.Name, gp.AuthMode, mp.AuthMode)
}
if gp.IsPlatform != mp.IsPlatform() {
t.Errorf("provider %q IsPlatform: gen=%v manifest=%v", gp.Name, gp.IsPlatform, mp.IsPlatform())
}
}
if len(Runtimes) != len(m.Runtimes) {
t.Fatalf("generated %d runtimes, manifest has %d", len(Runtimes), len(m.Runtimes))
}
for rt, native := range m.Runtimes {
genRefs, ok := Runtimes[rt]
if !ok {
t.Errorf("runtime %q missing from generated artifact", rt)
continue
}
if len(genRefs) != len(native.Providers) {
t.Errorf("runtime %q: gen has %d refs, manifest has %d", rt, len(genRefs), len(native.Providers))
continue
}
for i, ref := range native.Providers {
if genRefs[i].Name != ref.Name {
t.Errorf("runtime %q ref[%d] name: gen=%q manifest=%q", rt, i, genRefs[i].Name, ref.Name)
}
if len(genRefs[i].Models) != len(ref.Models) {
t.Errorf("runtime %q ref %q models count: gen=%d manifest=%d", rt, ref.Name, len(genRefs[i].Models), len(ref.Models))
}
}
}
}
// TestExactlyOnePlatformProvider guards the closed-set invariant in the
// generated projection: the platform-managed provider is a single, core-only
// entry. A federation merge that introduced a second IsPlatform=true provider
// (a forged platform) would flip this red.
func TestExactlyOnePlatformProvider(t *testing.T) {
count := 0
for _, p := range Providers {
if p.IsPlatform {
count++
if p.Name != "platform" {
t.Errorf("IsPlatform provider has unexpected name %q (platform is core-only, name must be %q)", p.Name, "platform")
}
}
}
if count != 1 {
t.Errorf("expected exactly 1 platform provider in the generated catalog, got %d", count)
}
}
@@ -0,0 +1,125 @@
package providers
import (
"go/build"
"os"
"path/filepath"
"strings"
"testing"
)
// gen_import_boundary_test.go — arch-lint-equivalent boundary gate
// (internal#718 P2-A, CTO 2026-05-27 "arch-lint so prod doesn't import the raw
// gen package incorrectly").
//
// molecule-controlplane enforces this with go-arch-lint: the
// internal/providers/gen component is absent from every other component's
// mayDependOn list, so a production package importing the raw generated
// projection fails CI. molecule-core has no go-arch-lint regime, so we pin the
// SAME invariant with a behavior-based AST gate (the established core pattern —
// see derive_provider_drift_test.go / class1_ast_gate_test.go).
//
// Invariant: NO production (non-test) Go file in workspace-server may import
// internal/providers/gen, EXCEPT inside internal/providers itself (the loader's
// own parity test wiring) — and even there only test files. The generated
// projection is checked-in + drift-gated DATA; production code derives through
// the loader (internal/providers DeriveProvider / IsPlatform), never the raw
// gen literals. P2-B wires the billing decision onto the loader, not gen.
const genImportPath = "git.moleculesai.app/molecule-ai/molecule-core/workspace-server/internal/providers/gen"
func TestNoProductionImportOfGenPackage(t *testing.T) {
// Walk up to the workspace-server module root (this test runs with cwd =
// internal/providers).
root := moduleRoot(t)
var offenders []string
walkErr := filepath.Walk(root, func(path string, info os.FileInfo, err error) error {
if err != nil {
return err
}
if info.IsDir() {
base := info.Name()
// Skip vendored / non-source trees.
if base == "vendor" || base == "node_modules" || base == ".git" || base == "testdata" {
return filepath.SkipDir
}
return nil
}
if !strings.HasSuffix(path, ".go") {
return nil
}
// Test files are exempt — the loader's own gen parity test
// (gen/registry_gen_test.go) legitimately imports the loader, and any
// test may cross boundaries to assert on the projection.
if strings.HasSuffix(path, "_test.go") {
return nil
}
// The gen package's own files import nothing internal; skip the dir
// itself so we never flag generated code referencing its own path in a
// comment-derived parse (build.ImportDir reads real imports only, but be
// explicit).
dir := filepath.Dir(path)
if filepath.Base(dir) == "gen" && strings.HasSuffix(filepath.Dir(dir), filepath.Join("internal", "providers")) {
return nil
}
pkg, perr := build.ImportDir(dir, build.ImportComment)
if perr != nil {
// A dir with build-tagged-out files or no buildable package for the
// default tags is not an offender; skip quietly.
return nil //nolint:nilerr // unbuildable dir is not a boundary violation
}
for _, imp := range pkg.Imports {
if imp == genImportPath {
rel, _ := filepath.Rel(root, dir)
offenders = append(offenders, rel)
}
}
return nil
})
if walkErr != nil {
t.Fatalf("walk module tree: %v", walkErr)
}
if len(offenders) > 0 {
t.Errorf("production packages import the raw generated projection %q: %v\n"+
"Production code must derive through the loader (internal/providers "+
"DeriveProvider / IsPlatform), never the raw gen literals. The gen "+
"package is checked-in + drift-gated DATA only (internal#718).",
genImportPath, dedupe(offenders))
}
}
// moduleRoot returns the workspace-server module root by walking up from the
// test's cwd (internal/providers) until it finds go.mod.
func moduleRoot(t *testing.T) string {
t.Helper()
dir, err := os.Getwd()
if err != nil {
t.Fatalf("getwd: %v", err)
}
for {
if _, statErr := os.Stat(filepath.Join(dir, "go.mod")); statErr == nil {
return dir
}
parent := filepath.Dir(dir)
if parent == dir {
t.Fatalf("could not locate go.mod above %s", dir)
}
dir = parent
}
}
func dedupe(in []string) []string {
seen := map[string]struct{}{}
var out []string
for _, s := range in {
if _, ok := seen[s]; ok {
continue
}
seen[s] = struct{}{}
out = append(out, s)
}
return out
}
@@ -0,0 +1,364 @@
// Package providers is the molecule-core SIDE of the LLM provider registry
// SSOT (internal#718 P2-A, CTO 2026-05-27 "Distribution = SDK via codegen +
// verify-CI"). It is a load-time mirror of the canonical loader that lives in
// molecule-controlplane internal/providers — same parse, same validation, same
// DeriveProvider/IsPlatform/ResolveUpstream API.
//
// CANONICAL SSOT = molecule-controlplane internal/providers/providers.yaml.
// This package embeds a SYNCED COPY of that file (providers.yaml here is a
// byte-for-byte mirror of the canonical, NOT a second authoring surface). The
// CTO-decided distribution model for a multi-repo registry is
// "codegen-checked-into-each-repo + verify-CI": every consumer repo carries the
// generated projection and a drift gate, so a registry change in CP must be
// re-synced here (the sync-providers-yaml verify gate goes RED if this copy
// drifts from the canonical). molecule-core has no Go module dependency on
// controlplane, so a synced+gated copy is the blessed path (a shared Go module
// is not viable across the two repos today).
//
// P2-A is ADDITIVE, ZERO behavior change (the P0 shape mirrored): the loader +
// DeriveProvider land here, plus the generated artifact (cmd/gen-providers) and
// the verify-providers-gen drift gate, but NO production code path imports this
// package yet. P2-B wires the billing/credential decision onto DeriveProvider.
//
// Distribution model mirrors molecule-controlplane internal/providers: go:embed
// the YAML into the binary so a boot-time Load never touches the network.
package providers
import (
_ "embed"
"fmt"
"regexp"
"gopkg.in/yaml.v3"
)
// schemaVersion is the providers.yaml schema this package knows how to
// parse. It is the MAJOR component of the semver'd extension contract
// (internal#718: the manifest is a first-class versioned public artifact;
// breaking the field set is a governed API break). Bumped only on a breaking
// field-set change; Load fails closed on a mismatch so an older binary cannot
// silently consume a newer manifest (mirrors internal/envs). See
// internal/providers/README.md for the contract + compatibility policy.
const schemaVersion = 1
// SchemaVersion exposes the schema/contract MAJOR version the loader knows
// how to parse. It is the version the codegen artifact (cmd/gen-providers)
// and any future conformance suite pin against. Public so the generator and
// external conformance tooling read the same constant the loader enforces.
func SchemaVersion() int { return schemaVersion }
//go:embed providers.yaml
var embeddedYAML []byte
// Protocol is the wire format the proxy speaks to a provider's upstream.
type Protocol string
const (
// ProtocolOpenAI is the OpenAI chat-completions wire format.
ProtocolOpenAI Protocol = "openai"
// ProtocolAnthropic is the Anthropic messages wire format.
ProtocolAnthropic Protocol = "anthropic"
)
// Provider is one entry in the canonical manifest. It is the superset
// schema from RFC §2 — each consumer reads the subset it needs (the
// proxy reads protocol/base_url/auth_env, the canvas reads
// display_name/vendor_logo/model_prefix_match, the adapter reads
// auth_mode/auth_token_env/base_url). Field names mirror the YAML keys.
type Provider struct {
// Name is the stable key (intended to align with
// llm_price_catalog.provider; see the DRIFT NOTE in providers.yaml).
Name string `yaml:"name"`
// DisplayName is the canvas dropdown label.
DisplayName string `yaml:"display_name"`
// VendorLogo is the canvas asset key.
VendorLogo string `yaml:"vendor_logo"`
// Protocol is the proxy wire format: "openai" or "anthropic".
Protocol Protocol `yaml:"protocol"`
// AuthMode is one of "anthropic_api", "oauth",
// "third_party_anthropic_compat".
AuthMode string `yaml:"auth_mode"`
// BaseURLTemplate is the openai-protocol base URL (empty = SDK/CLI
// default).
BaseURLTemplate string `yaml:"base_url_template"`
// BaseURLAnthropic is the anthropic-protocol base URL where the
// provider exposes one (empty otherwise).
BaseURLAnthropic string `yaml:"base_url_anthropic"`
// AuthEnv is the list of env var NAMES accepted (never secret
// values); any one being set satisfies auth.
AuthEnv []string `yaml:"auth_env"`
// AuthTokenEnv is the env var the adapter projects the vendor key
// into (defaults to ANTHROPIC_AUTH_TOKEN when empty).
AuthTokenEnv string `yaml:"auth_token_env"`
// ModelPrefixMatch is the RE2 regex that unifies the proxy's
// inferLLMProvider prefixes, the canvas BARE_VENDOR_PATTERNS, and
// the adapter model_prefixes.
ModelPrefixMatch string `yaml:"model_prefix_match"`
// ModelAliases are canvas shortcut ids (e.g. sonnet/opus/haiku).
ModelAliases []string `yaml:"model_aliases"`
// Deprecated greys the provider out in the canvas (RFC §8.2)
// without breaking saved workspace configs. Optional; default false.
Deprecated bool `yaml:"deprecated"`
// UpstreamVendor is the proxy's upstream-vendor key for this entry — the
// 4-name vocabulary {openai, moonshot, anthropic, minimax} the proxy's
// resolveLLMProviderTarget switch dispatches on to pick the upstream base
// URL + key (internal#718 P1, CONVERGED). It is set ONLY on the entries the
// proxy routes to an upstream vendor; empty for every other catalog entry.
//
// It is a single PROPERTY of the entry, not a parallel routing block: the
// upstream-vendor IDENTITY of a provider (e.g. "anthropic-api"'s upstream is
// the "anthropic" vendor) is a fact about that one entry. ResolveUpstream
// reads it to map a model id's NAMESPACE token to the backing provider,
// whose base_url_* / auth_env (already on this same entry) are the SINGLE
// source for the upstream target. The token may differ from Name (the entry
// "anthropic-api" has UpstreamVendor "anthropic"); for moonshot/openai/
// minimax the entry name and the upstream vendor coincide.
UpstreamVendor string `yaml:"upstream_vendor"`
// re is the compiled ModelPrefixMatch. Compiled at Load (so a bad
// regex fails the whole manifest, per RFC §8.5) and reused by
// MatchesModel. Nil only for a zero-value Provider not produced by
// Load, in which case MatchesModel compiles on demand.
re *regexp.Regexp
}
// RuntimeProviderRef is one provider a runtime natively supports, plus the
// exact model ids that runtime exposes for it. RFC #340 (CTO correction
// 2026-05-26): the manifest is constrained to each runtime's NATIVE support
// matrix, NOT the 24-provider superset. A provider absent from every
// runtime's native set is over-offer drift the canvas must not surface and
// the proxy must not route (matches cp#334 "use native endpoint, don't
// translate").
type RuntimeProviderRef struct {
// Name references a Provider.Name. Load fails closed if it does not
// resolve, so a typo can never silently drop a model from a runtime.
Name string `yaml:"name"`
// Models is the exact set of model ids this runtime exposes for the
// referenced provider (extracted verbatim from the runtime template's
// config.yaml runtime_config.models block). Empty is a manifest error:
// a native provider with zero models offers nothing.
Models []string `yaml:"models"`
}
// RuntimeNativeSet is the native provider+model matrix for a single runtime.
type RuntimeNativeSet struct {
// Providers is the runtime's native provider set (each with its exact
// model ids). Exactly the set the canvas may offer and the proxy may
// route for this runtime — no more, no fewer.
Providers []RuntimeProviderRef `yaml:"providers"`
}
// Manifest is the parsed providers.yaml: the provider catalog plus the
// per-runtime native constraint layer. Returned by LoadManifest; Load
// remains for callers that only need the flat provider slice.
type Manifest struct {
// Providers is the full provider catalog (protocol, base_url, auth).
Providers []Provider
// Runtimes maps a runtime name (claude-code, hermes, codex, openclaw)
// to its native provider+model set. The SSOT for "which providers and
// models does runtime R natively support".
Runtimes map[string]RuntimeNativeSet
}
type manifest struct {
SchemaVersion int `yaml:"schema_version"`
Providers []Provider `yaml:"providers"`
Runtimes map[string]RuntimeNativeSet `yaml:"runtimes"`
}
// Load parses the embedded providers.yaml and returns the manifest's
// provider slice. It validates the schema version, that every entry has
// the required fields populated, and that every model_prefix_match is a
// compilable RE2 regex. Errors are returned (never panic) so callers
// decide their own fallback (the proxy keeps a legacy switch; see RFC
// §6). Load does not touch the network.
//
// Load is the flat-slice accessor retained for PR-1 callers that only need
// the provider catalog. Callers needing the per-runtime native constraint
// layer use LoadManifest.
func Load() ([]Provider, error) {
m, err := LoadManifest()
if err != nil {
return nil, err
}
return m.Providers, nil
}
// LoadManifest parses the embedded providers.yaml into a Manifest: the
// provider catalog plus the per-runtime native support matrix (RFC #340).
// It performs all of Load's validation AND validates the runtimes block:
// every provider name a runtime references must resolve to a real provider
// entry, and every referenced provider must carry at least one model id.
// Fails closed (never panic, never network) so a typo'd provider ref or an
// empty native set is a load error, not a silent over/under-offer.
func LoadManifest() (*Manifest, error) {
return parseManifest(embeddedYAML)
}
// parseManifest is the byte-level seam LoadManifest delegates to. Split out
// so the validation branches (bad schema version, unknown provider ref,
// empty native set, duplicate ref, model-less ref) are unit-testable
// against crafted YAML without mutating the embedded baseline.
func parseManifest(raw []byte) (*Manifest, error) {
var m manifest
if err := yaml.Unmarshal(raw, &m); err != nil {
return nil, fmt.Errorf("providers: parse manifest: %w", err)
}
if m.SchemaVersion != schemaVersion {
return nil, fmt.Errorf("providers: manifest schema_version %d, loader expects %d", m.SchemaVersion, schemaVersion)
}
if len(m.Providers) == 0 {
return nil, fmt.Errorf("providers: manifest has no providers")
}
seen := make(map[string]struct{}, len(m.Providers))
out := make([]Provider, 0, len(m.Providers))
for i := range m.Providers {
p := m.Providers[i]
if err := p.validate(); err != nil {
return nil, fmt.Errorf("providers: entry %d (%q): %w", i, p.Name, err)
}
if _, dup := seen[p.Name]; dup {
return nil, fmt.Errorf("providers: duplicate provider name %q", p.Name)
}
seen[p.Name] = struct{}{}
re, err := regexp.Compile(p.ModelPrefixMatch)
if err != nil {
return nil, fmt.Errorf("providers: entry %q model_prefix_match %q: %w", p.Name, p.ModelPrefixMatch, err)
}
p.re = re
out = append(out, p)
}
// upstream_vendor validation (internal#718 P1, CONVERGED). It is optional
// (set only on the entries the proxy routes to an upstream), but it must be
// UNIQUE across the catalog: ResolveUpstream maps a model id's namespace
// token to the ONE entry whose UpstreamVendor equals it, so two entries
// claiming the same vendor would make the namespace token ambiguous (a
// non-deterministic upstream). Fail closed so a typo can never produce two
// entries owning the same upstream vendor.
vendorOwner := make(map[string]string, len(out))
for i := range out {
v := out[i].UpstreamVendor
if v == "" {
continue
}
if prev, dup := vendorOwner[v]; dup {
return nil, fmt.Errorf("providers: entries %q and %q both declare upstream_vendor %q — it must be unique (the namespace token resolves to exactly one entry)", prev, out[i].Name, v)
}
vendorOwner[v] = out[i].Name
}
if len(m.Runtimes) == 0 {
return nil, fmt.Errorf("providers: manifest declares no runtimes")
}
for rt, native := range m.Runtimes {
if len(native.Providers) == 0 {
return nil, fmt.Errorf("providers: runtime %q has an empty native provider set", rt)
}
refSeen := make(map[string]struct{}, len(native.Providers))
for _, ref := range native.Providers {
if _, ok := seen[ref.Name]; !ok {
return nil, fmt.Errorf("providers: runtime %q references unknown provider %q", rt, ref.Name)
}
if _, dup := refSeen[ref.Name]; dup {
return nil, fmt.Errorf("providers: runtime %q references provider %q twice", rt, ref.Name)
}
refSeen[ref.Name] = struct{}{}
if len(ref.Models) == 0 {
return nil, fmt.Errorf("providers: runtime %q provider %q has no model ids", rt, ref.Name)
}
}
}
return &Manifest{Providers: out, Runtimes: m.Runtimes}, nil
}
// ProvidersForRuntime returns the providers runtime rt natively supports,
// in the manifest's declared order. An unknown runtime returns a non-nil
// error and a nil slice — it never falls through to "all providers", so a
// caller that fat-fingers a runtime name fails loud rather than offering
// the whole catalog.
func (m *Manifest) ProvidersForRuntime(rt string) ([]Provider, error) {
native, ok := m.Runtimes[rt]
if !ok {
return nil, fmt.Errorf("providers: unknown runtime %q", rt)
}
byName := make(map[string]Provider, len(m.Providers))
for _, p := range m.Providers {
byName[p.Name] = p
}
out := make([]Provider, 0, len(native.Providers))
for _, ref := range native.Providers {
// Resolution is guaranteed by LoadManifest's validation, but guard
// anyway so a hand-built Manifest can't panic here.
if p, ok := byName[ref.Name]; ok {
out = append(out, p)
}
}
return out, nil
}
// ModelsForRuntime returns the exact model ids runtime rt natively exposes,
// flattened across all its native providers, in manifest-declared order.
// An unknown runtime returns a non-nil error and a nil slice (never the
// whole catalog). This is the SSOT the canvas dropdown (PR-4) and the proxy
// router (PR-3) both consume so they can never offer/route a model the
// runtime can't natively run.
func (m *Manifest) ModelsForRuntime(rt string) ([]string, error) {
native, ok := m.Runtimes[rt]
if !ok {
return nil, fmt.Errorf("providers: unknown runtime %q", rt)
}
var out []string
for _, ref := range native.Providers {
out = append(out, ref.Models...)
}
return out, nil
}
// validate checks the required-field invariants for a single entry.
func (p *Provider) validate() error {
if p.Name == "" {
return fmt.Errorf("name is required")
}
switch p.Protocol {
case ProtocolOpenAI, ProtocolAnthropic:
default:
return fmt.Errorf("protocol must be %q or %q, got %q", ProtocolOpenAI, ProtocolAnthropic, p.Protocol)
}
if p.AuthMode == "" {
return fmt.Errorf("auth_mode is required")
}
if len(p.AuthEnv) == 0 {
return fmt.Errorf("auth_env must be non-empty")
}
if p.DisplayName == "" {
return fmt.Errorf("display_name is required")
}
if p.ModelPrefixMatch == "" {
return fmt.Errorf("model_prefix_match is required")
}
return nil
}
// MatchesModel reports whether the given model slug is owned by this
// provider per its ModelPrefixMatch regex. A Provider produced by Load
// uses its precompiled regex. A zero-value Provider (one constructed
// directly, not via Load) compiles on demand; if the pattern is invalid
// or empty it never matches.
func (p Provider) MatchesModel(slug string) bool {
re := p.re
if re == nil {
if p.ModelPrefixMatch == "" {
return false
}
compiled, err := regexp.Compile(p.ModelPrefixMatch)
if err != nil {
return false
}
re = compiled
}
return re.MatchString(slug)
}
@@ -0,0 +1,687 @@
# Canonical providers manifest — single source of truth (SSOT) baseline.
#
# RFC: molecule-ai/molecule-controlplane#340 "Canonical Providers Manifest".
# This file is PR-1: the git-tracked baseline only. NOTHING imports the
# loader yet — no consumer is wired (proxy switch, canvas dropdown, and
# adapter registry are migrated in later PRs). Reverting PR-1 = delete
# this file + providers.go + providers_test.go. Zero runtime behavior
# change.
#
# It transcribes the UNION of the four places that independently define
# "which LLM providers exist" today, so later PRs can converge them:
# 1. Proxy — internal/handlers/llm_proxy.go
# resolveLLMProviderTargetForProtocol (4-arm switch:
# openai/moonshot/anthropic/minimax) + inferLLMProvider
# (prefix table: minimax / kimi->moonshot / claude->anthropic
# / default->openai).
# 2. Canvas — molecule-core/canvas/src/components/ProviderModelSelector.tsx
# VENDOR_LABELS (28 rows) + BARE_VENDOR_PATTERNS.
# 3. Adapter — molecule-ai-workspace-template-claude-code/config.yaml
# `providers:` block (8 entries) + adapter.py _BUILTIN_PROVIDERS.
# The same block is copy-pasted into the seo-agent template.
# 4. DB — migrations 037_llm_usage_billing + 039_minimax_llm_price_catalog
# seed llm_price_catalog with providers
# openai / anthropic / moonshot / minimax.
#
# Schema (RFC §2 superset; each consumer reads the subset it needs):
# name stable key (intended == llm_price_catalog.provider)
# display_name canvas dropdown label
# vendor_logo canvas asset key
# protocol openai | anthropic (proxy wire format)
# auth_mode anthropic_api | oauth | third_party_anthropic_compat
# base_url_template base URL for the openai-protocol surface (null = CLI/SDK default)
# base_url_anthropic base URL for the anthropic-protocol surface (where applicable)
# auth_env env var names accepted (NAMES ONLY — never secrets); any one satisfies auth
# auth_token_env env var the adapter projects the vendor key INTO (default ANTHROPIC_AUTH_TOKEN)
# model_prefix_match RE2 regex unifying proxy inferLLMProvider prefixes +
# canvas BARE_VENDOR_PATTERNS + adapter model_prefixes
# model_aliases canvas shortcut ids (sonnet/opus/haiku, etc.)
# deprecated optional bool (RFC §8.2; default false)
# upstream_vendor OPTIONAL (internal#718 P1, CONVERGED 2026-05-27). The
# proxy's upstream-vendor key for this entry — the 4-name
# vocabulary {openai, moonshot, anthropic, minimax} the
# proxy's resolveLLMProviderTarget switch dispatches on to
# pick the upstream base URL + key. Present ONLY on the
# entries the proxy routes to an upstream; absent everywhere
# else. It is a single PROPERTY of the entry (like protocol
# or base_url_template), NOT a parallel routing block: the
# upstream-vendor identity of "anthropic-api" is the
# "anthropic" vendor; for moonshot/openai/minimax the entry
# name and the vendor coincide. Manifest.ResolveUpstream is
# the ONE resolution over these entries — it maps a platform
# model id's NAMESPACE token (every live platform id is
# `vendor/model`) to the entry whose upstream_vendor equals
# it, then reads that entry's base_url_* / auth_env (the
# SINGLE source) for the upstream target. Bare ids are
# vestigial at the proxy (no live bare traffic) and resolve
# via the proxy's retained legacy fallback, not here.
# Must be UNIQUE across the catalog (load fails closed
# otherwise — the namespace token must resolve to one entry).
#
# DRIFT NOTE on `name` vs DB `provider`: the RFC suggests name == the
# llm_price_catalog.provider column. The DB actually seeds the row
# `anthropic` (not `anthropic-api`), and has no rows for the OAuth /
# platform / third-party providers. PR-1 keeps the RFC's `anthropic-api`
# key and records the mismatch here; reconciling the join key is a
# later-PR / migration concern, not a PR-1 routing change.
schema_version: 1
providers:
# ===========================================================================
# Anthropic — native. proxy + canvas + adapter + DB all know it.
# ===========================================================================
- name: anthropic-api
display_name: "Anthropic API"
vendor_logo: "anthropic"
protocol: anthropic
auth_mode: anthropic_api
base_url_template: "https://api.anthropic.com/v1"
base_url_anthropic: "https://api.anthropic.com/v1"
auth_env: [ANTHROPIC_API_KEY, ANTHROPIC_AUTH_TOKEN]
auth_token_env: ANTHROPIC_API_KEY
# Proxy inferLLMProvider matches HasPrefix "claude"; canvas matches /^claude-/i.
model_prefix_match: "^claude"
model_aliases: []
# internal#718 P1 (CONVERGED): the proxy's upstream-vendor key. ResolveUpstream
# maps the `anthropic/` namespace token to THIS entry, then dials this entry's
# base_url_anthropic / base_url_template + auth (the SINGLE source). The vendor
# key is "anthropic" (NOT the registry provider name "anthropic-api"). The
# anthropic-oauth entry carries NO upstream_vendor — OAuth never traverses the
# proxy (the CLI talks to Anthropic directly). Bare `claude*` ids are vestigial
# at the proxy (no live bare traffic) and resolve via the legacy fallback.
upstream_vendor: anthropic
# Claude Code subscription via OAuth. Adapter + canvas know it; proxy
# never routes OAuth (the CLI talks to Anthropic directly). No base URL.
- name: anthropic-oauth
display_name: "Claude Code subscription"
vendor_logo: "anthropic"
protocol: anthropic
auth_mode: oauth
base_url_template: null
base_url_anthropic: null
auth_env: [CLAUDE_CODE_OAUTH_TOKEN]
auth_token_env: CLAUDE_CODE_OAUTH_TOKEN
# Matched by exact alias, not prefix — the bare ids sonnet/opus/haiku
# only count as OAuth when CLAUDE_CODE_OAUTH_TOKEN is the auth env
# (canvas gates on env; the manifest expresses the alias set here).
model_prefix_match: "^(sonnet|opus|haiku)$"
model_aliases: [sonnet, opus, haiku]
# ===========================================================================
# OpenAI — proxy default arm + DB catalog + canvas. NOT in the adapter
# template (claude-code template is Anthropic-protocol only).
# ===========================================================================
- name: openai
display_name: "OpenAI"
vendor_logo: "openai"
protocol: openai
auth_mode: anthropic_api # OpenAI is openai-protocol; auth is a bearer API key.
base_url_template: "https://api.openai.com/v1"
base_url_anthropic: null # OpenAI exposes only the OpenAI protocol surface.
auth_env: [OPENAI_API_KEY]
auth_token_env: OPENAI_API_KEY
# Proxy treats openai as the DEFAULT (catch-all) arm of inferLLMProvider;
# there is no explicit prefix today. Canvas matches /^gpt-/i. Encode the
# canvas prefix so the explicit slugs route; the proxy's catch-all
# behavior is a routing decision for PR-3, not the manifest.
model_prefix_match: "^gpt-"
model_aliases: []
# internal#718 P1 (CONVERGED): the proxy's upstream-vendor key. ResolveUpstream
# maps the `openai/` namespace token to THIS entry. openai is ALSO the proxy's
# historical catch-all (the switch's `default:` arm) for bare/unknown ids —
# but the catch-all is a VESTIGIAL bare-id behavior (no live bare traffic), so
# it lives in the retained legacy fallback (inferLLMProviderLegacy), NOT as a
# registry data flag. Live `openai/<m>` ids resolve here by namespace.
upstream_vendor: openai
# ===========================================================================
# Moonshot (Kimi) — proxy arm + DB catalog + canvas label "moonshot".
# Distinct from the adapter's `kimi-coding` gateway (different host + auth
# header); both are retained — see kimi-coding below.
# ===========================================================================
- name: moonshot
display_name: "Moonshot (Kimi)"
vendor_logo: "moonshot"
protocol: openai
auth_mode: third_party_anthropic_compat
base_url_template: "https://api.moonshot.ai/v1"
base_url_anthropic: "https://api.moonshot.ai/anthropic/v1"
auth_env: [MOONSHOT_API_KEY, KIMI_API_KEY]
auth_token_env: ANTHROPIC_API_KEY
# internal#718 P0 (CTO 2026-05-27, EMPIRICALLY VERIFIED): the moonshot
# endpoint (api.moonshot.ai) and the kimi-coding gateway
# (api.kimi.com/coding) serve DIFFERENT models on DIFFERENT hosts —
# moonshot serves the moonshot-namespaced ids (the proxy's platform path
# resolves `moonshot/kimi-k2.6` here and 404s `kimi-for-coding`), while
# the bare kimi-* ids are served by the separate `kimi-coding` gateway
# below (which 404s on api.moonshot.ai). They are NOT a single owner.
# `moonshot` therefore owns ONLY the moonshot-prefixed ids:
# * "moonshot/..." — the proxy/platform-namespaced form (claude-code +
# hermes + openclaw platform refs route here),
# * "moonshot:..." — openclaw's colon-namespaced BYOK form,
# * "moonshot-..." — a bare moonshot-v1* model id.
# It deliberately does NOT claim bare kimi-* (those are kimi-coding's, per
# the corrected serving split). RE2 has no negative lookahead; the prefix
# is positively scoped to the moonshot namespace so the two regexes are
# disjoint and DeriveProvider resolves each bare/namespaced id to exactly
# one owner. This removes the false kimi-* overlap RFC#340/PR-1 flagged.
model_prefix_match: "^moonshot[:/-]"
model_aliases: []
# internal#718 P1 (CONVERGED): the proxy's upstream-vendor key. ResolveUpstream
# maps the `moonshot/` (slash) + `moonshot:` (openclaw colon) namespace tokens
# to THIS entry — jrs SEO's LIVE `moonshot/kimi-k2.6` + sibling `moonshot/...`
# ids dial this entry's base_url (api.moonshot.ai). The vendor key coincides
# with the entry name here.
# NOTE on bare kimi-* (the convergence's key clarification): a BARE `kimi*`
# id is NOT routed by this registry resolution. DeriveProvider (registry
# semantics, P0) resolves bare kimi-* to the `kimi-coding` gateway
# (api.kimi.com/coding) — which is NOT a valid proxy upstream — so routing a
# bare kimi id through the shared registry matcher would MISROUTE. Bare ids
# are vestigial at the proxy (zero live bare traffic; every platform id is
# namespaced), so the converged path does not resolve them at all; a bare
# `kimi*` falls through to the proxy's retained legacy switch, which routes
# it to moonshot exactly as before (byte-identical). No moonshot bare-prefix
# data block is recreated.
upstream_vendor: moonshot
# ===========================================================================
# MiniMax — proxy arm + DB catalog (7 models) + adapter + canvas.
# ===========================================================================
- name: minimax
display_name: "MiniMax"
vendor_logo: "minimax"
protocol: openai
auth_mode: third_party_anthropic_compat
base_url_template: "https://api.minimax.io/v1"
base_url_anthropic: "https://api.minimax.io/anthropic/v1"
# Adapter template uses api.minimax.io/anthropic (no /v1); proxy uses
# /anthropic/v1. Manifest follows the proxy's value (the routing layer);
# the adapter base_url is reconciled in PR-5.
auth_env: [MINIMAX_API_KEY, ANTHROPIC_AUTH_TOKEN, ANTHROPIC_API_KEY]
auth_token_env: ANTHROPIC_AUTH_TOKEN
# Proxy: HasPrefix "minimax" (case-insensitive lower). Catalog ids are
# mixed-case "MiniMax-M2.7" — every catalog/canvas id starts "MiniMax-M".
# Anchored on "-m" (not bare "-") so it does NOT also claim the
# `minimax-cn/` slash-prefixed China sibling below (RE2 has no negative
# lookahead; the more-specific China entry owns its slash-prefix).
model_prefix_match: "(?i)^minimax-m"
model_aliases: []
# internal#718 P1 (CONVERGED): the proxy's upstream-vendor key. ResolveUpstream
# maps the `minimax/` namespace token to THIS entry — claude-code's LIVE
# `minimax/MiniMax-M2.7(-highspeed)` platform ids dial this entry's base_url
# (api.minimax.io). The `minimax-cn` China sibling carries NO upstream_vendor
# (the proxy has no arm for it; a bare minimax-cn id is vestigial and falls to
# the legacy fallback, unchanged). Bare `minimax*` ids are vestigial at the
# proxy and resolve via the legacy fallback (which keeps the broader
# HasPrefix "minimax" behavior verbatim), not here.
upstream_vendor: minimax
# ===========================================================================
# Platform — Molecule-managed LLM proxy. Adapter + canvas know it. It is
# the PROXY ITSELF as seen from a workspace, so the manifest entry is the
# client-facing endpoint, not an upstream vendor. proxy switch has no
# "platform" arm (it routes the underlying vendor model instead).
# ===========================================================================
- name: platform
display_name: "Platform"
vendor_logo: "molecule"
protocol: anthropic
auth_mode: third_party_anthropic_compat
# Dual-surface: the platform proxy exposes BOTH the OpenAI-compat
# (/openai/v1/chat/completions) and Anthropic-compat (/anthropic/v1/messages)
# wire formats. Anthropic-protocol runtimes (claude-code) use
# base_url_anthropic; OpenAI-protocol runtimes (hermes/codex/openclaw) use
# base_url_template. Previously both pointed at the anthropic surface — a
# PR-1 simplification when only claude-code referenced platform.
base_url_template: "https://api.moleculesai.app/api/v1/internal/llm/openai/v1"
base_url_anthropic: "https://api.moleculesai.app/api/v1/internal/llm/anthropic/v1"
auth_env: [ANTHROPIC_API_KEY, MOLECULE_LLM_USAGE_TOKEN]
auth_token_env: ANTHROPIC_API_KEY
# Adapter routes kimi- / moonshot/ through platform by default. No bare
# vendor prefix of its own; it multiplexes other vendors' slugs. Match
# the explicit "platform/" slash-prefix only so it never steals another
# vendor's bare slug.
model_prefix_match: "^platform/"
model_aliases: []
# ===========================================================================
# Xiaomi MiMo — adapter + canvas (two canvas keys: "xiaomi-mimo" AND
# "xiaomi", both labelled "Xiaomi MiMo"). proxy has no arm; DB has no rows.
# ===========================================================================
- name: xiaomi-mimo
display_name: "Xiaomi MiMo"
vendor_logo: "xiaomi"
protocol: anthropic
auth_mode: third_party_anthropic_compat
base_url_template: "https://api.xiaomimimo.com/anthropic"
base_url_anthropic: "https://api.xiaomimimo.com/anthropic"
auth_env: [ANTHROPIC_AUTH_TOKEN, ANTHROPIC_API_KEY]
auth_token_env: ANTHROPIC_AUTH_TOKEN
# Adapter prefix "mimo-"; canvas /^mimo-/i. proxy routing TBD (PR-3).
# NOTE: canvas has a duplicate "xiaomi" VENDOR_LABELS key aliasing the
# same vendor — collapsed into this one entry.
model_prefix_match: "^mimo-"
model_aliases: []
# ===========================================================================
# Z.ai (GLM) — adapter + canvas. proxy has no arm; DB has no rows.
# ===========================================================================
- name: zai
display_name: "Z.ai (GLM)"
vendor_logo: "zai"
protocol: anthropic
auth_mode: third_party_anthropic_compat
base_url_template: "https://api.z.ai/api/anthropic"
base_url_anthropic: "https://api.z.ai/api/anthropic"
auth_env: [GLM_API_KEY, ANTHROPIC_AUTH_TOKEN, ANTHROPIC_API_KEY]
auth_token_env: ANTHROPIC_AUTH_TOKEN
# Adapter prefix "glm-" (lowercased match catches GLM-4.6); canvas /^GLM-/i.
# canvas-only + adapter-only today; proxy routing TBD (PR-3).
model_prefix_match: "(?i)^glm-"
model_aliases: []
# ===========================================================================
# Kimi For Coding — adapter ("kimi-coding") + canvas
# ("kimi-coding"="Moonshot Kimi (coding-tuned)"). Distinct host
# (api.kimi.com/coding/) + x-api-key auth from the `moonshot` entry above.
# DB seeds moonshot/kimi-for-coding as alias_for kimi-k2.6.
# ===========================================================================
- name: kimi-coding
display_name: "Moonshot Kimi (coding-tuned)"
vendor_logo: "moonshot"
protocol: anthropic
auth_mode: third_party_anthropic_compat
base_url_template: "https://api.kimi.com/coding/"
base_url_anthropic: "https://api.kimi.com/coding/"
auth_env: [KIMI_API_KEY, ANTHROPIC_API_KEY, ANTHROPIC_AUTH_TOKEN]
# x-api-key header (NOT bearer) per kimi.com's Claude Code integration doc.
auth_token_env: ANTHROPIC_API_KEY
# internal#718 P0 (CTO 2026-05-27, EMPIRICALLY VERIFIED): the
# api.kimi.com/coding gateway is the owner of the BARE kimi-* ids. Per
# kimi.com's official Claude Code integration doc + the claude-code
# template's `kimi-coding` provider (model_prefixes: [kimi-]), this
# gateway authenticates with KIMI_API_KEY (sk-kimi-*) on the x-api-key
# header and "routes to the served K2.6 model regardless of the model
# name on the wire" — so every bare kimi-* id (kimi-for-coding,
# kimi-k2.6, kimi-k2.5, kimi-k2, kimi-latest, ...) is served HERE, while
# api.moonshot.ai 404s these. This OWNS bare "kimi-"; the moonshot-
# namespaced ids (moonshot/, moonshot:, moonshot-) belong to `moonshot`
# above. The two regexes are now disjoint (no negative lookahead needed),
# removing the false kimi-* overlap that RFC#340/PR-1 deferred — each id
# resolves to exactly one owner. Registry-data-only: NO production reader
# consumes model_prefix_match yet (the proxy keeps its own hardcoded
# inferLLMProvider), so this cannot change live routing.
model_prefix_match: "^kimi-"
model_aliases: []
# ===========================================================================
# DeepSeek — adapter + canvas. proxy has no arm; DB has no rows.
# ===========================================================================
- name: deepseek
display_name: "DeepSeek"
vendor_logo: "deepseek"
protocol: anthropic
auth_mode: third_party_anthropic_compat
base_url_template: "https://api.deepseek.com/anthropic"
base_url_anthropic: "https://api.deepseek.com/anthropic"
auth_env: [DEEPSEEK_API_KEY, ANTHROPIC_AUTH_TOKEN, ANTHROPIC_API_KEY]
auth_token_env: ANTHROPIC_AUTH_TOKEN
# Adapter prefix "deepseek-"; canvas /^deepseek-/i. adapter+canvas only;
# proxy routing TBD (PR-3).
model_prefix_match: "^deepseek-"
model_aliases: []
# ===========================================================================
# CANVAS-ONLY vendors — present in ProviderModelSelector VENDOR_LABELS but
# NOT routed by the proxy, NOT in the adapter template, NOT in the DB.
# This is exactly the "canvas offered a provider the proxy can't route"
# drift the RFC targets. Transcribed here so PR-3/PR-4 converge them;
# base_url/auth are best-effort placeholders pending real routing in PR-3.
# Each is marked `proxy routing TBD`. model_prefix_match is the canvas
# heuristic (slash-prefix vendor key) where one exists, else a slash-prefix
# on the vendor key itself.
# ===========================================================================
- name: google
display_name: "Google Gemini"
vendor_logo: "google"
protocol: openai
auth_mode: third_party_anthropic_compat
base_url_template: null
base_url_anthropic: null
auth_env: [GEMINI_API_KEY, GOOGLE_API_KEY]
auth_token_env: ANTHROPIC_AUTH_TOKEN
# canvas-only today; proxy routing TBD. canvas /^gemini-/i.
# canvas also has a duplicate "gemini" label key aliasing the same vendor.
model_prefix_match: "^gemini-"
model_aliases: []
- name: alibaba
display_name: "Alibaba Qwen (DashScope)"
vendor_logo: "alibaba"
protocol: openai
auth_mode: third_party_anthropic_compat
base_url_template: null
base_url_anthropic: null
auth_env: [DASHSCOPE_API_KEY, ALIBABA_API_KEY]
auth_token_env: ANTHROPIC_AUTH_TOKEN
# canvas-only today; proxy routing TBD. canvas /^qwen-/i.
model_prefix_match: "^qwen-"
model_aliases: []
- name: nousresearch
display_name: "Nous Research (Hermes)"
vendor_logo: "nousresearch"
protocol: openai
auth_mode: third_party_anthropic_compat
base_url_template: null
base_url_anthropic: null
auth_env: [NOUSRESEARCH_API_KEY]
auth_token_env: ANTHROPIC_AUTH_TOKEN
# canvas-only today; proxy routing TBD. Slash-prefix id
# (e.g. nousresearch/hermes-4-70b).
model_prefix_match: "^nousresearch/"
model_aliases: []
- name: openrouter
display_name: "OpenRouter (any model)"
vendor_logo: "openrouter"
protocol: openai
auth_mode: third_party_anthropic_compat
base_url_template: "https://openrouter.ai/api/v1"
base_url_anthropic: null
auth_env: [OPENROUTER_API_KEY]
auth_token_env: ANTHROPIC_AUTH_TOKEN
# canvas-only today; proxy routing TBD. Wildcard: openrouter/<model>.
model_prefix_match: "^openrouter/"
model_aliases: []
- name: huggingface
display_name: "Hugging Face Inference"
vendor_logo: "huggingface"
protocol: openai
auth_mode: third_party_anthropic_compat
base_url_template: null
base_url_anthropic: null
auth_env: [HUGGINGFACE_API_KEY, HF_TOKEN]
auth_token_env: ANTHROPIC_AUTH_TOKEN
# canvas-only today; proxy routing TBD. Wildcard: huggingface/<model>.
model_prefix_match: "^huggingface/"
model_aliases: []
- name: ai-gateway
display_name: "Vercel AI Gateway"
vendor_logo: "ai-gateway"
protocol: openai
auth_mode: third_party_anthropic_compat
base_url_template: null
base_url_anthropic: null
auth_env: [AI_GATEWAY_API_KEY]
auth_token_env: ANTHROPIC_AUTH_TOKEN
# canvas-only today; proxy routing TBD.
model_prefix_match: "^ai-gateway/"
model_aliases: []
- name: opencode-zen
display_name: "OpenCode Zen"
vendor_logo: "opencode-zen"
protocol: openai
auth_mode: third_party_anthropic_compat
base_url_template: null
base_url_anthropic: null
auth_env: [OPENCODE_ZEN_API_KEY]
auth_token_env: ANTHROPIC_AUTH_TOKEN
# canvas-only today; proxy routing TBD.
model_prefix_match: "^opencode-zen/"
model_aliases: []
- name: opencode-go
display_name: "OpenCode Go"
vendor_logo: "opencode-go"
protocol: openai
auth_mode: third_party_anthropic_compat
base_url_template: null
base_url_anthropic: null
auth_env: [OPENCODE_GO_API_KEY]
auth_token_env: ANTHROPIC_AUTH_TOKEN
# canvas-only today; proxy routing TBD.
model_prefix_match: "^opencode-go/"
model_aliases: []
- name: kilocode
display_name: "Kilo Code"
vendor_logo: "kilocode"
protocol: openai
auth_mode: third_party_anthropic_compat
base_url_template: null
base_url_anthropic: null
auth_env: [KILOCODE_API_KEY]
auth_token_env: ANTHROPIC_AUTH_TOKEN
# canvas-only today; proxy routing TBD.
model_prefix_match: "^kilocode/"
model_aliases: []
- name: minimax-cn
display_name: "MiniMax China"
vendor_logo: "minimax"
protocol: openai
auth_mode: third_party_anthropic_compat
base_url_template: "https://api.minimaxi.com/v1"
base_url_anthropic: "https://api.minimaxi.com/anthropic"
auth_env: [MINIMAX_API_KEY, ANTHROPIC_AUTH_TOKEN]
auth_token_env: ANTHROPIC_AUTH_TOKEN
# canvas-only today; proxy routing TBD. China endpoint sibling of `minimax`
# (api.minimaxi.com). Matched only by the explicit slash-prefix so it does
# NOT collide with `minimax`'s (?i)^minimax- in the overlap guard.
model_prefix_match: "^minimax-cn/"
model_aliases: []
- name: ollama-cloud
display_name: "Ollama Cloud"
vendor_logo: "ollama"
protocol: openai
auth_mode: third_party_anthropic_compat
base_url_template: null
base_url_anthropic: null
auth_env: [OLLAMA_CLOUD_API_KEY]
auth_token_env: ANTHROPIC_AUTH_TOKEN
# canvas-only today; proxy routing TBD.
model_prefix_match: "^ollama-cloud/"
model_aliases: []
- name: ollama
display_name: "Ollama (self-hosted)"
vendor_logo: "ollama"
protocol: openai
auth_mode: third_party_anthropic_compat
base_url_template: "http://localhost:11434/v1"
base_url_anthropic: null
auth_env: [OLLAMA_HOST]
auth_token_env: ANTHROPIC_AUTH_TOKEN
# canvas-only today; proxy routing TBD. Self-hosted; no key (host only).
model_prefix_match: "^ollama/"
model_aliases: []
- name: nvidia
display_name: "NVIDIA NIM"
vendor_logo: "nvidia"
protocol: openai
auth_mode: third_party_anthropic_compat
base_url_template: "https://integrate.api.nvidia.com/v1"
base_url_anthropic: null
auth_env: [NVIDIA_API_KEY]
auth_token_env: ANTHROPIC_AUTH_TOKEN
# canvas-only today; proxy routing TBD.
model_prefix_match: "^nvidia/"
model_aliases: []
- name: arcee
display_name: "Arcee"
vendor_logo: "arcee"
protocol: openai
auth_mode: third_party_anthropic_compat
base_url_template: null
base_url_anthropic: null
auth_env: [ARCEE_API_KEY]
auth_token_env: ANTHROPIC_AUTH_TOKEN
# canvas-only today; proxy routing TBD.
model_prefix_match: "^arcee/"
model_aliases: []
- name: custom
display_name: "Custom OpenAI-compat endpoint"
vendor_logo: "custom"
protocol: openai
auth_mode: third_party_anthropic_compat
base_url_template: null # operator-supplied via workspace runtime config
base_url_anthropic: null
auth_env: [CUSTOM_API_KEY, OPENAI_API_KEY]
auth_token_env: ANTHROPIC_AUTH_TOKEN
# canvas-only today; proxy routing TBD. Wildcard free-text: custom/<model>.
model_prefix_match: "^custom/"
model_aliases: []
# =============================================================================
# RUNTIME NATIVE SUPPORT MATRIX (RFC #340 — CTO correction 2026-05-26)
# =============================================================================
# The `providers:` list above is the full catalog (the union of proxy /
# canvas / adapter / DB). It is NOT the support matrix. We do NOT support
# every model on every provider.
#
# This `runtimes:` block is the SSOT for "which providers + models does
# runtime R NATIVELY support". It constrains the catalog to each runtime's
# native support matrix — the INVERSE of a superset. Canvas (PR-4) offers
# ONLY a runtime's native models; the proxy (PR-3) routes ONLY native models
# with NO protocol translation (matches the cp#334 "use the native endpoint,
# don't translate" fix). A catalog provider that appears in NO runtime's
# native set is over-offer drift: it stays in `providers:` only if another
# runtime legitimately uses it, otherwise it is the drift this RFC prunes.
#
# AUTHORITATIVE MATRIX (provider level), encoded EXACTLY below:
# claude-code -> anthropic (oauth + api), kimi (kimi-coding), minimax
# hermes -> kimi (kimi-coding)
# codex -> openai
# openclaw -> kimi (kimi-coding)
#
# Each runtime entry lists native provider NAMES (referencing `providers:`
# above; Load fails closed on an unknown ref) plus the EXACT model ids that
# runtime exposes for that provider. Model ids are transcribed verbatim from
# each runtime template's config.yaml `runtime_config.models` block
# (git.moleculesai.app/molecule-ai/molecule-ai-workspace-template-<rt>),
# pruned to the native matrix above.
#
# DRIFT PRUNED (templates declare these, they are NOT in the native matrix,
# so they are deliberately absent from the runtimes block below — flagged in
# the RFC, carried in `providers:` only where another runtime needs them):
# * claude-code template also declares: xiaomi-mimo (mimo-*), zai (GLM-*),
# deepseek (deepseek-*). Outside {anthropic, kimi, minimax} -> pruned.
# * codex template also declares: minimax-token-plan (codex-minimax-*).
# Outside {openai} -> pruned. (Template itself notes the MiniMax
# token-plan leg 404s on /v1/responses — a vendor gap, reinforcing the
# prune.)
# * openclaw template also declares: minimax (the default!), openai (gpt-*),
# groq, openrouter. Outside {kimi} -> pruned. NOTE: openclaw's *default*
# model is minimax:MiniMax-M2.7, NOT kimi — the CTO matrix narrows
# openclaw to its native Kimi path (moonshot: prefix + KIMI_API_KEY ->
# api.kimi.com/coding gateway). See RFC #340 update for the rationale.
# * hermes template declares ~30 providers (nous, openrouter, anthropic,
# gemini, deepseek, zai, minimax, alibaba, xiaomi, arcee, nvidia, ...).
# The CTO matrix narrows hermes to {kimi} only -> all others pruned from
# the native set.
runtimes:
# claude-code: native Anthropic-API / Claude-Code endpoints. Anthropic is
# split across two manifest providers (oauth + api) because the runtime
# exposes both auth paths natively; both count as "anthropic".
claude-code:
providers:
- name: anthropic-oauth
models: [sonnet, opus, haiku]
- name: anthropic-api
# BYOK versioned API ids (platform-namespaced ids live under `platform`)
models:
- claude-sonnet-4-6
- claude-opus-4-7
- claude-haiku-4-5
- name: kimi-coding
# BYOK kimi-coding gateway ids (platform-namespaced under `platform`)
models:
- kimi-for-coding
- kimi-k2.5
- kimi-k2
- name: minimax
# BYOK MiniMax ids (platform-namespaced ids live under `platform`)
models:
- MiniMax-M2
- MiniMax-M2.7
- MiniMax-M2.7-highspeed
# Platform-managed (no tenant key; Molecule owns billing). The
# vendor/model-namespaced ids the proxy resolves to the upstream vendor.
# Canonical for the template's `provider: platform` model entries — the
# drift gate (molecule-ci validate-workspace-template) enforces the
# template can offer no platform model absent from this set.
- name: platform
models:
- anthropic/claude-opus-4-7
- anthropic/claude-sonnet-4-6
- moonshot/kimi-k2.6
- moonshot/kimi-k2.5
- minimax/MiniMax-M2.7
- minimax/MiniMax-M2.7-highspeed
# hermes: native Kimi only (kimi-coding gateway). hermes-agent owns its own
# broad provider matrix, but the CTO native matrix for the Molecule
# platform constrains it to kimi.
hermes:
providers:
- name: kimi-coding
models: [kimi-coding/kimi-k2]
# Platform-managed Kimi (hermes's native platform family). Routed via
# the proxy OpenAI-compat surface; see the template's
# scripts/derive-platform-llm.sh.
- name: platform
models:
- moonshot/kimi-k2.6
- moonshot/kimi-k2.5
# codex: OpenAI — BYOK (subscription + API key, both map to the `openai`
# manifest provider) + platform-managed (the `platform` ref below, served
# via the proxy Responses surface).
codex:
providers:
- name: openai
models:
- gpt-5.5
- gpt-5.4
- gpt-5.4-mini
- gpt-5.3-codex
- gpt-5.3-codex-spark
- gpt-5.2
# Platform-managed OpenAI. NOW servable: the proxy exposes the OpenAI
# Responses surface (/internal/llm/openai/v1/responses) that the Codex
# CLI (0.130+, Responses-API-only) requires. The codex template adapter
# routes these via that surface (provider_config.py platform provider,
# auth_mode=openai_compat_responses, wire_api=responses). Default
# mirrors the deploy's MOLECULE_LLM_DEFAULT_MODEL (openai/gpt-5.4-mini).
- name: platform
models:
- openai/gpt-5.4
- openai/gpt-5.4-mini
# openclaw: native Kimi only. openclaw's moonshot: model prefix + a
# KIMI_API_KEY (sk-kimi-*) routes to api.kimi.com/coding (kimi-for-coding),
# which is the native Kimi path. Default minimax / openai / groq / openrouter
# legs are pruned per the CTO matrix.
openclaw:
providers:
- name: kimi-coding
models:
- moonshot:kimi-k2.6
- moonshot:kimi-k2.5
# Platform-managed Kimi. Note the slash form (moonshot/...) here vs the
# BYOK colon form (moonshot:...) above — openclaw's adapter uses colon
# ids natively; the platform path normalizes to the proxy's slash form.
- name: platform
models:
- moonshot/kimi-k2.6
- moonshot/kimi-k2.5
@@ -0,0 +1,207 @@
package providers
import (
"testing"
)
// TestLoadParses asserts the embedded manifest parses and is non-empty.
func TestLoadParses(t *testing.T) {
ps, err := Load()
if err != nil {
t.Fatalf("Load() error = %v", err)
}
if len(ps) == 0 {
t.Fatal("Load() returned an empty provider slice")
}
}
// TestRequiredFieldsPopulated asserts every entry has the fields the
// validate invariant requires (name, protocol, auth_mode, auth_env,
// display_name, model_prefix_match), and that protocol is one of the
// two legal wire formats.
func TestRequiredFieldsPopulated(t *testing.T) {
ps, err := Load()
if err != nil {
t.Fatalf("Load() error = %v", err)
}
for _, p := range ps {
if p.Name == "" {
t.Errorf("provider with display_name %q has empty name", p.DisplayName)
}
if p.DisplayName == "" {
t.Errorf("provider %q has empty display_name", p.Name)
}
if p.AuthMode == "" {
t.Errorf("provider %q has empty auth_mode", p.Name)
}
if len(p.AuthEnv) == 0 {
t.Errorf("provider %q has empty auth_env", p.Name)
}
if p.ModelPrefixMatch == "" {
t.Errorf("provider %q has empty model_prefix_match", p.Name)
}
switch p.Protocol {
case ProtocolOpenAI, ProtocolAnthropic:
default:
t.Errorf("provider %q has invalid protocol %q", p.Name, p.Protocol)
}
}
}
// TestUniqueNames asserts provider names are unique (Load enforces this;
// this test guards the manifest data itself).
func TestUniqueNames(t *testing.T) {
ps, err := Load()
if err != nil {
t.Fatalf("Load() error = %v", err)
}
seen := make(map[string]bool, len(ps))
for _, p := range ps {
if seen[p.Name] {
t.Errorf("duplicate provider name %q", p.Name)
}
seen[p.Name] = true
}
}
// providerByName is a test helper.
func providerByName(t *testing.T, ps []Provider, name string) Provider {
t.Helper()
for _, p := range ps {
if p.Name == name {
return p
}
}
t.Fatalf("provider %q not found in manifest", name)
return Provider{}
}
// TestMatchesModel maps representative slugs from each source (proxy
// prefixes, canvas BARE_VENDOR_PATTERNS, adapter model_prefixes, DB
// catalog ids) to the provider that should own them.
func TestMatchesModel(t *testing.T) {
ps, err := Load()
if err != nil {
t.Fatalf("Load() error = %v", err)
}
cases := []struct {
slug string
expect string // provider name that must match
}{
// Moonshot vs Kimi-coding — corrected serving split (internal#718
// P0, CTO 2026-05-27, empirically verified): the BYOK api.kimi.com/
// coding gateway owns the BARE kimi-* ids; the moonshot endpoint owns
// the moonshot-namespaced/prefixed ids. Bare kimi-k2.6 / kimi-k2.5 /
// kimi-for-coding therefore belong to kimi-coding; only the explicit
// moonshot/ (proxy/platform) and moonshot- (bare moonshot model)
// prefixes belong to moonshot.
{"kimi-k2.6", "kimi-coding"},
{"kimi-k2.5", "kimi-coding"},
{"kimi-latest", "kimi-coding"},
{"moonshot/kimi-k2.6", "moonshot"},
{"moonshot-v1-128k", "moonshot"},
// Anthropic — proxy "claude"->anthropic + DB claude-* + canvas /^claude-/.
{"claude-sonnet-4-6", "anthropic-api"},
{"claude-opus-4-7", "anthropic-api"},
{"claude-haiku-4-5-20251001", "anthropic-api"},
// Anthropic OAuth aliases.
{"sonnet", "anthropic-oauth"},
{"opus", "anthropic-oauth"},
{"haiku", "anthropic-oauth"},
// MiniMax — DB MiniMax-M2.7 (mixed case) + canvas /^MiniMax-/.
{"MiniMax-M2.7", "minimax"},
{"MiniMax-M2", "minimax"},
{"minimax-m2.5", "minimax"},
// OpenAI — DB gpt-5.x + canvas /^gpt-/.
{"gpt-5.5", "openai"},
{"gpt-5.4-mini", "openai"},
// Xiaomi MiMo — adapter mimo- + canvas /^mimo-/.
{"mimo-v2.5-pro", "xiaomi-mimo"},
// Z.ai GLM — adapter glm- + canvas /^GLM-/ (mixed case).
{"GLM-4.6", "zai"},
{"glm-4.5", "zai"},
// DeepSeek.
{"deepseek-v4-pro", "deepseek"},
// Kimi coding-tuned gateway (distinct from moonshot).
{"kimi-for-coding", "kimi-coding"},
// Canvas-only slash-prefixed vendors.
{"openrouter/anthropic/claude-3.5", "openrouter"},
{"huggingface/mistralai/Mistral-7B", "huggingface"},
{"custom/my-local-model", "custom"},
{"gemini-2.5-pro", "google"},
{"qwen-3-max", "alibaba"},
{"nousresearch/hermes-4-70b", "nousresearch"},
}
for _, tc := range cases {
p := providerByName(t, ps, tc.expect)
if !p.MatchesModel(tc.slug) {
t.Errorf("slug %q: expected provider %q to match, but it did not (regex %q)", tc.slug, tc.expect, p.ModelPrefixMatch)
}
}
}
// TestNoAmbiguousModelMatch is the RFC §8.5 overlap guard: no two
// providers may claim the same representative slug. A bad regex that
// over-broadly matches another vendor's ids breaks routing across three
// runtimes, so we catch overlap at PR-1 load time.
func TestNoAmbiguousModelMatch(t *testing.T) {
ps, err := Load()
if err != nil {
t.Fatalf("Load() error = %v", err)
}
// Representative slug corpus spanning every source. Each slug must be
// claimed by exactly one provider.
corpus := []string{
"kimi-k2.6", "kimi-k2.5", "moonshot-v1-128k", "moonshot/kimi-k2.6",
"claude-sonnet-4-6", "claude-opus-4-7", "claude-haiku-4-5-20251001",
"sonnet", "opus", "haiku",
"MiniMax-M2.7", "MiniMax-M2", "minimax-m2.5", "MiniMax-M2.7-highspeed",
"gpt-5.5", "gpt-5.4", "gpt-5.4-mini",
"mimo-v2.5-pro", "mimo-v2-flash",
"GLM-4.6", "glm-4.5",
"deepseek-v4-pro", "deepseek-v4-flash",
"kimi-for-coding",
"openrouter/x", "huggingface/y", "custom/z",
"gemini-2.5-pro", "qwen-3-max", "nousresearch/hermes-4-70b",
"ai-gateway/m", "opencode-zen/m", "opencode-go/m", "kilocode/m",
"minimax-cn/m2", "ollama-cloud/m", "ollama/llama4", "nvidia/m", "arcee/m",
"platform/anything",
}
for _, slug := range corpus {
var matched []string
for _, p := range ps {
if p.MatchesModel(slug) {
matched = append(matched, p.Name)
}
}
if len(matched) > 1 {
t.Errorf("slug %q ambiguously matched %d providers: %v", slug, len(matched), matched)
}
}
}
// TestMatchesModelZeroValue exercises the lazy on-demand compile path of
// a Provider not produced by Load.
func TestMatchesModelZeroValue(t *testing.T) {
p := Provider{ModelPrefixMatch: "^claude-"}
if !p.MatchesModel("claude-opus-4-7") {
t.Error("zero-value Provider should match claude-opus-4-7")
}
if p.MatchesModel("gpt-5.5") {
t.Error("zero-value Provider should not match gpt-5.5")
}
bad := Provider{ModelPrefixMatch: "([unterminated"}
if bad.MatchesModel("anything") {
t.Error("Provider with an invalid regex must never match")
}
empty := Provider{}
if empty.MatchesModel("anything") {
t.Error("Provider with an empty regex must never match")
}
}
@@ -0,0 +1,420 @@
package providers
import (
"sort"
"strings"
"testing"
)
// runtimeNativeProviders is the authoritative per-runtime native provider
// matrix from RFC #340 (CTO correction 2026-05-26): the manifest is
// constrained to what each runtime NATIVELY supports, not a 24-provider
// superset. Provider-level expectations; the model-id-level assertions
// live in TestModelsForRuntime_ModelIDs.
//
// Each runtime also natively supports the `platform` provider (Molecule
// platform-managed LLM: no tenant key, platform owns billing) for the subset
// of its native vendors the proxy can serve — kimi for hermes/openclaw,
// openai for codex, anthropic+kimi+minimax for claude-code.
//
// claude-code -> anthropic (oauth+api), kimi (kimi-coding), minimax, platform
// hermes -> kimi (kimi-coding), platform
// codex -> openai, platform
// openclaw -> kimi (kimi-coding), platform
var runtimeNativeProviders = map[string][]string{
"claude-code": {"anthropic-api", "anthropic-oauth", "kimi-coding", "minimax", "platform"},
"hermes": {"kimi-coding", "platform"},
"codex": {"openai", "platform"}, // platform openai via the proxy Responses surface
"openclaw": {"kimi-coding", "platform"},
}
func sortedCopy(in []string) []string {
out := append([]string(nil), in...)
sort.Strings(out)
return out
}
// TestProvidersForRuntime_ExactNativeSet asserts ProvidersForRuntime
// returns EXACTLY the native provider set for each runtime — no more
// (over-offer drift), no fewer (under-route). Exact set equality, not
// substring/superset, per feedback_assert_exact_not_substring.
func TestProvidersForRuntime_ExactNativeSet(t *testing.T) {
m, err := LoadManifest()
if err != nil {
t.Fatalf("LoadManifest() error = %v", err)
}
for rt, want := range runtimeNativeProviders {
got, err := m.ProvidersForRuntime(rt)
if err != nil {
t.Fatalf("ProvidersForRuntime(%q) error = %v", rt, err)
}
var gotNames []string
for _, p := range got {
gotNames = append(gotNames, p.Name)
}
gotNames = sortedCopy(gotNames)
wantSorted := sortedCopy(want)
if len(gotNames) != len(wantSorted) {
t.Fatalf("ProvidersForRuntime(%q) = %v, want exactly %v", rt, gotNames, wantSorted)
}
for i := range wantSorted {
if gotNames[i] != wantSorted[i] {
t.Fatalf("ProvidersForRuntime(%q) = %v, want exactly %v", rt, gotNames, wantSorted)
}
}
}
}
// TestModelsForRuntime_ExactModelIDs is the brief's central assertion:
// ModelsForRuntime returns EXACTLY the native model-id set for each
// runtime. Encodes the model IDs extracted from each template config.yaml.
func TestModelsForRuntime_ExactModelIDs(t *testing.T) {
m, err := LoadManifest()
if err != nil {
t.Fatalf("LoadManifest() error = %v", err)
}
cases := map[string][]string{
// claude-code: anthropic (oauth aliases + versioned API ids +
// platform-namespaced) + kimi (kimi-coding gateway + platform) +
// minimax (BYOK + platform-namespaced).
"claude-code": {
// anthropic OAuth aliases
"sonnet", "opus", "haiku",
// anthropic API versioned
"claude-sonnet-4-6", "claude-opus-4-7", "claude-haiku-4-5",
// anthropic via platform proxy (namespaced)
"anthropic/claude-opus-4-7", "anthropic/claude-sonnet-4-6",
// kimi (kimi-coding gateway)
"kimi-for-coding", "kimi-k2.5", "kimi-k2",
// kimi via platform proxy
"moonshot/kimi-k2.6", "moonshot/kimi-k2.5",
// minimax BYOK
"MiniMax-M2", "MiniMax-M2.7", "MiniMax-M2.7-highspeed",
// minimax via platform proxy
"minimax/MiniMax-M2.7", "minimax/MiniMax-M2.7-highspeed",
},
// hermes: kimi (BYOK gateway) + platform-managed kimi.
"hermes": {
"kimi-coding/kimi-k2",
"moonshot/kimi-k2.6", "moonshot/kimi-k2.5",
},
// codex: openai BYOK + platform-managed openai (served via the proxy
// Responses surface; codex CLI 0.130+ is Responses-API-only).
"codex": {
"gpt-5.5", "gpt-5.4", "gpt-5.4-mini",
"gpt-5.3-codex", "gpt-5.3-codex-spark", "gpt-5.2",
"openai/gpt-5.4", "openai/gpt-5.4-mini",
},
// openclaw: kimi BYOK (moonshot: prefix -> KIMI_API_KEY ->
// api.kimi.com/coding gateway) + platform-managed kimi (moonshot/).
"openclaw": {
"moonshot:kimi-k2.6", "moonshot:kimi-k2.5",
"moonshot/kimi-k2.6", "moonshot/kimi-k2.5",
},
}
for rt, want := range cases {
got, err := m.ModelsForRuntime(rt)
if err != nil {
t.Fatalf("ModelsForRuntime(%q) error = %v", rt, err)
}
gotSorted := sortedCopy(got)
wantSorted := sortedCopy(want)
if len(gotSorted) != len(wantSorted) {
t.Fatalf("ModelsForRuntime(%q) returned %d ids %v, want %d %v",
rt, len(gotSorted), gotSorted, len(wantSorted), wantSorted)
}
for i := range wantSorted {
if gotSorted[i] != wantSorted[i] {
t.Fatalf("ModelsForRuntime(%q) = %v, want exactly %v", rt, gotSorted, wantSorted)
}
}
}
}
// TestModelsForRuntime_UnknownRuntime: an unknown runtime returns an error
// (and an empty slice). Fail-direction proof — a runtime not in the matrix
// must not silently return the whole catalog.
func TestModelsForRuntime_UnknownRuntime(t *testing.T) {
m, err := LoadManifest()
if err != nil {
t.Fatalf("LoadManifest() error = %v", err)
}
got, err := m.ModelsForRuntime("does-not-exist")
if err == nil {
t.Errorf("ModelsForRuntime(unknown) expected error, got nil (returned %v)", got)
}
if len(got) != 0 {
t.Errorf("ModelsForRuntime(unknown) expected empty slice, got %v", got)
}
}
// TestProvidersForRuntime_UnknownRuntime: same fail-closed contract for the
// provider-level accessor.
func TestProvidersForRuntime_UnknownRuntime(t *testing.T) {
m, err := LoadManifest()
if err != nil {
t.Fatalf("LoadManifest() error = %v", err)
}
got, err := m.ProvidersForRuntime("does-not-exist")
if err == nil {
t.Errorf("ProvidersForRuntime(unknown) expected error, got nil (returned %v)", got)
}
if len(got) != 0 {
t.Errorf("ProvidersForRuntime(unknown) expected empty slice, got %v", got)
}
}
// TestNonNativeModelAbsentFromEveryRuntime is the drift-prune proof: a model
// that no runtime natively supports must NOT be returned by ModelsForRuntime
// for ANY runtime. These ids are template-declared drift the RFC prunes:
// - gemini-2.5-pro (canvas/hermes-only, no native CTO matrix entry)
// - GLM-4.6 (zai; claude-code template declares it but it's outside the
// anthropic/kimi/minimax native set)
// - deepseek-v4-pro (claude-code template declares it; outside native set)
// - mimo-v2.5-pro (xiaomi; claude-code template declares it; outside set)
// - openai:gpt-4o (openclaw template declares it; outside the kimi-only set)
func TestNonNativeModelAbsentFromEveryRuntime(t *testing.T) {
m, err := LoadManifest()
if err != nil {
t.Fatalf("LoadManifest() error = %v", err)
}
driftModels := []string{
"gemini-2.5-pro",
"GLM-4.6",
"deepseek-v4-pro",
"mimo-v2.5-pro",
"openai:gpt-4o",
"qwen3-max",
"nousresearch/hermes-4-70b",
}
for rt := range runtimeNativeProviders {
got, err := m.ModelsForRuntime(rt)
if err != nil {
t.Fatalf("ModelsForRuntime(%q) error = %v", rt, err)
}
present := make(map[string]bool, len(got))
for _, id := range got {
present[id] = true
}
for _, drift := range driftModels {
if present[drift] {
t.Errorf("runtime %q must NOT offer non-native drift model %q, but it did", rt, drift)
}
}
}
}
// minimalValidManifest is a tiny well-formed manifest used as the base for
// the fail-direction tests below. Each negative test mutates one field and
// asserts parseManifest rejects it — proving the load-time guards are
// load-bearing, not vacuously satisfied by the embedded baseline.
const minimalValidManifest = `
schema_version: 1
providers:
- name: openai
display_name: "OpenAI"
protocol: openai
auth_mode: anthropic_api
auth_env: [OPENAI_API_KEY]
model_prefix_match: "^gpt-"
runtimes:
codex:
providers:
- name: openai
models: [gpt-5.5]
`
// TestParseManifest_ValidBaseline proves the minimal manifest parses, so the
// negative tests below isolate exactly the field they each mutate.
func TestParseManifest_ValidBaseline(t *testing.T) {
m, err := parseManifest([]byte(minimalValidManifest))
if err != nil {
t.Fatalf("parseManifest(valid) error = %v", err)
}
models, err := m.ModelsForRuntime("codex")
if err != nil || len(models) != 1 || models[0] != "gpt-5.5" {
t.Fatalf("ModelsForRuntime(codex) = %v, err = %v; want [gpt-5.5]", models, err)
}
}
// TestParseManifest_FailDirection is the load-bearing-guard proof: each case
// breaks the manifest in one way and asserts the matching error fires. If a
// future edit removes a guard, the corresponding case flips red.
func TestParseManifest_FailDirection(t *testing.T) {
cases := []struct {
name string
yaml string
wantErr string
}{
{
name: "unknown provider ref",
yaml: `
schema_version: 1
providers:
- {name: openai, display_name: "OpenAI", protocol: openai, auth_mode: anthropic_api, auth_env: [OPENAI_API_KEY], model_prefix_match: "^gpt-"}
runtimes:
codex:
providers:
- {name: typo-provider, models: [gpt-5.5]}
`,
wantErr: "unknown provider",
},
{
name: "empty native set",
yaml: `
schema_version: 1
providers:
- {name: openai, display_name: "OpenAI", protocol: openai, auth_mode: anthropic_api, auth_env: [OPENAI_API_KEY], model_prefix_match: "^gpt-"}
runtimes:
codex:
providers: []
`,
wantErr: "empty native provider set",
},
{
name: "provider ref with no models",
yaml: `
schema_version: 1
providers:
- {name: openai, display_name: "OpenAI", protocol: openai, auth_mode: anthropic_api, auth_env: [OPENAI_API_KEY], model_prefix_match: "^gpt-"}
runtimes:
codex:
providers:
- {name: openai, models: []}
`,
wantErr: "no model ids",
},
{
name: "duplicate provider ref",
yaml: `
schema_version: 1
providers:
- {name: openai, display_name: "OpenAI", protocol: openai, auth_mode: anthropic_api, auth_env: [OPENAI_API_KEY], model_prefix_match: "^gpt-"}
runtimes:
codex:
providers:
- {name: openai, models: [gpt-5.5]}
- {name: openai, models: [gpt-5.4]}
`,
wantErr: "twice",
},
{
name: "no runtimes block",
yaml: `
schema_version: 1
providers:
- {name: openai, display_name: "OpenAI", protocol: openai, auth_mode: anthropic_api, auth_env: [OPENAI_API_KEY], model_prefix_match: "^gpt-"}
`,
wantErr: "no runtimes",
},
{
name: "wrong schema version",
yaml: `
schema_version: 99
providers:
- {name: openai, display_name: "OpenAI", protocol: openai, auth_mode: anthropic_api, auth_env: [OPENAI_API_KEY], model_prefix_match: "^gpt-"}
runtimes:
codex:
providers:
- {name: openai, models: [gpt-5.5]}
`,
wantErr: "schema_version",
},
{
name: "malformed yaml",
yaml: "schema_version: 1\nproviders: [oops: not-a-list",
wantErr: "parse manifest",
},
{
name: "no providers",
yaml: `
schema_version: 1
providers: []
runtimes:
codex:
providers:
- {name: openai, models: [gpt-5.5]}
`,
wantErr: "no providers",
},
{
name: "duplicate provider name",
yaml: `
schema_version: 1
providers:
- {name: openai, display_name: "OpenAI", protocol: openai, auth_mode: anthropic_api, auth_env: [OPENAI_API_KEY], model_prefix_match: "^gpt-"}
- {name: openai, display_name: "OpenAI dup", protocol: openai, auth_mode: anthropic_api, auth_env: [OPENAI_API_KEY], model_prefix_match: "^gpt-"}
runtimes:
codex:
providers:
- {name: openai, models: [gpt-5.5]}
`,
wantErr: "duplicate provider name",
},
{
name: "uncompilable model_prefix_match",
yaml: `
schema_version: 1
providers:
- {name: openai, display_name: "OpenAI", protocol: openai, auth_mode: anthropic_api, auth_env: [OPENAI_API_KEY], model_prefix_match: "([unterminated"}
runtimes:
codex:
providers:
- {name: openai, models: [gpt-5.5]}
`,
wantErr: "model_prefix_match",
},
{
name: "missing required field (protocol)",
yaml: `
schema_version: 1
providers:
- {name: openai, display_name: "OpenAI", auth_mode: anthropic_api, auth_env: [OPENAI_API_KEY], model_prefix_match: "^gpt-"}
runtimes:
codex:
providers:
- {name: openai, models: [gpt-5.5]}
`,
wantErr: "protocol must be",
},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
_, err := parseManifest([]byte(tc.yaml))
if err == nil {
t.Fatalf("parseManifest(%s) expected error containing %q, got nil", tc.name, tc.wantErr)
}
if !strings.Contains(err.Error(), tc.wantErr) {
t.Fatalf("parseManifest(%s) error = %q, want substring %q", tc.name, err.Error(), tc.wantErr)
}
})
}
}
// TestRuntimes_AllProviderRefsResolve guards manifest integrity: every
// provider name referenced in a runtime's native set must resolve to a real
// provider entry. A typo'd provider ref must fail Load, not silently drop a
// model. (Load-time validation; this asserts the loaded manifest is clean.)
func TestRuntimes_AllProviderRefsResolve(t *testing.T) {
m, err := LoadManifest()
if err != nil {
t.Fatalf("LoadManifest() error = %v", err)
}
known := make(map[string]bool, len(m.Providers))
for _, p := range m.Providers {
known[p.Name] = true
}
if len(m.Runtimes) == 0 {
t.Fatal("manifest declares no runtimes")
}
for rt, native := range m.Runtimes {
for _, ref := range native.Providers {
if !known[ref.Name] {
t.Errorf("runtime %q references unknown provider %q", rt, ref.Name)
}
}
}
}
@@ -0,0 +1,45 @@
package providers
import (
"crypto/sha256"
"encoding/hex"
"testing"
)
// sync_canonical_test.go — hermetic half of the canonical↔synced-copy drift
// gate (internal#718 P2-A).
//
// molecule-core's providers.yaml is a SYNCED COPY of the canonical SSOT in
// molecule-controlplane internal/providers/providers.yaml. The live cross-repo
// byte-compare lives in the sync-providers-yaml CI workflow (it fetches the
// canonical from CP and diffs). This test is the HERMETIC backstop: it pins the
// sha256 of the embedded synced copy to the value the canonical produced at sync
// time, so a HAND-EDIT of core's copy (or a partial sync) flips red locally and
// in `go test ./...` even when CI cannot reach controlplane.
//
// When the canonical legitimately changes, the sync procedure is:
// 1. Copy controlplane internal/providers/providers.yaml verbatim over this
// copy.
// 2. `go generate ./...` to regenerate the artifact (verify-providers-gen).
// 3. Update canonicalProvidersYAMLSHA256 below to the new sha (the failure
// message prints the observed sha to paste in).
// The deliberate constant bump is the human checkpoint that a registry change
// was consciously re-synced into core, not silently forked.
// canonicalProvidersYAMLSHA256 is the sha256 of the canonical providers.yaml as
// synced from molecule-controlplane. Bumped deliberately on each re-sync (see
// file doc). Cross-checked live by the sync-providers-yaml CI workflow.
const canonicalProvidersYAMLSHA256 = "48a669210494f3fded2315eb59a5549bc7632676e6d2e29db58a67273184ce76"
func TestSyncedYAMLMatchesCanonicalSHA(t *testing.T) {
sum := sha256.Sum256(embeddedYAML)
got := hex.EncodeToString(sum[:])
if got != canonicalProvidersYAMLSHA256 {
t.Fatalf("embedded providers.yaml sha256 = %s, pinned canonical = %s\n"+
"If you intentionally re-synced the canonical from molecule-controlplane, "+
"update canonicalProvidersYAMLSHA256 to %s and regenerate (`go generate ./...`).\n"+
"If you did NOT mean to edit core's copy, revert it — the canonical SSOT is "+
"molecule-controlplane internal/providers/providers.yaml, not this synced copy.",
got, canonicalProvidersYAMLSHA256, got)
}
}