Compare commits

...

11 Commits

Author SHA1 Message Date
Molecule AI Dev Engineer A (Kimi) 99b7d21a48 docs(sop-checklist): sync issue_comment trigger comment with workflow reality
gate-check-v3 / gate-check (pull_request) Successful in 5s
qa-review / approved (pull_request) Failing after 5s
security-review / approved (pull_request) Failing after 5s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 7s
sop-tier-check / tier-check (pull_request_review) Successful in 4s
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 13s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Python Lint & Test (pull_request) Successful in 3s
CI / Detect changes (pull_request) Successful in 5s
CI / all-required (pull_request) Successful in 28s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 5s
E2E Chat / detect-changes (pull_request) Successful in 6s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s
E2E API Smoke Test / detect-changes (pull_request) Successful in 10s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 3s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 3s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
qa-review / approved (pull_request_target) Successful in 4s
gate-check-v3 / gate-check (pull_request_target) Successful in 4s
review-check-tests / review-check.sh regression tests (pull_request) Successful in 9s
sop-checklist / review-refire (pull_request_target) Has been skipped
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 4s
sop-tier-check / tier-check (pull_request_target) Successful in 3s
security-review / approved (pull_request_target) Successful in 6s
CI / Canvas (Next.js) (pull_request) Successful in 1s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 1s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Platform (Go) (pull_request) Successful in 6s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 5s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 59s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m2s
audit-force-merge / audit (pull_request_target) Successful in 5s
The sop-checklist.yml workflow subscribes only to issue_comment:[created]
(consolidated in PR #1345 / issue #1280 to reduce runner-slot occupancy).
The script header still claimed [created, edited, deleted], which could
mislead future maintainers into thinking edited/deleted events are handled.

No behavior change — comment-only.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 21:36:17 +00:00
Molecule AI Dev Engineer A (Kimi) 5c829c60c9 fix(ci): distinguish all-403 token-provisioning failures in review-check.sh
When the Gitea token owner is not a member of the qa/security team,
every team-membership probe returns 403. Previously the final error
message said "none are in team", which misled ops into verifying the
team roster when the real issue was token provisioning (Bug C).

Add tracking for all-403 vs mixed-response scenarios. When every
candidate returns 403, emit an explicit error naming the root cause
and the remediation (add token owner to team or switch tokens).

No behavior change — still fail-closed; only the diagnostic message
is improved.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 21:36:17 +00:00
hongming 7cfec2d61f Merge pull request 'fix(workspace-server): provider-aware gate on platform scope:global LLM creds (internal#711)' (#1963) from fix/byok-global-llm-cred-leak-internal-711 into main
ci-arm64-advisory / fast-checks (push) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 12s
Block internal-flavored paths / Block forbidden paths (push) Successful in 11s
CI / Python Lint & Test (push) Successful in 5s
CI / Detect changes (push) Successful in 14s
E2E API Smoke Test / detect-changes (push) Successful in 11s
E2E Chat / detect-changes (push) Successful in 14s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 8s
Handlers Postgres Integration / detect-changes (push) Successful in 8s
Harness Replays / detect-changes (push) Successful in 5s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 4s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 7s
E2E Staging SaaS (full lifecycle) / pr-validate (push) Successful in 34s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 5s
CI / Canvas (Next.js) (push) Successful in 3s
CI / Shellcheck (E2E scripts) (push) Successful in 2s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 1m49s
publish-workspace-server-image / build-and-push (push) Successful in 3m13s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 13s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m50s
Harness Replays / Harness Replays (push) Successful in 4s
CI / Canvas Deploy Reminder (push) Successful in 2s
E2E Chat / E2E Chat (push) Successful in 3m54s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push) Successful in 5m55s
CI / Platform (Go) (push) Successful in 5m54s
CI / all-required (push) Successful in 7m12s
publish-workspace-server-image / Production auto-deploy (push) Successful in 6m21s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 9s
ci-required-drift / drift (push) Successful in 1m14s
Sweep stale AWS Secrets Manager secrets / Sweep AWS Secrets Manager (push) Successful in 6s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 12s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 8s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 4m44s
main-red-watchdog / watchdog (push) Successful in 27s
gate-check-v3 / gate-check (push) Successful in 23s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 5m57s
2026-05-27 20:18:18 +00:00
agent-platform-engineer 585b3d6ed0 fix(workspace-server): provider-aware gate on platform scope:global LLM creds (internal#711)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 13s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 7s
CI / Python Lint & Test (pull_request) Successful in 6s
CI / Detect changes (pull_request) Successful in 14s
E2E API Smoke Test / detect-changes (pull_request) Successful in 20s
E2E Chat / detect-changes (pull_request) Successful in 15s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 17s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 15s
Harness Replays / detect-changes (pull_request) Successful in 13s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 9s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 8s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 18s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 57s
gate-check-v3 / gate-check (pull_request) Successful in 7s
qa-review / approved (pull_request) Failing after 7s
sop-checklist / na-declarations (pull_request) N/A: (none)
security-review / approved (pull_request) Failing after 7s
sop-checklist / all-items-acked (pull_request) Successful in 7s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 5s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m9s
CI / Canvas (Next.js) (pull_request) Successful in 3s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 4s
E2E Chat / E2E Chat (pull_request) Successful in 19s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 9s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m39s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Successful in 7m40s
Harness Replays / Harness Replays (pull_request) Successful in 4s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m7s
CI / Platform (Go) (pull_request) Successful in 5m52s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) Successful in 15m9s
audit-force-merge / audit (pull_request) Successful in 9s
A workspace whose resolved LLM billing mode is NOT platform_managed
(byok / subscription) was still being injected with the platform's
scope:global CLAUDE_CODE_OAUTH_TOKEN and ran on the platform's Anthropic
credits. Confirmed live 2026-05-27 on the Reno Stars tenant: the SEO
(352e3c2b-...) and Marketing (6b66de8d-...) claude-code agents had no
workspace-scoped LLM credential, yet ran MODEL=opus directly on
api.anthropic.com using the platform's global OAuth token.

Root cause: loadWorkspaceSecrets merges ALL global_secrets into every
workspace's env provenance-blind. applyPlatformManagedLLMEnv's
non-platform (byok/disabled) path then early-returned WITHOUT stripping
those inherited platform globals — so a workspace with no LLM credential
of its own kept the platform's scope:global CLAUDE_CODE_OAUTH_TOKEN.
The same leak existed on the remote-pull path (GET
/workspaces/:id/secrets/values), which also merged globals unconditionally.

Fix (provider-aware, both injection vectors):
- applyPlatformManagedLLMEnv now takes the global-provenance key set and,
  on the non-platform path, strips every platform-managed LLM bypass key
  (CLAUDE_CODE_OAUTH_TOKEN + the rest) that originated from global_secrets.
  A workspace's OWN LLM cred (a workspace_secrets row — provenance flag
  dropped by loadWorkspaceSecrets) is NOT in the global set and survives.
- secrets.Values applies the same provenance-aware gate before returning
  the merged bundle to a remote agent.
- Fail closed: a byok workspace left with no usable LLM credential aborts
  provision with code MISSING_BYOK_CREDENTIAL instead of starting on the
  (now-stripped) platform creds. Scoped to byok; disabled mode strips but
  still boots (no-LLM workspaces are legitimate).
- platform_managed path is unchanged (it still receives + force-routes the
  platform creds via the CP proxy), and the LLM-proxy anthropic path is
  untouched.

Tests (all green; go build/test ./... + -tags=integration build pass):
- ByokStripsGlobalOriginOAuthToken — platform global token stripped, no cred.
- ByokKeepsWorkspaceOwnOAuthEvenWithGlobal — workspace's own token survives.
- DisabledStripsGlobalButReportsNoCred — disabled strips but does not abort.
- PlatformManagedStillReceivesGlobalCreds — no regression on platform path.
- PrepareProvisionContext_ByokWithOnlyGlobalOAuthFailsClosed — e2e abort.
- SecretsValues_ByokStripsGlobalLLMCred — remote-pull path gated.

Note: open PR #1930 (refactor/drop-org-tier-llm-billing-mode, internal#691
follow-up) changes ResolveLLMBillingMode's signature in the same files.
This change is built on current main and is orthogonal in intent; whichever
merges second needs a mechanical 1-line resolver-call adjustment (drop the
orgMode arg). #1930 does NOT fix this leak.

Refs internal#711

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 12:55:58 -07:00
hongming 9deb8e9ea6 Merge pull request 'fix(security): scope peer discovery + a2a routing to caller org (#1953)' (#1954) from fix/1953-scope-peer-discovery-a2a-to-org into main
ci-arm64-advisory / fast-checks (push) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 17s
Block internal-flavored paths / Block forbidden paths (push) Successful in 9s
CI / Python Lint & Test (push) Successful in 9s
CI / Detect changes (push) Successful in 17s
E2E API Smoke Test / detect-changes (push) Successful in 11s
E2E Chat / detect-changes (push) Successful in 15s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 11s
publish-workspace-server-image / build-and-push (push) Successful in 3m23s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (push) Successful in 54s
E2E Staging SaaS (full lifecycle) / pr-validate (push) Successful in 38s
Handlers Postgres Integration / detect-changes (push) Successful in 7s
Harness Replays / detect-changes (push) Successful in 6s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 4s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 4s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 5s
CI / Canvas (Next.js) (push) Successful in 5s
CI / Shellcheck (E2E scripts) (push) Successful in 20s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 1m31s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push) Successful in 5m28s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 3s
CI / Platform (Go) (push) Successful in 4m55s
Harness Replays / Harness Replays (push) Successful in 9s
CI / all-required (push) Successful in 9m15s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push) Successful in 6m44s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m48s
E2E Chat / E2E Chat (push) Successful in 4m13s
publish-workspace-server-image / Production auto-deploy (push) Successful in 8m4s
CI / Canvas Deploy Reminder (push) Successful in 1s
Sweep stale AWS Secrets Manager secrets / Sweep AWS Secrets Manager (push) Successful in 10s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 5s
main-red-watchdog / watchdog (push) Successful in 44s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 6m45s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 5m46s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 3s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 9s
gate-check-v3 / gate-check (push) Successful in 41s
ci-required-drift / drift (push) Successful in 1m3s
2026-05-27 17:51:46 +00:00
core-be 69391595f3 fix(e2e): delete child before parent in test_api delete/round-trip (#1953)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
CI / Python Lint & Test (pull_request) Successful in 5s
E2E API Smoke Test / detect-changes (pull_request) Successful in 13s
CI / Detect changes (pull_request) Successful in 13s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 13s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 7s
E2E Chat / detect-changes (pull_request) Successful in 10s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 27s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 12s
Harness Replays / detect-changes (pull_request) Successful in 6s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 47s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 5s
review-check-tests / review-check.sh regression tests (pull_request) Successful in 9s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 1m8s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s
gate-check-v3 / gate-check (pull_request) Successful in 7s
qa-review / approved (pull_request) Failing after 7s
security-review / approved (pull_request) Failing after 3s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 4s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 5s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m5s
CI / Canvas (Next.js) (pull_request) Successful in 2s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 20s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m12s
E2E Chat / E2E Chat (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 6s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2m3s
Harness Replays / Harness Replays (pull_request) Successful in 4s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m4s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Successful in 6m38s
CI / Platform (Go) (pull_request) Successful in 5m56s
CI / all-required (pull_request) Successful in 8m1s
audit-force-merge / audit (pull_request) Successful in 10s
The #1953 fixture re-seed made Summarizer a CHILD of Echo (same-org) so
the peer-discovery assertions exercise legit same-org enumeration. But
Test 21 still deleted the PARENT (Echo) first and asserted the other
workspace survives (count=1). CascadeDelete walks the recursive parent_id
CTE, so deleting Echo also removed its child Summarizer -> "List after
delete" saw 0, and Test 22 then hit 410 Gone deleting an already-removed
Summarizer ("got: {error: workspace removed}").

Fix: capture Summarizer's bundle, delete the CHILD (Summarizer) first
(child delete does not cascade upward so Echo survives -> count=1), then
delete the parent Echo in the round-trip block and re-import the captured
bundle. Cross-tenant isolation and the same-org parent/child relationship
are unchanged; only the delete ordering is corrected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 17:42:44 +00:00
hongming 46606801c6 Merge pull request 'fix(ci): add explicit utf-8 encoding to Python open() calls' (#1920) from fix/python-open-encoding into main
ci-arm64-advisory / fast-checks (push) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 11s
publish-workspace-server-image / build-and-push (push) Successful in 6m8s
Block internal-flavored paths / Block forbidden paths (push) Successful in 5s
CI / Detect changes (push) Successful in 8s
CI / Python Lint & Test (push) Successful in 4s
E2E Chat / detect-changes (push) Successful in 9s
E2E API Smoke Test / detect-changes (push) Successful in 9s
Handlers Postgres Integration / detect-changes (push) Successful in 4s
CI / all-required (push) Successful in 8m28s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 7s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 5s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 6s
review-check-tests / review-check.sh regression tests (push) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 4s
Ops Scripts Tests / Ops scripts (unittest) (push) Successful in 1m5s
publish-workspace-server-image / Production auto-deploy (push) Successful in 4m52s
main-red-watchdog / watchdog (push) Successful in 57s
CI / Platform (Go) (push) Successful in 6s
CI / Canvas (Next.js) (push) Successful in 5s
CI / Shellcheck (E2E scripts) (push) Successful in 5s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 6s
E2E Chat / E2E Chat (push) Successful in 6s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 6s
gate-check-v3 / gate-check (push) Successful in 1m12s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 15s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m2s
ci-required-drift / drift (push) Successful in 1m10s
CI / Canvas Deploy Reminder (push) Successful in 5s
Sweep stale AWS Secrets Manager secrets / Sweep AWS Secrets Manager (push) Successful in 6s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 4m31s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 7m11s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 5s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 17s
2026-05-27 17:01:54 +00:00
hongming cd671e1263 Merge pull request 'fix(memory): upsert namespace before v2 commit' (#1925) from fix/memory-v2-upsert-namespace-20260526 into main
Block internal-flavored paths / Block forbidden paths (push) Waiting to run
ci-arm64-advisory / fast-checks (push) Waiting to run
CI / Detect changes (push) Waiting to run
CI / Platform (Go) (push) Blocked by required conditions
CI / Canvas (Next.js) (push) Blocked by required conditions
CI / Shellcheck (E2E scripts) (push) Blocked by required conditions
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / Python Lint & Test (push) Waiting to run
E2E API Smoke Test / detect-changes (push) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / detect-changes (push) Waiting to run
E2E Chat / E2E Chat (push) Blocked by required conditions
E2E Staging Canvas (Playwright) / detect-changes (push) Waiting to run
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Waiting to run
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Waiting to run
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 12s
publish-workspace-server-image / build-and-push (push) Successful in 3m2s
Harness Replays / detect-changes (push) Successful in 3s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 10s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 20s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 5m52s
Harness Replays / Harness Replays (push) Successful in 4s
CI / all-required (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
publish-workspace-server-image / Production auto-deploy (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
2026-05-27 16:43:49 +00:00
core-be 51f74e9d8a fix(security): correct org-root CTE so same-org a2a routing works (#1953)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 11s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 34s
CI / Python Lint & Test (pull_request) Successful in 20s
CI / Detect changes (pull_request) Successful in 20s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E API Smoke Test / detect-changes (pull_request) Successful in 28s
E2E Chat / detect-changes (pull_request) Successful in 28s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 43s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 10s
Harness Replays / detect-changes (pull_request) Successful in 8s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 1m2s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 7s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 5s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 1m30s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 18s
gate-check-v3 / gate-check (pull_request) Successful in 8s
qa-review / approved (pull_request) Successful in 12s
security-review / approved (pull_request) Failing after 10s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 9s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 5s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m11s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Successful in 7m21s
CI / Canvas (Next.js) (pull_request) Successful in 3s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 11s
E2E Chat / E2E Chat (pull_request) Successful in 25s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 12s
Harness Replays / Harness Replays (pull_request) Successful in 4s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 1m36s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m37s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Platform (Go) (pull_request) Successful in 5m46s
CI / all-required (pull_request) Successful in 28m17s
The #1953 sameOrg() guard over-blocked legitimate SAME-ORG a2a routing:
orgRootSubtreeCTE carried `id AS root_id` from the recursive SEED, so a
non-root workspace resolved to ITSELF instead of its topmost ancestor.
sameOrg(child, root) therefore compared child-id vs root-id, reported the
pair as DIFFERENT orgs, and 403'd a legitimate same-org delegation. The
cross-org case was unaffected (two distinct roots already resolve to
different ids), so isolation stayed closed — but real same-org delegation
broke. Caught only by the real-Postgres integration suite: the sqlmock
unit tests hand-feed sameOrg() a root_id row and so structurally cannot
exercise the CTE.

Fix: select the parentless chain row's own `id` (aliased root_id) instead
of the seed-carried value. A node that already IS an org root has a
one-row chain and still resolves to itself.

Why the two required checks were red:

- handlers-postgres-integration (real CTE): the executeDelegation
  success-path fixtures seeded source AND target both parent_id=NULL —
  two DISTINCT org roots, i.e. a CROSS-tenant pair that only ever
  "communicated" via the OLD leaky root-sibling behavior #1953 closes.
  Re-seeded target as a CHILD of source (same org). With the same-org
  fixture, the CTE bug surfaced and is now fixed; all 5 ExecuteDelegation
  tests pass (success + failure paths). Added
  TestIntegration_SameOrg_RealCTE_ResolvesAncestorChain as the real-SQL
  regression gate for root→child→grandchild resolution + cross-org denial.

- e2e-api (test_api.sh): created Echo + Summarizer both as org roots and
  asserted they appear in each other's /registry/:id/peers — that
  enumeration WAS the cross-tenant leak (org root seeing another org
  root). Re-created Summarizer as a child of Echo so the peer assertions
  exercise legitimate same-org parent/child enumeration.

Cross-tenant isolation remains closed (all cross-org negative tests pass);
same-org peers + a2a now work. go build ./... + go test ./internal/handlers/...
green; integration suite green.

Refs #1953

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 09:41:44 -07:00
core-be 6211d27bc7 fix(security): scope peer discovery + a2a routing to caller org (#1953)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 11s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Python Lint & Test (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 7s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
E2E Chat / detect-changes (pull_request) Successful in 8s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 14s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 13s
Harness Replays / detect-changes (pull_request) Successful in 5s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 40s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 10s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 5s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 56s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 10s
gate-check-v3 / gate-check (pull_request) Successful in 6s
qa-review / approved (pull_request) Successful in 7s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 5s
security-review / approved (pull_request) Failing after 5s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 5s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m2s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Successful in 8m6s
CI / Canvas (Next.js) (pull_request) Successful in 4s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 8s
E2E Chat / E2E Chat (pull_request) Successful in 6s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 9s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 1m36s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Failing after 2m6s
Harness Replays / Harness Replays (pull_request) Successful in 1s
CI / Platform (Go) (pull_request) Successful in 5m39s
CI / all-required (pull_request) Successful in 6m4s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Three workspace-server paths computed an "org-root sibling set" as
`WHERE parent_id IS NULL`, which matches EVERY tenant's org root (the
workspaces table has no org_id column) → cross-tenant data exposure:

  1. GET /registry/:id/peers (discovery.Peers) — returned peer
     id/name/role/url/agent_card across ALL tenants when the caller
     was itself an org root.
  2. MCP toolListPeers (mcp_tools.go) — same cross-tenant peer
     enumeration via the MCP bridge.
  3. a2a routing (a2a_proxy.proxyA2ARequest → resolveAgentURL) —
     CanCommunicate's "root-level siblings, both no parent" rule treats
     every tenant's org root as a sibling, and resolveAgentURL accepts
     ANY workspace id with no org check, so an org root could resolve
     and route a2a to another tenant's org root.

Fix — reuse the OFFSEC-015 broadcast scoping (commit 5a05302c,
workspace_broadcast.go): the org is the parent_id-chain subtree from a
single org root. New org_scope.go centralises that recursive CTE
(orgRootID / sameOrg) so all paths derive "the caller's org" the same way:

  - discovery.Peers + toolListPeers: drop the `parent_id IS NULL`
    sibling branch entirely. An org root has no siblings inside its own
    org; its peers are its children (still enumerated). Only the
    parent_id-bound sibling branch remains, already scoped to one tenant.
  - a2a proxyA2ARequest: after CanCommunicate, add a sameOrg() guard that
    rejects (403) before resolveAgentURL when caller and target resolve
    to different org roots. Fail-closed: a DB error denies routing.

No org_id column is added — that is a separate architecture decision
pending CTO. This uses the existing parent_id-chain scoping.

Tests (cross_tenant_isolation_test.go): per-path cross-tenant regression
— a DIFFERENT-org workspace must NOT appear in /registry peers, must NOT
appear in toolListPeers, and a2a MUST reject resolving/routing to a
workspace outside the caller's org; plus same-org positive tests. The
three negative tests were verified to FAIL against the pre-fix code.
Existing peer/a2a/delegation tests updated to the org-scoped behavior.

Follow-up for CTO: registry.CanCommunicate still treats any two org
roots as siblings, so discovery.Discover and CheckAccess share the same
root-sibling weakness. Scoping CanCommunicate itself (registry package)
would close that class fully; flagged separately as it is outside the
three #1953 paths.

Refs #1953

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 08:45:27 -07:00
claude-ceo-assistant 42b16b33fb fix(memory): upsert namespace before v2 commit
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 11s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 10s
CI / Detect changes (pull_request) Successful in 13s
CI / Python Lint & Test (pull_request) Successful in 9s
E2E API Smoke Test / detect-changes (pull_request) Successful in 14s
E2E Chat / detect-changes (pull_request) Successful in 14s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 14s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
Harness Replays / detect-changes (pull_request) Successful in 9s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 7s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 7s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m11s
CI / Canvas (Next.js) (pull_request) Successful in 10s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 10s
E2E Chat / E2E Chat (pull_request) Successful in 10s
Harness Replays / Harness Replays (pull_request) Successful in 3s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m58s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m45s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Platform (Go) (pull_request) Successful in 6m7s
CI / all-required (pull_request) Successful in 13m6s
security-review / approved (pull_request) Refired via /security-recheck by unknown
qa-review / approved (pull_request) Refired via /qa-recheck by unknown
gate-check-v3 / gate-check (pull_request) Successful in 31s
sop-checklist / review-refire (pull_request) Has been skipped
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 22s
sop-tier-check / tier-check (pull_request) Successful in 14s
audit-force-merge / audit (pull_request) Successful in 7s
2026-05-26 12:38:50 -07:00
22 changed files with 1343 additions and 100 deletions
+15 -1
View File
@@ -296,7 +296,15 @@ fi
# 403 → token owner is not in this team (Gitea 1.22.6 'Must be a team
# member' constraint — see follow-up issue for token-provisioning)
# 404 → not a member
# Track whether every candidate returned 403 (token owner not in team).
# When this happens the root cause is a token-provisioning issue, not a
# reviewer-eligibility issue — surface it clearly so ops don't waste time
# verifying team roster (Bug C / RFC#324 follow-up).
_ALL_CANDIDATES_403="yes"
_CANDIDATE_COUNT=0
for U in $CANDIDATES; do
_CANDIDATE_COUNT=$((_CANDIDATE_COUNT + 1))
CODE=$(curl -sS -o "$TEAM_PROBE_TMP" -w '%{http_code}' \
-K "$CURL_AUTH_FILE" "${API}/teams/${TEAM_ID}/members/${U}")
debug "probe ${U} in team ${TEAM} (id=${TEAM_ID}) → HTTP ${CODE}"
@@ -317,14 +325,20 @@ for U in $CANDIDATES; do
continue
;;
404)
_ALL_CANDIDATES_403="no"
debug "${U} not a member of ${TEAM}"
;;
*)
_ALL_CANDIDATES_403="no"
echo "::warning::team-probe for ${U} in ${TEAM} returned unexpected HTTP ${CODE}"
cat "$TEAM_PROBE_TMP" >&2
;;
esac
done
echo "::error::${TEAM}-review awaiting non-author APPROVE from ${TEAM} team (candidates: $(echo "$CANDIDATES" | tr '\n' ',' | sed 's/,$//') — none are in team)"
if [ "$_ALL_CANDIDATES_403" = "yes" ] && [ "$_CANDIDATE_COUNT" -gt 0 ]; then
echo "::error::${TEAM}-review FAILED — every candidate returned 403 (token owner is not a member of the ${TEAM} team). This is a TOKEN PROVISIONING issue, not a reviewer-eligibility issue. Add the token owner to the '${TEAM}' Gitea team (id=${TEAM_ID}) or use a token whose owner is already in that team."
else
echo "::error::${TEAM}-review awaiting non-author APPROVE from ${TEAM} team (candidates: $(echo "$CANDIDATES" | tr '\n' ',' | sed 's/,$//') — none are in team)"
fi
exit 1
+2 -2
View File
@@ -6,8 +6,8 @@
# RFC#351 Step 2 of 6 (implementation MVP).
#
# Invoked by .gitea/workflows/sop-checklist.yml on:
# - pull_request_target: [opened, edited, synchronize, reopened]
# - issue_comment: [created, edited, deleted]
# - pull_request_target: [opened, edited, synchronize, reopened, labeled, unlabeled]
# - issue_comment: [created] # edited/deleted omitted (Gitea 1.22.6 job-parsing quirk)
#
# Flow:
# 1. Load .gitea/sop-checklist-config.yaml (from BASE ref — trusted).
+43 -25
View File
@@ -73,7 +73,15 @@ else
fi
# Test 4: Create workspace B (needs bearer — tokens now exist in DB)
R=$(acurl -X POST "$BASE/workspaces" -H "Content-Type: application/json" -d '{"name":"Summarizer Agent","tier":1,"runtime":"external","external":true}')
# #1953 cross-tenant isolation: Summarizer is created as a CHILD of Echo so the
# two live in the SAME org (Echo is the org root; Summarizer hangs off it via
# parent_id). The peer-discovery tests below assert same-org peer enumeration
# (Echo sees its child, the child sees its parent). Previously both were created
# parent_id=NULL — two DISTINCT org roots — and "peers" only listed each other
# via the `WHERE parent_id IS NULL` branch that returned every tenant's org root.
# That branch WAS the cross-tenant leak (#1953) and is now removed, so two org
# roots no longer see each other; the assertions must run inside one org.
R=$(acurl -X POST "$BASE/workspaces" -H "Content-Type: application/json" -d "{\"name\":\"Summarizer Agent\",\"tier\":1,\"runtime\":\"external\",\"external\":true,\"parent_id\":\"$ECHO_ID\"}")
check "POST /workspaces (create summarizer)" '"status":"awaiting_agent"' "$R"
SUM_ID=$(echo "$R" | python3 -c "import sys,json; print(json.load(sys.stdin)['id'])")
@@ -133,21 +141,23 @@ check "Heartbeat updated uptime" '"uptime_seconds":120' "$R"
R=$(curl -s "$BASE/registry/discover/$ECHO_ID")
check "GET /registry/discover/:id (missing caller rejected)" 'X-Workspace-ID header is required' "$R"
# Test 12: Discover (from sibling — allowed)
# Test 12: Discover (from same-org child — allowed)
R=$(curl -s "$BASE/registry/discover/$ECHO_ID" -H "X-Workspace-ID: $SUM_ID" -H "Authorization: Bearer $SUM_TOKEN")
check "GET /registry/discover/:id (sibling)" '"url"' "$R"
check "GET /registry/discover/:id (same-org)" '"url"' "$R"
# Test 13: Peers (root siblings see each other)
# Test 13: Peers — same-org parent/child see each other (#1953). Echo is the org
# root and lists its child Summarizer; Summarizer lists its parent Echo. A
# cross-org workspace would NOT appear here (see cross_tenant_isolation_test.go).
R=$(curl -s "$BASE/registry/$ECHO_ID/peers" -H "Authorization: Bearer $ECHO_TOKEN")
check "GET /registry/:id/peers (has summarizer)" '"Summarizer' "$R"
R=$(curl -s "$BASE/registry/$SUM_ID/peers" -H "Authorization: Bearer $SUM_TOKEN")
check "GET /registry/:id/peers (has echo)" '"Echo Agent"' "$R"
# Test 14: Check access (root siblings)
# Test 14: Check access (same-org parent↔child — allowed)
R=$(curl -s -X POST "$BASE/registry/check-access" -H "Content-Type: application/json" \
-d "{\"caller_id\":\"$ECHO_ID\",\"target_id\":\"$SUM_ID\"}")
check "POST /registry/check-access (siblings allowed)" '"allowed":true' "$R"
check "POST /registry/check-access (same-org allowed)" '"allowed":true' "$R"
# Test 15: PATCH workspace (update position)
R=$(acurl -X PATCH "$BASE/workspaces/$ECHO_ID" -H "Content-Type: application/json" -d '{"x":100,"y":200}')
@@ -289,32 +299,40 @@ R=$(curl -s "$BASE/workspaces" -H "Authorization: Bearer $ECHO_TOKEN")
check "current_task in list response" '"current_task"' "$R"
# Test 21: Delete
R=$(acurl -X DELETE "$BASE/workspaces/$ECHO_ID?confirm=true" \
-H "Authorization: Bearer $ECHO_TOKEN" \
-H "X-Confirm-Name: Echo Agent v2")
check "DELETE /workspaces/:id" '"status":"removed"' "$R"
R=$(curl -s "$BASE/workspaces" -H "Authorization: Bearer $SUM_TOKEN")
COUNT=$(echo "$R" | python3 -c "import sys,json; print(len(json.load(sys.stdin)))")
check "List after delete (count=1)" "1" "$COUNT"
# Test 22: Bundle round-trip — export → delete → import → verify same config
echo ""
echo "--- Bundle Round-Trip Test ---"
# Export the summarizer workspace (#165 / PR #167 — admin-gated)
# #1953: Summarizer is now a CHILD of Echo (same-org, for the peer-discovery
# tests above). DELETE on the *parent* (Echo) cascade-removes its descendants
# (CascadeDelete walks the recursive `parent_id` CTE), so deleting Echo first
# would also remove Summarizer and the "one survives" assertion would see 0.
# Delete the CHILD (Summarizer) here instead: a child delete does NOT cascade
# upward, so the parent Echo survives and count=1 holds. The bundle round-trip
# below needs Summarizer's exported config, so capture it BEFORE this delete.
BUNDLE=$(curl -s "$BASE/bundles/export/$SUM_ID" -H "Authorization: Bearer $SUM_TOKEN")
check "GET /bundles/export/:id" '"name":"Summarizer Agent"' "$BUNDLE"
# Capture original config for comparison
ORIG_NAME=$(echo "$BUNDLE" | python3 -c "import sys,json; print(json.load(sys.stdin)['name'])")
ORIG_TIER=$(echo "$BUNDLE" | python3 -c "import sys,json; print(json.load(sys.stdin)['tier'])")
# Delete the workspace — use SUM_TOKEN (per-workspace) for WorkspaceAuth
# and ADMIN_TOKEN for the AdminAuth layer.
R=$(curl -s -X DELETE "$BASE/workspaces/$SUM_ID?confirm=true" \
R=$(acurl -X DELETE "$BASE/workspaces/$SUM_ID?confirm=true" \
-H "Authorization: Bearer $SUM_TOKEN" \
-H "X-Confirm-Name: Summarizer Agent")
check "DELETE /workspaces/:id" '"status":"removed"' "$R"
# Parent Echo must survive a child delete — list as Echo and expect count=1.
R=$(curl -s "$BASE/workspaces" -H "Authorization: Bearer $ECHO_TOKEN")
COUNT=$(echo "$R" | python3 -c "import sys,json; print(len(json.load(sys.stdin)))")
check "List after delete (count=1)" "1" "$COUNT"
# Test 22: Bundle round-trip — export → delete → import → verify same config.
# Summarizer's bundle was captured above; now delete the parent Echo (the only
# remaining workspace) so the import lands in a clean org, then re-import the
# Summarizer bundle.
echo ""
echo "--- Bundle Round-Trip Test ---"
# Delete the remaining parent Echo — use ECHO_TOKEN (per-workspace) for
# WorkspaceAuth and ADMIN_TOKEN for the AdminAuth layer.
R=$(acurl -X DELETE "$BASE/workspaces/$ECHO_ID?confirm=true" \
-H "Authorization: Bearer $ECHO_TOKEN" \
-H "X-Confirm-Name: Echo Agent v2")
check "Delete before re-import" '"status":"removed"' "$R"
# After deleting both workspaces, all per-workspace tokens are revoked.
@@ -375,6 +375,30 @@ func (h *WorkspaceHandler) proxyA2ARequest(ctx context.Context, workspaceID stri
Response: gin.H{"error": "access denied: workspaces cannot communicate per hierarchy rules"},
}
}
// #1953 cross-tenant isolation. CanCommunicate alone does NOT enforce
// org boundaries: its "root-level siblings — both have no parent" rule
// treats every tenant's org root as a sibling, so a caller that is an
// org root could resolve and route a2a to another tenant's org root
// (and resolveAgentURL accepts ANY workspace id with no org check).
// Gate on the SAME parent_id-chain org scoping the OFFSEC-015 broadcast
// fix uses: reject before resolveAgentURL when caller and target are in
// different orgs. Fail-closed — a DB error denies cross-org routing.
ok, err := sameOrg(ctx, db.DB, callerID, workspaceID)
if err != nil {
log.Printf("ProxyA2A: org-scope check failed %s → %s: %v — denying", callerID, workspaceID, err)
return 0, nil, &proxyA2AError{
Status: http.StatusForbidden,
Response: gin.H{"error": "access denied: org isolation check failed"},
}
}
if !ok {
log.Printf("ProxyA2A: cross-org routing denied %s → %s (#1953)", callerID, workspaceID)
return 0, nil, &proxyA2AError{
Status: http.StatusForbidden,
Response: gin.H{"error": "access denied: target workspace is in a different org"},
}
}
}
// Budget enforcement: reject A2A calls when the workspace has exceeded its
@@ -437,6 +437,10 @@ func TestProxyA2A_CallerIDPropagated(t *testing.T) {
WithArgs("ws-target").
WillReturnRows(sqlmock.NewRows([]string{"id", "parent_id"}).AddRow("ws-target", "ws-parent"))
// #1953 cross-tenant guard: same-org check after CanCommunicate. Both
// workspaces resolve to the same org root → routing allowed.
mockSameOrg(mock, "ws-caller", "ws-target", true)
expectBudgetCheck(mock, "ws-target")
// Expect activity log with source_id set
@@ -465,6 +469,24 @@ func TestProxyA2A_CallerIDPropagated(t *testing.T) {
}
}
// mockSameOrg sets up the two org-root recursive-CTE expectations that the
// #1953 cross-tenant guard in proxyA2ARequest runs after CanCommunicate passes.
// sameOrg=true returns the SAME root_id for both caller and target (same tenant);
// sameOrg=false returns different root_ids (cross-tenant → routing must be denied).
func mockSameOrg(mock sqlmock.Sqlmock, caller, target string, sameOrg bool) {
callerRoot := "org-root-shared"
targetRoot := "org-root-shared"
if !sameOrg {
targetRoot = "org-root-other-tenant"
}
mock.ExpectQuery("WITH RECURSIVE org_chain AS").
WithArgs(caller).
WillReturnRows(sqlmock.NewRows([]string{"root_id"}).AddRow(callerRoot))
mock.ExpectQuery("WITH RECURSIVE org_chain AS").
WithArgs(target).
WillReturnRows(sqlmock.NewRows([]string{"root_id"}).AddRow(targetRoot))
}
// mockCanCommunicate sets up sqlmock expectations for CanCommunicate(caller, target).
// allowed=true sets up rows that satisfy the access policy (siblings under same parent).
// allowed=false sets up rows that don't (different parents).
@@ -659,6 +681,9 @@ func TestProxyA2A_CallerIDDerivedFromBearer(t *testing.T) {
WithArgs("ws-target").
WillReturnRows(sqlmock.NewRows([]string{"id", "parent_id"}).AddRow("ws-target", "ws-parent"))
// 3b. #1953 cross-tenant guard — same org root → routing allowed.
mockSameOrg(mock, "ws-caller", "ws-target", true)
expectBudgetCheck(mock, "ws-target")
// 4. activity_logs INSERT — verify source_id arg is the derived ws-caller
@@ -0,0 +1,427 @@
package handlers
// cross_tenant_isolation_test.go — #1953 regression tests.
//
// Three workspace-server paths historically derived an "org-root sibling set"
// as `WHERE parent_id IS NULL`, which matches EVERY tenant's org root (the
// workspaces table has no org_id column) → cross-tenant data exposure:
//
// 1. GET /registry/:id/peers (discovery.Peers)
// 2. MCP toolListPeers (mcp_tools.toolListPeers)
// 3. a2a routing (a2a_proxy.proxyA2ARequest → resolveAgentURL)
//
// These tests assert that a workspace in a DIFFERENT org is never returned as a
// peer and that a2a refuses to resolve/route to a workspace outside the caller's
// org, while same-org peers/targets still work. They reuse the SAME parent_id-
// chain org scoping the OFFSEC-015 broadcast fix introduced (org_scope.go).
import (
"bytes"
"context"
"database/sql"
"encoding/json"
"fmt"
"net/http"
"net/http/httptest"
"strings"
"testing"
"time"
"git.moleculesai.app/molecule-ai/molecule-core/workspace-server/internal/db"
"github.com/DATA-DOG/go-sqlmock"
"github.com/gin-gonic/gin"
)
// dbHandleForTest returns the global sqlmock-backed *sql.DB that setupTestDB
// installs, for tests that need to hand a *sql.DB to a component (e.g.
// MCPHandler.database, sameOrg) rather than relying on the package-global.
func dbHandleForTest() *sql.DB { return db.DB }
// peerColsForIsolation matches queryPeerMaps' SELECT column set.
var peerColsForIsolation = []string{
"id", "name", "role", "tier", "status", "agent_card", "url", "parent_id", "active_tasks",
}
// -------------------------------------------------------------------------
// Path 1: GET /registry/:id/peers — discovery.Peers
// -------------------------------------------------------------------------
// TestPeers_CrossTenant_OrgRootNotLeaked is the core #1953 regression for the
// discovery path. The caller is an org root (parent_id IS NULL). Pre-fix the
// handler ran `SELECT ... WHERE w.parent_id IS NULL AND w.id != $1`, returning
// every OTHER tenant's org root as a "sibling" peer. Post-fix an org-root caller
// issues NO sibling query — its only peers are its own children. If the handler
// regressed and issued the cross-tenant sibling query, sqlmock would report an
// unexpected query (the expectation below is intentionally NOT registered) and
// the test fails.
func TestPeers_CrossTenant_OrgRootNotLeaked(t *testing.T) {
mock := setupTestDB(t)
setupTestRedis(t)
handler := NewDiscoveryHandler()
// Behavioural leak test: register the OLD leaky `parent_id IS NULL` sibling
// query so that IF the handler still issues it, it returns another tenant's
// org root (org-b-root). The fix removes that query for an org-root caller,
// so org-b-root must never appear in the output. Unordered matching makes
// the leaky-sibling expectation optional — the fix simply never consumes it.
mock.MatchExpectationsInOrder(false)
caller := "org-a-root" // parent_id IS NULL — an org root for tenant A
// parent_id lookup → NULL (caller is an org root)
mock.ExpectQuery("SELECT parent_id FROM workspaces WHERE id =").
WithArgs(caller).
WillReturnRows(sqlmock.NewRows([]string{"parent_id"}).AddRow(nil))
// LEAKY sibling query (pre-fix). Returns a DIFFERENT tenant's org root.
// The fix must NOT issue this query; if it does, org-b-root leaks into the
// peer list and the output assertion below fails.
mock.ExpectQuery("SELECT w.id, w.name.*WHERE w.parent_id IS NULL AND w.id != \\$1").
WithArgs(caller).
WillReturnRows(sqlmock.NewRows(peerColsForIsolation).
AddRow("org-b-root", "Org B Root", "lead", 0, "online", []byte("null"), "http://b-root", nil, 0))
// Children query — caller's own org-A children only. Return one child.
mock.ExpectQuery("SELECT w.id, w.name.*WHERE w.parent_id = \\$1 AND w.id != \\$2").
WithArgs(caller, caller).
WillReturnRows(sqlmock.NewRows(peerColsForIsolation).
AddRow("org-a-child", "Org A Child", "worker", 1, "online", []byte("null"), "http://a-child", caller, 0))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: caller}}
c.Request = httptest.NewRequest("GET", "/registry/"+caller+"/peers", nil)
handler.Peers(c)
if w.Code != http.StatusOK {
t.Fatalf("expected 200, got %d: %s", w.Code, w.Body.String())
}
var peers []map[string]interface{}
if err := json.Unmarshal(w.Body.Bytes(), &peers); err != nil {
t.Fatalf("failed to parse response: %v", err)
}
// The other-tenant org root must NEVER appear; only the same-org child.
for _, p := range peers {
if id, _ := p["id"].(string); id == "org-b-root" {
t.Fatalf("cross-tenant leak (#1953): org-b-root appeared in org-a-root's peer list: %v", peers)
}
}
if len(peers) != 1 {
t.Fatalf("expected exactly 1 peer (same-org child), got %d: %v", len(peers), peers)
}
// NOTE: ExpectationsWereMet is intentionally NOT asserted — the leaky
// sibling expectation is deliberately left unconsumed by the fixed path.
}
// TestPeers_SameOrg_SiblingsStillWork is the positive companion: a non-root
// child caller still sees its same-org siblings, children, and parent. This
// guards against the fix over-scoping and breaking legitimate intra-org
// discovery.
func TestPeers_SameOrg_SiblingsStillWork(t *testing.T) {
mock := setupTestDB(t)
setupTestRedis(t)
handler := NewDiscoveryHandler()
caller := "org-a-child-1"
parent := "org-a-root"
mock.ExpectQuery("SELECT parent_id FROM workspaces WHERE id =").
WithArgs(caller).
WillReturnRows(sqlmock.NewRows([]string{"parent_id"}).AddRow(parent))
// Siblings — scoped to the shared parent (one tenant).
mock.ExpectQuery("SELECT w.id, w.name.*WHERE w.parent_id = \\$1 AND w.id != \\$2").
WithArgs(parent, caller).
WillReturnRows(sqlmock.NewRows(peerColsForIsolation).
AddRow("org-a-child-2", "Org A Sibling", "worker", 1, "online", []byte("null"), "http://a-sib", parent, 0))
// Children — none.
mock.ExpectQuery("SELECT w.id, w.name.*WHERE w.parent_id = \\$1 AND w.id != \\$2 AND w.status").
WithArgs(caller, caller).
WillReturnRows(sqlmock.NewRows(peerColsForIsolation))
// Parent.
mock.ExpectQuery("SELECT w.id, w.name.*WHERE w.id = \\$1 AND w.id != \\$2 AND w.status").
WithArgs(parent, caller).
WillReturnRows(sqlmock.NewRows(peerColsForIsolation).
AddRow(parent, "Org A Root", "lead", 0, "online", []byte("null"), "http://a-root", nil, 0))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: caller}}
c.Request = httptest.NewRequest("GET", "/registry/"+caller+"/peers", nil)
handler.Peers(c)
if w.Code != http.StatusOK {
t.Fatalf("expected 200, got %d: %s", w.Code, w.Body.String())
}
var peers []map[string]interface{}
if err := json.Unmarshal(w.Body.Bytes(), &peers); err != nil {
t.Fatalf("failed to parse response: %v", err)
}
// Sibling + parent = 2 same-org peers.
if len(peers) != 2 {
t.Fatalf("expected 2 same-org peers (sibling + parent), got %d: %v", len(peers), peers)
}
names := map[string]bool{}
for _, p := range peers {
names[fmt.Sprint(p["name"])] = true
}
if !names["Org A Sibling"] || !names["Org A Root"] {
t.Errorf("expected same-org sibling + parent in peer list, got %v", names)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
// -------------------------------------------------------------------------
// Path 2: MCP toolListPeers — mcp_tools.toolListPeers
// -------------------------------------------------------------------------
// mcpPeerCols matches toolListPeers' SELECT column set.
var mcpPeerCols = []string{"id", "name", "role", "status", "tier"}
// TestToolListPeers_CrossTenant_OrgRootNotLeaked is the #1953 regression for
// the MCP path. Same shape as the discovery test: an org-root caller must NOT
// enumerate other tenants' org roots. The cross-tenant `parent_id IS NULL`
// sibling query is intentionally not registered, so if it runs sqlmock fails.
func TestToolListPeers_CrossTenant_OrgRootNotLeaked(t *testing.T) {
mock := setupTestDB(t)
mock.MatchExpectationsInOrder(false)
h := &MCPHandler{database: dbHandleForTest()}
caller := "org-a-root"
mock.ExpectQuery("SELECT parent_id FROM workspaces WHERE id =").
WithArgs(caller).
WillReturnRows(sqlmock.NewRows([]string{"parent_id"}).AddRow(nil))
// LEAKY sibling query (pre-fix). Returns another tenant's org root. The fix
// must NOT issue this for an org-root caller; if it does, org-b-root leaks
// into the output and the assertion below fails. Left optional via
// unordered matching, so the fixed path simply never consumes it.
mock.ExpectQuery("WHERE w.parent_id IS NULL AND w.id != \\$1").
WithArgs(caller).
WillReturnRows(sqlmock.NewRows(mcpPeerCols).
AddRow("org-b-root", "Org B Root", "lead", "online", 0))
// Children — caller's own org-A children only.
mock.ExpectQuery("WHERE w.parent_id = \\$1 AND w.status").
WithArgs(caller).
WillReturnRows(sqlmock.NewRows(mcpPeerCols).
AddRow("org-a-child", "Org A Child", "worker", "online", 1))
out, err := h.toolListPeers(context.Background(), caller)
if err != nil {
t.Fatalf("toolListPeers returned error: %v", err)
}
if strings.Contains(out, "org-b-root") || strings.Contains(out, "Org B Root") {
t.Fatalf("cross-tenant leak (#1953): another tenant's org root appeared in toolListPeers output:\n%s", out)
}
if !strings.Contains(out, "org-a-child") {
t.Errorf("same-org child missing from toolListPeers output:\n%s", out)
}
// ExpectationsWereMet intentionally NOT asserted — leaky sibling expectation
// is deliberately left unconsumed by the fixed path.
}
// TestToolListPeers_SameOrg_SiblingsStillWork — positive companion for the MCP
// path: a non-root child still enumerates its same-org siblings + children + parent.
func TestToolListPeers_SameOrg_SiblingsStillWork(t *testing.T) {
mock := setupTestDB(t)
h := &MCPHandler{database: dbHandleForTest()}
caller := "org-a-child-1"
parent := "org-a-root"
mock.ExpectQuery("SELECT parent_id FROM workspaces WHERE id =").
WithArgs(caller).
WillReturnRows(sqlmock.NewRows([]string{"parent_id"}).AddRow(parent))
// Siblings — scoped to shared parent.
mock.ExpectQuery("WHERE w.parent_id = \\$1 AND w.id != \\$2 AND w.status").
WithArgs(parent, caller).
WillReturnRows(sqlmock.NewRows(mcpPeerCols).
AddRow("org-a-child-2", "Org A Sibling", "worker", "online", 1))
// Children — none.
mock.ExpectQuery("WHERE w.parent_id = \\$1 AND w.status").
WithArgs(caller).
WillReturnRows(sqlmock.NewRows(mcpPeerCols))
// Parent.
mock.ExpectQuery("WHERE w.id = \\$1 AND w.status").
WithArgs(parent).
WillReturnRows(sqlmock.NewRows(mcpPeerCols).
AddRow(parent, "Org A Root", "lead", "online", 0))
out, err := h.toolListPeers(context.Background(), caller)
if err != nil {
t.Fatalf("toolListPeers returned error: %v", err)
}
if !strings.Contains(out, "Org A Sibling") || !strings.Contains(out, "Org A Root") {
t.Errorf("expected same-org sibling + parent in toolListPeers output:\n%s", out)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
// -------------------------------------------------------------------------
// Path 3: a2a routing — a2a_proxy.proxyA2ARequest / resolveAgentURL
// -------------------------------------------------------------------------
// TestProxyA2A_CrossTenant_RoutingDenied is the #1953 regression for a2a
// routing. Caller and target are both org roots (parent_id IS NULL) belonging
// to DIFFERENT tenants. Pre-fix, CanCommunicate's "root-level siblings" rule
// waved this through and resolveAgentURL routed to the foreign tenant. Post-fix
// the org-scope guard resolves each to a different org root and returns 403
// BEFORE resolveAgentURL/dispatch.
func TestProxyA2A_CrossTenant_RoutingDenied(t *testing.T) {
mock := setupTestDB(t)
mr := setupTestRedis(t)
broadcaster := newTestBroadcaster()
handler := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
caller := "org-a-root"
target := "org-b-root" // different tenant
// A URL exists for the target; the guard must deny BEFORE it is used.
mr.Set(fmt.Sprintf("ws:%s:url", target), "http://localhost:1")
// CanCommunicate: both root-level (parent_id NULL) → its weak "root-level
// siblings" rule ALLOWS this. The org guard must catch it afterward.
mock.ExpectQuery("SELECT id, parent_id FROM workspaces WHERE id = ").
WithArgs(caller).
WillReturnRows(sqlmock.NewRows([]string{"id", "parent_id"}).AddRow(caller, nil))
mock.ExpectQuery("SELECT id, parent_id FROM workspaces WHERE id = ").
WithArgs(target).
WillReturnRows(sqlmock.NewRows([]string{"id", "parent_id"}).AddRow(target, nil))
// #1953 org-scope guard: caller resolves to org-a-root, target to org-b-root
// → different orgs → 403. (Each org root resolves to itself.)
mock.ExpectQuery("WITH RECURSIVE org_chain AS").
WithArgs(caller).
WillReturnRows(sqlmock.NewRows([]string{"root_id"}).AddRow(caller))
mock.ExpectQuery("WITH RECURSIVE org_chain AS").
WithArgs(target).
WillReturnRows(sqlmock.NewRows([]string{"root_id"}).AddRow(target))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: target}}
body := `{"method":"message/send","params":{"message":{"role":"user","parts":[{"text":"cross-tenant"}]}}}`
c.Request = httptest.NewRequest("POST", "/workspaces/"+target+"/a2a", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
c.Request.Header.Set("X-Workspace-ID", caller)
handler.ProxyA2A(c)
if w.Code != http.StatusForbidden {
t.Fatalf("expected 403 for cross-tenant a2a routing, got %d: %s", w.Code, w.Body.String())
}
var resp map[string]interface{}
if err := json.Unmarshal(w.Body.Bytes(), &resp); err != nil {
t.Fatalf("body not JSON: %v", err)
}
if msg, _ := resp["error"].(string); !strings.Contains(msg, "different org") {
t.Errorf("expected cross-org denial message, got %v", resp["error"])
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
// TestResolveAgentURL_CrossTenant_RejectedViaSameOrg is a direct unit test of
// the sameOrg primitive that gates resolveAgentURL: a target in a different org
// must be reported as NOT same-org, so the a2a guard rejects it before
// resolveAgentURL is ever called.
func TestResolveAgentURL_CrossTenant_RejectedViaSameOrg(t *testing.T) {
mock := setupTestDB(t)
caller := "org-a-root"
target := "org-b-root"
mock.ExpectQuery("WITH RECURSIVE org_chain AS").
WithArgs(caller).
WillReturnRows(sqlmock.NewRows([]string{"root_id"}).AddRow(caller))
mock.ExpectQuery("WITH RECURSIVE org_chain AS").
WithArgs(target).
WillReturnRows(sqlmock.NewRows([]string{"root_id"}).AddRow(target))
ok, err := sameOrg(context.Background(), dbHandleForTest(), caller, target)
if err != nil {
t.Fatalf("sameOrg returned unexpected error: %v", err)
}
if ok {
t.Errorf("expected cross-tenant workspaces to be reported as DIFFERENT orgs, got sameOrg=true")
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
// TestProxyA2A_SameOrg_RoutingAllowed — positive companion for a2a: two
// same-org siblings route successfully (mirrors TestProxyA2A_CallerIDPropagated
// but named to document the #1953 same-org allow path).
func TestProxyA2A_SameOrg_RoutingAllowed(t *testing.T) {
mock := setupTestDB(t)
mr := setupTestRedis(t)
allowLoopbackForTest(t)
broadcaster := newTestBroadcaster()
handler := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
waitForHandlerAsyncBeforeDBCleanup(t, handler)
caller := "org-a-child-1"
target := "org-a-child-2"
parent := "org-a-root"
agentServer := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
fmt.Fprint(w, `{"jsonrpc":"2.0","id":"1","result":{}}`)
}))
defer agentServer.Close()
mr.Set(fmt.Sprintf("ws:%s:url", target), agentServer.URL)
// CanCommunicate — siblings under shared parent.
mock.ExpectQuery("SELECT id, parent_id FROM workspaces WHERE id = ").
WithArgs(caller).
WillReturnRows(sqlmock.NewRows([]string{"id", "parent_id"}).AddRow(caller, parent))
mock.ExpectQuery("SELECT id, parent_id FROM workspaces WHERE id = ").
WithArgs(target).
WillReturnRows(sqlmock.NewRows([]string{"id", "parent_id"}).AddRow(target, parent))
// #1953 org guard — both resolve to the same org root → allowed.
mock.ExpectQuery("WITH RECURSIVE org_chain AS").
WithArgs(caller).
WillReturnRows(sqlmock.NewRows([]string{"root_id"}).AddRow(parent))
mock.ExpectQuery("WITH RECURSIVE org_chain AS").
WithArgs(target).
WillReturnRows(sqlmock.NewRows([]string{"root_id"}).AddRow(parent))
expectBudgetCheck(mock, target)
mock.ExpectExec("INSERT INTO activity_logs").WillReturnResult(sqlmock.NewResult(0, 1))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: target}}
body := `{"method":"message/send","params":{"message":{"role":"user","parts":[{"text":"same-org"}]}}}`
c.Request = httptest.NewRequest("POST", "/workspaces/"+target+"/a2a", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
c.Request.Header.Set("X-Workspace-ID", caller)
handler.ProxyA2A(c)
time.Sleep(50 * time.Millisecond) // allow the async logA2ASuccess INSERT to flush
if w.Code != http.StatusOK {
t.Fatalf("expected 200 for same-org a2a routing, got %d: %s", w.Code, w.Body.String())
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
@@ -140,7 +140,14 @@ func buildHTTPResponse(statusCode int, body string) []byte {
}
// setupIntegrationFixtures inserts the rows executeDelegation requires:
// - workspaces: source and target (siblings, parent_id=NULL so CanCommunicate=true)
// - workspaces: source (org root) + target as its CHILD, so both live in the
// SAME org. CanCommunicate=true (parent↔child) AND the #1953 sameOrg() guard
// in proxyA2ARequest passes (both resolve to the same org root). A real
// delegation happens INSIDE one org. (Previously both were parent_id=NULL —
// two DISTINCT org roots — which only "communicated" via CanCommunicate's
// root-sibling rule; #1953 added a sameOrg() guard that now denies routing
// between two org roots as cross-tenant, so the success-path tests below
// must use a same-org source/target pair.)
// - activity_logs: the 'delegate' row that updateDelegationStatus UPDATE will find
// - delegations: the ledger row that recordLedgerStatus will UPDATE
//
@@ -148,13 +155,14 @@ func buildHTTPResponse(statusCode int, body string) []byte {
func setupIntegrationFixtures(t *testing.T, conn *sql.DB) func() {
t.Helper()
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
sourceID := integrationTestSourceID // org root (parent_id NULL); target hangs off it
for _, ws := range []struct {
id string
name string
parentID *string
}{
{integrationTestSourceID, "test-source", nil},
{integrationTestTargetID, "test-target", nil},
{integrationTestTargetID, "test-target", &sourceID}, // child of source → same org
} {
if _, err := conn.ExecContext(ctx,
`INSERT INTO workspaces (id, name, parent_id) VALUES ($1::uuid, $2, $3) ON CONFLICT (id) DO NOTHING`,
@@ -510,6 +518,94 @@ func TestIntegration_ExecuteDelegation_RedisDown_FallsBackToDB(t *testing.T) {
}
}
// TestIntegration_SameOrg_RealCTE_ResolvesAncestorChain is the regression gate
// for the org_scope.go recursive-CTE bug (#1953 follow-up). The sqlmock unit
// tests feed sameOrg() a pre-computed root_id row, so they CANNOT catch a wrong
// CTE — they assume it already returns the right value. Only a real Postgres
// run exercises orgRootSubtreeCTE itself.
//
// The bug: the CTE carried `id AS root_id` from the recursive SEED, so a
// non-root workspace resolved to ITSELF instead of its topmost ancestor. That
// made sameOrg() return false for two genuinely same-org workspaces and 403 a
// legitimate same-org a2a route (over-block). This test seeds a real
// root → child → grandchild chain plus a separate org root, and asserts:
// - every node in the chain resolves to the SAME org root (root, child, grandchild)
// - two workspaces in the same chain are sameOrg (incl. grandchild ↔ root)
// - a workspace in a DIFFERENT chain is NOT sameOrg (cross-tenant stays closed)
func TestIntegration_SameOrg_RealCTE_ResolvesAncestorChain(t *testing.T) {
conn := integrationDB(t)
const (
rootA = "11111111-1111-1111-1111-111111111111"
childA = "22222222-2222-2222-2222-222222222222"
grandchildA = "33333333-3333-3333-3333-333333333333"
rootB = "44444444-4444-4444-4444-444444444444"
)
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
t.Cleanup(func() {
c2, cancel2 := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel2()
// Delete leaf-first to respect the parent_id self-FK.
for _, id := range []string{grandchildA, childA, rootA, rootB} {
conn.ExecContext(c2, `DELETE FROM workspaces WHERE id = $1`, id)
}
})
// Insert parent-before-child to satisfy the self-referential FK.
seed := []struct {
id, name string
parent *string
}{
{rootA, "org-a-root", nil},
{childA, "org-a-child", strPtr(rootA)},
{grandchildA, "org-a-grandchild", strPtr(childA)},
{rootB, "org-b-root", nil},
}
for _, s := range seed {
if _, err := conn.ExecContext(ctx,
`INSERT INTO workspaces (id, name, parent_id) VALUES ($1::uuid, $2, $3) ON CONFLICT (id) DO NOTHING`,
s.id, s.name, s.parent); err != nil {
t.Fatalf("seed %s: %v", s.name, err)
}
}
// Every node in chain A must resolve to rootA via the REAL CTE.
for _, id := range []string{rootA, childA, grandchildA} {
got, err := orgRootID(ctx, conn, id)
if err != nil {
t.Fatalf("orgRootID(%s): %v", id, err)
}
if got != rootA {
t.Errorf("orgRootID(%s) = %q, want rootA %q (CTE must walk to topmost ancestor)", id, got, rootA)
}
}
// Same-org positives — including the grandchild↔root pair that the buggy
// CTE got wrong.
for _, pair := range [][2]string{{childA, grandchildA}, {rootA, grandchildA}, {rootA, childA}} {
ok, err := sameOrg(ctx, conn, pair[0], pair[1])
if err != nil {
t.Fatalf("sameOrg(%s,%s): %v", pair[0], pair[1], err)
}
if !ok {
t.Errorf("sameOrg(%s,%s) = false, want true (same org chain)", pair[0], pair[1])
}
}
// Cross-org negative — isolation must stay closed.
for _, pair := range [][2]string{{rootA, rootB}, {grandchildA, rootB}, {childA, rootB}} {
ok, err := sameOrg(ctx, conn, pair[0], pair[1])
if err != nil {
t.Fatalf("sameOrg(%s,%s): %v", pair[0], pair[1], err)
}
if ok {
t.Errorf("sameOrg(%s,%s) = true, want false (different orgs — cross-tenant must stay denied)", pair[0], pair[1])
}
}
}
// extractHostPort parses "http://127.0.0.1:PORT/" and returns "127.0.0.1:PORT".
func extractHostPort(rawURL string) string {
// Simple parse: strip "http://" prefix and trailing slash.
@@ -1059,13 +1059,25 @@ func expectExecuteDelegationBase(mock sqlmock.Sqlmock) {
WillReturnResult(sqlmock.NewResult(0, 1))
// CanCommunicate: getWorkspaceRef(source) + getWorkspaceRef(target).
// Both are root-level workspaces (parent_id=NULL) → root-level siblings → allowed.
// Source and target are siblings under one shared parent (one tenant) →
// CanCommunicate allowed. (#1953: they must NOT both be parent_id=NULL —
// two distinct org roots are now treated as DIFFERENT orgs and routing
// between them is denied. A real delegation happens inside one org.)
mock.ExpectQuery("SELECT id, parent_id FROM workspaces WHERE id = ").
WithArgs(testDeliverySourceID).
WillReturnRows(sqlmock.NewRows([]string{"id", "parent_id"}).AddRow(testDeliverySourceID, nil))
WillReturnRows(sqlmock.NewRows([]string{"id", "parent_id"}).AddRow(testDeliverySourceID, "ws-org-root-159"))
mock.ExpectQuery("SELECT id, parent_id FROM workspaces WHERE id = ").
WithArgs(testDeliveryTargetID).
WillReturnRows(sqlmock.NewRows([]string{"id", "parent_id"}).AddRow(testDeliveryTargetID, nil))
WillReturnRows(sqlmock.NewRows([]string{"id", "parent_id"}).AddRow(testDeliveryTargetID, "ws-org-root-159"))
// #1953 cross-tenant guard: same-org check after CanCommunicate. Both
// resolve to the same org root → routing allowed.
mock.ExpectQuery("WITH RECURSIVE org_chain AS").
WithArgs(testDeliverySourceID).
WillReturnRows(sqlmock.NewRows([]string{"root_id"}).AddRow("ws-org-root-159"))
mock.ExpectQuery("WITH RECURSIVE org_chain AS").
WithArgs(testDeliveryTargetID).
WillReturnRows(sqlmock.NewRows([]string{"root_id"}).AddRow("ws-org-root-159"))
// resolveAgentURL: test callers always set the URL in Redis (mr.Set ws:{id}:url),
// so resolveAgentURL gets a cache hit and never falls back to DB.
@@ -237,7 +237,17 @@ func (h *DiscoveryHandler) Peers(c *gin.Context) {
var peers []map[string]interface{}
// Siblings
// Siblings — workspaces sharing the caller's parent.
//
// #1953 cross-tenant isolation: the OLD code's else-branch handled the
// org-root caller (parent_id IS NULL) by returning EVERY workspace with
// parent_id IS NULL — i.e. every other tenant's org root, since the
// workspaces table has no org_id column. That leaked peer identities/URLs
// across tenants. An org root has no siblings inside its own org (each
// tenant is a distinct org root), so the org-root caller now gets an empty
// sibling set; its real peers are its children, returned below. Only the
// parent_id-bound branch enumerates siblings, and that is already scoped to
// one parent (one tenant).
if parentID.Valid {
siblings, _ := queryPeerMaps(`
SELECT w.id, w.name, COALESCE(w.role, ''), w.tier, w.status,
@@ -246,14 +256,6 @@ func (h *DiscoveryHandler) Peers(c *gin.Context) {
FROM workspaces w WHERE w.parent_id = $1 AND w.id != $2 AND w.status != 'removed'`,
parentID.String, workspaceID)
peers = append(peers, siblings...)
} else {
siblings, _ := queryPeerMaps(`
SELECT w.id, w.name, COALESCE(w.role, ''), w.tier, w.status,
COALESCE(w.agent_card, 'null'::jsonb), COALESCE(w.url, ''),
w.parent_id, w.active_tasks
FROM workspaces w WHERE w.parent_id IS NULL AND w.id != $1 AND w.status != 'removed'`,
workspaceID)
peers = append(peers, siblings...)
}
// Children — exclude self defensively. A child row whose parent_id
@@ -223,10 +223,10 @@ func TestPeers_RootWorkspace_NoPeers(t *testing.T) {
peerCols := []string{"id", "name", "role", "tier", "status", "agent_card", "url", "parent_id", "active_tasks"}
// Siblings (other root-level workspaces) — none
mock.ExpectQuery("SELECT w.id, w.name.*WHERE w.parent_id IS NULL AND w.id != \\$1").
WithArgs("ws-root-alone").
WillReturnRows(sqlmock.NewRows(peerCols))
// #1953: an org-root caller (parent_id IS NULL) now issues NO sibling
// query at all. The old `WHERE w.parent_id IS NULL` sibling read returned
// EVERY tenant's org root (cross-tenant leak); an org root has no siblings
// inside its own org, so the handler skips the sibling read entirely.
// Children — none. #383 added explicit `w.id != $2` self-filter.
mock.ExpectQuery("SELECT w.id, w.name.*WHERE w.parent_id = \\$1 AND w.id != \\$2").
@@ -453,6 +453,14 @@ func TestExtended_DiscoverMissingHeader(t *testing.T) {
// ---------- TestPeers (Extended) ----------
// TestExtended_Peers verifies a root-level (org-root) workspace's peer view.
//
// #1953: previously a root-level caller issued `WHERE w.parent_id IS NULL`
// for siblings, which returned EVERY other tenant's org root as a "peer"
// (cross-tenant leak, since the workspaces table has no org_id column). After
// the fix an org root has no cross-tenant siblings; its only peers are its own
// children. This test asserts the child is returned and that NO sibling query
// is issued (no `parent_id IS NULL` read).
func TestExtended_Peers(t *testing.T) {
mock := setupTestDB(t)
setupTestRedis(t)
@@ -463,17 +471,14 @@ func TestExtended_Peers(t *testing.T) {
WithArgs("ws-peer").
WillReturnRows(sqlmock.NewRows([]string{"parent_id"}).AddRow(nil))
// Expect root-level siblings query (parent IS NULL, excluding self)
mock.ExpectQuery("SELECT w.id, w.name").
WithArgs("ws-peer").
WillReturnRows(sqlmock.NewRows([]string{"id", "name", "role", "tier", "status", "agent_card", "url", "parent_id", "active_tasks"}).
AddRow("ws-sibling", "Sibling Agent", "worker", 1, "online", []byte("null"), "http://localhost:9001", nil, 0))
// NO root-level sibling query is issued for an org-root caller anymore.
// Expect children query (workspaces with parent_id = ws-peer, excluding self)
// Query now binds (parent_id, self_id) for the self-filter guard added in #383.
// Children query (workspaces with parent_id = ws-peer, excluding self).
// Query binds (parent_id, self_id) for the self-filter guard added in #383.
mock.ExpectQuery("SELECT w.id, w.name").
WithArgs("ws-peer", "ws-peer").
WillReturnRows(sqlmock.NewRows([]string{"id", "name", "role", "tier", "status", "agent_card", "url", "parent_id", "active_tasks"}))
WillReturnRows(sqlmock.NewRows([]string{"id", "name", "role", "tier", "status", "agent_card", "url", "parent_id", "active_tasks"}).
AddRow("ws-child", "Child Agent", "worker", 1, "online", []byte("null"), "http://localhost:9001", "ws-peer", 0))
// No parent query since workspace is root-level
@@ -493,10 +498,10 @@ func TestExtended_Peers(t *testing.T) {
t.Fatalf("failed to parse response: %v", err)
}
if len(resp) != 1 {
t.Fatalf("expected 1 peer, got %d", len(resp))
t.Fatalf("expected 1 peer (the child), got %d", len(resp))
}
if resp[0]["name"] != "Sibling Agent" {
t.Errorf("expected peer name 'Sibling Agent', got %v", resp[0]["name"])
if resp[0]["name"] != "Child Agent" {
t.Errorf("expected peer name 'Child Agent', got %v", resp[0]["name"])
}
if err := mock.ExpectationsWereMet(); err != nil {
@@ -97,7 +97,15 @@ func (h *MCPHandler) toolListPeers(ctx context.Context, workspaceID string) (str
const cols = `SELECT w.id, w.name, COALESCE(w.role,''), w.status, w.tier`
// Siblings
// Siblings — workspaces sharing the caller's parent.
//
// #1953 cross-tenant isolation: the OLD else-branch returned every
// workspace with parent_id IS NULL when the caller was itself an org root,
// i.e. every other tenant's org root (the workspaces table has no org_id
// column). That leaked peer identities across tenants via MCP list_peers.
// An org root has no siblings inside its own org, so the org-root caller
// now gets no siblings; its peers are its children, enumerated below. Only
// the parent_id-bound branch enumerates siblings, scoped to one tenant.
if parentID.Valid {
rows, err := h.database.QueryContext(ctx,
cols+` FROM workspaces w WHERE w.parent_id = $1 AND w.id != $2 AND w.status != 'removed'`,
@@ -107,15 +115,6 @@ func (h *MCPHandler) toolListPeers(ctx context.Context, workspaceID string) (str
log.Printf("MCP toolListPeers: sibling scan error: %v", scanErr)
}
}
} else {
rows, err := h.database.QueryContext(ctx,
cols+` FROM workspaces w WHERE w.parent_id IS NULL AND w.id != $1 AND w.status != 'removed'`,
workspaceID)
if err == nil {
if scanErr := scanPeers(rows); scanErr != nil {
log.Printf("MCP toolListPeers: sibling scan error: %v", scanErr)
}
}
}
// Children
@@ -48,6 +48,7 @@ type memoryV2Deps struct {
// call. Defining an interface here lets handler tests stub the plugin
// without spinning up an HTTP server.
type memoryPluginAPI interface {
UpsertNamespace(ctx context.Context, name string, body contract.NamespaceUpsert) (*contract.Namespace, error)
CommitMemory(ctx context.Context, namespace string, body contract.MemoryWrite) (*contract.MemoryWriteResponse, error)
Search(ctx context.Context, body contract.SearchRequest) (*contract.SearchResponse, error)
ForgetMemory(ctx context.Context, id string, body contract.ForgetRequest) error
@@ -117,6 +118,9 @@ func (h *MCPHandler) toolCommitMemoryV2(ctx context.Context, workspaceID string,
if !ok {
return "", fmt.Errorf("workspace %s cannot write to namespace %s", workspaceID, ns)
}
if _, err := h.memv2.plugin.UpsertNamespace(ctx, ns, contract.NamespaceUpsert{Kind: kindFromNamespace(ns)}); err != nil {
return "", fmt.Errorf("plugin upsert namespace: %w", err)
}
// SAFE-T1201: scrub credential-shaped strings BEFORE the plugin sees
// them. Non-negotiable; see memories.go:180.
@@ -171,6 +175,19 @@ func (h *MCPHandler) toolCommitMemoryV2(ctx context.Context, workspaceID string,
return string(out), nil
}
func kindFromNamespace(ns string) contract.NamespaceKind {
switch {
case strings.HasPrefix(ns, "workspace:"):
return contract.NamespaceKindWorkspace
case strings.HasPrefix(ns, "team:"):
return contract.NamespaceKindTeam
case strings.HasPrefix(ns, "org:"):
return contract.NamespaceKindOrg
default:
return contract.NamespaceKindCustom
}
}
// ─────────────────────────────────────────────────────────────────────────────
// search_memory
// ─────────────────────────────────────────────────────────────────────────────
@@ -20,11 +20,18 @@ import (
// --- stubs ---
type stubMemoryPlugin struct {
upsertFn func(ctx context.Context, name string, body contract.NamespaceUpsert) (*contract.Namespace, error)
commitFn func(ctx context.Context, ns string, body contract.MemoryWrite) (*contract.MemoryWriteResponse, error)
searchFn func(ctx context.Context, body contract.SearchRequest) (*contract.SearchResponse, error)
forgetFn func(ctx context.Context, id string, body contract.ForgetRequest) error
}
func (s *stubMemoryPlugin) UpsertNamespace(ctx context.Context, name string, body contract.NamespaceUpsert) (*contract.Namespace, error) {
if s.upsertFn != nil {
return s.upsertFn(ctx, name, body)
}
return &contract.Namespace{Name: name, Kind: body.Kind}, nil
}
func (s *stubMemoryPlugin) CommitMemory(ctx context.Context, ns string, body contract.MemoryWrite) (*contract.MemoryWriteResponse, error) {
if s.commitFn != nil {
return s.commitFn(ctx, ns, body)
@@ -159,7 +166,15 @@ func TestMemoryV2Available(t *testing.T) {
func TestCommitMemoryV2_HappyPathDefaultNamespace(t *testing.T) {
db, _, _ := sqlmock.New()
defer db.Close()
gotUpsertNS := ""
h := newV2Handler(t, db, &stubMemoryPlugin{
upsertFn: func(_ context.Context, name string, body contract.NamespaceUpsert) (*contract.Namespace, error) {
gotUpsertNS = name
if body.Kind != contract.NamespaceKindWorkspace {
t.Errorf("upsert kind = %q, want workspace", body.Kind)
}
return &contract.Namespace{Name: name, Kind: body.Kind}, nil
},
commitFn: func(_ context.Context, ns string, body contract.MemoryWrite) (*contract.MemoryWriteResponse, error) {
if ns != "workspace:root-1" {
t.Errorf("ns = %q, want default workspace:root-1", ns)
@@ -180,6 +195,9 @@ func TestCommitMemoryV2_HappyPathDefaultNamespace(t *testing.T) {
if !strings.Contains(got, `"id":"mem-1"`) {
t.Errorf("got = %s", got)
}
if gotUpsertNS != "workspace:root-1" {
t.Errorf("upsert namespace = %q, want workspace:root-1", gotUpsertNS)
}
}
func TestCommitMemoryV2_NamespaceParamUsed(t *testing.T) {
@@ -45,6 +45,9 @@ type fakePlugin struct {
forgetReq contract.ForgetRequest
}
func (f *fakePlugin) UpsertNamespace(ctx context.Context, name string, body contract.NamespaceUpsert) (*contract.Namespace, error) {
return &contract.Namespace{Name: name, Kind: body.Kind}, nil
}
func (f *fakePlugin) CommitMemory(ctx context.Context, ns string, body contract.MemoryWrite) (*contract.MemoryWriteResponse, error) {
return nil, errors.New("not implemented in fake")
}
@@ -511,11 +514,11 @@ func TestMemoriesV2_Forget_MissingMemoryID_400(t *testing.T) {
// DisplayName over UUID-prefix fallback (issue #2988).
func TestNamespaceLabelWithName_PrefersDisplayNameWhenSet(t *testing.T) {
cases := []struct {
name string
raw string
kind contract.NamespaceKind
display string
want string
name string
raw string
kind contract.NamespaceKind
display string
want string
}{
{"workspace with name", "workspace:abc-1234", contract.NamespaceKindWorkspace, "mac laptop", "Workspace (mac laptop)"},
{"team with name", "team:abc-1234", contract.NamespaceKindTeam, "Engineering", "Team (Engineering)"},
@@ -625,12 +628,12 @@ func TestParseLimit(t *testing.T) {
}{
{"", memoriesV2DefaultLimit},
{"10", 10},
{"0", memoriesV2DefaultLimit}, // ≤0 → default, not error
{"-5", memoriesV2DefaultLimit}, // negative → default
{"abc", memoriesV2DefaultLimit}, // non-numeric → default
{"99999", memoriesV2MaxLimit}, // over cap → clamped
{"100", memoriesV2MaxLimit}, // exactly cap → kept
{"99", 99}, // just under cap → kept
{"0", memoriesV2DefaultLimit}, // ≤0 → default, not error
{"-5", memoriesV2DefaultLimit}, // negative → default
{"abc", memoriesV2DefaultLimit}, // non-numeric → default
{"99999", memoriesV2MaxLimit}, // over cap → clamped
{"100", memoriesV2MaxLimit}, // exactly cap → kept
{"99", 99}, // just under cap → kept
}
for _, tc := range cases {
t.Run("raw="+tc.raw, func(t *testing.T) {
@@ -741,11 +744,11 @@ func TestWithMemoryV2_FluentReturnsReceiver(t *testing.T) {
func TestShortID(t *testing.T) {
cases := map[string]string{
"": "",
"short": "short",
"exactly8": "exactly8",
"longer-than-eight": "longer-t",
"abc-1234-5678-90ab": "abc-1234",
"": "",
"short": "short",
"exactly8": "exactly8",
"longer-than-eight": "longer-t",
"abc-1234-5678-90ab": "abc-1234",
}
for in, want := range cases {
if got := shortID(in); got != want {
@@ -0,0 +1,104 @@
package handlers
// org_scope.go — cross-tenant isolation helpers (#1953).
//
// The `workspaces` table has no `org_id` column; an "org" is the subtree of
// workspaces reachable through the `parent_id` chain from a single org root
// (a row with parent_id IS NULL). Several code paths historically computed an
// org-root sibling set as `WHERE parent_id IS NULL`, which matches EVERY
// tenant's org root and therefore leaks peer metadata / routing across tenants.
//
// This file centralises the org-scoping primitive so peer discovery, the MCP
// list_peers tool, and a2a routing all derive "the caller's org" the SAME way
// the OFFSEC-015 broadcast fix (commit 5a05302c, workspace_broadcast.go) does:
// a recursive CTE that walks the parent_id chain up to the org root. Keeping
// the CTE in one place means there is a single, testable source of truth for
// tenant isolation rather than four hand-copied queries that can drift.
//
// NOTE: this is the parent_id-chain scoping that the broadcast fix already
// ships. It is deliberately NOT an `org_id` column — adding that column is a
// separate architecture decision pending CTO sign-off. See #1953.
import (
"context"
"database/sql"
"errors"
)
// errNoOrgRoot is returned by orgRootID when the workspace id has no row (and
// therefore no resolvable org root). Callers translate this into a 404/not-found
// at their own layer; it is distinct from a transient DB error so a missing
// workspace never gets treated as "belongs to every org".
var errNoOrgRoot = errors.New("org root not found for workspace")
// orgRootSubtreeCTE is the recursive CTE — identical in shape to the OFFSEC-015
// broadcast fix — that walks UP the parent_id chain from a single workspace to
// its org root. The org root is the row on the chain whose parent_id IS NULL.
//
// $1 = workspace id to resolve
//
// The recursive member walks UP the parent_id chain: each step joins to the row
// whose id is the current row's parent_id. The topmost ancestor is the single
// chain row with parent_id IS NULL — and THAT row's own `id` is the org root.
//
// We select that parentless row's `id` (aliased root_id). We must NOT carry a
// fixed `id AS root_id` from the recursive seed: that value is just the input
// workspace id, so a non-root caller (e.g. a child delegating to a sibling)
// would resolve to ITSELF instead of its org root, and sameOrg() would wrongly
// report two genuinely same-org workspaces as different orgs and 403 a
// legitimate a2a route. A workspace that already IS an org root has a one-row
// chain whose id == itself, so it correctly resolves to itself.
const orgRootSubtreeCTE = `
WITH RECURSIVE org_chain AS (
SELECT id, parent_id
FROM workspaces
WHERE id = $1
UNION ALL
SELECT w.id, w.parent_id
FROM workspaces w
JOIN org_chain c ON w.id = c.parent_id
)
SELECT id AS root_id FROM org_chain WHERE parent_id IS NULL LIMIT 1
`
// orgRootID resolves the org root of `workspaceID` by walking the parent_id
// chain via orgRootSubtreeCTE. Returns errNoOrgRoot when the workspace (or its
// chain) yields no org root row, and the underlying error on any DB failure.
//
// This is the SAME lookup the broadcast handler performs inline; the three
// leak paths in #1953 call this instead of re-deriving "the org" from
// `parent_id IS NULL` (which spans all tenants).
func orgRootID(ctx context.Context, database *sql.DB, workspaceID string) (string, error) {
var root string
err := database.QueryRowContext(ctx, orgRootSubtreeCTE, workspaceID).Scan(&root)
if errors.Is(err, sql.ErrNoRows) {
return "", errNoOrgRoot
}
if err != nil {
return "", err
}
if root == "" {
return "", errNoOrgRoot
}
return root, nil
}
// sameOrg reports whether workspaces `a` and `b` share an org root, i.e. they
// belong to the same tenant. Used by a2a routing to reject resolving/dispatching
// to a workspace id outside the caller's org. Fail-CLOSED: any lookup error or
// missing org root yields (false, err) so a DB hiccup denies cross-tenant
// routing rather than allowing it.
func sameOrg(ctx context.Context, database *sql.DB, a, b string) (bool, error) {
if a == b {
return true, nil
}
rootA, err := orgRootID(ctx, database, a)
if err != nil {
return false, err
}
rootB, err := orgRootID(ctx, database, b)
if err != nil {
return false, err
}
return rootA == rootB, nil
}
@@ -245,6 +245,11 @@ func (h *SecretsHandler) Values(c *gin.Context) {
// provisioner path in workspace_provision.go so env-vars look identical
// whether the workspace was bootstrapped locally or remotely).
out := map[string]string{}
// Provenance side-channel (internal#711): which keys in `out` originated
// from global_secrets and were NOT overridden by a workspace_secrets row.
// Used by the provider-aware gate below so a non-platform workspace's
// remote pull never receives the platform's scope:global LLM credential.
globalKeys := map[string]struct{}{}
// Track decrypt failures so we can refuse the response with a list
// instead of returning a partial bundle that boots a broken agent.
var failedKeys []string
@@ -270,6 +275,7 @@ func (h *SecretsHandler) Values(c *gin.Context) {
continue
}
out[k] = string(decrypted)
globalKeys[k] = struct{}{}
}
}
if err := globalRows.Err(); err != nil {
@@ -294,6 +300,10 @@ func (h *SecretsHandler) Values(c *gin.Context) {
continue
}
out[k] = string(decrypted) // workspace override wins over global
// User explicitly re-set this via the canvas Secrets tab — it is
// no longer "the operator-store version", so drop the global
// provenance flag (mirrors loadWorkspaceSecrets).
delete(globalKeys, k)
}
}
if err := wsRows.Err(); err != nil {
@@ -309,6 +319,32 @@ func (h *SecretsHandler) Values(c *gin.Context) {
return
}
// internal#711: provider-aware gate on the remote-pull path. A workspace
// whose resolved billing mode is NOT platform_managed (byok / subscription)
// must NOT receive the platform's scope:global LLM credentials
// (CLAUDE_CODE_OAUTH_TOKEN + the rest of the bypass-key set). Those keys
// were merged from global_secrets above; here we drop any that are still
// of global provenance (a workspace override survives, since its flag was
// cleared). Symmetric with applyPlatformManagedLLMEnv's strip on the
// provision/restart env path — both injection vectors are now gated.
//
// Default-closed: ResolveLLMBillingMode collapses any DB error / NULL /
// garbled value to platform_managed, so a transient failure leaves the
// existing (global-inheriting) behavior in place rather than stripping a
// platform_managed workspace's creds.
orgMode := strings.ToLower(strings.TrimSpace(os.Getenv("MOLECULE_LLM_BILLING_MODE")))
res, resolveErr := ResolveLLMBillingMode(ctx, workspaceID, orgMode)
if resolveErr != nil {
log.Printf("secrets.Values: resolve billing mode workspace=%s err=%v (defaulting to platform_managed)", workspaceID, resolveErr)
}
if res.ResolvedMode != LLMBillingModePlatformManaged {
for k := range globalKeys {
if isPlatformManagedDirectLLMBypassKey(k) {
delete(out, k)
}
}
}
c.JSON(http.StatusOK, out)
}
@@ -865,6 +865,12 @@ func TestSecretsValues_LegacyWorkspaceGrandfathered(t *testing.T) {
WithArgs(testWsID).
WillReturnRows(sqlmock.NewRows([]string{"key", "encrypted_value", "encryption_version"}).
AddRow("WS_KEY", []byte("ws_plainvalue"), 0))
// internal#711: Values now resolves billing mode to gate the global LLM-cred
// merge. Neither key here is a platform-managed LLM bypass key, so the mode
// is immaterial to the assertions — but the resolver query must be mocked.
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(testWsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(LLMBillingModePlatformManaged))
w := httptest.NewRecorder()
c := secretsValuesRequest(w, "") // no auth — grandfathered
@@ -942,6 +948,12 @@ func TestSecretsValues_ValidTokenReturnsDecryptedMerge(t *testing.T) {
WillReturnRows(sqlmock.NewRows([]string{"key", "encrypted_value", "encryption_version"}).
AddRow("ONLY_WS", []byte("ws_val"), 0).
AddRow("SHARED_KEY", []byte("ws_wins"), 0))
// internal#711: billing-mode resolver query. None of these keys is a
// platform-managed LLM bypass key, so the resolved mode does not affect the
// merge assertions; platform_managed keeps the existing pass-through.
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(testWsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(LLMBillingModePlatformManaged))
w := httptest.NewRecorder()
c := secretsValuesRequest(w, "Bearer good-token")
@@ -963,6 +975,68 @@ func TestSecretsValues_ValidTokenReturnsDecryptedMerge(t *testing.T) {
}
}
// TestSecretsValues_ByokStripsGlobalLLMCred is the internal#711 regression
// guard for the remote-pull injection vector. A non-platform (byok) workspace
// that pulls its secrets via GET /workspaces/:id/secrets/values must NOT
// receive the platform's scope:global CLAUDE_CODE_OAUTH_TOKEN — that key is
// of global_secrets provenance and is dropped by the provider-aware gate.
// Its OWN ANTHROPIC_API_KEY (a workspace_secrets row) survives, and unrelated
// non-LLM global secrets are untouched.
func TestSecretsValues_ByokStripsGlobalLLMCred(t *testing.T) {
mock := setupTestDB(t)
handler := NewSecretsHandler(nil)
mock.ExpectQuery(`SELECT COUNT\(\*\) FROM workspace_auth_tokens`).
WithArgs(testWsID).
WillReturnRows(sqlmock.NewRows([]string{"count"}).AddRow(1))
mock.ExpectQuery(`SELECT t\.id, t\.workspace_id.*FROM workspace_auth_tokens t.*JOIN workspaces`).
WithArgs(sqlmock.AnyArg()).
WillReturnRows(sqlmock.NewRows([]string{"id", "workspace_id"}).AddRow("tok-1", testWsID))
mock.ExpectExec(`UPDATE workspace_auth_tokens SET last_used_at`).
WithArgs("tok-1").
WillReturnResult(sqlmock.NewResult(0, 1))
// global_secrets holds the platform's scope:global OAuth token + a
// non-LLM operator global (should be untouched).
mock.ExpectQuery(`SELECT key, encrypted_value, encryption_version FROM global_secrets`).
WillReturnRows(sqlmock.NewRows([]string{"key", "encrypted_value", "encryption_version"}).
AddRow("CLAUDE_CODE_OAUTH_TOKEN", []byte("PLATFORM-GLOBAL-OAUTH"), 0).
AddRow("SENTRY_DSN", []byte("https://sentry.example/123"), 0))
// The workspace brought its OWN Anthropic API key via the Secrets tab.
mock.ExpectQuery(`SELECT key, encrypted_value, encryption_version FROM workspace_secrets WHERE workspace_id`).
WithArgs(testWsID).
WillReturnRows(sqlmock.NewRows([]string{"key", "encrypted_value", "encryption_version"}).
AddRow("ANTHROPIC_API_KEY", []byte("CUSTOMER-OWN-ANTHROPIC-KEY"), 0))
// Resolver: this workspace is byok.
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(testWsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(LLMBillingModeBYOK))
w := httptest.NewRecorder()
c := secretsValuesRequest(w, "Bearer good-token")
handler.Values(c)
if w.Code != http.StatusOK {
t.Fatalf("expected 200, got %d: %s", w.Code, w.Body.String())
}
var body map[string]string
_ = json.Unmarshal(w.Body.Bytes(), &body)
// 1. Platform global OAuth token stripped — the leak is closed on the pull path.
if got, ok := body["CLAUDE_CODE_OAUTH_TOKEN"]; ok {
t.Fatalf("CLAUDE_CODE_OAUTH_TOKEN = %q present — platform scope:global token must be stripped for byok pull", got)
}
// 2. The workspace's own LLM key survives.
if body["ANTHROPIC_API_KEY"] != "CUSTOMER-OWN-ANTHROPIC-KEY" {
t.Fatalf("ANTHROPIC_API_KEY = %q, want the workspace's own key preserved", body["ANTHROPIC_API_KEY"])
}
// 3. Unrelated non-LLM global secrets are untouched.
if body["SENTRY_DSN"] != "https://sentry.example/123" {
t.Fatalf("SENTRY_DSN = %q, want non-LLM globals untouched", body["SENTRY_DSN"])
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
func TestSecretsValues_InvalidWorkspaceID(t *testing.T) {
setupTestDB(t)
handler := NewSecretsHandler(nil)
@@ -75,3 +75,21 @@ func formatMissingEnvError(missing []string) string {
strings.Join(missing, ", "),
)
}
// formatMissingBYOKCredentialError builds the user-facing message for a
// provision failure caused by a non-platform (byok/subscription) workspace
// that has no usable LLM credential of its own (internal#711). The platform's
// scope:global LLM credentials are NOT a valid fallback for a non-platform
// workspace — resolving to them would bill the platform's Anthropic credits —
// so the provision fails closed here rather than starting the workspace on
// stripped/absent creds. Rendered verbatim in the canvas Events tab.
func formatMissingBYOKCredentialError(mode string) string {
return fmt.Sprintf(
"this workspace's LLM billing mode is %q (not platform-managed) but it has no LLM credential of its own. "+
"Add a workspace-scoped credential (e.g. CLAUDE_CODE_OAUTH_TOKEN or your provider's API key) under "+
"Config → Secrets, or switch the workspace to platform-managed billing via "+
"/admin/workspaces/:id/llm-billing-mode, then retry. The platform's shared LLM credentials are not "+
"used for non-platform workspaces.",
mode,
)
}
@@ -943,7 +943,47 @@ func applyRuntimeModelEnv(envVars map[string]string, runtime, model string) {
// MOLECULE_LLM_BILLING_MODE_RESOLVED so an in-container debug check can
// answer "what mode is this workspace running under" without DB queries
// (RFC Observability hot-spot).
func applyPlatformManagedLLMEnv(ctx context.Context, envVars map[string]string, workspaceID, runtime, model string) {
//
// internal#711 — PROVIDER-AWARE GLOBAL-LLM-CRED GATE. The platform's
// LLM credentials (CLAUDE_CODE_OAUTH_TOKEN + the rest of the
// platformManagedDirectLLMBypassKeys set) live in `global_secrets` and
// are merged into EVERY workspace's env by loadWorkspaceSecrets — that
// merge is provenance-blind. Pre-fix, the non-platform (byok/disabled)
// early-return left envVars untouched, so a BYOK / subscription
// workspace that brought NO LLM credential of its own still inherited
// the platform's scope:global CLAUDE_CODE_OAUTH_TOKEN and ran Opus on
// the platform's (Molecule's) Anthropic credits (Reno Stars SEO +
// Marketing agents, confirmed live 2026-05-27).
//
// The gate: on the non-platform path we strip every platform-managed
// LLM key whose PROVENANCE is `global_secrets` (the globalKeys set).
// A workspace's OWN LLM credential — set via the canvas Secrets tab,
// i.e. a `workspace_secrets` row — has had its global provenance flag
// dropped by loadWorkspaceSecrets, so it is NOT in globalKeys and
// survives. Net effect: platform global LLM creds reach a workspace
// ONLY when its resolved mode is platform_managed; a non-platform
// workspace resolves to its own (workspace-scoped) credential or none.
//
// The boolean return reports whether, after the gate, the workspace
// still has at least one usable LLM credential. The caller
// (prepareProvisionContext) uses it to FAIL CLOSED — a non-platform
// workspace with no usable LLM credential is aborted with a clear
// MISSING_BYOK_CREDENTIAL error at provision time rather than being
// started on (now-stripped) platform creds.
// platformLLMEnvResult is the structured outcome of applyPlatformManagedLLMEnv.
// ResolvedMode is the per-workspace billing/provider mode the resolver
// landed on. HasUsableLLMCred reports whether — AFTER the provider-aware
// global-cred gate — the workspace still has at least one platform-managed
// LLM credential key in its env (its own, workspace-scoped one). Only the
// non-platform path consults HasUsableLLMCred for the fail-closed decision;
// the platform_managed path always returns true (it forces the CP proxy
// usage token, which IS the usable credential).
type platformLLMEnvResult struct {
ResolvedMode string
HasUsableLLMCred bool
}
func applyPlatformManagedLLMEnv(ctx context.Context, envVars map[string]string, globalKeys map[string]struct{}, workspaceID, runtime, model string) platformLLMEnvResult {
orgMode := strings.ToLower(strings.TrimSpace(os.Getenv("MOLECULE_LLM_BILLING_MODE")))
res, resolveErr := ResolveLLMBillingMode(ctx, workspaceID, orgMode)
if resolveErr != nil {
@@ -966,18 +1006,35 @@ func applyPlatformManagedLLMEnv(ctx context.Context, envVars map[string]string,
// pulling logs or hitting the admin route.
envVars["MOLECULE_LLM_BILLING_MODE_RESOLVED"] = res.ResolvedMode
if res.ResolvedMode != LLMBillingModePlatformManaged {
// byok or disabled — DO NOT strip vendor keys, DO NOT force-route to CP,
// DO NOT override the workspace own ANTHROPIC_BASE_URL / OAuth token.
// Leave envVars alone so CLAUDE_CODE_OAUTH_TOKEN / vendor API keys
// pulled from workspace_secrets survive into the container, and the
// workspace talks to its own provider directly (internal#703).
return
// byok or disabled — DO NOT force-route to CP, DO NOT override the
// workspace's own ANTHROPIC_BASE_URL / OAuth token.
//
// internal#711: but DO strip platform-origin LLM credentials. The
// platform's scope:global CLAUDE_CODE_OAUTH_TOKEN (+ the rest of the
// bypass-key set) was merged into envVars by loadWorkspaceSecrets
// from global_secrets; without this strip a BYOK workspace that
// brought no LLM credential of its own would inherit the platform's
// global token and bill the platform's Anthropic credits. The strip
// is PROVENANCE-AWARE: only keys still flagged as global_secrets
// origin are removed; a workspace's own LLM cred (a workspace_secrets
// row — provenance flag already dropped by loadWorkspaceSecrets)
// survives so the workspace talks to its own provider directly.
stripGlobalOriginLLMCreds(envVars, globalKeys)
return platformLLMEnvResult{
ResolvedMode: res.ResolvedMode,
HasUsableLLMCred: hasAnyPlatformManagedLLMKey(envVars),
}
}
baseURL := firstNonEmptyEnv("MOLECULE_LLM_BASE_URL", "OPENAI_BASE_URL")
anthropicBaseURL := firstNonEmptyEnv("MOLECULE_LLM_ANTHROPIC_BASE_URL", "ANTHROPIC_BASE_URL")
token := firstNonEmptyEnv("MOLECULE_LLM_USAGE_TOKEN", "OPENAI_API_KEY")
if baseURL == "" || token == "" {
return
// Proxy not configured (boot race / misconfig). On the platform_managed
// path the workspace IS entitled to platform creds, so we do NOT strip
// here — but we report HasUsableLLMCred from whatever survived so the
// caller's fail-closed branch (non-platform only) is never reached on
// this path.
return platformLLMEnvResult{ResolvedMode: res.ResolvedMode, HasUsableLLMCred: true}
}
stripPlatformManagedLLMBypassEnv(envVars)
@@ -1006,6 +1063,10 @@ func applyPlatformManagedLLMEnv(ctx context.Context, envVars map[string]string,
envVars["MOLECULE_MODEL"] = defaultModel
}
}
// platform_managed: the CP proxy usage token (injected as ANTHROPIC_API_KEY
// / OPENAI_API_KEY above) IS the usable credential, so the workspace is
// never fail-closed on this path.
return platformLLMEnvResult{ResolvedMode: res.ResolvedMode, HasUsableLLMCred: true}
}
func stripPlatformManagedLLMBypassEnv(envVars map[string]string) {
@@ -1014,6 +1075,41 @@ func stripPlatformManagedLLMBypassEnv(envVars map[string]string) {
}
}
// stripGlobalOriginLLMCreds removes platform-managed LLM credential keys
// (CLAUDE_CODE_OAUTH_TOKEN + the rest of platformManagedDirectLLMBypassKeys)
// from envVars ONLY when they originated from the operator-controlled
// `global_secrets` table (i.e. their key is present in globalKeys).
//
// internal#711 provider-aware gate. A platform global LLM credential is the
// platform's own credential and must never be the credential a non-platform
// (byok / subscription) workspace runs on. loadWorkspaceSecrets drops the
// global-provenance flag for any key the workspace re-set via the canvas
// Secrets tab (a workspace_secrets row), so a workspace's OWN LLM credential
// is NOT in globalKeys and survives this strip — only the inherited platform
// global creds are removed.
func stripGlobalOriginLLMCreds(envVars map[string]string, globalKeys map[string]struct{}) {
for key := range platformManagedDirectLLMBypassKeys {
if _, fromGlobal := globalKeys[key]; fromGlobal {
delete(envVars, key)
}
}
}
// hasAnyPlatformManagedLLMKey reports whether envVars still carries at least
// one non-empty platform-managed LLM credential key after the provider-aware
// gate. Used by the non-platform fail-closed branch: a byok/subscription
// workspace with no surviving (workspace-scoped) LLM credential must be
// aborted with MISSING_BYOK_CREDENTIAL rather than started credential-less or
// on stripped platform creds.
func hasAnyPlatformManagedLLMKey(envVars map[string]string) bool {
for key := range platformManagedDirectLLMBypassKeys {
if strings.TrimSpace(envVars[key]) != "" {
return true
}
}
return false
}
func runtimeUsesAnthropicNativeProxy(runtime string) bool {
return strings.EqualFold(strings.TrimSpace(runtime), "claude-code")
}
@@ -193,7 +193,35 @@ func (h *WorkspaceHandler) prepareProvisionContext(
// continue to rely on workspace_secrets / org-import persona-env
// merge for their git auth.
applyAgentGitHTTPCreds(envVars, payload.Role)
applyPlatformManagedLLMEnv(ctx, envVars, workspaceID, payload.Runtime, payload.Model)
// internal#711: provider-aware LLM-credential resolution. On a non-platform
// (byok/subscription) workspace this strips the platform's scope:global LLM
// creds inherited from global_secrets and reports whether the workspace
// still has a usable (workspace-scoped) LLM credential of its own.
llmRes := applyPlatformManagedLLMEnv(ctx, envVars, globalSecretKeys, workspaceID, payload.Runtime, payload.Model)
// Fail closed for a BYOK workspace with no usable LLM credential: do NOT
// start it on the platform's (now-stripped) global creds. Mirror the
// "model+provider+credential REQUIRED at create" spirit (internal#711)
// with an actionable error surfaced at provision time.
//
// Scoped to byok specifically (NOT disabled): "byok" means "the user
// intends to run an LLM on their own credential" — a missing one is a
// misconfiguration worth surfacing loudly. "disabled" means "this
// workspace runs no platform-billed LLM at all" (terminal / file work, or
// a runtime that talks to a non-bypass-key endpoint); stripping the
// inherited platform globals is sufficient there and aborting would
// regress a legitimate no-LLM workspace. The strip above already ran for
// both non-platform modes.
//
// The bypass-key check is intentionally broad — any surviving bypass key
// (the workspace's own, of workspace_secrets provenance) clears it.
if llmRes.ResolvedMode == LLMBillingModeBYOK && !llmRes.HasUsableLLMCred {
msg := formatMissingBYOKCredentialError(llmRes.ResolvedMode)
log.Printf("Provisioner: ABORT workspace=%s — byok billing mode has no usable LLM credential (MISSING_BYOK_CREDENTIAL, internal#711)", workspaceID)
return nil, &provisionAbort{
Msg: msg,
Extra: map[string]interface{}{"error": msg, "code": "MISSING_BYOK_CREDENTIAL", "billing_mode": llmRes.ResolvedMode, "issue": "711"},
}
}
applyRuntimeModelEnv(envVars, payload.Runtime, payload.Model)
if payload.Role != "" {
envVars["MOLECULE_AGENT_ROLE"] = payload.Role
@@ -494,6 +494,57 @@ func TestPrepareProvisionContext_WorkspaceSecretWinsOverPersonaToken(t *testing.
}
}
// TestPrepareProvisionContext_ByokWithOnlyGlobalOAuthFailsClosed is the
// internal#711 end-to-end guard for the live Reno Stars leak. A byok
// workspace whose ONLY LLM credential is the platform's scope:global
// CLAUDE_CODE_OAUTH_TOKEN (inherited from global_secrets, no workspace
// override) must:
//
// 1. have that platform token STRIPPED from the prepared env (no leak), and
// 2. ABORT the provision with the MISSING_BYOK_CREDENTIAL code rather than
// start the workspace on the platform's credits.
//
// This is the discriminating end-to-end test: pre-fix prepared.EnvVars would
// carry CLAUDE_CODE_OAUTH_TOKEN=<platform token> and the provision would
// succeed, running Opus on Molecule's Anthropic credits.
func TestPrepareProvisionContext_ByokWithOnlyGlobalOAuthFailsClosed(t *testing.T) {
const wsID = "352e3c2b-0546-4e9c-b487-1e2ff1cf29fc" // Reno Stars SEO agent
t.Setenv("MOLECULE_LLM_BILLING_MODE", LLMBillingModePlatformManaged)
mock := setupTestDB(t)
// global_secrets carries the platform's scope:global OAuth token.
mock.ExpectQuery(`SELECT key, encrypted_value, encryption_version FROM global_secrets`).
WillReturnRows(sqlmock.NewRows([]string{"key", "encrypted_value", "encryption_version"}).
AddRow("CLAUDE_CODE_OAUTH_TOKEN", []byte("PLATFORM-GLOBAL-OAUTH"), 0))
// Workspace set NO secrets of its own.
mock.ExpectQuery(`SELECT key, encrypted_value, encryption_version FROM workspace_secrets`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"key", "encrypted_value", "encryption_version"}))
// Resolver: workspace override = byok.
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(LLMBillingModeBYOK))
handler := NewWorkspaceHandler(&captureBroadcaster{}, nil, "http://localhost:8080", t.TempDir())
payload := models.CreateWorkspacePayload{
Name: "Reno Stars SEO",
Runtime: "claude-code",
Tier: 1,
}
prepared, abort := handler.prepareProvisionContext(
context.Background(), wsID, "/nonexistent", nil, payload, false)
if abort == nil {
t.Fatalf("expected MISSING_BYOK_CREDENTIAL abort, got success (prepared=%v) — the leak would still ship", prepared)
}
if code, _ := abort.Extra["code"].(string); code != "MISSING_BYOK_CREDENTIAL" {
t.Fatalf("abort.Extra[code] = %v, want MISSING_BYOK_CREDENTIAL", abort.Extra["code"])
}
if mode, _ := abort.Extra["billing_mode"].(string); mode != LLMBillingModeBYOK {
t.Fatalf("abort.Extra[billing_mode] = %v, want %q", abort.Extra["billing_mode"], LLMBillingModeBYOK)
}
}
// TestReadOrLazyHealInboundSecret pins the four branches of the
// shared lazy-heal helper directly. Each call site (chat_files,
// registry) has its own integration test, but those go through the
@@ -972,7 +1023,7 @@ func TestApplyPlatformManagedLLMEnv_NonClaudeRuntimeDefaultsOpenAIProxyWhenNoWor
t.Setenv("MOLECULE_LLM_DEFAULT_MODEL", "moonshot/kimi-k2.6")
envVars := map[string]string{}
applyPlatformManagedLLMEnv(context.Background(), envVars, "", "codex", "")
applyPlatformManagedLLMEnv(context.Background(), envVars, nil, "", "codex", "")
applyRuntimeModelEnv(envVars, "codex", "")
if got := envVars["OPENAI_BASE_URL"]; got != "https://api.example.test/api/v1/internal/llm/openai/v1" {
@@ -1002,7 +1053,7 @@ func TestApplyPlatformManagedLLMEnv_StripsWorkspaceOpenAIKeyForClaudeCode(t *tes
"OPENAI_BASE_URL": "https://api.openai.com/v1",
"MODEL": "openai/gpt-5.5",
}
applyPlatformManagedLLMEnv(context.Background(), envVars, "", "claude-code", "")
applyPlatformManagedLLMEnv(context.Background(), envVars, nil, "", "claude-code", "")
if _, ok := envVars["OPENAI_API_KEY"]; ok {
t.Fatalf("OPENAI_API_KEY should be stripped for claude-code platform-managed mode")
@@ -1028,7 +1079,7 @@ func TestApplyPlatformManagedLLMEnv_ClaudeCodeUsesAnthropicProxyOverOAuth(t *tes
"CLAUDE_CODE_OAUTH_TOKEN": "user-oauth-token",
"MODEL": "sonnet",
}
applyPlatformManagedLLMEnv(context.Background(), envVars, "", "claude-code", "")
applyPlatformManagedLLMEnv(context.Background(), envVars, nil, "", "claude-code", "")
if _, ok := envVars["CLAUDE_CODE_OAUTH_TOKEN"]; ok {
t.Fatalf("CLAUDE_CODE_OAUTH_TOKEN should be stripped in platform-managed mode")
@@ -1051,7 +1102,7 @@ func TestApplyPlatformManagedLLMEnv_ClaudeCodeInjectsAnthropicProxyWhenNoWorkspa
t.Setenv("MOLECULE_LLM_USAGE_TOKEN", "tenant-admin-token")
envVars := map[string]string{}
applyPlatformManagedLLMEnv(context.Background(), envVars, "", "claude-code", "minimax/MiniMax-M2.7")
applyPlatformManagedLLMEnv(context.Background(), envVars, nil, "", "claude-code", "minimax/MiniMax-M2.7")
if got := envVars["ANTHROPIC_BASE_URL"]; got != "https://api.example.test/api/v1/internal/llm/anthropic/v1" {
t.Fatalf("ANTHROPIC_BASE_URL = %q", got)
@@ -1074,7 +1125,7 @@ func TestApplyPlatformManagedLLMEnv_ClaudeCodeStripsVendorBYOK(t *testing.T) {
"MINIMAX_API_KEY": "user-minimax-key",
"MODEL": "MiniMax-M2.7",
}
applyPlatformManagedLLMEnv(context.Background(), envVars, "", "claude-code", "")
applyPlatformManagedLLMEnv(context.Background(), envVars, nil, "", "claude-code", "")
if _, ok := envVars["MINIMAX_API_KEY"]; ok {
t.Fatalf("MINIMAX_API_KEY should be stripped in platform-managed mode")
@@ -1096,7 +1147,7 @@ func TestApplyPlatformManagedLLMEnv_NoopsOutsidePlatformManaged(t *testing.T) {
t.Setenv("MOLECULE_LLM_USAGE_TOKEN", "tenant-admin-token")
envVars := map[string]string{}
applyPlatformManagedLLMEnv(context.Background(), envVars, "", "claude-code", "")
applyPlatformManagedLLMEnv(context.Background(), envVars, nil, "", "claude-code", "")
if _, ok := envVars["OPENAI_API_KEY"]; ok {
t.Fatalf("OPENAI_API_KEY should not be set outside platform-managed mode")
@@ -1137,7 +1188,7 @@ func TestApplyPlatformManagedLLMEnv_ClaudeCodeByokKeepsOwnProviderEnv(t *testing
"CLAUDE_CODE_OAUTH_TOKEN": "user-oauth-token",
"MODEL": "sonnet",
}
applyPlatformManagedLLMEnv(context.Background(), envVars, wsID, "claude-code", "")
applyPlatformManagedLLMEnv(context.Background(), envVars, nil, wsID, "claude-code", "")
// 1. OAuth token intact — not stripped.
if got := envVars["CLAUDE_CODE_OAUTH_TOKEN"]; got != "user-oauth-token" {
@@ -1168,6 +1219,182 @@ func TestApplyPlatformManagedLLMEnv_ClaudeCodeByokKeepsOwnProviderEnv(t *testing
}
}
// TestApplyPlatformManagedLLMEnv_ByokStripsGlobalOriginOAuthToken is the
// internal#711 regression guard for the live 2026-05-27 leak (Reno Stars SEO
// + Marketing claude-code agents). A non-platform (byok) workspace that
// brought NO LLM credential of its own, but which inherited the platform's
// scope:global CLAUDE_CODE_OAUTH_TOKEN from global_secrets (provenance =
// globalKeys), must have that platform token STRIPPED — not run on it.
//
// Pre-fix the byok early-return left envVars untouched, so the platform's
// global OAuth token survived into the container and the agent ran Opus on
// the platform's Anthropic credits. The fix gates the global-cred merge on
// provider==platform: a non-platform workspace keeps only its own
// (workspace_secrets) creds, of which there are none here.
func TestApplyPlatformManagedLLMEnv_ByokStripsGlobalOriginOAuthToken(t *testing.T) {
const wsID = "352e3c2b-0546-4e9c-b487-1e2ff1cf29fc" // Reno Stars SEO agent
mock := setupTestDB(t)
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(LLMBillingModeBYOK))
t.Setenv("MOLECULE_LLM_BILLING_MODE", LLMBillingModePlatformManaged)
t.Setenv("MOLECULE_LLM_BASE_URL", "https://api.example.test/api/v1/internal/llm/openai/v1")
t.Setenv("MOLECULE_LLM_ANTHROPIC_BASE_URL", "https://api.example.test/api/v1/internal/llm/anthropic")
t.Setenv("MOLECULE_LLM_USAGE_TOKEN", "tenant-admin-token")
// The ONLY LLM credential in env is the platform's scope:global OAuth
// token, merged from global_secrets (so its key is in globalKeys). The
// workspace set none of its own.
envVars := map[string]string{
"CLAUDE_CODE_OAUTH_TOKEN": "PLATFORM-GLOBAL-OAUTH-TOKEN",
"MODEL": "opus",
}
globalKeys := map[string]struct{}{"CLAUDE_CODE_OAUTH_TOKEN": {}}
res := applyPlatformManagedLLMEnv(context.Background(), envVars, globalKeys, wsID, "claude-code", "")
// 1. The platform global OAuth token must be STRIPPED — the leak is closed.
if got, ok := envVars["CLAUDE_CODE_OAUTH_TOKEN"]; ok {
t.Fatalf("CLAUDE_CODE_OAUTH_TOKEN = %q present — platform scope:global token must be stripped for a byok workspace", got)
}
// 2. No CP proxy creds forced (byok = workspace talks to its own provider).
if got, ok := envVars["ANTHROPIC_API_KEY"]; ok {
t.Fatalf("ANTHROPIC_API_KEY must NOT be injected for byok, got %q", got)
}
// 3. Resolver reports byok with NO usable LLM credential → caller fails closed.
if res.ResolvedMode != LLMBillingModeBYOK {
t.Fatalf("ResolvedMode = %q, want %q", res.ResolvedMode, LLMBillingModeBYOK)
}
if res.HasUsableLLMCred {
t.Fatalf("HasUsableLLMCred = true, want false (only the stripped platform global token was present)")
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
// TestApplyPlatformManagedLLMEnv_ByokKeepsWorkspaceOwnOAuthEvenWithGlobal is
// the discriminating companion to the strip test: a byok workspace that DID
// set its own CLAUDE_CODE_OAUTH_TOKEN via the canvas Secrets tab (a
// workspace_secrets row) keeps it. loadWorkspaceSecrets drops the global
// provenance flag on a workspace override, so the key is NOT in globalKeys
// and the provenance-aware strip leaves it alone. Proves the fix strips only
// platform-origin creds, never the customer's own.
func TestApplyPlatformManagedLLMEnv_ByokKeepsWorkspaceOwnOAuthEvenWithGlobal(t *testing.T) {
const wsID = "6b66de8d-9337-4fb4-be8d-6d49dca0d809" // Reno Stars Marketing agent
mock := setupTestDB(t)
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(LLMBillingModeBYOK))
t.Setenv("MOLECULE_LLM_BILLING_MODE", LLMBillingModePlatformManaged)
t.Setenv("MOLECULE_LLM_BASE_URL", "https://api.example.test/api/v1/internal/llm/openai/v1")
t.Setenv("MOLECULE_LLM_USAGE_TOKEN", "tenant-admin-token")
// Workspace set its OWN OAuth token — loadWorkspaceSecrets would have
// dropped its global provenance flag, so globalKeys does NOT contain it.
envVars := map[string]string{
"CLAUDE_CODE_OAUTH_TOKEN": "CUSTOMER-OWN-OAUTH-TOKEN",
"MODEL": "opus",
}
globalKeys := map[string]struct{}{} // not from global_secrets
res := applyPlatformManagedLLMEnv(context.Background(), envVars, globalKeys, wsID, "claude-code", "")
if got := envVars["CLAUDE_CODE_OAUTH_TOKEN"]; got != "CUSTOMER-OWN-OAUTH-TOKEN" {
t.Fatalf("CLAUDE_CODE_OAUTH_TOKEN = %q, want the workspace's own token left intact", got)
}
if !res.HasUsableLLMCred {
t.Fatalf("HasUsableLLMCred = false, want true (workspace brought its own credential)")
}
if res.ResolvedMode != LLMBillingModeBYOK {
t.Fatalf("ResolvedMode = %q, want %q", res.ResolvedMode, LLMBillingModeBYOK)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
// TestApplyPlatformManagedLLMEnv_DisabledStripsGlobalButReportsNoCred proves
// that "disabled" mode also strips the platform's global LLM creds (the leak
// is closed for disabled too), and reports HasUsableLLMCred=false. The
// caller's fail-closed abort is scoped to byok only, so a disabled workspace
// with no LLM cred still boots (for terminal / non-LLM work); here we pin the
// function-level strip + report.
func TestApplyPlatformManagedLLMEnv_DisabledStripsGlobalButReportsNoCred(t *testing.T) {
const wsID = "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"
mock := setupTestDB(t)
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(LLMBillingModeDisabled))
t.Setenv("MOLECULE_LLM_BILLING_MODE", LLMBillingModePlatformManaged)
envVars := map[string]string{
"CLAUDE_CODE_OAUTH_TOKEN": "PLATFORM-GLOBAL-OAUTH-TOKEN",
}
globalKeys := map[string]struct{}{"CLAUDE_CODE_OAUTH_TOKEN": {}}
res := applyPlatformManagedLLMEnv(context.Background(), envVars, globalKeys, wsID, "claude-code", "")
if _, ok := envVars["CLAUDE_CODE_OAUTH_TOKEN"]; ok {
t.Fatalf("CLAUDE_CODE_OAUTH_TOKEN must be stripped for disabled mode too")
}
if res.ResolvedMode != LLMBillingModeDisabled {
t.Fatalf("ResolvedMode = %q, want %q", res.ResolvedMode, LLMBillingModeDisabled)
}
if res.HasUsableLLMCred {
t.Fatalf("HasUsableLLMCred = true, want false")
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
// TestApplyPlatformManagedLLMEnv_PlatformManagedStillReceivesGlobalCreds is
// the no-regression guard for the OTHER side of the gate (internal#711): a
// platform-managed workspace MUST still receive the platform's creds. Here
// the proxy IS configured, so the contract is the existing one — the global
// OAuth token is replaced by the proxy usage token (HasUsableLLMCred=true).
func TestApplyPlatformManagedLLMEnv_PlatformManagedStillReceivesGlobalCreds(t *testing.T) {
const wsID = "99999999-9999-9999-9999-999999999999"
mock := setupTestDB(t)
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(LLMBillingModePlatformManaged))
t.Setenv("MOLECULE_LLM_BILLING_MODE", LLMBillingModePlatformManaged)
t.Setenv("MOLECULE_LLM_BASE_URL", "https://api.example.test/api/v1/internal/llm/openai/v1")
t.Setenv("MOLECULE_LLM_ANTHROPIC_BASE_URL", "https://api.example.test/api/v1/internal/llm/anthropic")
t.Setenv("MOLECULE_LLM_USAGE_TOKEN", "tenant-admin-token")
envVars := map[string]string{
"CLAUDE_CODE_OAUTH_TOKEN": "PLATFORM-GLOBAL-OAUTH-TOKEN",
"MODEL": "opus",
}
globalKeys := map[string]struct{}{"CLAUDE_CODE_OAUTH_TOKEN": {}}
res := applyPlatformManagedLLMEnv(context.Background(), envVars, globalKeys, wsID, "claude-code", "")
// Platform-managed routes through the CP proxy: OAuth stripped, proxy creds forced.
if _, ok := envVars["CLAUDE_CODE_OAUTH_TOKEN"]; ok {
t.Fatalf("CLAUDE_CODE_OAUTH_TOKEN should be stripped + replaced by the proxy token for platform_managed")
}
if got := envVars["ANTHROPIC_API_KEY"]; got != "tenant-admin-token" {
t.Fatalf("ANTHROPIC_API_KEY = %q, want proxy usage token for platform_managed", got)
}
if !res.HasUsableLLMCred {
t.Fatalf("HasUsableLLMCred = false, want true for platform_managed (proxy token is the credential)")
}
if res.ResolvedMode != LLMBillingModePlatformManaged {
t.Fatalf("ResolvedMode = %q, want %q", res.ResolvedMode, LLMBillingModePlatformManaged)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
// TestApplyPlatformManagedLLMEnv_PlatformManagedStillEmitsResolvedMode is the
// no-regression companion: a workspace that resolves to platform_managed must
// still strip + force the proxy AND emit MOLECULE_LLM_BILLING_MODE=
@@ -1189,7 +1416,7 @@ func TestApplyPlatformManagedLLMEnv_PlatformManagedStillEmitsResolvedMode(t *tes
"CLAUDE_CODE_OAUTH_TOKEN": "user-oauth-token",
"MODEL": "sonnet",
}
applyPlatformManagedLLMEnv(context.Background(), envVars, wsID, "claude-code", "")
applyPlatformManagedLLMEnv(context.Background(), envVars, nil, wsID, "claude-code", "")
// OAuth stripped, proxy forced — unchanged platform_managed contract.
if _, ok := envVars["CLAUDE_CODE_OAUTH_TOKEN"]; ok {