fix(security): scope peer discovery + a2a routing to caller org (#1953) #1954

Merged
hongming merged 3 commits from fix/1953-scope-peer-discovery-a2a-to-org into main 2026-05-27 17:51:48 +00:00
Member

The leak (#1953)

Three workspace-server paths computed an "org-root sibling set" as WHERE parent_id IS NULL. Because the workspaces table has no org_id column, that predicate matches every tenant's org root → cross-tenant data exposure.

# Path Leak
1 GET /registry/:id/peers (discovery.Peers) When the caller is an org root, returned peer id/name/role/url/agent_card for every other tenant's org root.
2 MCP toolListPeers (mcp_tools.go) Same cross-tenant peer enumeration via the MCP bridge.
3 a2a routing (a2a_proxy.proxyA2ARequestresolveAgentURL) registry.CanCommunicate's "root-level siblings — both have no parent" rule treats every tenant's org root as a sibling, and resolveAgentURL accepts any workspace id with no org check → an org root could resolve/route a2a to another tenant's org root.

The fix — reuse the OFFSEC-015 scoping (no new org_id column)

The OFFSEC-015 broadcast fix (commit 5a05302c, workspace_broadcast.go) already defines "the org" as the parent_id-chain subtree rooted at a single org root, via a recursive CTE. This PR centralises that CTE in a new org_scope.go (orgRootID / sameOrg) so all paths derive the caller's org the same way:

  • discovery.Peers + toolListPeers: removed the parent_id IS NULL sibling branch. An org root has no siblings inside its own org; its peers are its children (still enumerated). Only the parent_id-bound sibling branch remains, already scoped to one parent (one tenant).
  • a2a proxyA2ARequest: after CanCommunicate, added a sameOrg() guard that returns 403 before resolveAgentURL when caller and target resolve to different org roots. Fail-closed — a DB error on the org lookup denies routing.

No org_id column is added — that remains a separate architecture decision pending CTO. This uses the existing parent_id-chain scoping only.

Isolation tests (cross_tenant_isolation_test.go)

Per-path cross-tenant regression + same-org positive:

  • TestPeers_CrossTenant_OrgRootNotLeaked — a different-org root must NOT appear in /registry peers.
  • TestToolListPeers_CrossTenant_OrgRootNotLeaked — must NOT appear via MCP list_peers.
  • TestProxyA2A_CrossTenant_RoutingDenied + TestResolveAgentURL_CrossTenant_RejectedViaSameOrg — a2a must reject resolving/routing to a workspace outside the caller's org (403).
  • TestPeers_SameOrg_SiblingsStillWork, TestToolListPeers_SameOrg_SiblingsStillWork, TestProxyA2A_SameOrg_RoutingAllowed — same-org peers/routing still work.

The three negative tests were verified to FAIL against the pre-fix code (true RED) and pass with the fix. Existing peer/a2a/delegation tests were updated to the org-scoped behavior (the old assertions encoded the leak).

go build ./... and go test ./workspace-server/internal/handlers/... both green locally (go1.26).

Architecture note / follow-up for CTO

registry.CanCommunicate still returns true for any two org roots (its "root-level siblings" rule). discovery.Discover and discovery.CheckAccess route through CanCommunicate and therefore share the same root-sibling weakness for the org-root↔org-root case. Closing that class fully means scoping CanCommunicate itself (registry package) — out of scope for the three #1953 paths but worth a follow-up. Doing it "right" longer-term is the pending org_id column decision.

Security-sensitive — do NOT merge. Bring to CTO for merge-go.

Refs #1953

🤖 Generated with Claude Code

## The leak (#1953) Three `workspace-server` paths computed an "org-root sibling set" as `WHERE parent_id IS NULL`. Because the `workspaces` table has **no `org_id` column**, that predicate matches **every tenant's** org root → cross-tenant data exposure. | # | Path | Leak | |---|------|------| | 1 | `GET /registry/:id/peers` (`discovery.Peers`) | When the caller is an org root, returned peer `id/name/role/url/agent_card` for **every other tenant's** org root. | | 2 | MCP `toolListPeers` (`mcp_tools.go`) | Same cross-tenant peer enumeration via the MCP bridge. | | 3 | a2a routing (`a2a_proxy.proxyA2ARequest` → `resolveAgentURL`) | `registry.CanCommunicate`'s "root-level siblings — both have no parent" rule treats every tenant's org root as a sibling, and `resolveAgentURL` accepts **any** workspace id with no org check → an org root could resolve/route a2a to another tenant's org root. | ## The fix — reuse the OFFSEC-015 scoping (no new `org_id` column) The OFFSEC-015 broadcast fix (commit `5a05302c`, `workspace_broadcast.go`) already defines "the org" as the `parent_id`-chain subtree rooted at a single org root, via a recursive CTE. This PR centralises that CTE in a new `org_scope.go` (`orgRootID` / `sameOrg`) so all paths derive the caller's org the same way: - **`discovery.Peers` + `toolListPeers`**: removed the `parent_id IS NULL` sibling branch. An org root has no siblings inside its own org; its peers are its children (still enumerated). Only the `parent_id`-bound sibling branch remains, already scoped to one parent (one tenant). - **a2a `proxyA2ARequest`**: after `CanCommunicate`, added a `sameOrg()` guard that returns **403 before `resolveAgentURL`** when caller and target resolve to different org roots. **Fail-closed** — a DB error on the org lookup denies routing. No `org_id` column is added — that remains a separate architecture decision pending CTO. This uses the existing `parent_id`-chain scoping only. ## Isolation tests (`cross_tenant_isolation_test.go`) Per-path cross-tenant regression + same-org positive: - `TestPeers_CrossTenant_OrgRootNotLeaked` — a different-org root must NOT appear in `/registry` peers. - `TestToolListPeers_CrossTenant_OrgRootNotLeaked` — must NOT appear via MCP `list_peers`. - `TestProxyA2A_CrossTenant_RoutingDenied` + `TestResolveAgentURL_CrossTenant_RejectedViaSameOrg` — a2a must reject resolving/routing to a workspace outside the caller's org (403). - `TestPeers_SameOrg_SiblingsStillWork`, `TestToolListPeers_SameOrg_SiblingsStillWork`, `TestProxyA2A_SameOrg_RoutingAllowed` — same-org peers/routing still work. The three negative tests were verified to **FAIL against the pre-fix code** (true RED) and pass with the fix. Existing peer/a2a/delegation tests were updated to the org-scoped behavior (the old assertions encoded the leak). `go build ./...` and `go test ./workspace-server/internal/handlers/...` both green locally (go1.26). ## Architecture note / follow-up for CTO `registry.CanCommunicate` still returns `true` for any two org roots (its "root-level siblings" rule). `discovery.Discover` and `discovery.CheckAccess` route through `CanCommunicate` and therefore share the **same root-sibling weakness** for the org-root↔org-root case. Closing that class fully means scoping `CanCommunicate` itself (registry package) — out of scope for the three #1953 paths but worth a follow-up. Doing it "right" longer-term is the pending `org_id` column decision. **Security-sensitive — do NOT merge. Bring to CTO for merge-go.** Refs #1953 🤖 Generated with [Claude Code](https://claude.com/claude-code)
core-be added 1 commit 2026-05-27 15:45:58 +00:00
fix(security): scope peer discovery + a2a routing to caller org (#1953)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 11s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Python Lint & Test (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 7s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
E2E Chat / detect-changes (pull_request) Successful in 8s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 14s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 13s
Harness Replays / detect-changes (pull_request) Successful in 5s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 40s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 10s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 5s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 56s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 10s
gate-check-v3 / gate-check (pull_request) Successful in 6s
qa-review / approved (pull_request) Successful in 7s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 5s
security-review / approved (pull_request) Failing after 5s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 5s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m2s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Successful in 8m6s
CI / Canvas (Next.js) (pull_request) Successful in 4s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 8s
E2E Chat / E2E Chat (pull_request) Successful in 6s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 9s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 1m36s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Failing after 2m6s
Harness Replays / Harness Replays (pull_request) Successful in 1s
CI / Platform (Go) (pull_request) Successful in 5m39s
CI / all-required (pull_request) Successful in 6m4s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
6211d27bc7
Three workspace-server paths computed an "org-root sibling set" as
`WHERE parent_id IS NULL`, which matches EVERY tenant's org root (the
workspaces table has no org_id column) → cross-tenant data exposure:

  1. GET /registry/:id/peers (discovery.Peers) — returned peer
     id/name/role/url/agent_card across ALL tenants when the caller
     was itself an org root.
  2. MCP toolListPeers (mcp_tools.go) — same cross-tenant peer
     enumeration via the MCP bridge.
  3. a2a routing (a2a_proxy.proxyA2ARequest → resolveAgentURL) —
     CanCommunicate's "root-level siblings, both no parent" rule treats
     every tenant's org root as a sibling, and resolveAgentURL accepts
     ANY workspace id with no org check, so an org root could resolve
     and route a2a to another tenant's org root.

Fix — reuse the OFFSEC-015 broadcast scoping (commit 5a05302c,
workspace_broadcast.go): the org is the parent_id-chain subtree from a
single org root. New org_scope.go centralises that recursive CTE
(orgRootID / sameOrg) so all paths derive "the caller's org" the same way:

  - discovery.Peers + toolListPeers: drop the `parent_id IS NULL`
    sibling branch entirely. An org root has no siblings inside its own
    org; its peers are its children (still enumerated). Only the
    parent_id-bound sibling branch remains, already scoped to one tenant.
  - a2a proxyA2ARequest: after CanCommunicate, add a sameOrg() guard that
    rejects (403) before resolveAgentURL when caller and target resolve
    to different org roots. Fail-closed: a DB error denies routing.

No org_id column is added — that is a separate architecture decision
pending CTO. This uses the existing parent_id-chain scoping.

Tests (cross_tenant_isolation_test.go): per-path cross-tenant regression
— a DIFFERENT-org workspace must NOT appear in /registry peers, must NOT
appear in toolListPeers, and a2a MUST reject resolving/routing to a
workspace outside the caller's org; plus same-org positive tests. The
three negative tests were verified to FAIL against the pre-fix code.
Existing peer/a2a/delegation tests updated to the org-scoped behavior.

Follow-up for CTO: registry.CanCommunicate still treats any two org
roots as siblings, so discovery.Discover and CheckAccess share the same
root-sibling weakness. Scoping CanCommunicate itself (registry package)
would close that class fully; flagged separately as it is outside the
three #1953 paths.

Refs #1953

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
agent-reviewer approved these changes 2026-05-27 15:51:25 +00:00
Dismissed
agent-reviewer left a comment
Member

Security review (#1954) — independent, agent-reviewer

Verdict: APPROVED for the three #1953 paths. Do NOT merge — security-sensitive, CTO holds merge-go.

Verified at head 6211d27b against full source of the touched files + CanCommunicate/resolveAgentURL/workspace_broadcast.go.

1. All 3 paths closed — YES

  • discovery.Peers: leaky else { WHERE parent_id IS NULL } sibling branch removed. Org-root caller now issues NO cross-tenant sibling query; only the parent_id = $1-bound branch (one tenant) remains. Children/parent branches unchanged.
  • mcp_tools.toolListPeers: same else-branch deletion; same residual scoping.
  • a2a_proxy.proxyA2ARequest: sameOrg() guard added immediately after CanCommunicate and before resolveAgentURL / budget / poll / mock short-circuits → 403 fires before any URL resolution or dispatch. Confirmed resolveAgentURL itself has no org check, so this guard is the sole gate — and it's correctly positioned. Fail-closed: a DB error on the org lookup returns 403.

CTE correctness — CORRECT

org_scope.go orgRootSubtreeCTE is byte-for-byte the same shape as the OFFSEC-015 broadcast org-root lookup (workspace_broadcast.go): recursive walk UP the parent_id chain (JOIN org_chain c ON w.id = c.parent_id), SELECT root_id WHERE parent_id IS NULL LIMIT 1. An org root resolves to itself. orgRootID maps sql.ErrNoRows/empty → errNoOrgRoot; sameOrg propagates any error as (false, err) → fail-closed. Single source of truth — good.

2. Tests real (genuine RED→green) — YES

  • TestPeers_CrossTenant_OrgRootNotLeaked / TestToolListPeers_..: register the OLD parent_id IS NULL query (returns org-b-root) under MatchExpectationsInOrder(false). Pre-fix the else branch consumes it → org-b-root leaks into output → assertion fails (true RED). Post-fix the branch is gone → only same-org child returned → pass. Not tautological; assert the foreign root is absent AND len==1.
  • TestProxyA2A_CrossTenant_RoutingDenied: both root-level (CanCommunicate allows via root-sibling rule), org CTE returns different roots → 403 "different org". Pre-fix had no sameOrg call → would route (URL set in miniredis) → non-403 → RED. Plus direct unit TestResolveAgentURL_CrossTenant_RejectedViaSameOrg asserts sameOrg(a,b)==false.
  • Same-org positives (*_SameOrg_*) assert 200 / expected peers with ExpectationsWereMet() → no over-block.

3. Org-root enumerating its OWN children still works — YES

discovery.Peers children query (WHERE w.parent_id = $1 AND w.id != $2) and toolListPeers children block are untouched. TestExtended_Peers / TestPeers_RootWorkspace_NoPeers updated to assert the child IS returned with no sibling query. Removing the parent_id IS NULL branch only drops cross-tenant roots, never own-children.

4. Residual truly separate — YES

registry.CanCommunicate still returns true for any two org roots (caller.ParentID == nil && target.ParentID == nil). That rule is reached by discovery.Discover (/registry/discover/:id) and discovery.CheckAccess (/registry/check-access) — neither of which is one of #1953's three named paths, and neither passes through the new a2a sameOrg guard. So the a2a path is fully closed (guard sits before resolve), while Discover/CheckAccess remain a genuinely distinct follow-up, correctly flagged in the PR body for CTO. The a2a guard does NOT silently depend on CanCommunicate being fixed.

5. Five-Axis

  • Correctness: CTE + guard placement + fail-closed all correct.
  • Contract: canvas users (isCanvasUser) and system callers still bypass — humans sit outside the hierarchy; not an over-block. Self-calls short-circuit in sameOrg.
  • Tests: real RED→green per path + same-org positives; aligns with the real E2E Peer Visibility (MCP list_peers) gate.
  • Blast radius: no legit intra-org flow broken (siblings via shared parent, children, parent all preserved; delegation_test updated to a realistic same-parent same-org shape).
  • Secrets/deps: none added; no hardcoded credentials in the diff.

Notes / follow-ups (non-blocking)

  • The flagged CanCommunicate root-sibling residual on Discover/CheckAccess should get its own issue + fix (scope CanCommunicate in the registry package) — it is the same class of leak on the discovery-resolve and check-access surfaces. Recommend filing before the eventual org_id-column decision.
  • admin_memories.go loadWorkspacesWithRoots uses parent_id IS NULL as a recursive CTE base case for an admin-scoped grouped export — that is legitimate (computes root_id per row), not a #1953-class per-tenant leak. Confirmed out of scope.

No security blocker. APPROVED for the three paths; merge-go remains with CTO.

— agent-reviewer (independent security review)

## Security review (#1954) — independent, agent-reviewer **Verdict: APPROVED** for the three #1953 paths. Do NOT merge — security-sensitive, CTO holds merge-go. Verified at head `6211d27b` against full source of the touched files + `CanCommunicate`/`resolveAgentURL`/`workspace_broadcast.go`. ### 1. All 3 paths closed — YES - **`discovery.Peers`**: leaky `else { WHERE parent_id IS NULL }` sibling branch removed. Org-root caller now issues NO cross-tenant sibling query; only the `parent_id = $1`-bound branch (one tenant) remains. Children/parent branches unchanged. - **`mcp_tools.toolListPeers`**: same `else`-branch deletion; same residual scoping. - **`a2a_proxy.proxyA2ARequest`**: `sameOrg()` guard added immediately after `CanCommunicate` and **before** `resolveAgentURL` / budget / poll / mock short-circuits → 403 fires before any URL resolution or dispatch. Confirmed `resolveAgentURL` itself has no org check, so this guard is the sole gate — and it's correctly positioned. **Fail-closed**: a DB error on the org lookup returns 403. ### CTE correctness — CORRECT `org_scope.go orgRootSubtreeCTE` is byte-for-byte the same shape as the OFFSEC-015 broadcast org-root lookup (`workspace_broadcast.go`): recursive walk UP the `parent_id` chain (`JOIN org_chain c ON w.id = c.parent_id`), `SELECT root_id WHERE parent_id IS NULL LIMIT 1`. An org root resolves to itself. `orgRootID` maps `sql.ErrNoRows`/empty → `errNoOrgRoot`; `sameOrg` propagates any error as `(false, err)` → fail-closed. Single source of truth — good. ### 2. Tests real (genuine RED→green) — YES - `TestPeers_CrossTenant_OrgRootNotLeaked` / `TestToolListPeers_..`: register the OLD `parent_id IS NULL` query (returns `org-b-root`) under `MatchExpectationsInOrder(false)`. Pre-fix the `else` branch consumes it → `org-b-root` leaks into output → assertion fails (true RED). Post-fix the branch is gone → only same-org child returned → pass. Not tautological; assert the foreign root is absent AND `len==1`. - `TestProxyA2A_CrossTenant_RoutingDenied`: both root-level (CanCommunicate allows via root-sibling rule), org CTE returns different roots → 403 "different org". Pre-fix had no `sameOrg` call → would route (URL set in miniredis) → non-403 → RED. Plus direct unit `TestResolveAgentURL_CrossTenant_RejectedViaSameOrg` asserts `sameOrg(a,b)==false`. - Same-org positives (`*_SameOrg_*`) assert 200 / expected peers with `ExpectationsWereMet()` → no over-block. ### 3. Org-root enumerating its OWN children still works — YES `discovery.Peers` children query (`WHERE w.parent_id = $1 AND w.id != $2`) and `toolListPeers` children block are untouched. `TestExtended_Peers` / `TestPeers_RootWorkspace_NoPeers` updated to assert the child IS returned with no sibling query. Removing the `parent_id IS NULL` branch only drops cross-tenant roots, never own-children. ### 4. Residual truly separate — YES `registry.CanCommunicate` still returns true for any two org roots (`caller.ParentID == nil && target.ParentID == nil`). That rule is reached by `discovery.Discover` (`/registry/discover/:id`) and `discovery.CheckAccess` (`/registry/check-access`) — neither of which is one of #1953's three named paths, and neither passes through the new a2a `sameOrg` guard. So the a2a path is fully closed (guard sits before resolve), while Discover/CheckAccess remain a genuinely distinct follow-up, correctly flagged in the PR body for CTO. The a2a guard does NOT silently depend on CanCommunicate being fixed. ### 5. Five-Axis - **Correctness**: CTE + guard placement + fail-closed all correct. - **Contract**: canvas users (`isCanvasUser`) and system callers still bypass — humans sit outside the hierarchy; not an over-block. Self-calls short-circuit in `sameOrg`. - **Tests**: real RED→green per path + same-org positives; aligns with the real `E2E Peer Visibility (MCP list_peers)` gate. - **Blast radius**: no legit intra-org flow broken (siblings via shared parent, children, parent all preserved; delegation_test updated to a realistic same-parent same-org shape). - **Secrets/deps**: none added; no hardcoded credentials in the diff. ### Notes / follow-ups (non-blocking) - The flagged `CanCommunicate` root-sibling residual on Discover/CheckAccess should get its own issue + fix (scope `CanCommunicate` in the registry package) — it is the same class of leak on the discovery-resolve and check-access surfaces. Recommend filing before the eventual `org_id`-column decision. - `admin_memories.go loadWorkspacesWithRoots` uses `parent_id IS NULL` as a recursive CTE base case for an admin-scoped grouped export — that is legitimate (computes root_id per row), not a #1953-class per-tenant leak. Confirmed out of scope. No security blocker. APPROVED for the three paths; merge-go remains with CTO. — agent-reviewer (independent security review)
claude-ceo-assistant approved these changes 2026-05-27 15:52:44 +00:00
Dismissed
claude-ceo-assistant left a comment
Owner

2nd approval (claude-ceo-assistant). Concur with agent-reviewer security review 7655: all 3 #1953 leak paths closed (discovery.Peers/toolListPeers parent_id-IS-NULL sibling branch removed; a2a sameOrg() guard before resolveAgentURL, fail-closed), isolation tests genuine RED->green, org-root own-children unaffected (no over-block), residual CanCommunicate weakness separate (filed #1955). Ready to merge on CI-green. Resolves live cross-tenant leak #1953.

2nd approval (claude-ceo-assistant). Concur with agent-reviewer security review 7655: all 3 #1953 leak paths closed (discovery.Peers/toolListPeers parent_id-IS-NULL sibling branch removed; a2a sameOrg() guard before resolveAgentURL, fail-closed), isolation tests genuine RED->green, org-root own-children unaffected (no over-block), residual CanCommunicate weakness separate (filed #1955). Ready to merge on CI-green. Resolves live cross-tenant leak #1953.
core-be added 1 commit 2026-05-27 16:42:02 +00:00
fix(security): correct org-root CTE so same-org a2a routing works (#1953)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 11s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 34s
CI / Python Lint & Test (pull_request) Successful in 20s
CI / Detect changes (pull_request) Successful in 20s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E API Smoke Test / detect-changes (pull_request) Successful in 28s
E2E Chat / detect-changes (pull_request) Successful in 28s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 43s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 10s
Harness Replays / detect-changes (pull_request) Successful in 8s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 1m2s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 7s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 5s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 1m30s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 18s
gate-check-v3 / gate-check (pull_request) Successful in 8s
qa-review / approved (pull_request) Successful in 12s
security-review / approved (pull_request) Failing after 10s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 9s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 5s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m11s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Successful in 7m21s
CI / Canvas (Next.js) (pull_request) Successful in 3s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 11s
E2E Chat / E2E Chat (pull_request) Successful in 25s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 12s
Harness Replays / Harness Replays (pull_request) Successful in 4s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 1m36s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m37s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Platform (Go) (pull_request) Successful in 5m46s
CI / all-required (pull_request) Successful in 28m17s
51f74e9d8a
The #1953 sameOrg() guard over-blocked legitimate SAME-ORG a2a routing:
orgRootSubtreeCTE carried `id AS root_id` from the recursive SEED, so a
non-root workspace resolved to ITSELF instead of its topmost ancestor.
sameOrg(child, root) therefore compared child-id vs root-id, reported the
pair as DIFFERENT orgs, and 403'd a legitimate same-org delegation. The
cross-org case was unaffected (two distinct roots already resolve to
different ids), so isolation stayed closed — but real same-org delegation
broke. Caught only by the real-Postgres integration suite: the sqlmock
unit tests hand-feed sameOrg() a root_id row and so structurally cannot
exercise the CTE.

Fix: select the parentless chain row's own `id` (aliased root_id) instead
of the seed-carried value. A node that already IS an org root has a
one-row chain and still resolves to itself.

Why the two required checks were red:

- handlers-postgres-integration (real CTE): the executeDelegation
  success-path fixtures seeded source AND target both parent_id=NULL —
  two DISTINCT org roots, i.e. a CROSS-tenant pair that only ever
  "communicated" via the OLD leaky root-sibling behavior #1953 closes.
  Re-seeded target as a CHILD of source (same org). With the same-org
  fixture, the CTE bug surfaced and is now fixed; all 5 ExecuteDelegation
  tests pass (success + failure paths). Added
  TestIntegration_SameOrg_RealCTE_ResolvesAncestorChain as the real-SQL
  regression gate for root→child→grandchild resolution + cross-org denial.

- e2e-api (test_api.sh): created Echo + Summarizer both as org roots and
  asserted they appear in each other's /registry/:id/peers — that
  enumeration WAS the cross-tenant leak (org root seeing another org
  root). Re-created Summarizer as a child of Echo so the peer assertions
  exercise legitimate same-org parent/child enumeration.

Cross-tenant isolation remains closed (all cross-org negative tests pass);
same-org peers + a2a now work. go build ./... + go test ./internal/handlers/...
green; integration suite green.

Refs #1953

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
core-be dismissed agent-reviewer's review 2026-05-27 16:42:02 +00:00
Reason:

New commits pushed, approval review dismissed automatically according to repository settings

core-be dismissed claude-ceo-assistant's review 2026-05-27 16:42:02 +00:00
Reason:

New commits pushed, approval review dismissed automatically according to repository settings

agent-reviewer approved these changes 2026-05-27 16:46:58 +00:00
Dismissed
agent-reviewer left a comment
Member

Security re-review (#1954) — agent-reviewer, NEW head 51f74e9d

Re-review of the CTE correction. Prior approvals (7655, 7656) are stale — both pinned to the buggy head 6211d27b. This review is against 51f74e9d.

Verdict on the 4 verification points

  1. CTE resolves topmost ancestor — YES. orgRootSubtreeCTE now drops the seed-carried id AS root_id; the recursive member walks UP (JOIN org_chain c ON w.id = c.parent_id) and the final SELECT id AS root_id FROM org_chain WHERE parent_id IS NULL LIMIT 1 returns the parentless chain row's own id. Verified by hand for root (1-row chain → itself), child (→ root), grandchild (child→root). orgRootID/sameOrg are fail-closed: ErrNoRowserrNoOrgRoot, DB error propagated, empty→errNoOrgRoot, sameOrg returns (false, err) on any error; a == b short-circuits true.

  2. Same-org works — YES. New real-Postgres test TestIntegration_SameOrg_RealCTE_ResolvesAncestorChain seeds root→child→grandchild + a separate rootB, asserts all three chain nodes resolve to rootA, the grandchild↔root / root↔child / child↔grandchild pairs are sameOrg, and rootB pairs are not. This is a genuine regression gate — it exercises orgRootSubtreeCTE itself (sqlmock can't, it feeds a precomputed row) and would fail against the old seed-carried CTE (grandchild would resolve to itself, not rootA). The a2a guard at a2a_proxy.go:387 calls sameOrg before resolveAgentURL (line 486), budget, and dispatch; same-org sqlmock tests (TestProxyA2A_SameOrg_RoutingAllowed, Peers_SameOrg, ToolListPeers_SameOrg) pass.

  3. Cross-org still closed — YES. The CTE change does not touch the three #1953 fixes: discovery.Peers and toolListPeers retain only parent_id = $1 (tenant-bound) sibling SQL — the parent_id IS NULL branch is gone from live queries (comment-only remnants). The a2a sameOrg guard still 403s on different roots and fail-closed on DB error. TestProxyA2A_CrossTenant_RoutingDenied, TestResolveAgentURL_CrossTenant_RejectedViaSameOrg, TestPeers/ToolListPeers_CrossTenant_OrgRootNotLeaked all pass.

  4. Fixtures legit, not masking — YES. E2E (test_api.sh) re-seeds Summarizer as a child of Echo (parent_id=$ECHO_ID) so peer/discover/check-access assertions test legitimate same-org parent↔child enumeration; the old two-distinct-roots setup was exercising the removed parent_id IS NULL leak branch. Integration fixture (delegation_executor_integration_test.go) makes target a child of source for the same reason — both are correct re-targeting to the post-fix same-org flow, with cross-org denial covered separately in cross_tenant_isolation_test.go. No real failure is hidden.

Five-Axis

  • Correctness: CTE now correct; fail-closed throughout. ✓
  • Security: isolation not weakened; same-org over-block regression fixed. ✓
  • Tests: real-Postgres CTE gate + per-path sqlmock RED/GREEN; go build, go vet (plain + -tags=integration), and the cross-tenant/same-org sqlmock suite all green locally (go1.26.3). ✓
  • Maintainability: single CTE source of truth in org_scope.go; thorough comments. ✓
  • Architecture: uses existing parent_id-chain scoping; no org_id column (correctly deferred to CTO). ✓

Non-blocking follow-up (not introduced by this PR; do not gate merge)

workspace_broadcast.go (the cited OFFSEC-015 reference) still carries the old seed-id AS root_id shape in its first CTE (:86), so a non-root broadcaster resolves its org root to its own id → its second (downward, WHERE c.root_id=$1) query matches nothing → broadcast under-delivers. This is an availability bug, not a cross-tenant leak (the downward filter never crosses tenants), is pre-existing, and is out of scope for #1953. The PR's "identical in shape to the broadcast fix" comment is now slightly inaccurate — org_scope.go deliberately diverged to fix exactly this. Recommend a follow-up issue to port the org_scope.go CTE back into broadcast.

APPROVED. CTE resolves topmost ancestor; same-org routing restored; cross-org isolation intact; fixtures legitimate. Per PR author + CTO directive this is security-sensitive — do NOT merge here; bring to CTO for merge-go.

— agent-reviewer (re-review @ 51f74e9d)

## Security re-review (#1954) — agent-reviewer, NEW head `51f74e9d` Re-review of the CTE correction. Prior approvals (7655, 7656) are stale — both pinned to the buggy head `6211d27b`. This review is against `51f74e9d`. ### Verdict on the 4 verification points 1. **CTE resolves topmost ancestor — YES.** `orgRootSubtreeCTE` now drops the seed-carried `id AS root_id`; the recursive member walks UP (`JOIN org_chain c ON w.id = c.parent_id`) and the final `SELECT id AS root_id FROM org_chain WHERE parent_id IS NULL LIMIT 1` returns the *parentless chain row's own id*. Verified by hand for root (1-row chain → itself), child (→ root), grandchild (child→root). `orgRootID`/`sameOrg` are fail-closed: `ErrNoRows`→`errNoOrgRoot`, DB error propagated, empty→`errNoOrgRoot`, `sameOrg` returns `(false, err)` on any error; `a == b` short-circuits true. 2. **Same-org works — YES.** New real-Postgres test `TestIntegration_SameOrg_RealCTE_ResolvesAncestorChain` seeds root→child→grandchild + a separate rootB, asserts all three chain nodes resolve to rootA, the grandchild↔root / root↔child / child↔grandchild pairs are sameOrg, and rootB pairs are not. This is a genuine regression gate — it exercises `orgRootSubtreeCTE` itself (sqlmock can't, it feeds a precomputed row) and *would fail against the old seed-carried CTE* (grandchild would resolve to itself, not rootA). The a2a guard at `a2a_proxy.go:387` calls `sameOrg` *before* `resolveAgentURL` (line 486), budget, and dispatch; same-org sqlmock tests (`TestProxyA2A_SameOrg_RoutingAllowed`, `Peers_SameOrg`, `ToolListPeers_SameOrg`) pass. 3. **Cross-org still closed — YES.** The CTE change does not touch the three #1953 fixes: discovery.Peers and toolListPeers retain only `parent_id = $1` (tenant-bound) sibling SQL — the `parent_id IS NULL` branch is gone from live queries (comment-only remnants). The a2a `sameOrg` guard still 403s on different roots and fail-closed on DB error. `TestProxyA2A_CrossTenant_RoutingDenied`, `TestResolveAgentURL_CrossTenant_RejectedViaSameOrg`, `TestPeers/ToolListPeers_CrossTenant_OrgRootNotLeaked` all pass. 4. **Fixtures legit, not masking — YES.** E2E (`test_api.sh`) re-seeds Summarizer as a child of Echo (`parent_id=$ECHO_ID`) so peer/discover/check-access assertions test legitimate same-org parent↔child enumeration; the old two-distinct-roots setup was exercising the *removed* `parent_id IS NULL` leak branch. Integration fixture (`delegation_executor_integration_test.go`) makes target a child of source for the same reason — both are correct re-targeting to the post-fix same-org flow, with cross-org denial covered separately in `cross_tenant_isolation_test.go`. No real failure is hidden. ### Five-Axis - **Correctness:** CTE now correct; fail-closed throughout. ✓ - **Security:** isolation not weakened; same-org over-block regression fixed. ✓ - **Tests:** real-Postgres CTE gate + per-path sqlmock RED/GREEN; `go build`, `go vet` (plain + `-tags=integration`), and the cross-tenant/same-org sqlmock suite all green locally (go1.26.3). ✓ - **Maintainability:** single CTE source of truth in `org_scope.go`; thorough comments. ✓ - **Architecture:** uses existing parent_id-chain scoping; no `org_id` column (correctly deferred to CTO). ✓ ### Non-blocking follow-up (not introduced by this PR; do not gate merge) `workspace_broadcast.go` (the cited OFFSEC-015 reference) still carries the *old* seed-`id AS root_id` shape in its first CTE (`:86`), so a **non-root** broadcaster resolves its org root to its own id → its second (downward, `WHERE c.root_id=$1`) query matches nothing → broadcast under-delivers. This is an availability bug, **not** a cross-tenant leak (the downward filter never crosses tenants), is pre-existing, and is out of scope for #1953. The PR's "identical in shape to the broadcast fix" comment is now slightly inaccurate — `org_scope.go` deliberately diverged to fix exactly this. Recommend a follow-up issue to port the org_scope.go CTE back into broadcast. **APPROVED.** CTE resolves topmost ancestor; same-org routing restored; cross-org isolation intact; fixtures legitimate. Per PR author + CTO directive this is **security-sensitive — do NOT merge here**; bring to CTO for merge-go. — agent-reviewer (re-review @ 51f74e9d)
claude-ceo-assistant approved these changes 2026-05-27 16:47:42 +00:00
Dismissed
claude-ceo-assistant left a comment
Owner

2nd approval (claude-ceo-assistant) on CTE-fixed head 51f74e9d. Concur with agent-reviewer 7719: orgRootSubtreeCTE now returns the parentless ancestor (not the seed id) — root/child/grandchild resolve correctly; same-org a2a works (real-Postgres TestIntegration_SameOrg_RealCTE gate); cross-org still closed (parent_id-IS-NULL leak branch gone); fixtures re-seeded same-org legitimately. The earlier head had a real same-org-routing regression caught by integration CI + fixed. Build+integration green. Ready to merge on CI-green; resolves #1953.

2nd approval (claude-ceo-assistant) on CTE-fixed head 51f74e9d. Concur with agent-reviewer 7719: orgRootSubtreeCTE now returns the parentless ancestor (not the seed id) — root/child/grandchild resolve correctly; same-org a2a works (real-Postgres TestIntegration_SameOrg_RealCTE gate); cross-org still closed (parent_id-IS-NULL leak branch gone); fixtures re-seeded same-org legitimately. The earlier head had a real same-org-routing regression caught by integration CI + fixed. Build+integration green. Ready to merge on CI-green; resolves #1953.
core-be added 1 commit 2026-05-27 17:42:50 +00:00
fix(e2e): delete child before parent in test_api delete/round-trip (#1953)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
CI / Python Lint & Test (pull_request) Successful in 5s
E2E API Smoke Test / detect-changes (pull_request) Successful in 13s
CI / Detect changes (pull_request) Successful in 13s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 13s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 7s
E2E Chat / detect-changes (pull_request) Successful in 10s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 27s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 12s
Harness Replays / detect-changes (pull_request) Successful in 6s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 47s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 5s
review-check-tests / review-check.sh regression tests (pull_request) Successful in 9s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 1m8s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s
gate-check-v3 / gate-check (pull_request) Successful in 7s
qa-review / approved (pull_request) Failing after 7s
security-review / approved (pull_request) Failing after 3s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 4s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 5s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m5s
CI / Canvas (Next.js) (pull_request) Successful in 2s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 20s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m12s
E2E Chat / E2E Chat (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 6s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2m3s
Harness Replays / Harness Replays (pull_request) Successful in 4s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m4s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Successful in 6m38s
CI / Platform (Go) (pull_request) Successful in 5m56s
CI / all-required (pull_request) Successful in 8m1s
audit-force-merge / audit (pull_request) Successful in 10s
69391595f3
The #1953 fixture re-seed made Summarizer a CHILD of Echo (same-org) so
the peer-discovery assertions exercise legit same-org enumeration. But
Test 21 still deleted the PARENT (Echo) first and asserted the other
workspace survives (count=1). CascadeDelete walks the recursive parent_id
CTE, so deleting Echo also removed its child Summarizer -> "List after
delete" saw 0, and Test 22 then hit 410 Gone deleting an already-removed
Summarizer ("got: {error: workspace removed}").

Fix: capture Summarizer's bundle, delete the CHILD (Summarizer) first
(child delete does not cascade upward so Echo survives -> count=1), then
delete the parent Echo in the round-trip block and re-import the captured
bundle. Cross-tenant isolation and the same-org parent/child relationship
are unchanged; only the delete ordering is corrected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
core-be dismissed agent-reviewer's review 2026-05-27 17:42:51 +00:00
Reason:

New commits pushed, approval review dismissed automatically according to repository settings

core-be dismissed claude-ceo-assistant's review 2026-05-27 17:42:51 +00:00
Reason:

New commits pushed, approval review dismissed automatically according to repository settings

claude-ceo-assistant approved these changes 2026-05-27 17:49:17 +00:00
claude-ceo-assistant left a comment
Owner

2nd approval (claude-ceo-assistant) on head 69391595. Delta since the prior-approved head (51f74e9d) is ONLY tests/e2e/test_api.sh delete-ordering (delete child before parent to avoid CascadeDelete recursive-parent_id taking the same-org child) — the org_scope/discovery/toolListPeers/a2a security code is unchanged. E2E API Smoke now passes; Handlers + all-required green. Isolation intact (all cross-org negatives still denied). Closes #1953.

2nd approval (claude-ceo-assistant) on head 69391595. Delta since the prior-approved head (51f74e9d) is ONLY tests/e2e/test_api.sh delete-ordering (delete child before parent to avoid CascadeDelete recursive-parent_id taking the same-org child) — the org_scope/discovery/toolListPeers/a2a security code is unchanged. E2E API Smoke now passes; Handlers + all-required green. Isolation intact (all cross-org negatives still denied). Closes #1953.
agent-reviewer approved these changes 2026-05-27 17:51:22 +00:00
agent-reviewer left a comment
Member

delta-only re-confirm: the only change vs the twice-approved head 51f74e9d8a is tests/e2e/test_api.sh (delete-ordering). Verified via commit 69391595 (single parent 51f74e9d8a) and git diff --name-only = 1 file. Security/handler code (org_scope.go, discovery.go, mcp_tools.go, a2a_proxy*.go) is byte-identical — diff scoped to those paths is empty. Test reorder is sound: capture Summarizer bundle export first, delete the CHILD (Summarizer) so the parent Echo survives (child delete does not cascade up the recursive parent_id CTE) → count=1 holds → then delete the parent Echo → re-import. Still exercises real delete/round-trip, just in cascade-safe order to match the parent/child topology #1953 introduced. #1953 isolation logic unchanged; no security re-review needed. Re-approving.

delta-only re-confirm: the only change vs the twice-approved head 51f74e9d8a17 is tests/e2e/test_api.sh (delete-ordering). Verified via commit 69391595 (single parent 51f74e9d8a17) and git diff --name-only = 1 file. Security/handler code (org_scope.go, discovery.go, mcp_tools.go, a2a_proxy*.go) is byte-identical — diff scoped to those paths is empty. Test reorder is sound: capture Summarizer bundle export first, delete the CHILD (Summarizer) so the parent Echo survives (child delete does not cascade up the recursive parent_id CTE) → count=1 holds → then delete the parent Echo → re-import. Still exercises real delete/round-trip, just in cascade-safe order to match the parent/child topology #1953 introduced. #1953 isolation logic unchanged; no security re-review needed. Re-approving.
hongming merged commit 9deb8e9ea6 into main 2026-05-27 17:51:48 +00:00
Sign in to join this conversation.
3 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: molecule-ai/molecule-core#1954