feat: approval-gate infrastructure for destructive ops (Phase 4) #2372

Merged
devops-engineer merged 1 commits from feat/platform-agent-approval-gate into main 2026-06-06 18:24:03 +00:00
Member

Phase 4 of the org-level platform agent (RFC #2360). Independent of the re-parenting work (off main).

Server-side approval gate so destructive ops the user-driven concierge can trigger require a human approval. The platform MCP is a client of these handlers, so enforcement is here (the trust boundary), not in the MCP.

  • migration: approval_requests.consumed_at (single-use) + request_hash (dedup) + partial index.
  • internal/approvals/policy.go: the one auditable gated-action map + IsGated.
  • requireApproval(): consumes a matching approved+unconsumed request (race-safe via conditional UPDATE … RETURNING / FOR UPDATE SKIP LOCKED) and proceeds; else creates/reuses a pending request, broadcasts to canvas, escalates to parent if any. gateDestructive() writes HTTP 202 pending.

Matching is (workspace_id, action, request_hash) — an approval for "delete ws A" can't be replayed to "delete ws B", and retries reuse one pending row.

Tests

  • Unit: policy, hash stability + context-sensitivity, non-gated passthrough.
  • Real-Postgres integration: full cycle — pending → dedup (1 row) → approve → consume → single-use (no replay) → context isolation. Verified locally (postgres:15, all migrations apply, green).

Scope note

Infrastructure only — not yet wired into live handlers. Wiring needs a platform-agent caller marker so the gate fires only for concierge-initiated calls (not operator/CP deletes); that lands with Phase 3's runtime/MCP marker, so existing delete/secret flows are unchanged until then. This keeps the gate safe to land now.

🤖 Generated with Claude Code

Phase 4 of the org-level platform agent (RFC #2360). **Independent** of the re-parenting work (off `main`). Server-side approval gate so destructive ops the user-driven concierge can trigger require a human approval. The platform MCP is a *client* of these handlers, so enforcement is here (the trust boundary), not in the MCP. - migration: `approval_requests.consumed_at` (single-use) + `request_hash` (dedup) + partial index. - `internal/approvals/policy.go`: the one auditable gated-action map + `IsGated`. - `requireApproval()`: consumes a matching approved+unconsumed request (race-safe via conditional `UPDATE … RETURNING` / `FOR UPDATE SKIP LOCKED`) and proceeds; else creates/reuses a pending request, broadcasts to canvas, escalates to parent if any. `gateDestructive()` writes HTTP 202 pending. Matching is `(workspace_id, action, request_hash)` — an approval for "delete ws A" can't be replayed to "delete ws B", and retries reuse one pending row. ### Tests - Unit: policy, hash stability + context-sensitivity, non-gated passthrough. - **Real-Postgres integration**: full cycle — pending → dedup (1 row) → approve → consume → single-use (no replay) → context isolation. Verified locally (postgres:15, all migrations apply, green). ### Scope note **Infrastructure only — not yet wired into live handlers.** Wiring needs a platform-agent caller marker so the gate fires only for concierge-initiated calls (not operator/CP deletes); that lands with Phase 3's runtime/MCP marker, so existing delete/secret flows are unchanged until then. This keeps the gate safe to land now. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
devops-engineer added 1 commit 2026-06-06 17:43:44 +00:00
feat(workspace-server): approval-gate infrastructure for destructive ops (Phase 4)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
CI / Python Lint & Test (pull_request) Successful in 3s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 9s
CI / Detect changes (pull_request) Successful in 10s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
E2E Chat / detect-changes (pull_request) Successful in 6s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 6s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
Harness Replays / detect-changes (pull_request) Successful in 7s
Check migration collisions / Migration version collision check (pull_request) Successful in 18s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 9s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
qa-review / approved (pull_request_target) Failing after 4s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 13s
gate-check-v3 / gate-check (pull_request_target) Successful in 6s
CI / Canvas (Next.js) (pull_request) Successful in 2s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 1s
E2E Chat / E2E Chat (pull_request) Successful in 2s
security-review / approved (pull_request_target) Failing after 9s
CI / Canvas Deploy Status (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
Harness Replays / Harness Replays (pull_request) Successful in 3s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 55s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 58s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m26s
CI / Platform (Go) (pull_request) Successful in 4m9s
CI / all-required (pull_request) Successful in 5s
sop-checklist / review-refire (pull_request_target) Has been skipped
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 6s
sop-tier-check / tier-check (pull_request_target) Failing after 7s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Has been cancelled
qa-review / approved (pull_request_review) Has been skipped
security-review / approved (pull_request_review) Has been skipped
sop-tier-check / tier-check (pull_request_review) Failing after 11s
audit-force-merge / audit (pull_request_target) Successful in 21s
0b771d5770
Server-side gate so destructive org operations the user-driven platform agent can
trigger require a human approval (RFC docs/design/rfc-platform-agent.md). The
platform MCP is a CLIENT of these handlers, so enforcement lives here (the trust
boundary), not in the MCP.

- migration 20260606020000_approvals_consumed: approval_requests.consumed_at
  (single-use) + request_hash (dedup) + a partial index for the gate lookup.
- internal/approvals/policy.go: the one auditable map of gated actions
  (delete_workspace / deprovision / secret_write / org_token_mint) + IsGated.
- requireApproval(): consumes a matching approved+unconsumed request (race-safe
  via conditional UPDATE ... RETURNING / FOR UPDATE SKIP LOCKED) and proceeds,
  else creates/reuses a pending request (dedup by request_hash), broadcasts it to
  the canvas and escalates to the parent if any. gateDestructive() wraps it and
  writes HTTP 202 pending for gin handlers.

Matching is (workspace_id, action, request_hash) where request_hash is a stable
digest of the op + context, so an approval for 'delete ws A' can't be replayed to
'delete ws B', and retries reuse one pending row instead of flooding.

Tests: policy + hash-stability/context-sensitivity unit tests; gateDestructive
non-gated passthrough; and a real-Postgres integration test proving the full
cycle — pending -> dedup -> approve -> consume -> single-use (no replay) ->
context isolation (sqlmock cannot prove consume-once row state).

Infrastructure only — NOT yet wired into live handlers. Wiring requires a
platform-agent caller marker so the gate fires only for concierge-initiated calls
(not operator/CP flows); that lands with Phase 3's runtime/MCP marker so existing
delete/secret flows are unchanged until then.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
devops-engineer added the tier:medium label 2026-06-06 17:55:21 +00:00
molecule-code-reviewer approved these changes 2026-06-06 18:23:07 +00:00
molecule-code-reviewer left a comment
Member

Independent review confirmed: race-safe consume via FOR UPDATE SKIP LOCKED, dedup via conditional INSERT, context-isolated request_hash, best-effort broadcast, infra-only (no live wiring). Real-PG integration proves the full single-use cycle. Approving.

Independent review confirmed: race-safe consume via FOR UPDATE SKIP LOCKED, dedup via conditional INSERT, context-isolated request_hash, best-effort broadcast, infra-only (no live wiring). Real-PG integration proves the full single-use cycle. Approving.
core-security approved these changes 2026-06-06 18:23:42 +00:00
core-security left a comment
Member

Security review: server-side trust boundary (not MCP), fail-closed, single-use approvals (consumed_at), replay-blocked + context-isolated. No change to existing destructive handlers. Approving.

Security review: server-side trust boundary (not MCP), fail-closed, single-use approvals (consumed_at), replay-blocked + context-isolated. No change to existing destructive handlers. Approving.
devops-engineer merged commit 173881e67a into main 2026-06-06 18:24:03 +00:00
Sign in to join this conversation.
3 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: molecule-ai/molecule-core#2372