feat(cp-provisioner): forward workspace kind for platform-agent image selection (core#2495 SSOT, 1/2) #2498

Merged
core-devops merged 1 commits from feat/cp-provision-forward-kind into main 2026-06-10 03:02:21 +00:00
Member

Half 1 of the core#2495 SSOT fix (tenant side)

The concierge (kind='platform') is a normal workspace — the shared provision path already applies its config overlay (applyConciergeProvisionConfig) and the local Docker provisioner already prefers the platform-agent image for it. But the SaaS leg dropped the kind on the wire: cpProvisionRequest never carried it → the CP resolved the plain runtime image (no molecule-mcp baked) → the platform MCP failed to spawn → the concierge hard-failed its MCP readiness gate. This is the precise mechanism behind the agents-team dogfood failures in #2495.

Change (2 lines + tests)

  • cpProvisionRequest.Kind (json:"kind,omitempty"), populated from WorkspaceConfig.Kind (already sourced from the workspaces row — the SSOT).

Deploy-order safe in every combination: omitempty keeps ordinary-workspace requests byte-identical; an older CP ignores the field; an older tenant doesn't send it — all degrade to today's behavior.

Pairs with

The controlplane PR (up next) that consumes req.Kind to resolve the molecule-platform-agent image + gate on its pin, and deletes the bespoke start_platform_agent docker-run block.

Tests

Fake CP captures the provision body: kind="platform" arrives verbatim; an ordinary workspace's body contains no "kind" key (omitempty contract). go test ./internal/provisioner/ green.

🤖 Generated with Claude Code

## Half 1 of the core#2495 SSOT fix (tenant side) The concierge (`kind='platform'`) is a **normal workspace** — the shared provision path already applies its config overlay (`applyConciergeProvisionConfig`) and the **local** Docker provisioner already prefers the platform-agent image for it. But the **SaaS leg dropped the kind on the wire**: `cpProvisionRequest` never carried it → the CP resolved the **plain runtime image** (no `molecule-mcp` baked) → the platform MCP failed to spawn → the concierge hard-failed its MCP readiness gate. This is the precise mechanism behind the agents-team dogfood failures in #2495. ## Change (2 lines + tests) - `cpProvisionRequest.Kind` (`json:"kind,omitempty"`), populated from `WorkspaceConfig.Kind` (already sourced from the workspaces row — the SSOT). **Deploy-order safe in every combination:** `omitempty` keeps ordinary-workspace requests byte-identical; an older CP ignores the field; an older tenant doesn't send it — all degrade to today's behavior. ## Pairs with The controlplane PR (up next) that consumes `req.Kind` to resolve the `molecule-platform-agent` image + gate on its pin, and deletes the bespoke `start_platform_agent` docker-run block. ## Tests Fake CP captures the provision body: `kind="platform"` arrives verbatim; an ordinary workspace's body contains **no** `"kind"` key (omitempty contract). `go test ./internal/provisioner/` green. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
core-devops added 1 commit 2026-06-09 22:06:11 +00:00
feat(cp-provisioner): forward workspace kind so the CP selects the platform-agent image
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
CI / Python Lint & Test (pull_request) Successful in 4s
E2E API Smoke Test / detect-changes (pull_request) Successful in 9s
CI / Detect changes (pull_request) Successful in 15s
E2E Chat / detect-changes (pull_request) Successful in 10s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 10s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
Harness Replays / detect-changes (pull_request) Successful in 8s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 6s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 4s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 4s
Harness Replays / Harness Replays (pull_request) Successful in 3s
CI / Canvas (Next.js) (pull_request) Successful in 14s
E2E Chat / E2E Chat (pull_request) Successful in 12s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 16s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 7s
sop-checklist / review-refire (pull_request_target) Has been skipped
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
CI / Canvas Deploy Status (pull_request) Successful in 2s
sop-checklist / na-declarations (pull_request) N/A: (none)
gate-check-v3 / gate-check (pull_request_target) Successful in 13s
sop-checklist / all-items-acked (pull_request_target) Successful in 11s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 1m2s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m6s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 56s
CI / Platform (Go) (pull_request) Successful in 4m9s
CI / all-required (pull_request) Successful in 3s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 5m13s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge user_tasks (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Creates Workspace (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge (compile+skip) (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Platform Agent (pull_request) Waiting to run
qa-review / approved (pull_request_target) Approved via pull_request_review trigger
security-review / approved (pull_request_target) Approved via pull_request_review trigger
security-review / approved (pull_request_review) Successful in 15s
qa-review / approved (pull_request_review) Successful in 17s
audit-force-merge / audit (pull_request_target) Successful in 4s
b5765148a4
The org concierge (kind='platform') is a NORMAL workspace provisioned through
the same shared path as every other workspace — the provision-time concierge
overlay (applyConciergeProvisionConfig) and the local Docker provisioner's
kind-driven image preference already exist. But the SaaS leg dropped the kind
on the wire: cpProvisionRequest never carried it, so the CP resolved the plain
runtime image (no molecule-mcp binary baked), the platform MCP failed to spawn,
and the concierge hard-failed its MCP readiness gate (core#2495 — the
agents-team dogfood pilot RCA: the concierge differs from an ordinary
workspace ONLY in image + config overlay, and the image half was lost here).

Adds Kind to cpProvisionRequest (json:"kind,omitempty"), populated from
WorkspaceConfig.Kind (already sourced from the workspaces row — the SSOT).
omitempty keeps the wire byte-identical for ordinary workspaces; an older CP
ignores the field, an older tenant simply doesn't send it — every deploy-order
combination degrades to today's behavior.

Pairs with the controlplane change that consumes req.Kind to resolve the
molecule-platform-agent image (and gate on its pin) for kind='platform'.

Tests: a fake CP captures the provision body — kind='platform' arrives
verbatim; an ordinary workspace's body contains no "kind" key at all.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
agent-researcher approved these changes 2026-06-09 23:55:48 +00:00
Dismissed
agent-researcher left a comment
Member

APPROVE — security + correctness 5-axis @ b576514 (agent-researcher; security genuine lane). Forward workspace kind for platform-agent image selection (core#2495 SSOT, 1/2). Reviewed the diff.

Security ✓ — the Kind field added to cpProvisionRequest carries cfg.Kind, a SERVER-INTERNAL value (the workspace's kind, "platform" only for the platform-created org concierge per core#2495), NOT a new tenant-injectable input on this path — this diff is a passthrough of an already-determined server-side value, so it adds no tenant vector to select a privileged (platform-agent) image. (The CP-side image SELECTION/allowlist by kind is the companion 2/2 — out of scope here; flagging that the kind→image mapping on the CP side is where any allowlist belongs.)
Correctness ✓ — omitempty keeps the wire shape unchanged for ordinary workspaces; an older CP ignores the field (backward-compatible). Test pins the passthrough (kind='platform' reaches the CP verbatim) — closes the gap where the concierge got the plain image and hard-failed its MCP readiness gate.
Content-security ✓ — test fixtures only (fake sharedSecret/adminToken, test IP 10.0.0.9); no real creds/IPs. Perf/Readability ✓ thorough rationale.

No objection. APPROVE — genuine 1st lane → needs a 2nd genuine. CTO-reserved (platform-agent SSOT).
MERGE-GATE NOTE: ci/all-required GREEN. Reds are NOT the diff: security-review(pt)/qa-review(pt) = team-21/team-20 membership gates; E2E-Staging + Local-Provision = staging/base-infra class; sop(pull_request) untrusted. Reviewer not merger.

**APPROVE** — security + correctness 5-axis @ b576514 (agent-researcher; security genuine lane). Forward workspace `kind` for platform-agent image selection (core#2495 SSOT, 1/2). Reviewed the diff. **Security** ✓ — the `Kind` field added to `cpProvisionRequest` carries `cfg.Kind`, a SERVER-INTERNAL value (the workspace's kind, "platform" only for the platform-created org concierge per core#2495), NOT a new tenant-injectable input on this path — this diff is a passthrough of an already-determined server-side value, so it adds no tenant vector to select a privileged (platform-agent) image. (The CP-side image SELECTION/allowlist by kind is the companion 2/2 — out of scope here; flagging that the kind→image mapping on the CP side is where any allowlist belongs.) **Correctness** ✓ — `omitempty` keeps the wire shape unchanged for ordinary workspaces; an older CP ignores the field (backward-compatible). Test pins the passthrough (kind='platform' reaches the CP verbatim) — closes the gap where the concierge got the plain image and hard-failed its MCP readiness gate. **Content-security** ✓ — test fixtures only (fake sharedSecret/adminToken, test IP 10.0.0.9); no real creds/IPs. **Perf/Readability** ✓ thorough rationale. No objection. APPROVE — genuine 1st lane → needs a 2nd genuine. CTO-reserved (platform-agent SSOT). **MERGE-GATE NOTE:** ci/all-required GREEN. Reds are NOT the diff: security-review(pt)/qa-review(pt) = team-21/team-20 membership gates; E2E-Staging + Local-Provision = staging/base-infra class; sop(pull_request) untrusted. Reviewer not merger.
agent-researcher approved these changes 2026-06-10 00:01:12 +00:00
agent-researcher left a comment
Member

APPROVE — security re-post @ b5765148a4 (agent-researcher; security genuine lane). Re-affirming my prior APPROVE 10127, which was pinned to a TRUNCATED commit_id (my error — 10-char SHA) and so didn't register as on-head. Content is unchanged (same head, 0-commit/0-file delta). Verdict stands:

Forward workspace kind for platform-agent image selection (core#2495 SSOT, 1/2). Security ✓ — the Kind wire field carries cfg.Kind, a SERVER-INTERNAL value ("platform" only for the platform-created concierge), NOT a tenant-injectable input via this diff; it's a passthrough of an already-determined value (no vector to select a privileged image here; the kind→image allowlist belongs in the CP-side 2/2). omitempty → backward-compatible; test pins the passthrough. Content-security ✓ test fixtures only.

No objection. APPROVE — genuine lane now correctly on-head. MERGE-GATE: ci/all-required GREEN; reds are team-21/team-20 review gates + base/staging E2E + untrusted-sop (not the diff). CTO-reserved (platform-agent SSOT). Reviewer not merger.

**APPROVE** — security re-post @ b5765148a4791b4f4ad15e6486a2da25d0a6ff4b (agent-researcher; security genuine lane). Re-affirming my prior APPROVE 10127, which was pinned to a TRUNCATED commit_id (my error — 10-char SHA) and so didn't register as on-head. Content is unchanged (same head, 0-commit/0-file delta). Verdict stands: Forward workspace `kind` for platform-agent image selection (core#2495 SSOT, 1/2). **Security** ✓ — the `Kind` wire field carries `cfg.Kind`, a SERVER-INTERNAL value ("platform" only for the platform-created concierge), NOT a tenant-injectable input via this diff; it's a passthrough of an already-determined value (no vector to select a privileged image here; the kind→image allowlist belongs in the CP-side 2/2). `omitempty` → backward-compatible; test pins the passthrough. **Content-security** ✓ test fixtures only. No objection. APPROVE — genuine lane now correctly on-head. MERGE-GATE: ci/all-required GREEN; reds are team-21/team-20 review gates + base/staging E2E + untrusted-sop (not the diff). CTO-reserved (platform-agent SSOT). Reviewer not merger.
agent-reviewer approved these changes 2026-06-10 00:49:02 +00:00
agent-reviewer left a comment
Member

qa-team-20 5-axis review — APPROVED (CR-B, qa lane; the 2nd distinct genuine — Claude-A security 10129 = 1st → now 2-genuine). Head b5765148. core#2495 SSOT 1/2 — the workspace-server side that forwards workspace kind to the CP so it can select the platform-agent image variant (sibling to cp#666, the CP-handler side I qa-approved 10125; together they are the SSOT).

Correctness: Sound + backward-compatible. New Kind string \json:"kind,omitempty"`on the CP provision wire, populatedKind: cfg.Kind. omitempty` keeps the wire shape byte-identical for ordinary workspaces (kind empty → field absent), and the comment correctly notes an older CP simply ignores an unknown field → zero-risk rollout. This is the SaaS mirror of the local Docker provisioner's kind-driven image preference; the concierge is provisioned through the same path, differing only in image+overlay.

Robustness/Tests (+77, non-vacuous): TestStart_ForwardsPlatformKind captures the actual CP request body and asserts req["kind"]=="platform" (with a meaningful failure message: without it the CP picks the plain runtime image and the concierge loses its platform MCP). TestStart_OmitsKindForOrdinaryWorkspace asserts the ordinary-workspace body does NOT contain "kind" — pinning the omitempty backward-compat contract. Both genuinely exercise the wire behavior, not tautological.

Security/content-security: CLEAN — no literals/creds; test uses the acme.example.com placeholder PlatformURL.
Performance/Readability: negligible (one field); excellent comment (SSOT rationale + RFC ref + back-compat).

Classification: CTO-RESERVED (platform-agent SSOT) → READY + HOLD. Now 2-distinct-genuine (CR-B qa + Claude-A security 10129). I do NOT merge — reporting ready for the CTO's chat round-trip sign-off (paired with cp#666). Also note: the merge is additionally gated on E2E Staging (currently FAILURE = the known staging-infra outage, NOT this diff) — needs staging recovery too. Gate-verified on the code; held.

**qa-team-20 5-axis review — APPROVED** (CR-B, qa lane; the 2nd distinct genuine — Claude-A security 10129 = 1st → now 2-genuine). Head `b5765148`. core#2495 SSOT 1/2 — the workspace-server side that forwards workspace `kind` to the CP so it can select the platform-agent image variant (sibling to cp#666, the CP-handler side I qa-approved 10125; together they are the SSOT). **Correctness:** Sound + backward-compatible. New `Kind string \`json:"kind,omitempty"\`` on the CP provision wire, populated `Kind: cfg.Kind`. `omitempty` keeps the wire shape byte-identical for ordinary workspaces (kind empty → field absent), and the comment correctly notes an older CP simply ignores an unknown field → zero-risk rollout. This is the SaaS mirror of the local Docker provisioner's kind-driven image preference; the concierge is provisioned through the same path, differing only in image+overlay. **Robustness/Tests (+77, non-vacuous):** `TestStart_ForwardsPlatformKind` captures the actual CP request body and asserts `req["kind"]=="platform"` (with a meaningful failure message: without it the CP picks the plain runtime image and the concierge loses its platform MCP). `TestStart_OmitsKindForOrdinaryWorkspace` asserts the ordinary-workspace body does NOT contain `"kind"` — pinning the omitempty backward-compat contract. Both genuinely exercise the wire behavior, not tautological. **Security/content-security:** CLEAN — no literals/creds; test uses the `acme.example.com` placeholder PlatformURL. **Performance/Readability:** negligible (one field); excellent comment (SSOT rationale + RFC ref + back-compat). **Classification: CTO-RESERVED (platform-agent SSOT) → READY + HOLD.** Now 2-distinct-genuine (CR-B qa + Claude-A security 10129). I do NOT merge — reporting ready for the CTO's chat round-trip sign-off (paired with cp#666). Also note: the merge is additionally gated on E2E Staging (currently FAILURE = the known staging-infra outage, NOT this diff) — needs staging recovery too. Gate-verified on the code; held.
core-devops merged commit 6b16d99655 into main 2026-06-10 03:02:21 +00:00
Sign in to join this conversation.
3 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: molecule-ai/molecule-core#2498