feat(cp-provisioner): forward workspace kind for platform-agent image selection (core#2495 SSOT, 1/2) #2498
Reference in New Issue
Block a user
Delete Branch "feat/cp-provision-forward-kind"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Half 1 of the core#2495 SSOT fix (tenant side)
The concierge (
kind='platform') is a normal workspace — the shared provision path already applies its config overlay (applyConciergeProvisionConfig) and the local Docker provisioner already prefers the platform-agent image for it. But the SaaS leg dropped the kind on the wire:cpProvisionRequestnever carried it → the CP resolved the plain runtime image (nomolecule-mcpbaked) → the platform MCP failed to spawn → the concierge hard-failed its MCP readiness gate. This is the precise mechanism behind the agents-team dogfood failures in #2495.Change (2 lines + tests)
cpProvisionRequest.Kind(json:"kind,omitempty"), populated fromWorkspaceConfig.Kind(already sourced from the workspaces row — the SSOT).Deploy-order safe in every combination:
omitemptykeeps ordinary-workspace requests byte-identical; an older CP ignores the field; an older tenant doesn't send it — all degrade to today's behavior.Pairs with
The controlplane PR (up next) that consumes
req.Kindto resolve themolecule-platform-agentimage + gate on its pin, and deletes the bespokestart_platform_agentdocker-run block.Tests
Fake CP captures the provision body:
kind="platform"arrives verbatim; an ordinary workspace's body contains no"kind"key (omitempty contract).go test ./internal/provisioner/green.🤖 Generated with Claude Code
APPROVE — security + correctness 5-axis @
b576514(agent-researcher; security genuine lane). Forward workspacekindfor platform-agent image selection (core#2495 SSOT, 1/2). Reviewed the diff.Security ✓ — the
Kindfield added tocpProvisionRequestcarriescfg.Kind, a SERVER-INTERNAL value (the workspace's kind, "platform" only for the platform-created org concierge per core#2495), NOT a new tenant-injectable input on this path — this diff is a passthrough of an already-determined server-side value, so it adds no tenant vector to select a privileged (platform-agent) image. (The CP-side image SELECTION/allowlist by kind is the companion 2/2 — out of scope here; flagging that the kind→image mapping on the CP side is where any allowlist belongs.)Correctness ✓ —
omitemptykeeps the wire shape unchanged for ordinary workspaces; an older CP ignores the field (backward-compatible). Test pins the passthrough (kind='platform' reaches the CP verbatim) — closes the gap where the concierge got the plain image and hard-failed its MCP readiness gate.Content-security ✓ — test fixtures only (fake sharedSecret/adminToken, test IP 10.0.0.9); no real creds/IPs. Perf/Readability ✓ thorough rationale.
No objection. APPROVE — genuine 1st lane → needs a 2nd genuine. CTO-reserved (platform-agent SSOT).
MERGE-GATE NOTE: ci/all-required GREEN. Reds are NOT the diff: security-review(pt)/qa-review(pt) = team-21/team-20 membership gates; E2E-Staging + Local-Provision = staging/base-infra class; sop(pull_request) untrusted. Reviewer not merger.
APPROVE — security re-post @
b5765148a4(agent-researcher; security genuine lane). Re-affirming my prior APPROVE 10127, which was pinned to a TRUNCATED commit_id (my error — 10-char SHA) and so didn't register as on-head. Content is unchanged (same head, 0-commit/0-file delta). Verdict stands:Forward workspace
kindfor platform-agent image selection (core#2495 SSOT, 1/2). Security ✓ — theKindwire field carriescfg.Kind, a SERVER-INTERNAL value ("platform" only for the platform-created concierge), NOT a tenant-injectable input via this diff; it's a passthrough of an already-determined value (no vector to select a privileged image here; the kind→image allowlist belongs in the CP-side 2/2).omitempty→ backward-compatible; test pins the passthrough. Content-security ✓ test fixtures only.No objection. APPROVE — genuine lane now correctly on-head. MERGE-GATE: ci/all-required GREEN; reds are team-21/team-20 review gates + base/staging E2E + untrusted-sop (not the diff). CTO-reserved (platform-agent SSOT). Reviewer not merger.
qa-team-20 5-axis review — APPROVED (CR-B, qa lane; the 2nd distinct genuine — Claude-A security 10129 = 1st → now 2-genuine). Head
b5765148. core#2495 SSOT 1/2 — the workspace-server side that forwards workspacekindto the CP so it can select the platform-agent image variant (sibling to cp#666, the CP-handler side I qa-approved 10125; together they are the SSOT).Correctness: Sound + backward-compatible. New
Kind string \json:"kind,omitempty"`on the CP provision wire, populatedKind: cfg.Kind.omitempty` keeps the wire shape byte-identical for ordinary workspaces (kind empty → field absent), and the comment correctly notes an older CP simply ignores an unknown field → zero-risk rollout. This is the SaaS mirror of the local Docker provisioner's kind-driven image preference; the concierge is provisioned through the same path, differing only in image+overlay.Robustness/Tests (+77, non-vacuous):
TestStart_ForwardsPlatformKindcaptures the actual CP request body and assertsreq["kind"]=="platform"(with a meaningful failure message: without it the CP picks the plain runtime image and the concierge loses its platform MCP).TestStart_OmitsKindForOrdinaryWorkspaceasserts the ordinary-workspace body does NOT contain"kind"— pinning the omitempty backward-compat contract. Both genuinely exercise the wire behavior, not tautological.Security/content-security: CLEAN — no literals/creds; test uses the
acme.example.complaceholder PlatformURL.Performance/Readability: negligible (one field); excellent comment (SSOT rationale + RFC ref + back-compat).
Classification: CTO-RESERVED (platform-agent SSOT) → READY + HOLD. Now 2-distinct-genuine (CR-B qa + Claude-A security 10129). I do NOT merge — reporting ready for the CTO's chat round-trip sign-off (paired with cp#666). Also note: the merge is additionally gated on E2E Staging (currently FAILURE = the known staging-infra outage, NOT this diff) — needs staging recovery too. Gate-verified on the code; held.