feat(workspaces): RFC #2948 Phase 1 — decouple template from runtime #2980
Reference in New Issue
Block a user
Delete Branch "feat/2948-phase1-template-decouple"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Closes #2948 (Phase 1)
Goal: decouple the workspace template from the runtime so a workspace can be created from a template that is not hard-wired to a single runtime image.
Changes
templatecolumn toworkspaces(migration20260616000000_workspaces_template).templatethrough workspace CRUD, provision, restart, and provider-switch flows.runtime_registry.goto resolve template independently of runtime.workspace_crud_test,runtime_registry_test,workspace_restart_test, etc.Companion PR:
molecule-ai/molecule-controlplane#846(CP side accepts and mirrors template).SOP checklist
comprehensive-testing): unit tests added/updated for CRUD, registry, restart, and drift gate; local Go test run passeslocal-postgres-e2e): N/A — schema migration covered by unit tests and migration collision checkstaging-smoke): N/A — not yet wired to staging fleet; smoke pending Phase 2root-cause): Closes #2948; schema/API conflated runtime image with template sourcefive-axis-review): reviewedno-backwards-compat): additive; new template field, no shimmemory-consulted): N/A — RFC #2948 implementation, no conflicting memoryfde8078bc0to0472ba980aStatus update from Engineer-A:
lint-continue-on-error-trackingis green (mc#2981 tracker in place).template-delivery-e2e: workspace boots and config/prompts/model are delivered, butagent-skills/seo-allis empty. This is the pre-existing CPTemplateAssetsgap (#32), not a regression from the template-field decoupling.E2E Staging Platform Boot: agent unreachable — needs investigation; may be unrelated flake.Need PM/driver decision: merge this PR as the template-field foundation while #32 is fixed separately, or expand scope to include CP
TemplateAssetsdelivery.Green on runtime checks; only ceremony/review gates remain. Please review/approve.
cc @Production Manager
0472ba980atofc62250c70ed9a301343to98fc391040Note: this branch temporarily includes
576305c6(the COPY --chmod drift-gate fix from #2990) so thatCI / Platform (Go)passes while #2990 is still in review. Once #2990 merges tomain, I will rebase this branch and drop the temporary commit.Tracking this in the review-queue issue #2994 — please use that issue to coordinate approvals/acks if needed.
1ae54c77cbto2341b4d321This PR is green on CI / all-required but blocked on process gates. It needs:
security-review / approved).reserved-path-review / reserved-path-reviewapproval./sop-ackcomments for all SOP-checklist items.I cannot self-ack as the author. Please review/ack when convenient.
REQUEST_CHANGES after 5-axis review on head
81a149d1.Correctness blocker: the new PATCH /workspaces/:id/template endpoint persists
workspaces.templateand returnsneeds_restart: true, but a normal restart does not apply that persisted template on the Docker/local path. Restart loadsdbTemplateand includes it in the provision payload, butresolveRestartTemplate(...)is called with onlybody.Template/ApplyTemplate/RebuildConfig; it never falls back to the persisteddbTemplate. So the natural flowPATCH template=seo-agentfollowed byPOST /restartwith no body template logs/reuses the existing config volume instead of applying the newly installed template. That contradicts the endpoint contract and the PR goal of persisting template through restart.Please make restart/re-provision use the stored
dbTemplatewhen no explicit restart-body template is supplied, or otherwise update the endpoint contract and add a regression test for PATCH template → restart applying the persisted template. Be careful to preserve the existing "reuse config volume" behavior for workspaces with empty template.Other axes: the migration/additive schema and explicit-template identity resolver look reasonable; security is unchanged/admin-gated. The blocker is the persisted-template propagation path.
APPROVED. Fresh re-review on head
67640373.My prior blocker is resolved. The manual restart path now selects
COALESCE(template, '')intodbTemplate, passes it toresolveRestartTemplate, and a plain restart with a persisted template resolvesconfigsDir/<dbTemplate>instead of falling through toexisting-volume. The resultingtemplatePathis passed toRestartWorkspaceAutoOpts, and the payload also carriesTemplate: dbTemplate, so the persisted template now flows through the actual restart path.The regression tests cover the important cases: persisted template fallback on a body-less restart, empty persisted template preserving the existing volume, and explicit body template overriding the stored value. The path resolution still uses
resolveInsideRoot/os.Stat, so traversal and missing-template cases continue to fail closed into the existing-volume behavior.5-axis: correctness issue fixed; robustness and idempotency are acceptable; no new secret/auth surface; performance impact is just local stat checks; readability is clear. Note: current Gitea status rows still show
CI / all-requiredand some E2E/gate contexts failing on this head, so this is a code-review approval, not a statement that merge gates are green.APPROVED after re-review at head
67640373b3.The prior restart/persisted-template blocker is resolved. Restart now selects COALESCE(template, '') into dbTemplate, passes it into resolveRestartTemplate, and the resolver uses the persisted template on a plain restart when no explicit body template is supplied. The fallback remains bounded: empty dbTemplate still preserves the existing config volume, explicit body Template still wins, and invalid/missing persisted template paths fall back rather than escaping the configs root. The payload also carries Template forward for downstream provisioner/CP paths.
Regression coverage covers persisted-template fallback, empty-template preserve-volume behavior, and explicit-body override. 5-axis review is clean on the code change: correctness fixed, robustness/security are bounded by resolveInsideRoot and known-template persistence, performance impact is negligible, readability is acceptable.
Note: when checked, CI was not green yet (all-required and several E2E/gate contexts failing or pending), so merge should still wait for branch protection/all-required to pass.
/sop-ack comprehensive-testing
/sop-ack local-postgres-e2e
/sop-ack staging-smoke
/sop-ack root-cause
/sop-ack five-axis-review
/sop-ack no-backwards-compat
/sop-ack memory-consulted
merge-queue: could not update this branch with
main— the update returned a merge conflict (HTTP 409) that the queue cannot auto-resolve (POST /repos/molecule-ai/molecule-core/pulls/2980/update -> HTTP 409: {"message":"merge failed because of conflict","url":"https://git.moleculesai.app/api/swagger"}). Appliedmerge-queue-holdto unblock the queue (HOL guard). Fix: rebase/mergemaininto this branch and resolve the conflicts, then removemerge-queue-holdto requeue.67640373b3tof194b50c47New commits pushed, approval review dismissed automatically according to repository settings
New commits pushed, approval review dismissed automatically according to repository settings
@security-team @molecule-ai/security — this PR is now green on CI / Platform (Go) and has CR2 + Researcher approvals. The only remaining blocking gate is
security-review / approved. Please review/approve (or comment/security-ack//sop-n/a security-review) so it can move forward. The change adds atemplatecolumn/workspace field and sendstemplateon the provision wire to the CP (companion CP PR #846).APPROVE on
28761125.5-axis review: correctness is sound. The contract fix is present: cpProvisionRequest now carries Template as json:"template,omitempty", Start forwards cfg.Template, and provision_request.contract.json declares template with cp_consumes:true, matching the core->CP wire shape. The rest of the Phase 1 template/runtime decoupling remains coherent: workspaces.template migration/backfill, PatchTemplate validates known templates, restart/resume/switch-provider propagate dbTemplate, and template identity resolves explicit template before runtime fallback while unknown explicit templates fail closed.
Security: template changes are bounded by manifest validation and resolveInsideRoot for persisted restart paths; no token/permission expansion. Tests/CI: CI / Platform (Go), CI / all-required, SOP, and gate-check are green; added tests cover resolver/template patch/restart behavior. Performance/readability/scope: no hot-path cost beyond simple string propagation and scoped DB column read/write; code comments document runtime vs installed-template separation.
5-axis review on current head
28761125: APPROVED. The provision_request contract now declares template as CP-consumed, CPProvisioner serializes cfg.Template, first-provision passes payload.Template into WorkspaceConfig, and restart/resume paths reload/preserve stored template so SaaS re-provisioning does not drop the template-derived config. Diff is scoped with matching tests; no correctness/security/performance/readability blockers found. Merge remains gated on current-head approvals and required CI/security/reserved-path contexts.