ci(canvas): deterministic ordered canvas deploy + digest-pin (core#2226) #2233
Reference in New Issue
Block a user
Delete Branch "fix/core2226-canvas-ordered-deploy"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Closes #2226
Problem
The standalone
molecule-ai/canvasimage had no ordered/verified deploy — unlike the platform (publish-workspace-server-image.yml: build → push:staging-<sha>→ fleet redeploy → re-point:latestafter/buildinfoverify).publish-canvas-image.ymlonly built+pushed:latest+:sha-<sha>, anddocker-compose.yml:170referencedcanvas:latestunpinned (standingTODO: pin canvas ECR image digest). Tenants/hosts only picked up new canvas as a side effect of the platform fleet-redeploy pulling the mutable:latest— non-deterministic and unverifiable, hence the advisory "Canvas Deploy Reminder".Fix — mirror the platform's ordered deploy
publish-canvas-image.ymlbuild-and-pushnow pushes:staging-<sha>+:staging-latest(+ legacy:sha-<sha>for back-compat) and no longer moves:latest— an unpromoted/red build can never become the prod-blessed tag.promote-canvasjob (needs: build-and-push,if: push && main): waits for green main CI on this SHA via the sameprod-auto-deploy.py wait-ciSSOT the platformdeploy-productionuses, then re-points:latestto the verified:staging-<sha>by digest (imagetools create, no rebuild). So:latest== the last CI-green canvas, and platform + canvas advance:latestoff the identical signal/SHA. Honors thePROD_AUTO_DEPLOY_DISABLEDkill-switch; reuses the platform's writable-HOME/continue-on-errorpatterns.docker-compose.yml— canvas image pins viaCANVAS_IMAGE_TAG(defaultlatest= prod-blessed; setstaging-<sha>orstaging-<sha>@<digest>for a fully reproducible deploy). Resolves the standingTODO: pin canvas ECR image digest. Local-devbuild:context unchanged.ci.yml— replaced the advisory "Canvas Deploy Reminder" (which prescribed a manualdocker compose pull canvas) with "Canvas Deploy Status" recording that the ordered deploy is now handling it. (This job was never a required branch-protection context; advisory-only.)New deploy sequence
canvas/**merge to main → build+push:staging-<sha>/:staging-latest→promote-canvaswaits green main CI on the SHA → promote:latest→:staging-<sha>by digest. Image always exists in ECR before:latestis re-pointed (mirrors the platform build-then-deploy ordering).How it's verified
The platform's per-tenant
/buildinfoalready proves the tenant canvas: prod tenants serve canvas baked intoplatform-tenant(Dockerfile.tenantStage 2 buildscanvas/at the same SHA), which is already ordered +/buildinfo-verified. This PR makes the standalonemolecule-ai/canvasimage (the co-locateddocker-composecanvas /canvas.moleculesai.appsurface) equally deterministic: CI-green gate + immutable:staging-<sha>+ digest-pin. The canvas process has no/buildinfoof its own today, so the served-SHA assertion is not yet possible for the standalone image — flagged below.Validation
python3 -c "yaml.safe_load(...)"parses all 3 edited files.lint-workflow-yaml.py— clean (no Gitea-1.22.6-hostile shapes).docker compose configvalidates;CANVAS_IMAGE_TAGdefault→:latest, override→:staging-<sha>both resolve; local-devbuild:intact.pytest test_prod_auto_deploy.py test_ci_required_drift.py test_ci_workflow_bookkeeping.py— 49 passed (covers the reusedwait-ci+ the job rename).Flags / follow-ups (no CP-side dependency required)
redeploy-fleetis unchanged — it deploysplatform-tenant(which already bakes canvas), so the prod tenant fleet's canvas is already ordered+verified. This PR closes the gap for the standalone canvas image./buildinfoendpoint would letpromote-canvasassert the served SHA like the platform does. Tracked in #2226's "verify per-tenant via a canvas/buildinfo" item; not built here (would touch the Next.js app + needs a deploy surface to poll).:staging-latest/:staging-<sha>tag scheme is new for this repo's canvas image; any external consumer pinning the legacy:sha-<sha>keeps working (still pushed).🤖 Generated with Claude Code
QA approve. Ordered+digest-pinned canvas deploy (promote-canvas via the platform prod-auto-deploy SSOT); replaces advisory reminder. Found tenants bake canvas into platform-tenant (already ordered) so no CP change; only the standalone canvas image made deterministic. 49 tests pass, YAMLs+compose validate.
Security approve. CI/workflow+compose only; digest-pin tightens (not loosens) reproducibility; honors PROD_AUTO_DEPLOY_DISABLED. No auth surface change.
/qa-recheck
/security-recheck