ci(ecr): auto-apply canonical image lifecycle policy on prod ECR pushes #3137
Reference in New Issue
Block a user
Delete Branch "ops/ecr-lifecycle-iac"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
What
Make the prod ECR image lifecycle policy applied + maintained automatically by the publish pipelines (which already run with prod-ECR push creds), so the prod ECR storage bill (~$56/mo, account 153263036946) stops growing — without any standing prod-access grant.
Why
The prod ECR repos under
153263036946.dkr.ecr.us-east-2.amazonaws.com/molecule-ai/*had lifecycle policies set out-of-band (no IaC managed them). Prod is bloated:platform-tenantalone has 70+ images / 12GB+ of untagged layers and supersededsha-tags that linger forever.The publish workflows already
aws ecr get-login-password+docker pushto prod ECR — so they hold the right creds + region. Adding anaws ecr put-lifecycle-policyright after each push applies/refreshes the policy on every build. That's the durable IaC fix.Changes
scripts/ops/ensure-ecr-lifecycle.sh— shared, idempotent, fail-soft helper. The canonical lifecycle policy JSON is SSOT in this one file.put-lifecycle-policyonly declares policy (no deletes — ECR's own lifecycle engine does the expiry on its schedule). Always exits 0 so a policy error (transient ECR blip, IAM gap) never breaks a publish — it logs a::warning::and the policy reapplies next publish.publish-workspace-server-image.yml— calls the script formolecule-ai/platform(after the platform push) andmolecule-ai/platform-tenant(after the tenant push). Staging ECR (004947743811) is intentionally not touched.publish-canvas-image.yml— calls the script formolecule-ai/canvasafter push.Policy (canonical, validated on operator account)
sha-/v/latest/staging/mainprefixesVerify
After merge, the next publish of each image applies the policy to its prod repo (the CI run is the verification). Out of band:
aws ecr get-lifecycle-policy --repository-name molecule-ai/platform-tenant(needs prod creds).Tested
bash -nclean; embedded JSON parses to 2 rules.lint-curl-status-capture,lint-workflow-yaml,lint-publish-timeoutall pass; existingscripts/opsunittest suite (34 tests) still green.🤖 Generated with Claude Code
Reviewed: additive post-push ensure-ecr-lifecycle step, fail-soft (never breaks publish), canonical policy SSOT, lints pass. Durable prod-ECR cost guard. LGTM.
Reviewed: additive post-push ensure-ecr-lifecycle step, fail-soft (never breaks publish), canonical policy SSOT, lints pass. Durable prod-ECR cost guard. LGTM.