forked from molecule-ai/molecule-core
Independent review of #2358 surfaced three gaps that the original self-review missed. All three would manifest only on the FIRST real staging→main promotion through the new tail step, so they'd silently re-introduce the deploy-chain bug #2357 was supposed to fix. 1. **Missing `actions: write` permission.** `gh workflow run` POSTs to `/repos/.../actions/workflows/.../dispatches`, which requires the actions:write scope on GITHUB_TOKEN. The job had only contents:write + pull-requests:write, so the dispatch call would 403 on every run and the publish chain would still not fire. Adding the scope. 2. **No workflow-level concurrency block.** When CI + E2E Staging Canvas + E2E API Smoke + CodeQL all complete within seconds of each other on a green staging push (the typical case), four separate workflow_run events fire and four parallel auto-promote runs all reach the dispatch tail. They poll the same PR, all observe the same mergedAt, and all call `gh workflow run` — producing 2-4× redundant publish builds racing for the same `:staging-latest` retag and 2-4× canary-verify chains. Added `concurrency.group: auto-promote-staging, cancel-in-progress: false`. cancel-in-progress=false because killing a polling tail that's about to dispatch would re-introduce the original bug. 3. **PR closed-without-merge ties up a runner for 30 min.** If the merge queue rejects the PR (gates flip red post-approval), or an operator closes it manually, mergedAt stays null forever and the loop polls 60 × 30s burning a runner slot. Now also reads `state` in the same `gh pr view` call and breaks early when STATE=CLOSED. Verification on this PR is structural (workflow won't fire on a staging→main promotion until this lands AND a subsequent staging push triggers auto-promote). The actions:write fix in particular is unverifiable until the next real run — the prior #2358 fix has the same property, so we're stacking two unverifiable workflow edits. That's intentional rather than risky: stage 1 (#2358) was load-bearing for the deploy-chain restoration; stage 2 (this PR) hardens it before it actually matters. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| auto-promote-on-e2e.yml | ||
| auto-promote-staging.yml | ||
| auto-sync-main-to-staging.yml | ||
| auto-tag-runtime.yml | ||
| block-internal-paths.yml | ||
| canary-staging.yml | ||
| canary-verify.yml | ||
| check-merge-group-trigger.yml | ||
| ci.yml | ||
| codeql.yml | ||
| continuous-synth-e2e.yml | ||
| e2e-api.yml | ||
| e2e-staging-canvas.yml | ||
| e2e-staging-saas.yml | ||
| e2e-staging-sanity.yml | ||
| pr-guards.yml | ||
| promote-latest.yml | ||
| publish-canvas-image.yml | ||
| publish-runtime.yml | ||
| publish-workspace-server-image.yml | ||
| railway-pin-audit.yml | ||
| redeploy-tenants-on-main.yml | ||
| redeploy-tenants-on-staging.yml | ||
| retarget-main-to-staging.yml | ||
| runtime-pin-compat.yml | ||
| runtime-prbuild-compat.yml | ||
| secret-pattern-drift.yml | ||
| secret-scan.yml | ||
| sweep-cf-orphans.yml | ||
| sweep-cf-tunnels.yml | ||
| sweep-stale-e2e-orgs.yml | ||
| test-ops-scripts.yml | ||