Independent review of #2358 surfaced three gaps that the original
self-review missed. All three would manifest only on the FIRST real
staging→main promotion through the new tail step, so they'd silently
re-introduce the deploy-chain bug #2357 was supposed to fix.
1. **Missing `actions: write` permission.** `gh workflow run` POSTs to
`/repos/.../actions/workflows/.../dispatches`, which requires the
actions:write scope on GITHUB_TOKEN. The job had only contents:write
+ pull-requests:write, so the dispatch call would 403 on every run
and the publish chain would still not fire. Adding the scope.
2. **No workflow-level concurrency block.** When CI + E2E Staging
Canvas + E2E API Smoke + CodeQL all complete within seconds of each
other on a green staging push (the typical case), four separate
workflow_run events fire and four parallel auto-promote runs all
reach the dispatch tail. They poll the same PR, all observe the
same mergedAt, and all call `gh workflow run` — producing 2-4×
redundant publish builds racing for the same `:staging-latest`
retag and 2-4× canary-verify chains. Added
`concurrency.group: auto-promote-staging, cancel-in-progress: false`.
cancel-in-progress=false because killing a polling tail that's
about to dispatch would re-introduce the original bug.
3. **PR closed-without-merge ties up a runner for 30 min.** If the
merge queue rejects the PR (gates flip red post-approval), or an
operator closes it manually, mergedAt stays null forever and the
loop polls 60 × 30s burning a runner slot. Now also reads `state`
in the same `gh pr view` call and breaks early when STATE=CLOSED.
Verification on this PR is structural (workflow won't fire on a
staging→main promotion until this lands AND a subsequent staging
push triggers auto-promote). The actions:write fix in particular is
unverifiable until the next real run — the prior #2358 fix has
the same property, so we're stacking two unverifiable workflow
edits. That's intentional rather than risky: stage 1 (#2358) was
load-bearing for the deploy-chain restoration; stage 2 (this PR)
hardens it before it actually matters.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>