From 8df8487bbebf5d82600970c9e8857173a27788b7 Mon Sep 17 00:00:00 2001 From: Hongming Wang Date: Mon, 4 May 2026 19:26:29 -0700 Subject: [PATCH] fix(auto-promote): treat E2E completed/cancelled as defer, not failure MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Bug: the case statement at line 189 grouped completed/failure | completed/cancelled | completed/timed_out into the same "abort + exit 1" branch. cancelled ≠ failure — when per-SHA concurrency (memory: feedback_concurrency_group_per_sha) cancels an older E2E run because a newer push landed, the workflow blocked the whole auto-promote chain on a non-failure. Caught 2026-05-05 02:03 on sha 31f9a5e: E2E got cancelled by concurrency, auto-promote :latest aborted with exit 1, the next auto-promote-staging cycle had to manually clean up. Split: failure/timed_out keep the abort path. cancelled gets its own clean-defer branch (same shape as in_progress) — proceed=false without exit 1, with a step-summary explaining likely concurrency supersession and pointing operators at manual dispatch if they need that specific SHA promoted. Co-Authored-By: Claude Opus 4.7 (1M context) --- .github/workflows/auto-promote-on-e2e.yml | 23 ++++++++++++++++++++++- 1 file changed, 22 insertions(+), 1 deletion(-) diff --git a/.github/workflows/auto-promote-on-e2e.yml b/.github/workflows/auto-promote-on-e2e.yml index 9fac7eae..82d771a6 100644 --- a/.github/workflows/auto-promote-on-e2e.yml +++ b/.github/workflows/auto-promote-on-e2e.yml @@ -186,7 +186,7 @@ jobs: echo "proceed=true" >> "$GITHUB_OUTPUT" echo "::notice::E2E green for this SHA — proceeding with promote" ;; - completed/failure|completed/cancelled|completed/timed_out) + completed/failure|completed/timed_out) echo "proceed=false" >> "$GITHUB_OUTPUT" { echo "## ❌ Auto-promote aborted — E2E Staging SaaS failed" @@ -198,6 +198,27 @@ jobs: } >> "$GITHUB_STEP_SUMMARY" exit 1 ;; + completed/cancelled) + # cancelled ≠ failure. Per-SHA concurrency cancels older E2E + # runs when a newer push lands (memory: + # feedback_concurrency_group_per_sha) — the newer SHA will + # have its own E2E + promote chain. Treat the same as + # in_progress: defer without aborting, let the next E2E run + # promote when it lands. + # + # Caught 2026-05-05 02:03 on sha 31f9a5e — auto-promote + # blocked the whole chain because this case fell through to + # exit 1 instead of clean defer. + echo "proceed=false" >> "$GITHUB_OUTPUT" + { + echo "## ⏭ Auto-promote deferred — E2E Staging SaaS was cancelled" + echo + echo "E2E Staging SaaS for \`${SHA:0:7}\`: \`$RESULT\`" + echo "Likely per-SHA concurrency (newer push superseded this E2E run)." + echo "The newer SHA's E2E will fire its own promote when it lands." + echo "If you need this specific SHA promoted, manually dispatch." + } >> "$GITHUB_STEP_SUMMARY" + ;; in_progress/*|queued/*|requested/*|waiting/*|pending/*) echo "proceed=false" >> "$GITHUB_OUTPUT" {