fix(auto-promote): fail loud on 403 instead of silently degrading (#15)

Independent code review caught a Critical issue inherited from the
pre-extraction workflow: the branch-protection API call falls through
to '{}' on any non-200, then the empty-GATES check treats this as
"no gates configured (or API inaccessible)" and sets ok=true. Combined
with --ff-only being ancestry-only (not test-status), a green-but-
flaky source branch could ff-promote red commits to the target with
zero CI enforcement.

The conflation of three response classes is the bug:

  200 with .contexts[] populated  → honor the gates (correct)
  200 with empty .contexts        → "no gates configured" → ok=true (correct)
  404 (no branch protection)      → "no gates configured" → ok=true (correct)
  403 (token lacks permission)    → silently treated like 404 (BUG)

Use `gh api -i` to capture the HTTP status line and discriminate:

  - 200 → extract body, proceed to gate-check loop
  - 404 → legitimate fallback to --ff-only safety, log notice
  - 403/401 → fail loud with a concrete fix ("add administration: read
    to your caller's permissions block")
  - any other → fail loud with the response prefix for debugging

Also:

  - Update the README in the workflow header to document the
    administration: read requirement.
  - Add administration: read to molecule-ci's own self-caller
    (auto-promote-staging.yml) so its behavior is preserved.

Verified locally against four real API responses:

  - molecule-core/staging        → HTTP 200, 8 gates → loop runs
  - molecule-ci/main             → HTTP 200, 0 gates → ok=true (notice)
  - hackathon org-template/main  → HTTP 200, 0 gates → ok=true (notice)
  - this-repo-does-not-exist     → HTTP 404 → legitimate fallback path

Closes a Critical from the post-merge review of #14.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Hongming Wang 2026-04-28 11:11:59 -07:00 committed by GitHub
parent 236923196f
commit 8b0fbac78a
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
2 changed files with 78 additions and 8 deletions

View File

@ -16,8 +16,9 @@ name: Auto-promote branch (reusable)
# branches: [staging] # branches: [staging]
# workflow_dispatch: # workflow_dispatch:
# permissions: # permissions:
# contents: write # contents: write # push the fast-forward to to-branch
# statuses: read # statuses: read # read commit status checks
# administration: read # read branch protection (REQUIRED — see below)
# jobs: # jobs:
# promote: # promote:
# uses: Molecule-AI/molecule-ci/.github/workflows/auto-promote-branch.yml@main # uses: Molecule-AI/molecule-ci/.github/workflows/auto-promote-branch.yml@main
@ -28,6 +29,14 @@ name: Auto-promote branch (reusable)
# Repo-agnostic by design — gates are read from the consuming repo's # Repo-agnostic by design — gates are read from the consuming repo's
# branch protection at run time, not hardcoded here. # branch protection at run time, not hardcoded here.
# #
# `administration: read` is REQUIRED. Without it, the branch-protection
# API returns 403 and the workflow refuses to fast-forward (fail-loud),
# rather than silently degrading to --ff-only-only enforcement (which
# is ancestry-only, not test-status — a green-but-flaky branch would
# ff-promote red commits). If you intentionally want no-gate
# enforcement, leave from-branch unprotected — a 404 from the API is
# treated as "no gates configured" and falls back to --ff-only safety.
#
# Excluded-by-policy repos (molecule-core + molecule-controlplane per # Excluded-by-policy repos (molecule-core + molecule-controlplane per
# CEO directive 2026-04-24) simply do not adopt this workflow; the # CEO directive 2026-04-24) simply do not adopt this workflow; the
# reusable shape adds no surface area to repos that don't call it. # reusable shape adds no surface area to repos that don't call it.
@ -70,13 +79,72 @@ jobs:
run: | run: |
set -euo pipefail set -euo pipefail
# Try to read required gates from branch protection. Free-tier # Read required gates from branch protection. Three response
# private repos may 403; handle that gracefully. # classes, distinguished by HTTP status:
GATES_JSON=$(gh api "repos/${REPO}/branches/${FROM_BRANCH}/protection/required_status_checks" 2>/dev/null || echo '{}') #
# 200 — branch protection is configured. Honor the gates.
# 404 — branch is not protected. Legitimate "no gates";
# fall back to --ff-only as the sole safety net.
# 403 — caller's GITHUB_TOKEN can't read branch protection.
# FAIL LOUD. The previous behavior conflated this
# with 404 ("api inaccessible") and silently degraded
# to --ff-only-only — which is ancestry-only, not
# test-status. A green-but-flaky branch would
# ff-promote red commits to the target. The fix:
# require the caller to add `administration: read`
# to its permissions block, or explicitly accept the
# no-gates posture by removing branch protection on
# the source branch.
#
# `gh api` exit code is 0 only on 2xx; non-zero on anything
# else. We use --include to capture HTTP status to discriminate.
if PROTECTION_RESP=$(gh api -i "repos/${REPO}/branches/${FROM_BRANCH}/protection/required_status_checks" 2>&1); then
HTTP_STATUS=200
else
HTTP_STATUS=$(echo "$PROTECTION_RESP" | grep -oE '^HTTP/[12](\.[01])? [0-9]{3}' | awk '{print $2}' | head -1)
HTTP_STATUS=${HTTP_STATUS:-unknown}
fi
case "$HTTP_STATUS" in
200)
# Strip headers from gh -i output to get just the body.
GATES_JSON=$(echo "$PROTECTION_RESP" | awk 'p{print} /^[[:space:]]*$/ && !p {p=1}')
;;
404)
echo "::notice::No branch protection on '${FROM_BRANCH}' — relying on --ff-only safety."
echo "ok=true" >> "$GITHUB_OUTPUT"
exit 0
;;
403|401)
echo "::error::Cannot read branch protection on '${FROM_BRANCH}' (HTTP ${HTTP_STATUS})."
echo "::error::Caller's GITHUB_TOKEN lacks 'administration: read' permission."
echo "::error::Refusing to fast-forward without explicit gate enforcement —"
echo "::error::a silent fallback to --ff-only here would let green-but-flaky"
echo "::error::branches promote red commits."
echo "::error::"
echo "::error::Fix: add to the caller's workflow's permissions block:"
echo "::error:: permissions:"
echo "::error:: contents: write"
echo "::error:: statuses: read"
echo "::error:: administration: read"
echo "::error::"
echo "::error::Or, if you intentionally want no-gate enforcement, remove"
echo "::error::branch protection on '${FROM_BRANCH}' so the API returns 404."
exit 1
;;
*)
echo "::error::Unexpected HTTP status '${HTTP_STATUS}' from branch-protection API."
echo "::error::Response (first 5 lines):"
echo "$PROTECTION_RESP" | head -5 | sed 's/^/::error:: /'
exit 1
;;
esac
GATES=$(echo "${GATES_JSON}" | jq -r '.contexts[]?' 2>/dev/null || true) GATES=$(echo "${GATES_JSON}" | jq -r '.contexts[]?' 2>/dev/null || true)
if [ -z "$GATES" ]; then if [ -z "$GATES" ]; then
echo "No required gates configured on '${FROM_BRANCH}' (or API inaccessible). Relying on --ff-only safety." echo "::notice::Branch protection on '${FROM_BRANCH}' has zero required-status-checks contexts — relying on --ff-only safety."
echo "ok=true" >> "$GITHUB_OUTPUT" echo "ok=true" >> "$GITHUB_OUTPUT"
exit 0 exit 0
fi fi

View File

@ -16,8 +16,10 @@ on:
workflow_dispatch: workflow_dispatch:
permissions: permissions:
contents: write contents: write # push the fast-forward to main
statuses: read statuses: read # read commit status checks
administration: read # read branch protection (required by the
# reusable workflow — see its header for why)
jobs: jobs:
promote: promote: