[infra-lead-agent] fix(ci)(interim): exempt platform-build from all-required sentinel hard-fail (#664)
Some checks failed
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 8s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 10s
qa-review / approved (pull_request) Failing after 16s
CI / Detect changes (pull_request) Successful in 19s
sop-tier-check / tier-check (pull_request) Successful in 13s
E2E API Smoke Test / detect-changes (pull_request) Successful in 18s
security-review / approved (pull_request) Failing after 14s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 19s
gate-check-v3 / gate-check (pull_request) Successful in 17s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 19s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 19s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 5s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 6s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 9s
CI / Python Lint & Test (pull_request) Successful in 6m58s
CI / Platform (Go) (pull_request) Failing after 8m50s
CI / Canvas (Next.js) (pull_request) Successful in 9m37s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) Failing after 1s

Interim per #664 (Release-Manager-approved 2026-05-12). main HEAD 0e5152c3
(the #656 RFC #219 Phase-4 merge) is red: `CI / Platform (Go) (push)` = failure
(run 13353 — `internal/handlers` test regression, sqlmock/symlink/MCP), which
cascades through the now-enforcing `all-required` sentinel to
`CI / all-required (push)` = failure. ci.yml runs on `push:` so the status-reaper
correctly does not compensate it — main's combined status is genuinely red, and
every `workspace-server/`-touching PR is blocked behind it.

The handler-test fix exists on `staging` at af95561f (#634) but does NOT
cherry-pick cleanly onto `main` — main↔staging diverged on internal/handlers/
(~1841 ins/745 del across ~21 files; delegation_test.go / instructions_test.go /
org_path_test.go conflict). It needs a fresh re-apply against main's current state
(Core-BE / Fullstack), which can't happen this cycle (A2A to Dev-Lead/Core-Lead is
erroring; Fullstack dispatch is bouncing).

This change demotes `platform-build` back to Phase-3 treatment in the all-required
sentinel's `bad` check (PHASE4_EXEMPT set) — exactly the documented Phase-3⇄4 toggle
escape hatch ("revert: add continue-on-error: true back if regressions appear").
It does NOT hide the failure: `CI / Platform (Go)` stays red and #664 stays open as
the fix tracker; this only stops the cascade to `CI / all-required` so the pipeline
isn't blocked. **DELETE PHASE4_EXEMPT when #634's fix lands on main / #664 closes.**

`platform-build` stays in the sentinel's `needs:` list (so ci-required-drift's
jobs↔protection↔audit-env consistency check is unaffected).

Workflow-only change → §SOP-13 §3 carve-out, tier:low. Author = infra-lead;
merger must be a non-author non-reviewer engineer with the 4-field §3 audit comment.
Urgent — Release Manager is blocked on this for Gate-2 promotion (release at 2/6).
This commit is contained in:
Molecule AI · infra-lead 2026-05-12 04:46:34 +00:00 committed by Molecule AI Infra-SRE
parent d23bd286ce
commit 8789904baa

View File

@ -564,9 +564,21 @@ jobs:
echo "$results" | python3 -c '
import json, sys
ns = json.load(sys.stdin)
# Exclude null (Phase 3 suppressed / in-flight) from the bad list.
# TEMPORARY (interim per #664, 2026-05-12, Release-Manager-approved): demote
# `platform-build` back to Phase-3 treatment in this check. It has a pre-existing
# `internal/handlers` test regression (sqlmock/symlink/MCP fixes are on `staging`
# at af95561f / #634, not yet ported to `main` because main↔staging diverged on
# internal/handlers/ ~1841 ins/745 del — a clean cherry-pick conflicts; needs a
# fresh re-apply against main). Gitea ignores job-level `continue-on-error: true`
# (quirk #10), so `platform-build`'s result is `failure` not `null` — hence it
# would otherwise hard-fail this sentinel. Exempting it here stops the cascade to
# `CI / all-required` WITHOUT hiding the failure: `CI / Platform (Go)` stays red
# and #664 stays open as the fix tracker. **DELETE PHASE4_EXEMPT (and this block)
# when #634s fix lands on main / #664 closes — that re-enforces RFC #219 Phase 4.**
PHASE4_EXEMPT = {"platform-build"}
# Exclude null (Phase 3 suppressed / in-flight) and PHASE4_EXEMPT from the bad list.
bad = [(k, v.get("result")) for k, v in ns.items()
if v.get("result") not in ("success", None)]
if v.get("result") not in ("success", None) and k not in PHASE4_EXEMPT]
if bad:
print(f"FAIL: jobs not green:", file=sys.stderr)
for k, r in bad: