fix(ci): all-required sentinel skips null-result Phase-3 jobs #581

Closed
infra-sre wants to merge 1 commits from sre/fix-all-required-null-result into main

1 Commits

Author SHA1 Message Date
5cd3ad07f5 fix(ci): all-required sentinel assertion skips Phase-3 null results
Some checks failed
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 14s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 16s
qa-review / approved (pull_request) Failing after 18s
CI / Detect changes (pull_request) Successful in 1m1s
gate-check-v3 / gate-check (pull_request) Successful in 36s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 58s
security-review / approved (pull_request) Failing after 20s
E2E API Smoke Test / detect-changes (pull_request) Successful in 1m0s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 55s
sop-tier-check / tier-check (pull_request) Successful in 20s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 46s
Block internal-flavored paths / Block forbidden paths (pull_request) Failing after 12m28s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 7s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 7s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 8s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 26s
audit-force-merge / audit (pull_request) Has been skipped
CI / Python Lint & Test (pull_request) Successful in 7m47s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 9m21s
CI / Platform (Go) (pull_request) Failing after 11m16s
CI / Canvas (Next.js) (pull_request) Failing after 11m50s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) Failing after 5s
Phase 3 (RFC #219 §1): underlying build jobs use continue-on-error:
true to surface defects without blocking PRs. When a Phase-3 job fails,
its `needs.*.result` is null (not "failure"). The original assertion
`v.get("result") != "success"` treated null as bad, hard-failing the
sentinel on Phase-3 noise.

Fix (assertion only — continue-on-error: true NOT added to sentinel):
- Assertion updated: `v.get("result") not in ("success", None)` — null
  results from Phase-3 continue-on-error: true failures are skipped.
- Null means the job used continue-on-error: true and failed. This is
  expected Phase-3 behavior — skip rather than fail.
- failure / skipped / cancelled still fail the sentinel (correct — real
  problems that need human review).

NOTE: continue-on-error: true is intentionally NOT added to the
all-required job itself. With the assertion fix, null results are
already skipped so Phase-3 jobs don't hard-fail the sentinel.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 22:15:24 +00:00