fix(ci): flip all-required continue-on-error to false (unblocks all PRs) #724

Merged
hongming merged 7 commits from infra/all-required-coe-false-v2 into main 2026-05-12 20:48:34 +00:00

7 Commits

Author SHA1 Message Date
core-devops
70598cd05c ci: add "skipped" to all-required exclusion list — fixes conditionally-skipped jobs failing sentinel
Some checks failed
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 3s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 9s
CI / Detect changes (pull_request) Successful in 14s
E2E API Smoke Test / detect-changes (pull_request) Successful in 15s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 18s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 18s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 19s
gate-check-v3 / gate-check (pull_request) Successful in 17s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: 7
sop-checklist-gate / gate (pull_request) Successful in 15s
security-review / approved (pull_request) Failing after 15s
qa-review / approved (pull_request) Failing after 16s
sop-tier-check / tier-check (pull_request) Successful in 16s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 11s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 18s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 8s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 10s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 4s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m13s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Failing after 1m20s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m26s
lint-mask-pr-atomicity / lint-mask-pr-atomicity (pull_request) Successful in 1m35s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m39s
CI / Platform (Go) (pull_request) Failing after 4m11s
CI / Canvas (Next.js) (pull_request) Successful in 5m44s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Python Lint & Test (pull_request) Successful in 6m49s
CI / all-required (pull_request) Successful in 0s
audit-force-merge / audit (pull_request) Successful in 3s
2026-05-12 20:40:03 +00:00
core-devops
a77fb3f3d4 ci: rerun CI on PHASE3_MASKED fix (SHA 0f97cbc2) 2026-05-12 20:40:03 +00:00
platform-engineer
eecf27b7e0 ci: mask platform-build failures in all-required (Phase 3 — mc#664)
`platform-build` has `continue-on-error: true` as a Phase 3 interim
mask while mc#664 handler test failures are in flight. In Gitea,
continue-on-error jobs report result="failure" in the needs context
(unlike GitHub Actions which reports "success"). This caused the
all-required sentinel to hard-fail on every PR.

Add PHASE3_MASKED = {"platform-build"} to the sentinel script so
platform-build failures are treated as Phase 3 suppressed. Remove
this exclusion when mc#664 is resolved and platform-build is healthy.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 20:40:03 +00:00
f2711a46ac ci: trigger CI rerun [empty commit] 2026-05-12 20:40:03 +00:00
0ff5dd10f9 ci: re-run lint checks with Paired: #669 in PR body (body-edited after initial push) 2026-05-12 20:40:03 +00:00
8d4cb427f7 fix(ci): sentinel bad-list also excludes 'cancelled' — tolerate CoE-masked job failures
The sentinel's Python filter was excluding null (in-flight) and success from
the bad-list, but NOT cancelled. With continue-on-error: true on
platform-build (mc#664 interim mask), failing tests cause the job to
report 'cancelled' (not 'failure'). These cancelled results must not
hard-fail the sentinel while the interim mask is active.

Also adds an INFO line for any cancelled jobs so operators can see the
CoE-masked failures without the sentinel failing.

Bug introduced in 4f7ecc5a.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-12 20:40:03 +00:00
5b7150d5f9 ci.yml: flip all-required continue-on-error to false
The all-required sentinel was reporting no status to the Gitea Actions
API (continue-on-error: true suppresses status entries), so the required
check CI / all-required (pull_request) never appeared in the combined
commit status. gate-check-v3 (Signal 6) treats a missing required
check as failing, causing all PRs to block even when all deps are
green.

Fix: continue-on-error: false on all-required so it always reports.
Phase 3 safety is preserved — platform-build carries continue-on-error:
true, masking its failures to null; all-required sees null as "not bad"
and exits 0. When mc#664 lands (PR #669) the CoE flip on
platform-build completes Phase 3 exit.

Fixes: gate-check-v3 false-positive BLOCKED on all open PRs.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-12 20:40:03 +00:00