ci(lint): forbid continue-on-error on required-context jobs (SOP#765) #2541

Merged
devops-engineer merged 1 commits from ci/guard-no-coe-on-required into main 2026-06-10 20:47:13 +00:00
Member

Static workflow-shape lint making SOP#765 mechanical — forbids continue-on-error: true on any job that emits a REQUIRED branch-protection status context (the mc#1982 masking incident).

Forbidden shape: a job in .gitea/workflows/*.yml that is BOTH continue-on-error: true AND emits a context in .gitea/required-contexts.txt.

Why: continue-on-error: true rolls a failed step up to a SUCCESS job status (Gitea Quirk #10). On a REQUIRED context that silently converts a real failure into a green gate — continue-on-error on platform-build masked regressions for ~3 weeks before #656 surfaced them.

SSOT + drift guard: .gitea/required-contexts.txt is the checked-in SSOT (CI cannot always read branch_protections; cp returns 403). When DRIFT_BOT_TOKEN is present the lint ALSO live-reads BP and fails if the allowlist has DRIFTED from live BP; a 403/absent token degrades gracefully to allowlist-only. core required set verified 2026-06-10 against live BP: CI / all-required, E2E API Smoke Test, Handlers Postgres Integration — all currently coe=false (clean), live cross-check passes.

Fixture-catch proof: 6 pytest cases. Manually verified: injecting continue-on-error:true on all-required -> caught; dropping a BP line from the allowlist -> live drift check fails.

Guard class 3 of the cross-repo CI-bug-class lint set.

Static workflow-shape lint making **SOP#765 mechanical** — forbids `continue-on-error: true` on any job that emits a REQUIRED branch-protection status context (the mc#1982 masking incident). **Forbidden shape:** a job in `.gitea/workflows/*.yml` that is BOTH `continue-on-error: true` AND emits a context in `.gitea/required-contexts.txt`. **Why:** `continue-on-error: true` rolls a failed step up to a SUCCESS job status (Gitea Quirk #10). On a REQUIRED context that silently converts a real failure into a green gate — continue-on-error on platform-build masked regressions for ~3 weeks before #656 surfaced them. **SSOT + drift guard:** `.gitea/required-contexts.txt` is the checked-in SSOT (CI cannot always read branch_protections; cp returns 403). When DRIFT_BOT_TOKEN is present the lint ALSO live-reads BP and fails if the allowlist has DRIFTED from live BP; a 403/absent token degrades gracefully to allowlist-only. core required set verified 2026-06-10 against live BP: CI / all-required, E2E API Smoke Test, Handlers Postgres Integration — all currently coe=false (clean), live cross-check passes. **Fixture-catch proof:** 6 pytest cases. Manually verified: injecting continue-on-error:true on all-required -> caught; dropping a BP line from the allowlist -> live drift check fails. Guard class 3 of the cross-repo CI-bug-class lint set.
devops-engineer added 1 commit 2026-06-10 15:20:37 +00:00
ci(lint): forbid continue-on-error on required branch-protection jobs
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 9s
CI / Python Lint & Test (pull_request) Successful in 7s
CI / Detect changes (pull_request) Successful in 10s
E2E API Smoke Test / detect-changes (pull_request) Successful in 9s
CI / Platform (Go) (pull_request) Successful in 2s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
CI / Canvas (Next.js) (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 9s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 3s
CI / Canvas Deploy Status (pull_request) Successful in 1s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 7s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 6s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 10s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 8s
E2E Chat / detect-changes (pull_request) Successful in 19s
CI / all-required (pull_request) Successful in 2s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 6s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 9s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 7s
E2E Chat / E2E Chat (pull_request) Successful in 4s
lint-no-coe-on-required / lint-no-coe-on-required (pull_request) Successful in 19s
sop-checklist / review-refire (pull_request_target) Has been skipped
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 15s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 17s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 11s
gate-check-v3 / gate-check (pull_request_target) Successful in 10s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 9s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 35s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 43s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m9s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m16s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Failing after 1m5s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 6m42s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 6m1s
security-review / approved (pull_request_target) Approved via pull_request_review trigger
qa-review / approved (pull_request_target) Approved via pull_request_review trigger
security-review / approved (pull_request_review) Successful in 20s
qa-review / approved (pull_request_review) Successful in 24s
audit-force-merge / audit (pull_request_target) Successful in 9s
1aa1d14e57
Makes SOP#765 mechanical (the mc#1982 masking incident). A job that is
continue-on-error: true AND emits a context in .gitea/required-contexts.txt
fails the lint. continue-on-error rolls a failed step up to SUCCESS (Gitea
Quirk #10) — on a required context that turns a real failure green.

.gitea/required-contexts.txt is the checked-in SSOT (CI cannot always read
branch_protections); when DRIFT_BOT_TOKEN is present the lint also live-reads
BP and fails on allowlist drift, degrading gracefully on 403/absent token.
6 pytest cases + verified clean against current core (3 required contexts,
all coe=false) with live BP cross-check passing.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
devops-engineer requested review from molecule-code-reviewer 2026-06-10 15:21:43 +00:00
Author
Member

Note for reviewers: the lint-no-coe-on-required guard job itself is GREEN. The other red contexts on this PR are not caused by this change:

  • Ops Scripts Tests triggers only because the PR touches .gitea/scripts/** (path filter), but that job runs unittest discover over scripts/ + scripts/ops/ and a pytest suite under .gitea/scripts/tests — none of which this PR modifies. This guards tests live in top-level tests/ and run in the guards own job (green).
  • Local Provision Lifecycle E2E (stub) is an infra-dependent E2E, unrelated to a static YAML lint.
  • qa-review / security-review / sop-checklist are the standard SOP gate awaiting the requested reviewer + checklist ack.

This PR is static analysis only (one new Python lint script + allowlist + workflow + tests).

Note for reviewers: the **lint-no-coe-on-required** guard job itself is GREEN. The other red contexts on this PR are **not** caused by this change: - `Ops Scripts Tests` triggers only because the PR touches `.gitea/scripts/**` (path filter), but that job runs `unittest discover` over `scripts/` + `scripts/ops/` and a pytest suite under `.gitea/scripts/tests` — none of which this PR modifies. This guards tests live in top-level `tests/` and run in the guards own job (green). - `Local Provision Lifecycle E2E (stub)` is an infra-dependent E2E, unrelated to a static YAML lint. - `qa-review` / `security-review` / `sop-checklist` are the standard SOP gate awaiting the requested reviewer + checklist ack. This PR is static analysis only (one new Python lint script + allowlist + workflow + tests).
Author
Member

Skipping — genuine required-context failures plus an unacked core SOP gate. Failing: Ops Scripts Tests / Ops scripts (unittest), Local Provision Lifecycle E2E (stub) and (real image + MiniMax, advisory), and the SOP statuses qa-review / approved + security-review / approved + sop-checklist / all-items-acked are red (ceremony not completed). Reviewer-persona GIT approvals do not flip those. Needs the e2e/ops failures resolved and the checklist acked. Not forcing over a red required gate.

Skipping — genuine required-context failures plus an unacked core SOP gate. Failing: `Ops Scripts Tests / Ops scripts (unittest)`, `Local Provision Lifecycle E2E (stub)` and `(real image + MiniMax, advisory)`, and the SOP statuses `qa-review / approved` + `security-review / approved` + `sop-checklist / all-items-acked` are red (ceremony not completed). Reviewer-persona GIT approvals do not flip those. Needs the e2e/ops failures resolved and the checklist acked. Not forcing over a red required gate.
core-qa approved these changes 2026-06-10 19:43:39 +00:00
core-qa left a comment
Member

re-approve rebased head (main merged for #2551 Ops-Scripts fix); change unchanged

re-approve rebased head (main merged for #2551 Ops-Scripts fix); change unchanged
core-security approved these changes 2026-06-10 19:43:54 +00:00
core-security left a comment
Member

re-approve rebased head (main merged for #2551 Ops-Scripts fix); change unchanged

re-approve rebased head (main merged for #2551 Ops-Scripts fix); change unchanged
devops-engineer merged commit fda05b6124 into main 2026-06-10 20:47:13 +00:00
Sign in to join this conversation.
3 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: molecule-ai/molecule-core#2541