harden(ci): remove expired sop-tier-check burn-in masks (internal#189 Phase 1) #2287
Reference in New Issue
Block a user
Delete Branch "harden/sop-tier-check-remove-expired-coe"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Remove expired sop-tier-check burn-in masks (internal#189 Phase 1)
The internal#189 Phase 1 burn-in window closed 2026-05-17 (18+ days ago). The header comment in
sop-tier-check.ymlalready claimedcontinue-on-errorhad been removed from the tier-check job — but that comment was stale: three masking layers persisted and left the gate unable to honestly red CI on a real SOP-6 violation.What was removed (gate vs diagnostic verdict per occurrence)
continue-on-error: trueon theInstall jqstep — diagnostic, but redundant. The step's final command (jq --version ... || echo) already exits 0 unconditionally, so it cannot fail the job on its own. The inlinemc#1982comment directed removal. Removed.continue-on-error: trueon theVerify tier label + reviewer team membershipstep — the GATE step; this is the expired burn-in mask. Removed.|| trueafterbash .gitea/scripts/sop-tier-check.sh— also masked the script's realexit 1(missing tier label / no approving review / unsatisfied AND-clause). It was part of the same burn-in masking of the gate step's ability to fail. Removed.SOP_FAIL_OPEN=1is retained as sanctioned infra-resilience: per the guardedexit 0branches insop-tier-check.sh, it fails-open ONLY on infra faults (empty/invalid token, unreachable Gitea API, missing jq) — it does not mask a real tier-gate verdict. The stale header comment was rewritten to reflect reality.Safety (evidence-first)
Across the 50 open core PRs, the latest per-context sop-tier-check status is success/pending. The two PRs showing a
failurecontext (#2285, #2132) are"Has been cancelled"supersede artifacts fromcancel-in-progress— their real(pull_request_review)run issuccess, not a gate verdict. No currently-green PR newly reds from this change.Restores the gate's honest ability to fail per the no-non-gating-CI goal.
Verified:
sop-tier-check.ymlparses (PyYAML); no activecontinue-on-errorremains.The internal#189 Phase 1 burn-in window closed 2026-05-17 (18+ days ago). The header comment already claimed continue-on-error was removed from the tier-check job, but three masking layers persisted and made the gate unable to honestly fail CI on a real SOP-6 violation: 1. continue-on-error: true on the 'Install jq' setup step (redundant — the step's final command already exits 0 unconditionally; not a gate). 2. continue-on-error: true on the 'Verify tier label + reviewer team membership' step — the actual expired burn-in mask. 3. '|| true' after the sop-tier-check.sh invocation, which swallowed the script's real exit 1 (missing tier label / no approval / unsatisfied AND-clause). All three removed. SOP_FAIL_OPEN=1 is RETAINED: it fails-open ONLY on infra faults (empty/invalid token, unreachable Gitea API, missing jq) via the guarded exit-0 branches in sop-tier-check.sh — it does NOT mask a real tier-gate verdict. Stale header comment updated to reflect reality. Evidence it is safe: across the 50 open core PRs, the latest per-context sop-tier-check status is success/pending; the two PRs showing a 'failure' context (#2285, #2132) are 'Has been cancelled' supersede artifacts from cancel-in-progress, whose real (pull_request_review) run is success — not gate verdicts. No currently-green PR newly reds from this change. Restores the gate's honest ability to fail per the no-non-gating-CI goal. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>Reviewed: removes the EXPIRED sop-tier-check burn-in COE (gate step + the || true masking the script exit 1; SOP_FAIL_OPEN retained for infra-only). Restores the gate's honest ability to fail. Verified reds elsewhere are cancel-artifacts, not newly-blocked. Approve.
REQUEST_CHANGES: direct Gitea verification does not support approval at head
d063ecd186.Source-of-truth combined CI is failure across 30 contexts at the current head. I cannot post a counting approval while the PR is red/pending, even with an existing CEO Assistant approval. Please re-request CR2 review after CI is success on the current head; I will re-run the normal 5-axis review then.
APPROVED after re-review using branch-protection required contexts rather than combined status.
Required-context check: present required context(s) are green at head d063ecd18663; absent required contexts are path-filter absent for this PR. 5-axis review found no blocking issue.
Summary: Removes expired sop-tier-check burn-in masks and updates CI gate comments to current enforcement state.
Correctness/robustness: change adds targeted regression coverage or fail-closed behavior for the reported bug class. Security: no new secret exposure or auth broadening found. Performance: no concerning runtime cost. Readability: comments/tests are explicit about the incident class and gate semantics.