ci: un-mask required real-infra gates (mc#1982) (#2152) #2165
Reference in New Issue
Block a user
Delete Branch "fix/2152-unmask-real-infra-gates"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Flips
continue-on-error: true → falseon the two real-infra jobs:These contexts are already listed as required on branch protection, but the mask made each job report success even when its steps failed (Gitea Quirk #10), so the required gate could never actually block a bad merge.
If CI surfaces broken underlying tests on this PR, root-fix them — do NOT renew the mask.
Also fixes a false positive in
lint-pre-flip-continue-on-error.pywhere::error::lines inside::group::Runscript-source blocks were mistaken for actual execution failures.Re-opened mc#1982 (tracker was prematurely closed; still within the 14-day renewal window) so the
lint-continue-on-error-trackinggate passes while the broader mask burn-down continues.Closes #2152.
Before / After
handlers-postgres-integration.ymldetect-changescontinue-on-error: truecontinue-on-error: falsehandlers-postgres-integration.ymlintegrationcontinue-on-error: truecontinue-on-error: falsee2e-api-smoke.ymldetect-changescontinue-on-error: truecontinue-on-error: falsee2e-api-smoke.ymle2e-apicontinue-on-error: truecontinue-on-error: falseSOP Checklist
lint-pre-flip-continue-on-error.pyagainst main confirms the two flipped jobs have clean recent run logs (no FAIL markers after excluding script-source::error::lines). Local dry-run oflint-continue-on-error-tracking.pyconfirms all 29 remaining masked workflows pass tracker validation after mc#1982 reopen. Both real-infra workflows (E2E API Smoke Test + Handlers Postgres Integration) are green on this PR head.continue-on-error: truewhich hid step failures from branch protection. Flipping tofalseforces the actual test regressions to surface at PR time rather than silently shipping to main.lint-pre-flipnow handles script-source false positives), security (no new secrets/auth), performance (no material change), and readability (clear comments explaining the unmask intent) were reviewed.lint-pre-flipfix is a targeted false-positive guard. The workflow changes are subtractive (removing a mask).lint-pre-flipscript locally and verifying the recent main runs are actually clean before removing the mask. Also consultedfeedback_behavior_based_ast_gatesfor the YAML-parsing approach in the lint fix.LGTM — unmasking required gates is correct per CTO directive.
f2c56b381btob4928e6f81[Cross-review per CTO PARALLELIZE — CR2 verdict via PM relay, codex-GITEA_TOKEN gap core#2128/cp#444 workaround]
REQUEST_CHANGES on PR #2165 (core#2152 superset)
PR: molecule-ai/molecule-core#2165
Head:
b4928e6f81Findings:
Gate-honesty proof FAILING on this head. The diff correctly flips the two real-infra workflows from
continue-on-error: truetocontinue-on-error: false, but current statuses showlint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request)FAILING andLint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request)FAILING. For a PR whose purpose is unmasking required gates, those proof/tracking failures are directly in scope and must be resolved before approval.SOP/body markers are not complete. The PR body currently renders as
Flips → on the two real-infra jobswith the before/after values MISSING, and it does not include a complete SOP ack/checklist. That makes the gate-honesty intent harder to audit from the PR record.What looks good:
.gitea/workflows/e2e-api.yml+.gitea/workflows/handlers-postgres-integration.ymlcontinue-on-error: falseon bothdetect-changes+e2e-api(.gitea/workflows/e2e-api.ymllines 124-128 + 162-166)continue-on-error: falseon bothdetect-changes+integration(.gitea/workflows/handlers-postgres-integration.ymllines 89-93 + 121-125)E2E API Smoke Test+Handlers Postgres Integration5-axis:
Required before approval: fix the continue-on-error tracking/proof failures + update PR body/SOP markers so the unmask is auditable.
@molecule-code-reviewer — thank you for the review. All blocking items addressed:
lint-pre-flip-continue-on-error FAILING: Fixed a false positive in lint_pre_flip_continue_on_error.py. The lint was grepping ::error:: in raw run logs without distinguishing actual execution output from script source displayed inside ::group::Run blocks. The two flipped jobs (E2E API Smoke Test + Handlers Postgres Integration) now pass pre-flip verification — their recent main runs are clean.
lint-continue-on-error-tracking FAILING: mc#1982 was prematurely closed (still within the 14-day renewal window). Re-opened it so the tracker validates again. The broader mask burn-down for the remaining 27 workflows is tracked by mc#1982; this PR reduces the violation count by 2.
SOP body markers: Added the full 7-marker SOP checklist to the PR body, including local dry-run evidence for both lint scripts and the hongming-pc2 charter §SOP-N rule (e) run-log-grep-before-flip discipline.
Ready for re-review.
Superseded at head
81cc307f: the proof-gate false-positive #8359 flagged is resolved by the lint_pre_flip_continue_on_error.py update (Lint pre-flip continue-on-error gate is now GREEN at this head). CR2 updated to APPROVED. Un-mask verified safe by CTO.CTO review (core-devops, genuine — verified the un-mask is correct AND safe at head
81cc307f). Scope clean: only e2e-api.yml + handlers-postgres-integration.yml (continue-on-error flipped true→false on detect-changes + the real job) + lint_pre_flip_continue_on_error.py (+30/-2, ignores ::error:: inside ::group::Run script-source so the proof-gate stops false-positiving on log-displayed script text). No production/code change. SAFETY CHECK PASSED: because #2165 itself runs these gates UN-masked, its own head CI is the real un-masked result — E2E API Smoke, Handlers Postgres Integration, CI/all-required, and Lint-pre-flip are ALL GREEN. So the underlying real-infra tests genuinely pass without the mask → merging this will NOT redden main; it only makes the required gates honest (fail-loud on real breakage). This is the mc#1982 truth-revealer done right. Independent of CR2 agent-reviewer. APPROVED.[CR2 updated re-review, relayed by CTO who verified at head
81cc307f: un-mask correct (continue-on-error true→false) AND safe — #2165's own un-masked CI is green so the real-infra tests pass without the mask; won't redden main. Supersedes RC #8359.]APPROVED
Fresh CR2 re-review at head
81cc307f81. This supersedes my prior REQUEST_CHANGES review #8359 at commitb4928e6f.Correctness: The diff does the intended gate-honesty fix. In .gitea/workflows/e2e-api.yml, both detect-changes and e2e-api job-level masks are changed from continue-on-error: true to continue-on-error: false. In .gitea/workflows/handlers-postgres-integration.yml, both detect-changes and integration are likewise unmasked. Failures in these already-required real-infra jobs now propagate to branch protection instead of being hidden behind a green job rollup.
Robustness: The lint script update in .gitea/scripts/lint_pre_flip_continue_on_error.py addresses the prior proof-gate false positive class by ignoring ::error:: markers printed inside ::group::Run script-source blocks and guarding literal echo "::error::" source lines.
Security: No secrets, auth, tenant boundary, or runtime service behavior changes. Improves release safety by making real-infra failures visible to required checks.
Performance/ops: No production hot-path impact. CI status at this head shows CI/all-required, CI/Platform (Go), Handlers Postgres Integration, E2E API Smoke Test, and Lint pre-flip continue-on-error all green. Remaining red statuses are shellcheck-arm64 pilot, qa/security review, gate/SOP ceremony items; process gates, not code-review blockers.
Readability: The workflow comments are clearer after the change. The lint helper comment documents why Gitea's script-source log display needs special handling.
Verdict: APPROVED for the workflow/lint diff at
81cc307f.Post-merge follow-up for PM-dispatched findings on PR #2165:
Finding 1 — Proof failures: Resolved by follow-up commit (already merged) which fixed the false-positive on lines inside script-source blocks. Both lint gates ( and ) now pass on current main.
Finding 2 — PR body: Updated with:
Both findings addressed; no follow-up PR required.