From 0970feef703ffbcf3da2ca12a8aab9a48ee53341 Mon Sep 17 00:00:00 2001
From: Molecule AI Core-DevOps <core-devops@agents.moleculesai.app>
Date: Tue, 12 May 2026 04:59:50 +0000
Subject: [PATCH] =?UTF-8?q?feat(ci)(hard-gate):=20lint-pre-flip=20catches?=
 =?UTF-8?q?=20continue-on-error=20true=E2=86=92false=20without=20run-log?=
 =?UTF-8?q?=20proof?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Empirical class — PR #656 / mc#664:
PR #656 (RFC internal#219 Phase 4) flipped 5 platform-build-class jobs
`continue-on-error: true → false` on the basis of a "verified green
on main via combined-status check". But that "green" was the LIE
the prior `continue-on-error: true` produced: Gitea Quirk #10
(internal#342 + dup #287) — a failed step inside a CoE:true job rolls
up to a success job-level status. The precondition the PR claimed to
verify was structurally fooled by the bug being flipped.

mc#664 captured the surfaced defects (2 mutually-masked regressions):
- Class 1: sqlmock helper drift since 2f36bb9a (24 days old)
- Class 2: OFFSEC-001 contract collision since 7d1a189f (1 day old)

Codified 04:35Z as hongming-pc2 charter §SOP-N rule (e)
"run-log-grep-before-flip": pull the actual run log + grep for
--- FAIL / FAIL\s BEFORE flipping; don't trust the masked
combined-status. This commit structurally enforces that rule.

What this PR adds:

.gitea/workflows/lint-pre-flip-continue-on-error.yml — pre-merge
  pull_request gate, path-scoped to .gitea/workflows/**. Lands at
  continue-on-error:true (Phase 3 dogfood — flip to false in a
  follow-up only after this workflow has clean recent runs on main).

.gitea/scripts/lint_pre_flip_continue_on_error.py — the lint:
  1. Reads every .gitea/workflows/*.yml at the PR base SHA AND head
     SHA via git show <sha>:<path>. No checkout needed.
  2. Parses both sides via PyYAML AST (per
     feedback_behavior_based_ast_gates — NOT grep, so comment churn
     and key-order changes don't false-positive).
  3. For each flipped job (base=true, head=false), renders the
     commit-status context as "{workflow.name} / {job.name or job.key}
     (push)" and pulls combined commit-status for the last 5
     commits on the PR base branch.
  4. Fetches each matching run's log via the web-UI route
     {server_url}/{repo}/actions/runs/{run_id}/jobs/{job_idx}/logs
     (per reference_gitea_actions_log_fetch — Gitea 1.22.6 lacks
     REST /actions/runs/*; web-UI is the only working path, see
     reference_gitea_1_22_6_lacks_rest_rerun_endpoints).
  5. Greps for --- FAIL / FAIL\s / ::error::. If status==success
     AND log shows fail markers, the job was masked. Emit
     ::error::file=... naming the failing test + offending run URL.

.gitea/scripts/tests/test_lint_pre_flip_continue_on_error.py —
  35 unittest cases covering the 5 acceptance tests from the spec
  + CoE coercion (truthy/falsy/quoted/absent) + context-name
  rendering + multi-flip aggregation + dry-run semantics + 3
  graceful-degrade halt conditions (log-unavailable, zero-runs-
  history, zero-commits-on-branch).

Live empirical confirmation:
Ran the script against the PR#656 base→merge diff with
RECENT_COMMITS_N=3 on main. Result:
- platform-build flip BLOCKED — masked --- FAIL on
  TestExecuteDelegation_DeliveryConfirmedProxyError_TreatsAsSuccess
  + 4 more on action_run 13353.
- canvas-build / shellcheck / python-lint flips PASS — no FAIL
  markers in their recent logs.
Exactly the diagnosis hongming-pc2 charter §SOP-N rule (e) requires.

Halt-condition graceful-degrade contract:
- Log fetch 404 (act_runner pruned, transient outage): warn-not-block.
- Zero recent runs of the flipped context (newly-added workflow):
  chicken-and-egg exemption — warn and allow.
- YAML parse error in one workflow file: warn-not-block (the YAML
  lint workflows catch this separately).

Cross-links: PR#656, mc#664, PR#665 (interim re-mask), Quirk #10
(internal#342 + dup #287), hongming-pc2 charter §SOP-N rule (e),
feedback_strict_root_only_after_class_a,
feedback_no_shared_persona_token_use.

Refs: internal#342, internal#287, molecule-core#664, molecule-core#665
---
 .../lint_pre_flip_continue_on_error.py        | 681 ++++++++++++++++++
 .../test_lint_pre_flip_continue_on_error.py   | 505 +++++++++++++
 .../lint-pre-flip-continue-on-error.yml       | 142 ++++
 3 files changed, 1328 insertions(+)
 create mode 100644 .gitea/scripts/lint_pre_flip_continue_on_error.py
 create mode 100644 .gitea/scripts/tests/test_lint_pre_flip_continue_on_error.py
 create mode 100644 .gitea/workflows/lint-pre-flip-continue-on-error.yml
diff --git a/.gitea/scripts/lint_pre_flip_continue_on_error.py b/.gitea/scripts/lint_pre_flip_continue_on_error.py
new file mode 100644
index 00000000..38c37efc
--- /dev/null
+++ b/.gitea/scripts/lint_pre_flip_continue_on_error.py
@@ -0,0 +1,681 @@
+#!/usr/bin/env python3
+"""lint-pre-flip-continue-on-error — block a PR that flips a job from
+``continue-on-error: true`` to ``continue-on-error: false`` (or removes
+the key while the base had it ``true``) without proof that the job's
+recent runs on the target branch are actually green.
+
+Empirical class — PR #656 / mc#664:
+  PR #656 (RFC internal#219 Phase 4) flipped 5 ``platform-build``-class
+  jobs ``continue-on-error: true → false`` on the basis of a
+  "verified green on main via combined-status check". But that "green"
+  was the LIE produced by the prior ``continue-on-error: true``:
+  Gitea Quirk #10 (internal#342 + dup #287) — when a step inside a
+  job marked ``continue-on-error: true`` fails, the job-level status
+  is still rolled up as ``success``. So the precondition the PR
+  claimed to verify was structurally fooled by the bug being
+  flipped.
+
+  mc#664 then captured the surfaced defects (2 unrelated, mutually-
+  masked regressions):
+
+    Class 1: sqlmock helper drift since 2f36bb9a (24 days old)
+    Class 2: OFFSEC-001 contract collision since 7d1a189f (1 day old)
+
+  Codified 04:35Z as hongming-pc2 charter §SOP-N rule (e)
+  "run-log-grep-before-flip": pull the actual run log + grep for
+  ``--- FAIL`` / ``FAIL\\s`` BEFORE flipping; don't trust the masked
+  combined-status.
+
+This script structurally enforces that rule at PR time.
+
+How it works (one PR tick):
+  1. Parse the diff: compare ``.gitea/workflows/*.yml`` at PR base
+     vs PR head. For each file present in both, parse the YAML AST
+     and walk ``jobs.<key>.continue-on-error`` on each side. A
+     "flip" is base ∈ {true} AND head ∈ {false, None/absent}. We
+     coerce truthy/falsy per YAML semantics (PyYAML normalizes
+     ``true``/``True``/``yes`` to ``True``).
+  2. For each flipped job, derive its commit-status context name as
+     ``"{workflow.name} / {job.name or job.key} (push)"`` — that's
+     how Gitea Actions emits the context for runs on
+     ``main``/``staging`` (push event, see also expected_context()
+     in ci-required-drift.py).
+  3. Pull the last N commits of the target branch (PR base), fetch
+     combined commit-status per commit, scan ``statuses[]`` for
+     contexts matching ANY of the flipped jobs. For each match,
+     fetch the actual run log via the web-UI route
+     ``{server_url}/{repo}/actions/runs/{run_id}/jobs/{job_idx}/logs``
+     (per memory ``reference_gitea_actions_log_fetch`` — Gitea 1.22.6
+     lacks REST ``/actions/runs/*`` endpoints; the web-UI route is the
+     only working path; see ``reference_gitea_1_22_6_lacks_rest_rerun_endpoints``).
+  4. Grep each log for the Go-test failure markers ``--- FAIL`` /
+     ``FAIL\\s+<package>`` AND the bash-step error sentinel
+     ``::error::``. If ANY recent log shows any of these AND the
+     status itself reads ``success``, the job was masked. ``::error::``
+     the flip with the offending test name + offending run URL +
+     the regression commit (HEAD of the run).
+  5. Exit 1 if any flips have at least one masked run; exit 0
+     otherwise.
+
+Halt-on-noise contract:
+  - If a recent log fetch 404s (already-pruned-via-act_runner-gc,
+     transient gitea-web outage): emit ``::warning::`` and treat the
+     run as "log unavailable" — does NOT block the flip; logged so
+     a curious reviewer can re-run.
+  - If a flipped job has ZERO recent runs on the target branch (newly
+     added workflow): emit ``::warning::`` "no run history to verify"
+     and allow the flip. This is the only way a NEW workflow can ever
+     ship with ``continue-on-error: false``; otherwise we'd have a
+     chicken-and-egg.
+
+Behavior-based AST gate per ``feedback_behavior_based_ast_gates``:
+  - YAML parsed via PyYAML safe_load on BOTH sides of the diff
+  - No grep-by-line — formatting changes (comment churn, key order)
+    don't false-positive a flip
+  - Job-key match — so a rename ``platform-build → core-be-build``
+    appears as a DELETE + an ADD, not a flip (the delete side has no
+    new value to compare against; the add side has no base side).
+
+Run locally (works against this repo, requires PyYAML + Gitea token
+that can read combined-commit-status):
+
+    GITEA_TOKEN=... GITEA_HOST=git.moleculesai.app \\
+      REPO=molecule-ai/molecule-core BASE_REF=main \\
+      BASE_SHA=$(git rev-parse origin/main) \\
+      HEAD_SHA=$(git rev-parse HEAD) \\
+      python3 .gitea/scripts/lint_pre_flip_continue_on_error.py \\
+        --dry-run
+
+Cross-links: PR#656, mc#664, PR#665 (the interim re-mask),
+Quirk #10 (internal#342 + dup #287), hongming-pc2 charter §SOP-N
+rule (e), feedback_strict_root_only_after_class_a,
+feedback_no_shared_persona_token_use.
+"""
+from __future__ import annotations
+
+import argparse
+import json
+import os
+import subprocess
+import sys
+import urllib.error
+import urllib.parse
+import urllib.request
+from typing import Any
+
+import yaml  # PyYAML 6.0.2 — installed by the workflow before this runs.
+
+
+# --------------------------------------------------------------------------
+# Environment (read at module-import; runtime contract enforced in main())
+# --------------------------------------------------------------------------
+def _env(key: str, *, default: str = "") -> str:
+    return os.environ.get(key, default)
+
+
+GITEA_TOKEN = _env("GITEA_TOKEN")
+GITEA_HOST = _env("GITEA_HOST")
+REPO = _env("REPO")
+BASE_REF = _env("BASE_REF", default="main")
+BASE_SHA = _env("BASE_SHA")
+HEAD_SHA = _env("HEAD_SHA")
+# How many recent commits to scan on the target branch. 5 by default;
+# enough to catch a job that only fails intermittently, not so many
+# that the script paginates needlessly. Per spec.
+RECENT_COMMITS_N = int(_env("RECENT_COMMITS_N", default="5"))
+
+OWNER, NAME = (REPO.split("/", 1) + [""])[:2] if REPO else ("", "")
+API = f"https://{GITEA_HOST}/api/v1" if GITEA_HOST else ""
+WEB = f"https://{GITEA_HOST}" if GITEA_HOST else ""
+
+# Failure markers we grep for in the run log.
+#   --- FAIL — Go test failure marker
+#   FAIL\s   — `FAIL  github.com/x/y` package-level rollup
+#   ::error:: — bash-step `::error::` lines (the lint-curl-status-capture
+#               pattern: a `python3 <<PY` block writing `::error::` then
+#               sys.exit(1); also any shell `echo "::error::..."` from
+#               jobs that wrap pytest/eslint/etc. and convert
+#               non-zero exits into masked-by-CoE status)
+FAIL_PATTERNS = (
+    "--- FAIL",
+    "FAIL\t",
+    "FAIL ",
+    "::error::",
+)
+
+
+def _require_runtime_env() -> None:
+    for key in ("GITEA_TOKEN", "GITEA_HOST", "REPO", "BASE_REF", "BASE_SHA", "HEAD_SHA"):
+        if not os.environ.get(key):
+            sys.stderr.write(f"::error::missing required env var: {key}\n")
+            sys.exit(2)
+
+
+# --------------------------------------------------------------------------
+# Tiny HTTP helper (no requests dependency)
+# Mirrors the api()/ApiError contract in ci-required-drift.py +
+# main-red-watchdog.py per feedback_api_helper_must_raise_not_return_dict.
+# --------------------------------------------------------------------------
+class ApiError(RuntimeError):
+    """Raised when a Gitea API/web call cannot be trusted to have succeeded.
+
+    Soft-failure on non-2xx is the duplicate-write bug factory in
+    find-or-create flows (PR #112 Five-Axis). Here it would mean a
+    transient gitea-web 502 silently allows a flip whose recent runs
+    we couldn't actually verify — exactly the regression class this
+    lint exists to close.
+    """
+
+
+def http(
+    method: str,
+    url: str,
+    *,
+    body: dict | None = None,
+    headers: dict[str, str] | None = None,
+    expect_json: bool = True,
+    timeout: int = 30,
+) -> tuple[int, Any, bytes]:
+    """Tiny HTTP helper around urllib.
+
+    Returns (status, parsed_or_None, raw_bytes). Raises ApiError on any
+    non-2xx response. ``expect_json=False`` returns raw bytes in the
+    parsed slot (for log-fetch from the web-UI which returns text/plain).
+    """
+    final_headers = {
+        "Authorization": f"token {GITEA_TOKEN}",
+        "Accept": "application/json" if expect_json else "text/plain",
+    }
+    if headers:
+        final_headers.update(headers)
+    data = None
+    if body is not None:
+        data = json.dumps(body).encode("utf-8")
+        final_headers["Content-Type"] = "application/json"
+    req = urllib.request.Request(url, method=method, data=data, headers=final_headers)
+    try:
+        with urllib.request.urlopen(req, timeout=timeout) as resp:
+            raw = resp.read()
+            status = resp.status
+    except urllib.error.HTTPError as e:
+        raw = e.read() or b""
+        status = e.code
+
+    if not (200 <= status < 300):
+        snippet = raw[:500].decode("utf-8", errors="replace") if raw else ""
+        raise ApiError(f"{method} {url} → HTTP {status}: {snippet}")
+
+    if not expect_json:
+        return status, raw, raw
+    if not raw:
+        return status, None, raw
+    try:
+        return status, json.loads(raw), raw
+    except json.JSONDecodeError as e:
+        raise ApiError(f"{method} {url} → HTTP {status} but body is not JSON: {e}") from e
+
+
+def api(method: str, path: str, *, body: dict | None = None, query: dict[str, str] | None = None) -> tuple[int, Any]:
+    """Read-shaped Gitea REST helper. Path is API-relative (``/repos/...``)."""
+    url = f"{API}{path}"
+    if query:
+        url = f"{url}?{urllib.parse.urlencode(query)}"
+    status, parsed, _ = http(method, url, body=body, expect_json=True)
+    return status, parsed
+
+
+# --------------------------------------------------------------------------
+# YAML parsing — coerce truthy/falsy for continue-on-error
+# --------------------------------------------------------------------------
+def _coerce_coe(val: Any) -> bool:
+    """Coerce a continue-on-error YAML value to bool.
+
+    PyYAML safe_load normalizes ``true``/``True``/``yes``/``on`` to
+    Python ``True`` and ``false``/``False``/``no``/``off`` / absence
+    to ``False`` (we treat absence/None as False here too — that's the
+    GitHub Actions default semantics).
+
+    Edge cases:
+      - String ``"true"`` (quoted in YAML) — kept as the string
+        ``"true"``, falsy under bool() but a flip we DO care about
+        catching. Normalize string forms case-insensitively to bool
+        so the diff is consistent with the runtime behavior of
+        Gitea Actions, which YAML-parses the same way.
+    """
+    if isinstance(val, bool):
+        return val
+    if val is None:
+        return False
+    if isinstance(val, str):
+        return val.strip().lower() in ("true", "yes", "on", "1")
+    return bool(val)
+
+
+def jobs_coe_map(workflow_doc: dict) -> dict[str, bool]:
+    """Return ``{job_key: continue_on_error_bool}`` for every job in
+    the workflow. Job-level ``continue-on-error`` only — does NOT
+    descend into per-step ``continue-on-error`` (step-level CoE
+    masking is a separate class and is handled by the test suite
+    + reviewer, not by this gate — see Future Work in the workflow
+    YAML).
+    """
+    out: dict[str, bool] = {}
+    jobs = workflow_doc.get("jobs")
+    if not isinstance(jobs, dict):
+        return out
+    for key, job in jobs.items():
+        if not isinstance(job, dict):
+            continue
+        out[key] = _coerce_coe(job.get("continue-on-error"))
+    return out
+
+
+def workflow_name(workflow_doc: dict, *, fallback: str = "") -> str:
+    """Top-level ``name:`` of the workflow. Falls back to the filename
+    (without extension) per Gitea Actions semantics."""
+    n = workflow_doc.get("name")
+    if isinstance(n, str) and n.strip():
+        return n.strip()
+    return fallback
+
+
+def job_display_name(workflow_doc: dict, job_key: str) -> str:
+    """``jobs.<key>.name`` if present, else the key. Mirrors
+    expected_context() in ci-required-drift.py."""
+    job = workflow_doc.get("jobs", {}).get(job_key)
+    if isinstance(job, dict):
+        n = job.get("name")
+        if isinstance(n, str) and n.strip():
+            return n.strip()
+    return job_key
+
+
+def context_name(workflow_name_str: str, job_name_str: str, event: str = "push") -> str:
+    """Render the commit-status context the way Gitea Actions emits it.
+    Default ``event="push"`` because recent-runs-on-main are push events;
+    callers can override to ``"pull_request"`` for PR-context lookups."""
+    return f"{workflow_name_str} / {job_name_str} ({event})"
+
+
+# --------------------------------------------------------------------------
+# Diff detection — flips, not arbitrary changes
+# --------------------------------------------------------------------------
+def detect_flips(
+    base_workflows: dict[str, str],
+    head_workflows: dict[str, str],
+) -> list[dict]:
+    """Compare per-file CoE maps; return a list of flip records.
+
+    Inputs are ``{path: yaml_text}`` for both sides. Output records
+    have the shape::
+
+        {
+          "workflow_path": ".gitea/workflows/ci.yml",
+          "workflow_name": "CI",
+          "job_key":   "platform-build",
+          "job_name":  "Platform (Go)",
+          "context":   "CI / Platform (Go) (push)",
+        }
+
+    A flip is base[CoE] ∈ {True} AND head[CoE] ∈ {False}. Files
+    only present on one side are skipped — adding a new workflow
+    with ``CoE: false`` is fine (no history to mask), and removing
+    a workflow can't possibly flip anything.
+    """
+    flips: list[dict] = []
+    for path, base_text in base_workflows.items():
+        if path not in head_workflows:
+            continue
+        try:
+            base_doc = yaml.safe_load(base_text) or {}
+            head_doc = yaml.safe_load(head_workflows[path]) or {}
+        except yaml.YAMLError as e:
+            # Don't block on a parse error — the YAML lint workflows
+            # catch invalid YAML separately. Just warn so the failing
+            # file is visible.
+            sys.stderr.write(f"::warning file={path}::YAML parse error: {e}\n")
+            continue
+        if not isinstance(base_doc, dict) or not isinstance(head_doc, dict):
+            continue
+        base_map = jobs_coe_map(base_doc)
+        head_map = jobs_coe_map(head_doc)
+        wf_name = workflow_name(head_doc, fallback=os.path.basename(path).rsplit(".", 1)[0])
+        for job_key, base_val in base_map.items():
+            if job_key not in head_map:
+                continue  # job removed — not a flip
+            if base_val is True and head_map[job_key] is False:
+                flips.append({
+                    "workflow_path": path,
+                    "workflow_name": wf_name,
+                    "job_key": job_key,
+                    "job_name": job_display_name(head_doc, job_key),
+                    "context": context_name(wf_name, job_display_name(head_doc, job_key), "push"),
+                })
+    return flips
+
+
+# --------------------------------------------------------------------------
+# Git: snapshot every .gitea/workflows/*.yml at a SHA (no checkout)
+# --------------------------------------------------------------------------
+def _git(*args: str, cwd: str | None = None) -> str:
+    """Run ``git`` and return stdout (text)."""
+    result = subprocess.run(
+        ["git", *args],
+        capture_output=True,
+        text=True,
+        check=False,
+        cwd=cwd,
+    )
+    if result.returncode != 0:
+        raise RuntimeError(f"git {args!r} failed: {result.stderr.strip()}")
+    return result.stdout
+
+
+def workflows_at_sha(sha: str, *, repo_dir: str | None = None) -> dict[str, str]:
+    """Read every ``.gitea/workflows/*.yml`` blob at ``sha``.
+
+    Uses ``git ls-tree`` + ``git show`` so we never need to check out
+    the SHA (the workflow runs on the PR head; the base SHA is
+    fetched, not checked out).
+    """
+    out: dict[str, str] = {}
+    listing = _git("ls-tree", "-r", "--name-only", sha, ".gitea/workflows/", cwd=repo_dir)
+    for line in listing.splitlines():
+        line = line.strip()
+        if not line.endswith((".yml", ".yaml")):
+            continue
+        try:
+            blob = _git("show", f"{sha}:{line}", cwd=repo_dir)
+        except RuntimeError:
+            # Symlink or other non-blob; skip.
+            continue
+        out[line] = blob
+    return out
+
+
+# --------------------------------------------------------------------------
+# Gitea: recent commits + per-commit combined status + log fetch
+# --------------------------------------------------------------------------
+def recent_commits_on_branch(branch: str, n: int) -> list[str]:
+    """Last `n` commit SHAs on ``branch`` (oldest→newest is fine; we
+    treat them as a set). Uses the REST ``/commits`` endpoint with
+    ``sha=branch&limit=n``."""
+    _, body = api(
+        "GET",
+        f"/repos/{OWNER}/{NAME}/commits",
+        query={"sha": branch, "limit": str(n)},
+    )
+    if not isinstance(body, list):
+        raise ApiError(f"/commits for {branch} returned non-list: {type(body).__name__}")
+    out: list[str] = []
+    for c in body:
+        if isinstance(c, dict):
+            sha = c.get("sha") or (c.get("commit", {}) or {}).get("id")
+            if isinstance(sha, str) and len(sha) >= 7:
+                out.append(sha)
+    return out
+
+
+def combined_status(sha: str) -> dict:
+    """Combined commit status for a SHA. Same shape as
+    ``main-red-watchdog.get_combined_status``."""
+    _, body = api("GET", f"/repos/{OWNER}/{NAME}/commits/{sha}/status")
+    if not isinstance(body, dict):
+        raise ApiError(f"combined-status for {sha} not a dict")
+    return body
+
+
+def _entry_state(s: dict) -> str:
+    """Per-entry state — Gitea 1.22.6 schema asymmetry: top-level
+    uses ``state``, per-entry uses ``status``. Defensive fallback per
+    main-red-watchdog.py line 233."""
+    return s.get("status") or s.get("state") or ""
+
+
+def fetch_log(target_url: str) -> str | None:
+    """Fetch a job log given its web-UI ``target_url`` (e.g.
+    ``/molecule-ai/molecule-core/actions/runs/13494/jobs/0``).
+
+    Per ``reference_gitea_actions_log_fetch``: append ``/logs`` to the
+    job route. Per ``reference_gitea_1_22_6_lacks_rest_rerun_endpoints``:
+    Gitea 1.22.6 lacks the REST ``/api/v1/.../actions/runs/*`` path; the
+    web-UI route is the only working endpoint until 1.24+.
+
+    Returns the log text on success, ``None`` on 404 / log-pruned /
+    network error (caller treats None as "log unavailable, warn-not-fail").
+    """
+    if not target_url:
+        return None
+    # Normalize: target_url may be relative ("/owner/repo/...") or
+    # absolute. Both need ``/logs`` appended to the job sub-path.
+    if target_url.startswith("/"):
+        url = f"{WEB}{target_url}"
+    else:
+        url = target_url
+    if not url.endswith("/logs"):
+        url = f"{url}/logs"
+    try:
+        _, body, _ = http("GET", url, expect_json=False, timeout=60)
+    except ApiError as e:
+        sys.stderr.write(f"::warning::log fetch failed for {url}: {e}\n")
+        return None
+    if isinstance(body, bytes):
+        return body.decode("utf-8", errors="replace")
+    return None
+
+
+def grep_fail_markers(log_text: str) -> list[str]:
+    """Return up to 5 sample matching lines for any FAIL_PATTERNS hit.
+    Empty list = clean log."""
+    matches: list[str] = []
+    for line in log_text.splitlines():
+        for pat in FAIL_PATTERNS:
+            if pat in line:
+                # Truncate to keep error output bounded.
+                matches.append(line.strip()[:240])
+                break
+        if len(matches) >= 5:
+            break
+    return matches
+
+
+# --------------------------------------------------------------------------
+# Verification: for one flip, scan recent runs on BASE_REF
+# --------------------------------------------------------------------------
+def verify_flip(flip: dict, branch: str, n: int) -> dict:
+    """Scan the last ``n`` commits on ``branch``. For each commit whose
+    combined status contains a context matching ``flip["context"]``,
+    fetch the run log and grep for FAIL markers.
+
+    Returns::
+
+        {
+          "flip": flip,
+          "checked_commits": int,        # how many commits had a matching context
+          "masked_runs": [               # runs where log shows FAIL despite status==success
+            {"sha": "...", "status": "success", "target_url": "...", "samples": [...]},
+            ...
+          ],
+          "fail_runs": [                 # runs where status itself is failure/error
+            {"sha": "...", "status": "failure", "target_url": "...", "samples": [...]},
+            ...
+          ],
+          "warnings": [str],             # log-unavailable warnings (not blocking)
+        }
+
+    Blocking condition: ``masked_runs`` OR ``fail_runs`` non-empty.
+    A ``success`` status with a clean log is the only "OK to flip"
+    outcome (per hongming-pc2 §SOP-N rule (e)).
+    """
+    target_context = flip["context"]
+    result = {
+        "flip": flip,
+        "checked_commits": 0,
+        "masked_runs": [],
+        "fail_runs": [],
+        "warnings": [],
+    }
+
+    shas = recent_commits_on_branch(branch, n)
+    if not shas:
+        result["warnings"].append(
+            f"no recent commits on {branch} (cannot verify flip)"
+        )
+        return result
+
+    for sha in shas:
+        try:
+            status_doc = combined_status(sha)
+        except ApiError as e:
+            result["warnings"].append(f"combined-status for {sha}: {e}")
+            continue
+        statuses = status_doc.get("statuses") or []
+        # First entry matching the context name. Newest SHAs come
+        # first; one entry per context per SHA is the usual shape.
+        for s in statuses:
+            if not isinstance(s, dict):
+                continue
+            if s.get("context") != target_context:
+                continue
+            result["checked_commits"] += 1
+            state = _entry_state(s)
+            target_url = s.get("target_url") or ""
+            log_text = fetch_log(target_url)
+            if log_text is None:
+                result["warnings"].append(
+                    f"log unavailable for {sha} {target_context}"
+                )
+                # Still record the status itself if it's red — that's
+                # a hard signal that doesn't need log access.
+                if state in ("failure", "error"):
+                    result["fail_runs"].append({
+                        "sha": sha,
+                        "status": state,
+                        "target_url": target_url,
+                        "samples": ["[log unavailable; status itself is " + state + "]"],
+                    })
+                break
+            samples = grep_fail_markers(log_text)
+            if state in ("failure", "error"):
+                result["fail_runs"].append({
+                    "sha": sha,
+                    "status": state,
+                    "target_url": target_url,
+                    "samples": samples or ["[no FAIL markers found but status is " + state + "]"],
+                })
+            elif samples and state == "success":
+                # The bug class: status==success while log shows FAIL.
+                # That's exactly Quirk #10 (continue-on-error masking).
+                result["masked_runs"].append({
+                    "sha": sha,
+                    "status": state,
+                    "target_url": target_url,
+                    "samples": samples,
+                })
+            # Either way, we matched one context entry for this SHA;
+            # don't keep looping `statuses[]`.
+            break
+
+    if result["checked_commits"] == 0:
+        result["warnings"].append(
+            f"no runs of {target_context!r} found in the last {n} commits on "
+            f"{branch} — cannot verify; allowing flip with warning"
+        )
+    return result
+
+
+# --------------------------------------------------------------------------
+# Report rendering
+# --------------------------------------------------------------------------
+def render_flip_report(verdict: dict) -> str:
+    flip = verdict["flip"]
+    lines = [
+        f"job: {flip['job_key']} ({flip['context']})",
+        f"  workflow:        {flip['workflow_path']}",
+        f"  checked_commits: {verdict['checked_commits']}",
+    ]
+    for run in verdict["fail_runs"]:
+        url = run["target_url"]
+        # target_url may be relative; render the absolute form for
+        # click-through.
+        if url.startswith("/"):
+            url = f"{WEB}{url}"
+        lines.append(f"  fail run {run['sha'][:10]} (status={run['status']}): {url}")
+        for sample in run["samples"]:
+            lines.append(f"    | {sample}")
+    for run in verdict["masked_runs"]:
+        url = run["target_url"]
+        if url.startswith("/"):
+            url = f"{WEB}{url}"
+        lines.append(
+            f"  MASKED run {run['sha'][:10]} (status=success, log shows FAIL): {url}"
+        )
+        for sample in run["samples"]:
+            lines.append(f"    | {sample}")
+    for w in verdict["warnings"]:
+        lines.append(f"  warning: {w}")
+    return "\n".join(lines)
+
+
+# --------------------------------------------------------------------------
+# Main
+# --------------------------------------------------------------------------
+def _parse_args(argv: list[str] | None = None) -> argparse.Namespace:
+    p = argparse.ArgumentParser(
+        prog="lint-pre-flip-continue-on-error",
+        description="Block a PR that flips continue-on-error true→false "
+        "without proof recent runs are actually green.",
+    )
+    p.add_argument(
+        "--dry-run",
+        action="store_true",
+        help="Detect + print findings to stdout; never exit non-zero. "
+        "Useful for local testing.",
+    )
+    return p.parse_args(argv)
+
+
+def main(argv: list[str] | None = None) -> int:
+    args = _parse_args(argv)
+    _require_runtime_env()
+
+    base_workflows = workflows_at_sha(BASE_SHA)
+    head_workflows = workflows_at_sha(HEAD_SHA)
+    flips = detect_flips(base_workflows, head_workflows)
+
+    if not flips:
+        print("::notice::no continue-on-error true→false flips in this PR")
+        return 0
+
+    print(f"::notice::detected {len(flips)} continue-on-error true→false flip(s); verifying recent runs on {BASE_REF}")
+    bad_flips: list[dict] = []
+    for flip in flips:
+        verdict = verify_flip(flip, BASE_REF, RECENT_COMMITS_N)
+        report = render_flip_report(verdict)
+        if verdict["fail_runs"] or verdict["masked_runs"]:
+            print(f"::error file={flip['workflow_path']}::flip of {flip['job_key']} "
+                  f"({flip['context']}) blocked — recent runs on {BASE_REF} show "
+                  f"FAIL markers OR are red. Pull each run log below + grep "
+                  f"`--- FAIL` / `FAIL ` / `::error::` — DON'T trust the masked "
+                  f"combined-status. See hongming-pc2 charter §SOP-N rule (e). "
+                  f"PR#656 / mc#664 reference class.")
+            bad_flips.append(verdict)
+        else:
+            print(f"::notice::flip of {flip['job_key']} ({flip['context']}) is safe — "
+                  f"{verdict['checked_commits']} recent run(s), no FAIL markers")
+        # Always print the per-flip detail block so the human-readable
+        # report is in the run log for both safe and unsafe flips.
+        print(f"::group::flip detail: {flip['job_key']}")
+        print(report)
+        print("::endgroup::")
+
+    if bad_flips and not args.dry_run:
+        print(f"::error::{len(bad_flips)}/{len(flips)} flip(s) failed pre-flip verification")
+        return 1
+    if bad_flips and args.dry_run:
+        print(f"::warning::[dry-run] {len(bad_flips)}/{len(flips)} flip(s) WOULD fail; exit 0 forced")
+    return 0
+
+
+if __name__ == "__main__":
+    sys.exit(main())
diff --git a/.gitea/scripts/tests/test_lint_pre_flip_continue_on_error.py b/.gitea/scripts/tests/test_lint_pre_flip_continue_on_error.py
new file mode 100644
index 00000000..df86a8c6
--- /dev/null
+++ b/.gitea/scripts/tests/test_lint_pre_flip_continue_on_error.py
@@ -0,0 +1,505 @@
+"""Unit tests for .gitea/scripts/lint_pre_flip_continue_on_error.py.
+
+These tests pin the pure-logic surface (flip detection + per-flip
+verdict aggregation) without making real HTTP calls. The end-to-end
+git ls-tree + Gitea API path is exercised by running the workflow
+against real PRs.
+
+Run locally::
+
+    python3 -m unittest .gitea/scripts/tests/test_lint_pre_flip_continue_on_error.py -v
+
+Mirrors the pattern in scripts/ops/test_check_migration_collisions.py
++ scripts/test_build_runtime_package.py.
+"""
+from __future__ import annotations
+
+import importlib.util
+import os
+import sys
+import unittest
+from pathlib import Path
+from unittest import mock
+
+# Load the script as a module without invoking main(). Tests must NOT
+# depend on the full runtime env contract (GITEA_TOKEN etc.), so we
+# import individual functions and stub the network surface explicitly.
+SCRIPT_PATH = Path(__file__).resolve().parent.parent / "lint_pre_flip_continue_on_error.py"
+spec = importlib.util.spec_from_file_location("lpfc", SCRIPT_PATH)
+lpfc = importlib.util.module_from_spec(spec)
+spec.loader.exec_module(lpfc)
+
+
+# --------------------------------------------------------------------------
+# Fixtures: minimal valid workflow YAML on each side of a "diff"
+# --------------------------------------------------------------------------
+CI_YML_BASE = """\
+name: CI
+on:
+  push:
+    branches: [main]
+jobs:
+  platform-build:
+    name: Platform (Go)
+    runs-on: ubuntu-latest
+    continue-on-error: true
+    steps:
+      - run: echo platform
+  canvas-build:
+    name: Canvas (Next.js)
+    runs-on: ubuntu-latest
+    continue-on-error: true
+    steps:
+      - run: echo canvas
+  all-required:
+    runs-on: ubuntu-latest
+    continue-on-error: true
+    needs: [platform-build, canvas-build]
+    steps:
+      - run: echo ok
+"""
+
+CI_YML_HEAD_FLIPPED = """\
+name: CI
+on:
+  push:
+    branches: [main]
+jobs:
+  platform-build:
+    name: Platform (Go)
+    runs-on: ubuntu-latest
+    continue-on-error: false
+    steps:
+      - run: echo platform
+  canvas-build:
+    name: Canvas (Next.js)
+    runs-on: ubuntu-latest
+    continue-on-error: false
+    steps:
+      - run: echo canvas
+  all-required:
+    runs-on: ubuntu-latest
+    continue-on-error: true
+    needs: [platform-build, canvas-build]
+    steps:
+      - run: echo ok
+"""
+
+CI_YML_HEAD_NO_DIFF = CI_YML_BASE  # identical to base, no flip
+
+
+# --------------------------------------------------------------------------
+# 1. CoE coercion (truthy/falsy/quoted/absent)
+# --------------------------------------------------------------------------
+class TestCoerceCoE(unittest.TestCase):
+    def test_python_bool_true(self):
+        self.assertTrue(lpfc._coerce_coe(True))
+
+    def test_python_bool_false(self):
+        self.assertFalse(lpfc._coerce_coe(False))
+
+    def test_none_is_false(self):
+        # GitHub Actions default: absent == false.
+        self.assertFalse(lpfc._coerce_coe(None))
+
+    def test_string_true_lowercase(self):
+        # Quoted "true" in YAML — Gitea Actions normalizes to True.
+        self.assertTrue(lpfc._coerce_coe("true"))
+
+    def test_string_True_titlecase(self):
+        self.assertTrue(lpfc._coerce_coe("True"))
+
+    def test_string_yes(self):
+        # YAML 1.1 truthy form.
+        self.assertTrue(lpfc._coerce_coe("yes"))
+
+    def test_string_false(self):
+        self.assertFalse(lpfc._coerce_coe("false"))
+
+    def test_string_random_falsy(self):
+        # An unrecognized string is treated as falsy — safer than
+        # silently coercing "maybe" to True and false-positiving a
+        # flip.
+        self.assertFalse(lpfc._coerce_coe("maybe"))
+
+
+# --------------------------------------------------------------------------
+# 2. Diff detection — flips, not arbitrary changes
+# --------------------------------------------------------------------------
+class TestDetectFlips(unittest.TestCase):
+    def test_no_flip_in_diff_passes(self):
+        # Acceptance test #1: PR doesn't flip continue-on-error → 0 flips.
+        flips = lpfc.detect_flips(
+            {".gitea/workflows/ci.yml": CI_YML_BASE},
+            {".gitea/workflows/ci.yml": CI_YML_HEAD_NO_DIFF},
+        )
+        self.assertEqual(flips, [])
+
+    def test_flip_detected_in_one_file(self):
+        flips = lpfc.detect_flips(
+            {".gitea/workflows/ci.yml": CI_YML_BASE},
+            {".gitea/workflows/ci.yml": CI_YML_HEAD_FLIPPED},
+        )
+        # Two jobs flipped: platform-build, canvas-build. all-required
+        # is still true on both sides.
+        self.assertEqual(len(flips), 2)
+        keys = sorted(f["job_key"] for f in flips)
+        self.assertEqual(keys, ["canvas-build", "platform-build"])
+
+    def test_context_name_render(self):
+        flips = lpfc.detect_flips(
+            {".gitea/workflows/ci.yml": CI_YML_BASE},
+            {".gitea/workflows/ci.yml": CI_YML_HEAD_FLIPPED},
+        )
+        platform = next(f for f in flips if f["job_key"] == "platform-build")
+        self.assertEqual(platform["context"], "CI / Platform (Go) (push)")
+        self.assertEqual(platform["workflow_name"], "CI")
+
+    def test_context_falls_back_to_job_key_when_no_name(self):
+        base = "name: WF\njobs:\n  foo:\n    continue-on-error: true\n    runs-on: x\n    steps: []\n"
+        head = "name: WF\njobs:\n  foo:\n    continue-on-error: false\n    runs-on: x\n    steps: []\n"
+        flips = lpfc.detect_flips({"a.yml": base}, {"a.yml": head})
+        self.assertEqual(len(flips), 1)
+        self.assertEqual(flips[0]["context"], "WF / foo (push)")
+
+    def test_no_flip_when_only_one_side_has_file(self):
+        # Newly added workflow file — head has CoE:false, base has no
+        # file. Adding a new workflow with CoE:false is fine; there's
+        # nothing to mask.
+        flips = lpfc.detect_flips(
+            {},  # base has no workflow files
+            {".gitea/workflows/new.yml": CI_YML_HEAD_FLIPPED},
+        )
+        self.assertEqual(flips, [])
+
+    def test_no_flip_when_job_removed(self):
+        # Job exists on base, not on head — a removal, not a flip.
+        head = """\
+name: CI
+jobs:
+  canvas-build:
+    name: Canvas (Next.js)
+    continue-on-error: true
+    runs-on: ubuntu-latest
+    steps: []
+"""
+        flips = lpfc.detect_flips(
+            {".gitea/workflows/ci.yml": CI_YML_BASE},
+            {".gitea/workflows/ci.yml": head},
+        )
+        self.assertEqual(flips, [])
+
+    def test_no_flip_when_job_added_with_false(self):
+        # New job on head with CoE:false — no base side; not a flip.
+        head_with_new = CI_YML_BASE.replace(
+            "  all-required:",
+            "  newjob:\n    name: New Job\n    continue-on-error: false\n"
+            "    runs-on: x\n    steps: []\n"
+            "  all-required:",
+        )
+        flips = lpfc.detect_flips(
+            {".gitea/workflows/ci.yml": CI_YML_BASE},
+            {".gitea/workflows/ci.yml": head_with_new},
+        )
+        self.assertEqual(flips, [])
+
+    def test_yaml_parse_error_warns_not_raises(self):
+        # Malformed YAML on head — should warn (stderr) and skip,
+        # not raise.
+        bad_head = "name: CI\njobs:\n  :::\n"
+        # Capture stderr so the test isn't noisy.
+        with mock.patch.object(sys, "stderr"):
+            flips = lpfc.detect_flips(
+                {".gitea/workflows/ci.yml": CI_YML_BASE},
+                {".gitea/workflows/ci.yml": bad_head},
+            )
+        self.assertEqual(flips, [])
+
+
+# --------------------------------------------------------------------------
+# 3. grep_fail_markers — the regex / substring matcher
+# --------------------------------------------------------------------------
+class TestGrepFailMarkers(unittest.TestCase):
+    def test_clean_log_returns_empty(self):
+        log = "===== test run starting =====\nPASS\nok  example.com/foo  1.234s\n"
+        self.assertEqual(lpfc.grep_fail_markers(log), [])
+
+    def test_go_minus_minus_minus_fail_caught(self):
+        log = "ok  example.com/foo  1.234s\n--- FAIL: TestBar (0.01s)\n    bar_test.go:42:\n"
+        matches = lpfc.grep_fail_markers(log)
+        self.assertEqual(len(matches), 1)
+        self.assertIn("FAIL: TestBar", matches[0])
+
+    def test_go_package_fail_caught(self):
+        log = "FAIL\texample.com/baz\t1.234s\n"
+        matches = lpfc.grep_fail_markers(log)
+        self.assertEqual(len(matches), 1)
+        self.assertIn("FAIL", matches[0])
+
+    def test_bash_error_directive_caught(self):
+        # `lint-curl-status-capture` pattern: a python heredoc inside a
+        # bash step that prints `::error::` then sys.exit(1). With
+        # continue-on-error:true the job rolls up as success despite
+        # this line. THAT's the masking we're trying to catch.
+        log = "Running scan...\n::error::Found 3 curl-status-capture pollution site(s):\n"
+        matches = lpfc.grep_fail_markers(log)
+        self.assertEqual(len(matches), 1)
+        self.assertIn("::error::", matches[0])
+
+    def test_caps_matches_at_max_5(self):
+        log = "\n".join(["--- FAIL: T%d" % i for i in range(20)])
+        matches = lpfc.grep_fail_markers(log)
+        self.assertEqual(len(matches), 5)
+
+
+# --------------------------------------------------------------------------
+# 4. verify_flip — single-flip verdict assembly (network surface stubbed)
+# --------------------------------------------------------------------------
+def _stub_status(context: str, state: str, target_url: str = "/owner/repo/actions/runs/1/jobs/0") -> dict:
+    """Build a single-context combined-status response."""
+    return {
+        "state": state,
+        "statuses": [
+            {"context": context, "status": state, "target_url": target_url, "description": ""}
+        ],
+    }
+
+
+FLIP_FIXTURE = {
+    "workflow_path": ".gitea/workflows/ci.yml",
+    "workflow_name": "CI",
+    "job_key": "platform-build",
+    "job_name": "Platform (Go)",
+    "context": "CI / Platform (Go) (push)",
+}
+
+
+class TestVerifyFlip(unittest.TestCase):
+    def test_flip_with_clean_history_passes(self):
+        # Acceptance test #2: flip detected, last 5 runs clean → exit 0.
+        with mock.patch.object(lpfc, "recent_commits_on_branch", return_value=["sha1", "sha2", "sha3"]):
+            with mock.patch.object(
+                lpfc, "combined_status",
+                side_effect=[_stub_status(FLIP_FIXTURE["context"], "success") for _ in range(3)],
+            ):
+                with mock.patch.object(lpfc, "fetch_log", return_value="ok  example.com/foo  1s\nPASS\n"):
+                    verdict = lpfc.verify_flip(FLIP_FIXTURE, "main", 5)
+        self.assertEqual(verdict["fail_runs"], [])
+        self.assertEqual(verdict["masked_runs"], [])
+        self.assertEqual(verdict["checked_commits"], 3)
+        self.assertEqual(verdict["warnings"], [])
+
+    def test_flip_with_recent_fail_blocks(self):
+        # Acceptance test #3: flip detected, recent run has --- FAIL → exit 1.
+        # Setup: 3 commits, the most recent run's log shows --- FAIL
+        # but the STATUS is success (Quirk #10 mask). That's the
+        # masked_runs case.
+        log_with_fail = "ok  example.com/foo  1s\n--- FAIL: TestSqlmock (0.01s)\n    sqlmock_test.go:42:\n"
+        with mock.patch.object(lpfc, "recent_commits_on_branch", return_value=["sha1", "sha2", "sha3"]):
+            with mock.patch.object(
+                lpfc, "combined_status",
+                side_effect=[_stub_status(FLIP_FIXTURE["context"], "success") for _ in range(3)],
+            ):
+                with mock.patch.object(lpfc, "fetch_log", side_effect=[log_with_fail, "PASS\n", "PASS\n"]):
+                    verdict = lpfc.verify_flip(FLIP_FIXTURE, "main", 5)
+        self.assertEqual(len(verdict["masked_runs"]), 1)
+        self.assertEqual(verdict["masked_runs"][0]["sha"], "sha1")
+        self.assertTrue(any("TestSqlmock" in s for s in verdict["masked_runs"][0]["samples"]))
+        self.assertEqual(verdict["fail_runs"], [])
+
+    def test_red_status_alone_blocks(self):
+        # Status itself is `failure` — block without needing log
+        # markers. (Belt-and-braces: even with a clean log, a `failure`
+        # status means the job's exit code was non-zero.)
+        with mock.patch.object(lpfc, "recent_commits_on_branch", return_value=["sha1"]):
+            with mock.patch.object(
+                lpfc, "combined_status",
+                return_value=_stub_status(FLIP_FIXTURE["context"], "failure"),
+            ):
+                with mock.patch.object(lpfc, "fetch_log", return_value="some unrelated text\n"):
+                    verdict = lpfc.verify_flip(FLIP_FIXTURE, "main", 5)
+        self.assertEqual(len(verdict["fail_runs"]), 1)
+        self.assertEqual(verdict["fail_runs"][0]["status"], "failure")
+
+    def test_unreadable_log_warns_not_blocks(self):
+        # Acceptance test #5: log fetch 404 (None) → warn, not block.
+        # Status is `success`, log is None — we can't tell, so we warn
+        # and allow.
+        with mock.patch.object(lpfc, "recent_commits_on_branch", return_value=["sha1"]):
+            with mock.patch.object(
+                lpfc, "combined_status",
+                return_value=_stub_status(FLIP_FIXTURE["context"], "success"),
+            ):
+                with mock.patch.object(lpfc, "fetch_log", return_value=None):
+                    verdict = lpfc.verify_flip(FLIP_FIXTURE, "main", 5)
+        self.assertEqual(verdict["fail_runs"], [])
+        self.assertEqual(verdict["masked_runs"], [])
+        self.assertTrue(any("log unavailable" in w for w in verdict["warnings"]))
+
+    def test_unreadable_log_with_failure_status_still_blocks(self):
+        # Edge case: log fetch fails BUT the status itself is `failure`.
+        # We can still block — the status alone is sufficient signal,
+        # we don't need the log to confirm.
+        with mock.patch.object(lpfc, "recent_commits_on_branch", return_value=["sha1"]):
+            with mock.patch.object(
+                lpfc, "combined_status",
+                return_value=_stub_status(FLIP_FIXTURE["context"], "failure"),
+            ):
+                with mock.patch.object(lpfc, "fetch_log", return_value=None):
+                    verdict = lpfc.verify_flip(FLIP_FIXTURE, "main", 5)
+        self.assertEqual(len(verdict["fail_runs"]), 1)
+        self.assertIn("log unavailable", verdict["fail_runs"][0]["samples"][0])
+
+    def test_zero_runs_history_warns_allows(self):
+        # No commits with a matching context — newly added workflow.
+        # Allow with warning.
+        with mock.patch.object(lpfc, "recent_commits_on_branch", return_value=["sha1", "sha2"]):
+            with mock.patch.object(
+                lpfc, "combined_status",
+                return_value={"state": "success", "statuses": []},  # no matching context
+            ):
+                verdict = lpfc.verify_flip(FLIP_FIXTURE, "main", 5)
+        self.assertEqual(verdict["checked_commits"], 0)
+        self.assertEqual(verdict["fail_runs"], [])
+        self.assertEqual(verdict["masked_runs"], [])
+        self.assertTrue(any("no runs of" in w for w in verdict["warnings"]))
+
+    def test_zero_commits_warns_allows(self):
+        # Empty branch (newly created repo, e.g.). Allow with warning.
+        with mock.patch.object(lpfc, "recent_commits_on_branch", return_value=[]):
+            verdict = lpfc.verify_flip(FLIP_FIXTURE, "main", 5)
+        self.assertEqual(verdict["checked_commits"], 0)
+        self.assertEqual(verdict["fail_runs"], [])
+        self.assertEqual(verdict["masked_runs"], [])
+        self.assertTrue(any("no recent commits" in w for w in verdict["warnings"]))
+
+
+# --------------------------------------------------------------------------
+# 5. Multiple-flip aggregation in main()
+# --------------------------------------------------------------------------
+class TestMainAggregation(unittest.TestCase):
+    """Tests that `main()` aggregates multiple flips and exits 1 when
+    ANY one of them has a masked or red recent run. Acceptance test #4.
+
+    We stub at the verify_flip + workflows_at_sha + _require_runtime_env
+    boundary so we don't need real git or HTTP.
+    """
+
+    def setUp(self):
+        # The actual env values are irrelevant — _require_runtime_env
+        # is stubbed out — but the module reads OWNER/NAME at import
+        # time. Patch the runtime env contract to a no-op for the
+        # duration of each test.
+        self._patches = [
+            mock.patch.object(lpfc, "_require_runtime_env", return_value=None),
+            mock.patch.object(lpfc, "BASE_REF", "main"),
+            mock.patch.object(lpfc, "BASE_SHA", "deadbeefcafe"),
+            mock.patch.object(lpfc, "HEAD_SHA", "feedfaceabad"),
+            mock.patch.object(lpfc, "RECENT_COMMITS_N", 5),
+        ]
+        for p in self._patches:
+            p.start()
+        self.addCleanup(lambda: [p.stop() for p in self._patches])
+
+    def test_multiple_flips_aggregated_one_bad_blocks(self):
+        # PR flips 3 jobs; 1 has a recent fail → exit 1, naming that job.
+        flips = [
+            {"workflow_path": ".gitea/workflows/ci.yml", "workflow_name": "CI",
+             "job_key": "platform-build", "job_name": "Platform (Go)",
+             "context": "CI / Platform (Go) (push)"},
+            {"workflow_path": ".gitea/workflows/ci.yml", "workflow_name": "CI",
+             "job_key": "canvas-build", "job_name": "Canvas (Next.js)",
+             "context": "CI / Canvas (Next.js) (push)"},
+            {"workflow_path": ".gitea/workflows/ci.yml", "workflow_name": "CI",
+             "job_key": "python-lint", "job_name": "Python Lint & Test",
+             "context": "CI / Python Lint & Test (push)"},
+        ]
+        clean = {"flip": flips[0], "checked_commits": 5, "masked_runs": [],
+                 "fail_runs": [], "warnings": []}
+        bad = {"flip": flips[1], "checked_commits": 5,
+               "masked_runs": [{"sha": "abc1234567", "status": "success",
+                                "target_url": "/x/y/actions/runs/1/jobs/0",
+                                "samples": ["--- FAIL: TestSqlmock"]}],
+               "fail_runs": [], "warnings": []}
+        also_clean = {"flip": flips[2], "checked_commits": 5, "masked_runs": [],
+                      "fail_runs": [], "warnings": []}
+
+        with mock.patch.object(lpfc, "workflows_at_sha", return_value={}):
+            with mock.patch.object(lpfc, "detect_flips", return_value=flips):
+                with mock.patch.object(lpfc, "verify_flip",
+                                       side_effect=[clean, bad, also_clean]):
+                    # Capture stdout to assert on naming.
+                    captured = []
+                    with mock.patch("builtins.print", side_effect=lambda *a, **k: captured.append(" ".join(str(x) for x in a))):
+                        rc = lpfc.main([])
+        self.assertEqual(rc, 1)
+        # The blocking error message must name the failing job.
+        joined = "\n".join(captured)
+        self.assertIn("canvas-build", joined)
+        # And it must mention the empirical class so a reviewer can
+        # cross-link the right RFC.
+        self.assertTrue("mc#664" in joined or "PR#656" in joined)
+
+    def test_no_flips_in_diff_exits_zero(self):
+        # Acceptance test #1 at main() level: empty flips → exit 0.
+        with mock.patch.object(lpfc, "workflows_at_sha", return_value={}):
+            with mock.patch.object(lpfc, "detect_flips", return_value=[]):
+                rc = lpfc.main([])
+        self.assertEqual(rc, 0)
+
+    def test_all_flips_clean_exits_zero(self):
+        flips = [{"workflow_path": ".gitea/workflows/ci.yml", "workflow_name": "CI",
+                  "job_key": "platform-build", "job_name": "Platform (Go)",
+                  "context": "CI / Platform (Go) (push)"}]
+        clean = {"flip": flips[0], "checked_commits": 5, "masked_runs": [],
+                 "fail_runs": [], "warnings": []}
+        with mock.patch.object(lpfc, "workflows_at_sha", return_value={}):
+            with mock.patch.object(lpfc, "detect_flips", return_value=flips):
+                with mock.patch.object(lpfc, "verify_flip", return_value=clean):
+                    rc = lpfc.main([])
+        self.assertEqual(rc, 0)
+
+    def test_dry_run_forces_exit_zero_even_with_bad_flip(self):
+        # --dry-run never fails, even when verification finds masked runs.
+        flips = [{"workflow_path": ".gitea/workflows/ci.yml", "workflow_name": "CI",
+                  "job_key": "platform-build", "job_name": "Platform (Go)",
+                  "context": "CI / Platform (Go) (push)"}]
+        bad = {"flip": flips[0], "checked_commits": 5,
+               "masked_runs": [{"sha": "abc1234567", "status": "success",
+                                "target_url": "/x/y/actions/runs/1/jobs/0",
+                                "samples": ["--- FAIL: TestSqlmock"]}],
+               "fail_runs": [], "warnings": []}
+        with mock.patch.object(lpfc, "workflows_at_sha", return_value={}):
+            with mock.patch.object(lpfc, "detect_flips", return_value=flips):
+                with mock.patch.object(lpfc, "verify_flip", return_value=bad):
+                    rc = lpfc.main(["--dry-run"])
+        self.assertEqual(rc, 0)
+
+
+# --------------------------------------------------------------------------
+# 6. Context-name rendering (the format Gitea Actions actually emits)
+# --------------------------------------------------------------------------
+class TestContextName(unittest.TestCase):
+    def test_push_event(self):
+        self.assertEqual(
+            lpfc.context_name("CI", "Platform (Go)", "push"),
+            "CI / Platform (Go) (push)",
+        )
+
+    def test_pull_request_event(self):
+        self.assertEqual(
+            lpfc.context_name("CI", "Platform (Go)", "pull_request"),
+            "CI / Platform (Go) (pull_request)",
+        )
+
+    def test_workflow_name_falls_back_to_filename(self):
+        # No top-level `name:` → falls back to filename minus extension.
+        doc = {"jobs": {"foo": {"continue-on-error": True}}}
+        self.assertEqual(
+            lpfc.workflow_name(doc, fallback="my-workflow"),
+            "my-workflow",
+        )
+
+
+if __name__ == "__main__":
+    unittest.main()
diff --git a/.gitea/workflows/lint-pre-flip-continue-on-error.yml b/.gitea/workflows/lint-pre-flip-continue-on-error.yml
new file mode 100644
index 00000000..51d76d6a
--- /dev/null
+++ b/.gitea/workflows/lint-pre-flip-continue-on-error.yml
@@ -0,0 +1,142 @@
+name: Lint pre-flip continue-on-error
+
+# Pre-merge gate: blocks PRs that flip `continue-on-error: true → false`
+# on any job in `.gitea/workflows/*.yml` WITHOUT proof that the affected
+# job's recent runs on the target branch (PR base) are actually green.
+#
+# Empirical class: PR #656 / mc#664. PR #656 (RFC internal#219 Phase 4)
+# flipped 5 platform-build-class jobs `continue-on-error: true → false`
+# on the basis of a "verified green on main via combined-status check".
+# But that "green" was the LIE the prior `continue-on-error: true`
+# produced: Gitea Quirk #10 (internal#342 + dup #287) — a failed step
+# inside a `continue-on-error: true` job rolls up to a `success`
+# job-level status. The precondition the PR claimed to verify was
+# structurally fooled by the bug being flipped.
+#
+# mc#664 captured the surfaced defects (2 mutually-masked regressions):
+#   - Class 1: sqlmock helper drift since 2f36bb9a (24 days old)
+#   - Class 2: OFFSEC-001 contract collision since 7d1a189f (1 day old)
+#
+# Codified 04:35Z as hongming-pc2 charter §SOP-N rule (e)
+# "run-log-grep-before-flip" — now structurally enforced here at PR
+# time, ahead of merge.
+#
+# How the gate works:
+#   1. Read every `.gitea/workflows/*.yml` at the PR base SHA AND at
+#      the PR head SHA via `git show <sha>:<path>` (no checkout
+#      needed).
+#   2. Parse both sides via PyYAML AST (NOT grep — per
+#      `feedback_behavior_based_ast_gates`). Walk `jobs.<key>.
+#      continue-on-error` on each side. A flip is base=true,
+#      head=false.
+#   3. For each flipped job, render the commit-status context as
+#      `"{workflow.name} / {job.name or job.key} (push)"` — that's
+#      how Gitea Actions emits the per-context status on `main`/
+#      `staging` runs.
+#   4. Pull last 5 commits on the PR base branch, fetch combined
+#      commit-status per commit, scan for the target context. For
+#      each match, fetch the run log via the web-UI route
+#      `{server_url}/{repo}/actions/runs/{run_id}/jobs/{job_idx}/logs`
+#      (per `reference_gitea_actions_log_fetch` —
+#      Gitea 1.22.6 lacks REST `/actions/runs/*`; web-UI is the
+#      only working path, see also
+#      `reference_gitea_1_22_6_lacks_rest_rerun_endpoints`).
+#   5. Grep each log for `--- FAIL`, `FAIL\s`, `::error::`. If
+#      the status is `success` but the log shows any of these,
+#      the job was masked. Block the PR with `::error::`.
+#
+# Graceful-degrade contract (per task halt-conditions):
+#   - Log fetch 404 (act_runner pruned the log, transient outage):
+#     emit `::warning::` "log unavailable" — does NOT block.
+#   - Zero recent runs of the flipped job's context on the base
+#     branch (newly added workflow): emit `::warning::` "no run
+#     history to verify" — allow the flip. Chicken-and-egg
+#     exemption.
+#   - YAML parse error in one of the workflow files: warn-only,
+#     don't block — the YAML lint workflows catch this separately.
+#
+# Cross-links: PR#656, mc#664, PR#665 (interim re-mask),
+# Quirk #10 (internal#342 + dup #287), hongming-pc2 charter
+# §SOP-N rule (e), feedback_strict_root_only_after_class_a,
+# feedback_no_shared_persona_token_use.
+#
+# Phase contract (RFC internal#219 §1 ladder):
+#   - This workflow lands at `continue-on-error: true` (Phase 3 —
+#     surface defects without blocking). Follow-up PR flips it to
+#     `false` ONLY after this workflow's own recent runs on `main`
+#     are confirmed clean — exactly the discipline the workflow
+#     itself enforces. Eat your own dogfood.
+
+on:
+  pull_request:
+    types: [opened, synchronize, reopened]
+    paths:
+      - '.gitea/workflows/**'
+      - '.gitea/scripts/lint_pre_flip_continue_on_error.py'
+      - '.gitea/workflows/lint-pre-flip-continue-on-error.yml'
+
+env:
+  # Per `feedback_act_runner_github_server_url` — without this,
+  # actions/checkout and friends default to github.com → break.
+  GITHUB_SERVER_URL: https://git.moleculesai.app
+
+permissions:
+  contents: read
+  # Need read on the API to pull combined commit-status + commit list
+  # for the base branch. The job-log fetch uses the same token via
+  # the web-UI route (Gitea 1.22.6 accepts `Authorization: token ...`
+  # there).
+  pull-requests: read
+
+concurrency:
+  group: lint-pre-flip-coe-${{ github.event.pull_request.head.sha || github.sha }}
+  cancel-in-progress: true
+
+jobs:
+  scan:
+    name: Verify continue-on-error flips have run-log proof
+    runs-on: ubuntu-latest
+    timeout-minutes: 8
+    # Phase 3 (RFC #219 §1): surface broken flips without blocking
+    # the PR yet. Follow-up flips this to `false` once the workflow
+    # itself has clean recent runs on main. Dogfood: don't violate
+    # the same rule the workflow enforces.
+    continue-on-error: true
+    steps:
+      - name: Check out PR head (full history for base-SHA access)
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd  # v6.0.2
+        with:
+          # `git show <base-sha>:<path>` needs the base SHA's blobs.
+          # Shallow=1 would miss it. Same rationale as
+          # check-migration-collisions.yml.
+          fetch-depth: 0
+      - name: Set up Python (PyYAML for AST parsing)
+        uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065  # v5.6.0
+        with:
+          python-version: '3.12'
+      - name: Install PyYAML
+        # Same pin as ci-required-drift.yml — keep dependencies
+        # uniform so a Gitea runner cache hits across both jobs.
+        run: python -m pip install --quiet 'PyYAML==6.0.2'
+      - name: Ensure base ref is reachable locally
+        # `actions/checkout@v6 fetch-depth=0` usually pulls the base
+        # too, but explicit-fetch is cheap insurance against the
+        # form-of-ref differences across Gitea runner versions
+        # (mirrors the comment in check-migration-collisions.yml).
+        run: |
+          git fetch origin "${{ github.event.pull_request.base.ref }}" || true
+      - name: Run lint
+        env:
+          # Auto-injected by Gitea Actions; sufficient scope for
+          # combined-status + commit-list + log fetch via web-UI
+          # route. NO repo-admin needed (unlike the
+          # branch_protections endpoint).
+          GITEA_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+          GITEA_HOST: git.moleculesai.app
+          REPO: ${{ github.repository }}
+          BASE_REF: ${{ github.event.pull_request.base.ref }}
+          BASE_SHA: ${{ github.event.pull_request.base.sha }}
+          HEAD_SHA: ${{ github.event.pull_request.head.sha }}
+          # Last 5 commits on the base branch is the spec default.
+          RECENT_COMMITS_N: '5'
+        run: python3 .gitea/scripts/lint_pre_flip_continue_on_error.py