Commit Graph

2 Commits

Author SHA1 Message Date
5674b0e067 fix(ci): status-reaper rev4 reads per-context "status" key not "state" (compensation was unreachable since rev1)
Some checks failed
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 3s
CI / Detect changes (pull_request) Successful in 10s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 9s
E2E API Smoke Test / detect-changes (pull_request) Successful in 10s
qa-review / approved (pull_request) Failing after 9s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 12s
security-review / approved (pull_request) Failing after 11s
sop-tier-check / tier-check (pull_request) Successful in 10s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 13s
gate-check-v3 / gate-check (pull_request) Successful in 12s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 14s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 3s
CI / Platform (Go) (pull_request) Successful in 4s
CI / Canvas (Next.js) (pull_request) Successful in 3s
CI / Python Lint & Test (pull_request) Successful in 4s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 5s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 4s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 3s
CI / all-required (pull_request) Successful in 2s
Schema asymmetry in Gitea 1.22.6 combined-status response:
  - top-level `combined.state`         → uses key "state"
  - per-entry `combined.statuses[i].*` → uses key "status", NOT "state"

Pre-rev4 the per-entry loop in reap() (and the matching is_red() /
render_body() in main-red-watchdog) read `s.get("state")` only, which
returned None on every real Gitea response → state coerced to "" →
`"" != "failure"` guard preserved every entry → compensation path
unreachable since rev1.

Empirical proof (orchestrator probe 2026-05-12 03:42Z):
  GET /repos/molecule-ai/molecule-core/commits/210da3b1/status
  → 29 per-entry items, ALL have key "status", ZERO have key "state".
  status value distribution: {success:18, failure:8, pending:3}.
  rev3 production run 17516 reported preserved_non_failure=585=30*19.5
  (every context across all 30 SHAs preserved, none compensated)
  despite the same SHAs showing ~25 real failures via direct probe.

Fix is one line per call site:
  s.get("state") → s.get("status") or s.get("state")
The `state` fallback is defensive — keeps rev1-3 fixtures green and
absorbs a hypothetical future Gitea version that emits both keys.

Sibling-script audit:
  - main-red-watchdog.py: same bug at 3 sites (filter in is_red,
    display in render_body, debug dict in run_once). Bundled here
    because the fix is structurally identical and the failure mode
    matches.
  - ci-required-drift.py: no per-entry status iteration. Clean.

Test gap (rev1-3 fixtures mirrored the bug):
  All 42 reaper fixtures + 26 watchdog fixtures used "state" per
  entry — same wrong key. That's why rev1-3 tests stayed green while
  the production code was no-op. Logged under
  `feedback_smoke_test_vendor_truth_not_shape_match`.

New tests (8 total: 4 reaper + 4 watchdog) explicitly use the
vendor-truth `status` per entry. Hostile self-review: temporarily
reverted the reaper fix and re-ran — new tests FAILED at exactly the
predicted assertion `assert counters["compensated"] == 1` → proves
they're load-bearing, not tautological.

Cross-links:
  task #90 (orchestrator), task #46 (hongming-pc2 paired investigation)
  PR #618 (rev1), PR #633 (rev2), PR #650 (rev3 widened window)
2026-05-11 20:44:20 -07:00
2588b4ecbc feat(ci): main-red watchdog (Option C of main-never-red directive) — closes #420
All checks were successful
audit-force-merge / audit (pull_request) Successful in 18s
Adds a sentinel that detects post-merge CI red on `main` and files an
idempotent `[main-red] {repo}: {SHA[:10]}` issue. Auto-closes the issue
when main returns to green. Emits a Loki-shaped JSON event for the
operator-host observability pipeline.

Pattern source: CP `0adf2098` (ci-required-drift). Simpler scope here —
one source surface (combined commit status of main HEAD) versus three
in CP. Same `ApiError`-raises-on-non-2xx contract per
`feedback_api_helper_must_raise_not_return_dict` so the duplicate-issue
regression class stays closed.

Does NOT auto-revert. Option B is explicitly rejected per
`feedback_no_such_thing_as_flakes` + `feedback_fix_root_not_symptom`.
The watchdog files an alarm; humans fix forward.

Files:
  - .gitea/workflows/main-red-watchdog.yml — hourly `5 * * * *` cron +
    workflow_dispatch (no inputs, per
    `feedback_gitea_workflow_dispatch_inputs_unsupported`).
  - .gitea/scripts/main-red-watchdog.py — sidecar with `--dry-run`.
  - tests/test_main_red_watchdog.py — 26 pytest cases.

Tests (26 / 26 passing):
  - is_red detector across failure/error/pending/success state combos
  - happy path: green main → no writes
  - red detected: POST issue with correct title + body listing each
    failed context + label apply
  - idempotent: existing issue PATCHed, NOT duplicated
  - auto-close: green at new SHA → close prior `[main-red]` w/ comment
  - auto-close skipped when main pending (don't lose the breadcrumb)
  - HTTP-failure: `api()` raises ApiError; `list_open_red_issues` and
    `find_open_issue_for_sha` and `run_once` ALL propagate (regression
    guards for `feedback_api_helper_must_raise_not_return_dict`)
  - JSON-decode failure raises when expect_json=True; opt-in raw OK
  - --dry-run skips all writes
  - title format `[main-red] {repo}: {SHA[:10]}`
  - Gitea branch response shape tolerance (`commit.id` OR `commit.sha`)
  - Loki emitter survives `logger` not installed / subprocess failure
  - runtime env guard exits when required vars missing

Hostile self-review proven: 2 transient-error tests FAIL on a pre-fix
implementation (verified by injecting `try: ... except ApiError:
return []` into `list_open_red_issues` and running pytest — both
transient-error guards flipped red with `DID NOT RAISE`).

Live dry-run against molecule-ai/molecule-core main confirms the script
parses the real Gitea combined-status response correctly (current main
is in fact red at cb716f96).

Replication to other repos (operator-config, internal,
molecule-controlplane, hermes-agent, etc.) is out of scope for this
PR — molecule-core pilot only, per task brief.

Tracking: #420.
2026-05-11 00:36:20 -07:00