fix(ci): main-red-watchdog skips cancel-cascade entries (mc#1564) #1571

Merged
core-devops merged 2 commits from fix/main-red-watchdog-skip-cancel-cascade-mc1564 into main 2026-05-19 20:23:43 +00:00
Member

Summary

Closes #1564.

main-red-watchdog was filing phantom [main-red] issues because Gitea maps both action_run.status=2 (Failure) and status=3 (Cancelled) to commit-status string "failure". On a busy main with concurrency: cancel-in-progress: true, every merge burst cancels prior in-flight runs (status=3), inflating the watchdog's red%.

This PR implements option B from #1564 (description-string filter, no extra API call): is_red() excludes per-entry items whose description == "Has been cancelled" (the literal Gitea description for cancelled runs; real failures carry "Failing after Ns").

Verified empirically against mc#1562 body — that issue contains 3 real failures (Failing after 8s/1m49s/1m21s) + 1 cancel-cascade (Has been cancelled). With this filter, mc#1562 would have filed with only the 3 real failures (correct), and a pure-cancel-cascade SHA would not file at all.

Canonical Gitea 1.22.6 enum

Per models/actions/status.go:

1=Success, 2=Failure (real defect, file), 3=Cancelled (concurrency artifact, ignore),
4=Skipped, 5=Waiting, 6=Running, 7=Blocked

Source: operator memory reference_gitea_action_status_enum_corrected_2026_05_19 + reference_chronic_red_sweep_cancelled_vs_failed_filter (saved during mc#1529 triage).

Behaviour change

is_red() now drives red off the filtered failed list, not raw combined-state. Combined=failure with all-cancelled per-entry → not red. Combined=failure with empty statuses[] is preserved as red (the CI-emitter-direct edge case from render_body's existing fallback).

Exact-match (not substring) so a hypothetical real-failure log line containing "Has been cancelled by ..." still counts as red.

Out of scope

  • Does NOT close phantom issues retroactively (per #1564 — separate cleanup).
  • The Platform (Go) sqlmock failures on ci.yml (18 in 7d, real status=2) remain to file separately once root-caused.

Test plan

  • 6 new unit tests in tests/test_main_red_watchdog.py covering: cancel-cascade alone (not red), real-failure alone (red), mixed (red, only real in body), all cancelled (not red), combined+empty fallback (still red), exact-match contract.
  • All 36 tests in tests/test_main_red_watchdog.py pass locally (Python 3.13).
  • Required CI contexts green on this PR.
  • First post-merge cron tick (:05) observed on real main — confirm no new phantom [main-red] issue files when only cancel-cascade present.

Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com

## Summary Closes #1564. `main-red-watchdog` was filing phantom `[main-red]` issues because Gitea maps **both** `action_run.status=2` (Failure) and `status=3` (Cancelled) to commit-status string `"failure"`. On a busy `main` with `concurrency: cancel-in-progress: true`, every merge burst cancels prior in-flight runs (status=3), inflating the watchdog's red%. This PR implements **option B from #1564** (description-string filter, no extra API call): `is_red()` excludes per-entry items whose `description == "Has been cancelled"` (the literal Gitea description for cancelled runs; real failures carry `"Failing after Ns"`). Verified empirically against mc#1562 body — that issue contains 3 real failures (`Failing after 8s/1m49s/1m21s`) + 1 cancel-cascade (`Has been cancelled`). With this filter, mc#1562 would have filed with only the 3 real failures (correct), and a pure-cancel-cascade SHA would not file at all. ## Canonical Gitea 1.22.6 enum Per `models/actions/status.go`: ``` 1=Success, 2=Failure (real defect, file), 3=Cancelled (concurrency artifact, ignore), 4=Skipped, 5=Waiting, 6=Running, 7=Blocked ``` Source: operator memory `reference_gitea_action_status_enum_corrected_2026_05_19` + `reference_chronic_red_sweep_cancelled_vs_failed_filter` (saved during mc#1529 triage). ## Behaviour change `is_red()` now drives `red` off the filtered failed list, not raw combined-state. Combined=`failure` with all-cancelled per-entry → not red. Combined=`failure` with empty `statuses[]` is preserved as red (the CI-emitter-direct edge case from `render_body`'s existing fallback). Exact-match (not substring) so a hypothetical real-failure log line containing `"Has been cancelled by ..."` still counts as red. ## Out of scope - Does NOT close phantom issues retroactively (per #1564 — separate cleanup). - The Platform (Go) sqlmock failures on `ci.yml` (18 in 7d, real status=2) remain to file separately once root-caused. ## Test plan - 6 new unit tests in `tests/test_main_red_watchdog.py` covering: cancel-cascade alone (not red), real-failure alone (red), mixed (red, only real in body), all cancelled (not red), combined+empty fallback (still red), exact-match contract. - All 36 tests in `tests/test_main_red_watchdog.py` pass locally (Python 3.13). - Required CI contexts green on this PR. - First post-merge cron tick (`:05`) observed on real `main` — confirm no new phantom `[main-red]` issue files when only cancel-cascade present. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
core-devops added 1 commit 2026-05-19 19:13:59 +00:00
fix(ci): main-red-watchdog skips cancel-cascade entries — closes #1564
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 3s
CI / Detect changes (pull_request) Successful in 9s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 18s
E2E API Smoke Test / detect-changes (pull_request) Successful in 12s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s
E2E Chat / detect-changes (pull_request) Successful in 9s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 19s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m8s
gate-check-v3 / gate-check (pull_request) Successful in 5s
qa-review / approved (pull_request) Failing after 6s
security-review / approved (pull_request) Failing after 6s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 6s
sop-tier-check / tier-check (pull_request) Successful in 7s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m22s
CI / Canvas (Next.js) (pull_request) Successful in 3m46s
CI / Platform (Go) (pull_request) Successful in 5m49s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 1s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 3s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1s
E2E Chat / E2E Chat (pull_request) Successful in 24s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 6s
CI / Python Lint & Test (pull_request) Successful in 6m56s
CI / all-required (pull_request) Successful in 6m48s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
fcf08647c5
Gitea maps BOTH `action_run.status=2` (Failure) AND `status=3` (Cancelled)
to commit-status string `"failure"`. On a busy `main` with
`concurrency: cancel-in-progress: true`, every merge burst cancels prior
in-flight runs (status=3) — those bubble to the combined-status `failure`
rollup and inflate the watchdog's red%, generating phantom `[main-red]`
issues (mc#1562/#1552/#1540/#1532/#1527/#1526/#1522/#1503/#1487/#1484).

Per mc#1564 the cleanest filter at this layer is option B (description
string): cancelled-run entries carry description `"Has been cancelled"`,
real failures carry `"Failing after Ns"`. is_red() now excludes the
former from the failed[] list, and combined=failure alone (no per-entry
detail) only trips red when statuses[] is empty (the CI-emitter-direct
edge case from render_body's existing fallback).

Match is description == "Has been cancelled" exactly (after strip), not
substring, so a hypothetical real-failure log line containing that
phrase still counts as red.

Canonical Gitea 1.22.6 enum per `models/actions/status.go`:
  1=Success, 2=Failure, 3=Cancelled, 4=Skipped,
  5=Waiting, 6=Running, 7=Blocked
(reference: operator memory
 reference_gitea_action_status_enum_corrected_2026_05_19
 + reference_chronic_red_sweep_cancelled_vs_failed_filter)

Tests (6 new, all 36 in suite pass locally):
  - cancel-cascade entry alone → not red
  - real-failure entry alone → red (no over-filter)
  - mixed cancel + real → red, failed[] contains only real failures
  - all entries cancelled → not red (the phantom-issue case)
  - combined=failure + empty statuses[] → still red (preserve fallback)
  - exact-match contract (substring would over-match)

Refs:
  - mc#1564
  - mc#1529 (chronic-red triage that surfaced this)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
core-security approved these changes 2026-05-19 20:01:31 +00:00
core-security left a comment
Member

5-axis (core-security):

  1. Secrets: None handled. The watchdog reads Gitea status payloads; description-string match is a literal compare.

  2. Privilege: No new surface — same combined/statuses[] read shape. The cancel-cascade filter strictly NARROWS what counts as red (fewer issue-files), so least-privilege effect.

  3. Input handling: Description-string match uses exact compare after .strip() — explicitly rejects substring (the over-match test locks this contract: "Has been cancelled by the user unexpectedly" stays red). No regex eval surface.

  4. Trust boundary: Test coverage is comprehensive (6 new tests including the empty-statuses fallback edge — preserves the "combined=failure with no per-entry detail" breadcrumb path for the CI-emitter-direct case).

  5. Supply-chain: Pure Python stdlib. No new deps.

APPROVE.

5-axis (core-security): 1. **Secrets**: None handled. The watchdog reads Gitea status payloads; description-string match is a literal compare. 2. **Privilege**: No new surface — same `combined`/`statuses[]` read shape. The cancel-cascade filter strictly NARROWS what counts as red (fewer issue-files), so least-privilege effect. 3. **Input handling**: Description-string match uses exact compare after `.strip()` — explicitly rejects substring (the over-match test locks this contract: `"Has been cancelled by the user unexpectedly"` stays red). No regex eval surface. 4. **Trust boundary**: Test coverage is comprehensive (6 new tests including the empty-statuses fallback edge — preserves the "combined=failure with no per-entry detail" breadcrumb path for the CI-emitter-direct case). 5. **Supply-chain**: Pure Python stdlib. No new deps. APPROVE.
core-qa approved these changes 2026-05-19 20:01:31 +00:00
core-qa left a comment
Member

5-axis (core-qa):

  1. Test-driven fix: 6 new tests covering the cancel-cascade filter — exact-match contract, mixed-cancel-and-real, all-cancel green, real-failure preservation, empty-statuses fallback, and the over-match guard. Each test cites the corresponding memory entry.

  2. Canonical alignment: Matches reference_chronic_red_sweep_cancelled_vs_failed_filter — filter must exclude status=3 from BOTH numerator and denominator. Verified description-string contract "Has been cancelled" (Gitea 1.22.6).

  3. Edge coverage:

    • Pure cancel-cascade → green (mc#1562/#1552/#1540 phantom issue prevention).
    • Mixed entries → red but failed[] list contains only real failures (clean issue body).
    • Combined=failure + empty statuses[] → still red (CI-emitter-direct breadcrumb preserved).
    • Substring over-match guard (locks down exact-match contract).
  4. Forward-compat: Header documents that if Gitea renames the description string in a future release, cancel-cascade entries simply leak back through (visible-not-silent). Option A (resolve action_run.status integer via target_url) documented as escalation path.

  5. Reversibility: Pure filter narrowing; if it over-filters, real failures are still seen via combined-status fallback path.

APPROVE.

5-axis (core-qa): 1. **Test-driven fix**: 6 new tests covering the cancel-cascade filter — exact-match contract, mixed-cancel-and-real, all-cancel green, real-failure preservation, empty-statuses fallback, and the over-match guard. Each test cites the corresponding memory entry. 2. **Canonical alignment**: Matches `reference_chronic_red_sweep_cancelled_vs_failed_filter` — filter must exclude status=3 from BOTH numerator and denominator. Verified description-string contract `"Has been cancelled"` (Gitea 1.22.6). 3. **Edge coverage**: - Pure cancel-cascade → green (mc#1562/#1552/#1540 phantom issue prevention). - Mixed entries → red but failed[] list contains only real failures (clean issue body). - Combined=failure + empty statuses[] → still red (CI-emitter-direct breadcrumb preserved). - Substring over-match guard (locks down exact-match contract). 4. **Forward-compat**: Header documents that if Gitea renames the description string in a future release, cancel-cascade entries simply leak back through (visible-not-silent). Option A (resolve action_run.status integer via target_url) documented as escalation path. 5. **Reversibility**: Pure filter narrowing; if it over-filters, real failures are still seen via combined-status fallback path. APPROVE.
Author
Member

/qa-recheck

/qa-recheck
Author
Member

/security-recheck

/security-recheck
core-devops added 1 commit 2026-05-19 20:07:22 +00:00
Merge branch 'main' into fix/main-red-watchdog-skip-cancel-cascade-mc1564
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 10s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 17s
E2E Chat / detect-changes (pull_request) Successful in 24s
E2E API Smoke Test / detect-changes (pull_request) Successful in 25s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 9s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
Lint no tenant GITEA/GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 5s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 3s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m4s
gate-check-v3 / gate-check (pull_request) Successful in 5s
qa-review / approved (pull_request) Successful in 3s
security-review / approved (pull_request) Successful in 4s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 3s
CI / Platform (Go) (pull_request) Successful in 2m35s
sop-tier-check / tier-check (pull_request) Successful in 4s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m6s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 4s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 8s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 3s
CI / Canvas (Next.js) (pull_request) Successful in 5m41s
CI / Python Lint & Test (pull_request) Successful in 7m13s
CI / all-required (pull_request) Successful in 7m14s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
audit-force-merge / audit (pull_request) Successful in 8s
876ef122be
core-devops merged commit 7054b75650 into main 2026-05-19 20:23:43 +00:00
Sign in to join this conversation.
3 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: molecule-ai/molecule-core#1571