fix(harness): count __SKIP__/__XFAIL__ replays as skips, not passes #2872
Reference in New Issue
Block a user
Delete Branch "fix/harness-runner-skip-xfail-counting"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
False-green audit finding: the harness replay runner counted any replay that exited 0 as PASS, including
canary-smoke-a2a-pong.shwhich exits 0 immediately with an__XFAIL__marker (blocked on #2863). That inflated the pass count while the replay exercised nothing.Changes:
__SKIP__/__XFAIL__markers.canary-smoke-a2a-pong.shto the__SKIP__marker so the runner classifies it correctly (the xfail reason and #2863 reference stay in the human-readable output).Verification:
bash -npasses for both modified scripts.No replay semantics changed; the runner now honestly reports xfails as skips instead of false-greens.
Routing: 2-genuine review (CR2 + Researcher). Do not self-merge.
APPROVED on
fff480c6. Verified against the actual Harness Replays CI job, not local lint: job 501817 reports canary-smoke-a2a-pong asSKIPafter the__SKIP__:#2863marker, while genuinely passing replays still reportPASS. The final summary is7 passed, 0 failed, 1 skipped (of 8 total), so the old 8/8 false-pass inflation is gone. The runner still treats non-zero replay exit as FAIL, records the failed name, and exits 1 when FAIL_COUNT > 0, so real failures still surface. Scope is limited to run-all-replays.sh and the a2a-pong marker; no conflict with the later #2863 un-xfail beyond the expected file overlap.