Two more changes in evaluate_merge_readiness + get_combined_status:
4. **Skip PR-level combined state check**: The combined state is also
polluted by non-blocking jobs (continue-on-error: true). The
queue-bot now checks only the explicitly required PR-level contexts
(CI/all-required, sop-checklist/all-items-acked) instead of the full
combined state. This unblocks PRs whose only failures are pr-validate
timeouts or qa/sec token issues.
5. **Best-effort status fetch with graceful fallback**: Fetching
/statuses?limit=200 can time out on large SHAs (main with 550+
entries). Now catches ApiError/URLError/TimeoutError/OSError and
falls back to the statuses[] already in the combined response
(usually 30 entries — enough for push-required contexts). Also
reduced limit to 50 to reduce transfer size.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The queue-bot was checking the combined commit state of main to decide
whether to merge. Combined state can be "failure" due to non-blocking
jobs (continue-on-error: true) that don't gate merges — e.g. Platform
Go on main push fails due to mc#774 but that does not block PRs.
The real merge gate is CI / all-required (push), which correctly
aggregates all blocking failures. Switching to explicit context checks
also fixes two latent bugs:
1. latest_statuses_by_context() kept the FIRST (oldest) occurrence of
each context. Gitea's /status endpoint returns statuses in ascending
id order, so required-context entries were often missed from the
truncated 30-entry array. Fixed by iterating in reverse so the LAST
(newest) occurrence wins.
2. The /status endpoint caps statuses[] at 30 entries. Fixed by also
fetching /statuses?limit=200 to get the full list.
Tests: dry-run now shows queue processing PR #942 (skips: wrong base)
and would process PR #978 on next tick.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
SRE action: push empty commit to clear stale CI failures from runner
exhaustion window. Platform Go and Handlers Postgres push jobs ran
successfully at 09:01 on PRs; the stale failures on main SHA
8026f020 from 05:42 are blocking the merge queue.