Phase 4 of the force-merge protection fix (internal#219 §2).
Changes:
- audit-force-merge.yml REQUIRED_CHECKS: add CI / all-required (pull_request)
— closes the audit gap; force-merge audit now checks ci/all-required.
- ci.yml: flip continue-on-error: false on stable jobs
(changes, platform-build, canvas-build, shellcheck, python-lint)
— confirmed green on main 2026-05-12 combined-status check.
The all-required sentinel (continue-on-error: true) will be flipped
once branch protection PATCH lands (Owner-tier, delegated separately).
NOT included in this PR (separate Owner-tier action required):
- Branch protection PATCH: add ci/all-required as required check on main.
Needed to make the sentinel actually block merges. Delegate to Core
Platform Lead.
Refs: molecule-core#622, molecule-core#623
Schema asymmetry in Gitea 1.22.6 combined-status response:
- top-level `combined.state` → uses key "state"
- per-entry `combined.statuses[i].*` → uses key "status", NOT "state"
Pre-rev4 the per-entry loop in reap() (and the matching is_red() /
render_body() in main-red-watchdog) read `s.get("state")` only, which
returned None on every real Gitea response → state coerced to "" →
`"" != "failure"` guard preserved every entry → compensation path
unreachable since rev1.
Empirical proof (orchestrator probe 2026-05-12 03:42Z):
GET /repos/molecule-ai/molecule-core/commits/210da3b1/status
→ 29 per-entry items, ALL have key "status", ZERO have key "state".
status value distribution: {success:18, failure:8, pending:3}.
rev3 production run 17516 reported preserved_non_failure=585=30*19.5
(every context across all 30 SHAs preserved, none compensated)
despite the same SHAs showing ~25 real failures via direct probe.
Fix is one line per call site:
s.get("state") → s.get("status") or s.get("state")
The `state` fallback is defensive — keeps rev1-3 fixtures green and
absorbs a hypothetical future Gitea version that emits both keys.
Sibling-script audit:
- main-red-watchdog.py: same bug at 3 sites (filter in is_red,
display in render_body, debug dict in run_once). Bundled here
because the fix is structurally identical and the failure mode
matches.
- ci-required-drift.py: no per-entry status iteration. Clean.
Test gap (rev1-3 fixtures mirrored the bug):
All 42 reaper fixtures + 26 watchdog fixtures used "state" per
entry — same wrong key. That's why rev1-3 tests stayed green while
the production code was no-op. Logged under
`feedback_smoke_test_vendor_truth_not_shape_match`.
New tests (8 total: 4 reaper + 4 watchdog) explicitly use the
vendor-truth `status` per entry. Hostile self-review: temporarily
reverted the reaper fix and re-ran — new tests FAILED at exactly the
predicted assertion `assert counters["compensated"] == 1` → proves
they're load-bearing, not tautological.
Cross-links:
task #90 (orchestrator), task #46 (hongming-pc2 paired investigation)
PR #618 (rev1), PR #633 (rev2), PR #650 (rev3 widened window)
Phase 1+2 evidence (rev2 PR#633, merged 01:48Z): 6/6 ticks post-merge
with `compensated:0` despite ~25 known-stranded reds visible across
those same 10 SHAs on direct probe ~30min later. Reaper run 17057 at
02:46Z explicitly logged:
scanned 42 workflows; push-triggered=19, class-O candidates=23
status-reaper summary: {compensated:0, preserved_non_failure:185,
scanned_shas:10, limit:10}
Root cause: schedule workflows post `failure` to commit-status
RETROACTIVELY 5-15 min after their merge. By the time reaper's next
*/5 tick lands, the stranded red is on a SHA that has already fallen
OUTSIDE a 10-commit window during a burst-merge period. Reaper
algorithm is correct; the lookback window is too narrow vs. the
retroactive-failure-post lag.
Three-in-one fix (atomic per hongming-pc2 GO 03:25Z):
1. `.gitea/scripts/status-reaper.py`
DEFAULT_SWEEP_LIMIT 10 -> 30. Trades window-width-cheap for
cadence-loady; kept `*/5` cron unchanged (avoiding `*/2` which
would double runner load).
2. `.gitea/workflows/status-reaper.yml`
Restore schedule cron block (revert mc#645 comment-out for THIS
workflow only). Cron stays `*/5 * * * *`.
3. `.gitea/workflows/main-red-watchdog.yml`
Restore schedule cron block (revert mc#645 comment-out) AND raise
job-level `timeout-minutes: 5 -> 15`. Original 5min cap was
producing cancels under runner-saturation latency, which fed the
very `[main-red]` issues this workflow files (self-poisoning).
4. `tests/test_status_reaper.py`
+ test_default_sweep_limit_is_30 (contract pin)
+ test_reap_widened_window_catches_retroactive_failure: mocks 30
SHAs, plants the failing context on SHA[20] (depth strictly past
rev2's window=10), asserts the compensation POST lands on that
SHA. Existing tests retain explicit `limit=10` overrides and
remain unchanged. Suite: 42/42 passed (was 40 + 2 new).
Verification plan (post-merge, 10-15 min after merge / 2-3 cron ticks):
- DB: SELECT id, status FROM action_run WHERE workflow_id=
'status-reaper.yml' ORDER BY id DESC LIMIT 5 -> all status=1
- Log via web UI:
/molecule-ai/molecule-core/actions/runs/<index>/jobs/0/logs ->
summary line should now show compensated > 0 with
compensated_per_sha populated
- Direct probe: pick a SHA in the last 30 main commits with class-O
fails, GET /repos/molecule-ai/molecule-core/commits/{sha}/status
-> compensated contexts now show state=success with description
starting 'Compensated by status-reaper'
If rev3 STILL shows compensated:0 after the window-widening, the
diagnosis is wrong and a DIFFERENT bug needs to be uncovered (per
hongming-pc2 caveat 03:25Z). Re-enabling the crons IS the diagnosis
verification.
Cross-links:
- PR#618 (rev1, drop-concurrency, merge 4db64bcb)
- PR#633 (rev2, sweep-recent-commits, merge e7965a0f)
- PR#645 (interim disable, merge 4c54b590) — re-enable being reverted
- task #90 (orch rev3 tracker) / task #46 (hongming-pc2 tracker)
- feedback_brief_hypothesis_vs_evidence (empirical evidence above)
- feedback_strict_root_only_after_class_a (3-in-one root fix vs.
longer patching chain)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Same root cause as sop-tier-check.sh (commit a1e8f46): when
GITEA_TOKEN is empty or returns a non-JSON error page, the jq
pipeline exits 1, triggering set -e and aborting before the
SOP_FAIL_OPEN fallback can run.
Added || true to all jq-piped variable assignments:
- MERGE_SHA, MERGED_BY, TITLE, BASE_BRANCH, HEAD_SHA extractions
(lines 52-56): guard against malformed/empty PR JSON
- process-substitution in the status-check while loop (line 78):
guard against empty/invalid STATUS response
- FAILED_JSON construction (line 100): guard against empty
FAILED_CHECKS array producing empty-pipeline jq failures
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
SOP_FAIL_OPEN=1 was not preventing CI failures because three API calls
with `set -euo pipefail` would abort the script before reaching the
SOP_FAIL_OPEN exit block:
1. `WHOAMI=$(curl ... | jq -r ...)` — jq exits 1 on empty input,
triggering set -e → script exits before SOP_FAIL_OPEN check.
2. `curl` for reviews — curl exits non-zero on 401 from empty token,
triggering set -e → same problem.
3. `curl` for org teams list — same issue.
Fix: add `|| true` to jq pipelines and `set +e` / `set -e` guards
around curl calls that may fail with empty token. When SOP_FAIL_OPEN=1
and the token is invalid, the script now exits 0 instead of 1,
preventing blocking CI failures on unconfigured runners.
Refs: sop-tier-check failure on PRs #617, #621, #587, #562
Root cause: DRIFT_BOT_TOKEN lacks repo-admin scope → Gitea 1.22.6's
`GET /repos/.../branch_protections/{branch}` returns 403/404 → ApiError
→ non-zero exit → workflow red. The token trail (internal#329) was never
completed for mc-drift-bot on molecule-core.
Fix (script): catch ApiError on the protection fetch; on 403/404 log a
clear ::error:: diagnostic explaining the token-scope gap and return
empty findings (skip this branch). The issue IS the alarm, not a red
workflow. 5xx is still propagated (transient outage).
Fix (workflow): remove stale transitional comment that claimed the
all-required sentinel didn't exist yet (it landed in #553).
Fixes: infra/ci-required-drift red on main (210da3b1→4db64bcb).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
RFC#420 Option-C machinery has been down ~2.5h:
- status-reaper rev2 (PR#633, merged 01:48Z): 0 'Compensated by status-reaper'
status on the last 14 main commits. Schedule reds stranded on stale
commits despite the rev2 sweep-last-10 design.
- main-red-watchdog: 'Failing after 10m56s' with timeout-minutes:5 — runner
saturation queue-lag pushed it past its own timeout. No [main-red] issues
filed during the outage despite 5 reds on HEAD e7965a0f at the high
watermark.
Both workflows were themselves contributing to the red pileup on main +
queuing the ubuntu-latest pool. Cheap-and-safe interim: comment out the
schedule: blocks. workflow_dispatch: stays so they can be triggered
manually for debugging.
Re-enable after:
1. rev3 lands (likely scan_workflows() should LOG-and-skip rather than
sys.exit on a malformed workflow; list_recent_commit_shas() should
degrade gracefully)
2. Dedicated status-ops runner-label (route status-reaper + watchdog +
ci-required-drift to it so they don't queue behind CI-merge-churn)
Per hongming-pc2 02:31Z directive: 'pick one: rev3+raise-timeout OR
temporarily disable the crons'. Choosing disable for safety while rev3
investigation proceeds.
Reviewed-by: hongming-pc2 (pre-APPROVE on sight 02:31Z)
Author: claude-ceo-assistant (orchestrator emergency; operator-host
unreachable 02:01-02:38Z blocked SSH-bridge to core-devops persona)
Cross-links: task #90 (rev2), task #75 (main-red sweep), RFC#420 Option-C
rev1 (PR #618, merged 4db64bcb) only inspected the CURRENT main HEAD per
tick. Schedule workflows post `failure` to whatever SHA was HEAD when the
run COMPLETED, which by the next */5 tick is usually a stale commit
because main has already moved forward via merges. Result: rev1 was
running successfully but with `compensated:0` on every tick across ~6
cycles (orchestrator + hongming-pc2 Phase 1+2 evidence 23:46Z / 23:59Z /
00:02Z); reds stranded on stale commits.
rev2 sweeps the last 10 main commits per tick:
- New `list_recent_commit_shas(branch, limit)` wraps
GET /repos/{o}/{r}/commits?sha={branch}&limit={limit}. Vendor-truth
probe 2026-05-11 confirms Gitea 1.22.6 returns a JSON list of commit
objects with `sha` keys (per `feedback_smoke_test_vendor_truth_not_
shape_match`).
- New `reap_branch()` orchestrates the sweep:
- For each SHA: GET combined status with PER-SHA ERROR ISOLATION
(refinement #7) — ApiError on one stale SHA logs `:⚠️:` and
continues to the next. Different from the single-HEAD pre-rev2 path
where fail-loud was correct; the sweep is best-effort across
historical commits.
- When `combined.state == "success"`: skip the per-context loop
entirely (refinement #2, cost optimization, common case).
- Otherwise delegate to the existing per-SHA `reap()` worker (logic
UNCHANGED — `_has_push_trigger` / `parse_push_context` /
`scan_workflows` not touched per refinement #6).
- Aggregated counters preserve all rev1 fields PLUS:
- `scanned_shas`: how many SHAs we actually iterated (always 10
in normal operation; less if commits API returns fewer)
- `compensated_per_sha`: {<full_sha>: [<context>, ...]} for the
SHAs that actually got at least one compensation
- `reap()` now also returns `compensated_contexts` so `reap_branch()`
can build `compensated_per_sha` without re-deriving it from the POST
stream. Backwards-compatible — all existing test assertions check
specific counter keys, none enforce a closed dict shape.
- `main()` switches from `get_head_sha` + `get_combined_status` + `reap`
to a single `reap_branch()` call. Adds `--limit` CLI flag for
ops-driven sweep-width tuning (default 10).
Design choices (refinements 1-4):
- N=10: covers the burst-merge window between */5 ticks; older reds
falling off acceptable (the schedule run that posted them has long
since been overwritten by a real push trigger).
- Skip combined=success early: most commits in the window will be green;
short-circuit before the per-context loop saves work.
- No de-dup needed (refinement #4): each workflow run posts to exactly
one SHA, so two different SHAs in the sweep cannot have the same
(context) pair eligible for compensation.
Test suite: 37 + 3 = 40/40 cases pass.
- New: test_reap_sweeps_n_shas_smoke (mock 3 SHAs, verify each GET'd)
- New: test_reap_skips_combined_success_shas (verify the
combined=success short-circuit; only the 1 failure SHA is iterated)
- New: test_reap_continues_on_per_sha_apierror (per-SHA error isolation
contract — ApiError on SHA[0] logged + skipped + SHA[1] processes)
- All 37 existing rev1 tests pass unchanged (per-SHA worker logic + the
helpers it consumes are untouched).
Live dry-run smoke against git.moleculesai.app:
scanned 41 workflows; push-triggered=18, class-O candidates=23
summary: {"branch":"main","compensated":0,"compensated_per_sha":{},
"dry_run":true,"limit":10,"preserved_non_failure":196,
...,"scanned_shas":10}
Cross-link:
- internal#327 (sibling publish-runtime-bot)
- task #90 (orchestrator brief), task #46 (hongming-pc2 brief)
- PR #618 (parent rev1, merge 4db64bcb)
- `reference_post_suspension_pipeline`
- `feedback_no_shared_persona_token_use` (commit author = core-devops, not hongming-pc2)
- `feedback_strict_root_only_after_class_a` (root cause, not symptom)
- `feedback_brief_hypothesis_vs_evidence` (evidence: compensated:0 across 6 cycles)
Removal path: drop this workflow when Gitea >= 1.24 ships with a real
fix for the hardcoded-suffix bug. Audit issue (filed alongside rev1)
tracks the deletion as a follow-up sweep.
getSkills (DetailsTab): null/undefined/empty inputs, id+name priority,
description truthy-guard edge cases, id-name precedence, falsy coercion.
extractSkills (SkillsTab): same inputs plus tags/examples coercion,
"undefined" id vs "Unnamed skill" name distinction, mixed valid/invalid.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Two fixes found during first CI run:
1. Workflow missing jq installation step — T12 jq-filter test needs jq
which is not in the Gitea Actions ubuntu-latest runner image.
Add the same install dance as sop-tier-check.yml (apt-get first,
GitHub binary download fallback, infra#241 belt-and-suspenders).
2. test_review_check.sh hardcodes /tmp/jq in T12. In CI jq gets
installed to /usr/bin/jq via apt-get. Fix: use `command -v jq` to
resolve from PATH first, fall back to /tmp/jq for local dev.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
New workflow .gitea/workflows/review-check-tests.yml triggers on
every PR + push that touches review-check.sh or its test fixtures.
Runs the existing 22-scenario regression suite (test_review_check.sh)
which covers all issue #540 acceptance criteria.
CONTRIBUTING.md updated with:
- review-check-tests row in the CI job table
- Local testing section with the smoke command
Note: tests are bash-based (not bats) per existing test_review_check.sh
design. Converting to bats would be refactoring rather than closing the gap.
Bats dependency was never added to the runner-base image.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Covers all render states: loading, fetch error, 402 exceeded banner,
budget loaded (with/without limit, over-limit cap), progress bar
visibility, save success, save error, saving-in-flight button state,
and the isApiError402 helper's regex branches.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
NotAvailablePanel: renders heading, runtime name in monospace, Chat hint,
SVG aria-hidden, flex layout.
FilesToolbar: directory selector options + aria-label, setRoot on change,
file count display, New/Upload/Clear visible only for /configs,
Export/Refresh always visible, aria-labels on all buttons,
onNewFile/onDownloadAll/onClearAll/onRefresh called on click,
focus-visible ring on all buttons.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
executeDelegation(sourceID, targetID) fires proxyA2ARequest which calls
registry.CanCommunicate(sourceID, targetID) when source != target. Both
IDs are different test fixtures (ws-source-159, ws-target-159), so the
lookup fires two separate getWorkspaceRef queries:
SELECT id, parent_id FROM workspaces WHERE id = $1 -- sourceID
SELECT id, parent_id FROM workspaces WHERE id = $1 -- targetID
expectExecuteDelegationBase only mocked the URL/status fallback query.
sqlmock would fail with "unexpected query" when the CanCommunicate
lookups fired — this was a silent failure because the tests never
verified ExpectationWereMet on the CanCommunicate path.
Fix: add two ExpectQuery rows for both parent_id lookups (both NULL,
root-level siblings, allowed).
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Adds a continue-on-error step that runs ./internal/handlers/... and
./internal/pendinguploads/... with -v -timeout 60s, tee-ing output to
/tmp/ and emitting last-100-lines to step summary. Gitea Actions logs
API returns 404 (gitea/gitea#22168), making the run-page step summary
the only available signal when CI stalls. Step is stripped before merge.
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
REVERT of #599 (infra/docker-runner-label) — urgent CI regression fix.
The `docker` label is NOT registered on any act_runner. With
runs-on: [ubuntu-latest, docker], publish-workflow jobs queue
indefinitely with zero eligible runners — strictly worse than the
pre-#599 coin-flip (50% success rate).
Restore runs-on: ubuntu-latest so publish-workflow jobs can run
again. The docker-label registration is the hard prerequisite that
must be satisfied before re-applying #599.
Fixes: publish-workspace-server-image + publish-canvas-image
stuck in "Waiting to run" since #599 merged ~23:24Z.
To re-apply: once `docker` label is registered on ≥2 runners,
re-apply the runs-on: [ubuntu-latest, docker] change from
#599 (branch infra/docker-runner-label).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Runs the full Platform-Go suite (build, vet, golangci-lint, tests with
coverage thresholds) every Monday at 04:17 UTC regardless of whether
workspace-server/ was touched by the last push.
Background: ci.yml's platform-build gates real work on
`needs.changes.outputs.platform == 'true'`. When no push touches
workspace-server/, the suite never executes on main, so latent vet
errors and test flakes can sit for weeks undetected.
This workflow surfaces those errors in advance so the next
workspace-server push doesn't trigger unexpected failures.
Closes#567.
Closes molecule-core#567.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds DEFAULT_TIMEOUT=15 to gate_check.py and passes it to all urlopen()
calls (api_get, comment POST, comment PATCH).
Adds socket.setdefaulttimeout(15) to the inline Python in the workflow's
cron step, catching the PR-polling loop too.
Defence-in-depth: the real fix is provisioning SOP_TIER_CHECK_TOKEN
in Gitea; this caps worst-case wall-clock at ~15 s per call when the
token is missing or Gitea is unreachable.
Fixes issue #603. Note: PR #603 (da1487ad) has the same changes but
is missing `import socket` in the inline Python — that version would
NameError at runtime. This branch carries the complete fix.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Root cause (verified via runs 14525 + 14526):
Gitea 1.22.6 emits commit-status context as
<workflow_name> / <job_name> (push)
for ANY workflow run on the default-branch HEAD, REGARDLESS of the
trigger event. Schedule- and workflow_dispatch-triggered runs
therefore paint main red via a fake-push status. No upstream fix
in 1.23-1.26.1 (sibling a6f20db1 research; internal#80 RFC).
Design — Option B (b2 cron-based compensating-status POST):
workflow_run is NOT supported on Gitea 1.22.6 (verified via
modules/actions/workflows.go enumeration); cron is the only
event-shaped option that fires reliably.
Every 5min, .gitea/workflows/status-reaper.yml runs a stdlib +
PyYAML scanner that:
1. Walks .gitea/workflows/*.yml. Resolves each workflow_id from
top-level 'name:' (else filename stem). Fails LOUD on
name-collision OR '/' in name (would break ' / ' context
parsing downstream). Classifies each by 'push:' trigger
presence (str / list / dict on: shapes all handled).
2. Reads main HEAD's combined commit status.
3. For each failure-state context ending ' (push)':
- parses '<workflow_name> / <job_name> (push)';
- skips if workflow not in scan map (conservative);
- preserves if workflow has push: trigger (real defect);
- else POSTs state=success with the same context to
/repos/{o}/{r}/statuses/{sha}, with a description that
documents the workaround.
Safety:
- Only failure-state contexts whose suffix is ' (push)' are
compensated. Branch_protections required checks on main (Secret
scan, sop-tier-check) have ' (pull_request)' suffix — UNREACHABLE
from this code path. Verified 2026-05-11 + test
test_reap_required_check_pull_request_suffix_never_touched.
- publish-workspace-server-image has a real push: trigger →
PRESERVED. mc#576's docker-socket failure stays visible as
intended. Explicit test fixture.
- api() raises ApiError on non-2xx + JSON-decode failure per
feedback_api_helper_must_raise_not_return_dict. Pre-fix
'soft-fail' would silently paint main green via omission.
Persona:
claude-status-reaper (Gitea uid 94, write:repository) — provisioned
2026-05-11 21:39Z by sub-agent aefaac1b. Token under
secrets.STATUS_REAPER_TOKEN (no other write surface touched).
Acceptance (post-merge verify, Step-5):
Trigger one class-O workflow via workflow_dispatch (e.g.
sweep-cf-tunnels). Observe reaper compensate the resulting
(push)-suffix failure on the next 5-min tick. Real
push-triggered failures (publish-workspace-server-image) MUST
still red main.
Removal path:
Drop this workflow + script + tests when Gitea is upgraded to
>= 1.24 with a fix for the hardcoded-suffix bug, OR when an
upstream patch lands (internal#80 RFC). Tracked in
post-merge audit issue.
Cross-links:
- sibling internal#327 (publish-runtime-bot)
- sibling internal#328 (mc-drift-bot)
- sibling internal#329 (Gitea dispatcher race)
- sibling internal#330 (disk-GC cron Gitea-class bug)
- upstream internal#80 (Gitea hardcoded-suffix RFC)
- mc#576 (preserved by design — real push-trigger failure)
- sub-agent aefaac1b (provisioning sibling)
- sub-agent a6f20db1 (Option A research — no upstream fix)
Tests: 37 pytest cases pass (incl. hongming-pc 22:08Z review's 3
design checks: name-collision fail-loud, '/' in name lint, name vs
filename fallback).