molecule-core/tests
core-devops fae62ac8c1
Some checks failed
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 19s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 23s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 22s
qa-review / approved (pull_request) Failing after 17s
gate-check-v3 / gate-check (pull_request) Successful in 24s
security-review / approved (pull_request) Failing after 13s
CI / Detect changes (pull_request) Successful in 29s
E2E API Smoke Test / detect-changes (pull_request) Successful in 32s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 31s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 33s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 33s
sop-tier-check / tier-check (pull_request) Successful in 14s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 6s
CI / Python Lint & Test (pull_request) Successful in 6s
CI / Canvas (Next.js) (pull_request) Successful in 8s
CI / Platform (Go) (pull_request) Successful in 7s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 7s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 6s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 8s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 6s
CI / all-required (pull_request) Successful in 3s
audit-force-merge / audit (pull_request) Successful in 8s
fix(ci): status-reaper rev3 widens window 10->30 + raises watchdog timeout + re-enables both crons
Phase 1+2 evidence (rev2 PR#633, merged 01:48Z): 6/6 ticks post-merge
with `compensated:0` despite ~25 known-stranded reds visible across
those same 10 SHAs on direct probe ~30min later. Reaper run 17057 at
02:46Z explicitly logged:

    scanned 42 workflows; push-triggered=19, class-O candidates=23
    status-reaper summary: {compensated:0, preserved_non_failure:185,
      scanned_shas:10, limit:10}

Root cause: schedule workflows post `failure` to commit-status
RETROACTIVELY 5-15 min after their merge. By the time reaper's next
*/5 tick lands, the stranded red is on a SHA that has already fallen
OUTSIDE a 10-commit window during a burst-merge period. Reaper
algorithm is correct; the lookback window is too narrow vs. the
retroactive-failure-post lag.

Three-in-one fix (atomic per hongming-pc2 GO 03:25Z):

1. `.gitea/scripts/status-reaper.py`
   DEFAULT_SWEEP_LIMIT 10 -> 30. Trades window-width-cheap for
   cadence-loady; kept `*/5` cron unchanged (avoiding `*/2` which
   would double runner load).

2. `.gitea/workflows/status-reaper.yml`
   Restore schedule cron block (revert mc#645 comment-out for THIS
   workflow only). Cron stays `*/5 * * * *`.

3. `.gitea/workflows/main-red-watchdog.yml`
   Restore schedule cron block (revert mc#645 comment-out) AND raise
   job-level `timeout-minutes: 5 -> 15`. Original 5min cap was
   producing cancels under runner-saturation latency, which fed the
   very `[main-red]` issues this workflow files (self-poisoning).

4. `tests/test_status_reaper.py`
   + test_default_sweep_limit_is_30 (contract pin)
   + test_reap_widened_window_catches_retroactive_failure: mocks 30
     SHAs, plants the failing context on SHA[20] (depth strictly past
     rev2's window=10), asserts the compensation POST lands on that
     SHA. Existing tests retain explicit `limit=10` overrides and
     remain unchanged. Suite: 42/42 passed (was 40 + 2 new).

Verification plan (post-merge, 10-15 min after merge / 2-3 cron ticks):
  - DB: SELECT id, status FROM action_run WHERE workflow_id=
    'status-reaper.yml' ORDER BY id DESC LIMIT 5 -> all status=1
  - Log via web UI:
    /molecule-ai/molecule-core/actions/runs/<index>/jobs/0/logs ->
    summary line should now show compensated > 0 with
    compensated_per_sha populated
  - Direct probe: pick a SHA in the last 30 main commits with class-O
    fails, GET /repos/molecule-ai/molecule-core/commits/{sha}/status
    -> compensated contexts now show state=success with description
    starting 'Compensated by status-reaper'

If rev3 STILL shows compensated:0 after the window-widening, the
diagnosis is wrong and a DIFFERENT bug needs to be uncovered (per
hongming-pc2 caveat 03:25Z). Re-enabling the crons IS the diagnosis
verification.

Cross-links:
  - PR#618 (rev1, drop-concurrency, merge 4db64bcb)
  - PR#633 (rev2, sweep-recent-commits, merge e7965a0f)
  - PR#645 (interim disable, merge 4c54b590) — re-enable being reverted
  - task #90 (orch rev3 tracker) / task #46 (hongming-pc2 tracker)
  - feedback_brief_hypothesis_vs_evidence (empirical evidence above)
  - feedback_strict_root_only_after_class_a (3-in-one root fix vs.
    longer patching chain)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-11 20:29:06 -07:00
..
e2e fix(ci): canonicalize MOLECULE_STAGING_ADMIN_TOKEN -> CP_STAGING_ADMIN_API_TOKEN (post-#443 rebase) + drop staging-smoke continue-on-error 2026-05-11 04:33:56 -07:00
harness ci(docker): pin base image digests in all Dockerfiles 2026-05-09 23:56:39 +00:00
ops ops: add Railway SHA-pin drift audit script + regression test (#2001) 2026-04-27 05:01:23 -07:00
README.md chore: final open-source cleanup — binary, stale paths, private refs 2026-04-18 00:38:55 -07:00
test_ci_required_drift.py feat(internal#219 §4+§6): port ci-required-drift + audit-force-merge sidecar from CP 2026-05-11 00:35:25 -07:00
test_main_red_watchdog.py feat(ci): main-red watchdog (Option C of main-never-red directive) — closes #420 2026-05-11 00:36:20 -07:00
test_status_reaper.py fix(ci): status-reaper rev3 widens window 10->30 + raises watchdog timeout + re-enables both crons 2026-05-11 20:29:06 -07:00

Tests

This repo uses the standard monorepo testing convention: unit tests live with their package, cross-component E2E tests live here.

Where to find tests

Scope Location
Go unit + integration (platform, CLI, handlers) workspace-server/**/*_test.go — run with cd workspace-server && go test -race ./...
TypeScript unit (canvas components, hooks, store) canvas/src/**/__tests__/ — run with cd canvas && npm test -- --run
TypeScript unit (MCP server handlers) mcp-server/src/__tests__/ — run with cd mcp-server && npx jest
Python unit (workspace runtime, adapters) workspace/tests/ — run with cd workspace && python3 -m pytest
Python unit (SDK: plugin + remote agent) sdk/python/tests/ — run with cd sdk/python && python3 -m pytest
Cross-component E2E (spans platform + runtime + HTTP) tests/e2e/you are here

Why split this way

  • Go requires co-located _test.go files to access unexported symbols.
  • Per-package test commands keep the inner loop fast — changing canvas doesn't re-run Go tests.
  • tests/e2e/ covers scenarios that no single package owns: a full workspace lifecycle, A2A across two provisioned agents, delegation chains, bundle round-trips.

Running E2E

Every E2E script here assumes the platform is running at localhost:8080 and (where noted) provisioned agents are online. See the header comment of each .sh for specifics.

Cleaning up rogue test workspaces

If an E2E run aborts before its teardown runs (Ctrl-C, crash, CI timeout), the platform can be left with workspaces whose config volume is stale or empty — Docker's unless-stopped restart policy then spins those containers in a FileNotFoundError loop. The platform's pre-flight check (#17) marks such workspaces failed on the next restart, but a manual cleanup is useful:

bash scripts/cleanup-rogue-workspaces.sh               # deletes ws with id/name starting aaaaaaaa-, bbbbbbbb-, cccccccc-, test-ws-
MOLECULE_URL=http://host:8080 bash scripts/cleanup-rogue-workspaces.sh

The script DELETEs each matching workspace via the API and force-removes the ws-<id[:12]> container as a belt-and-suspenders fallback.