History

core-devops fae62ac8c1 Some checks failed Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 19s Details Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 23s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 22s Details qa-review / approved (pull_request) Failing after 17s Details gate-check-v3 / gate-check (pull_request) Successful in 24s Details security-review / approved (pull_request) Failing after 13s Details CI / Detect changes (pull_request) Successful in 29s Details E2E API Smoke Test / detect-changes (pull_request) Successful in 32s Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 31s Details E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 33s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 33s Details sop-tier-check / tier-check (pull_request) Successful in 14s Details CI / Shellcheck (E2E scripts) (pull_request) Successful in 6s Details CI / Python Lint & Test (pull_request) Successful in 6s Details CI / Canvas (Next.js) (pull_request) Successful in 8s Details CI / Platform (Go) (pull_request) Successful in 7s Details CI / Canvas Deploy Reminder (pull_request) Has been skipped Details E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 7s Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 6s Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 8s Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 6s Details CI / all-required (pull_request) Successful in 3s Details audit-force-merge / audit (pull_request) Successful in 8s Details fix(ci): status-reaper rev3 widens window 10->30 + raises watchdog timeout + re-enables both crons Phase 1+2 evidence (rev2 PR#633, merged 01:48Z): 6/6 ticks post-merge with `compensated:0` despite ~25 known-stranded reds visible across those same 10 SHAs on direct probe ~30min later. Reaper run 17057 at 02:46Z explicitly logged: scanned 42 workflows; push-triggered=19, class-O candidates=23 status-reaper summary: {compensated:0, preserved_non_failure:185, scanned_shas:10, limit:10} Root cause: schedule workflows post `failure` to commit-status RETROACTIVELY 5-15 min after their merge. By the time reaper's next /5 tick lands, the stranded red is on a SHA that has already fallen OUTSIDE a 10-commit window during a burst-merge period. Reaper algorithm is correct; the lookback window is too narrow vs. the retroactive-failure-post lag. Three-in-one fix (atomic per hongming-pc2 GO 03:25Z): 1. `.gitea/scripts/status-reaper.py` DEFAULT_SWEEP_LIMIT 10 -> 30. Trades window-width-cheap for cadence-loady; kept `/5` cron unchanged (avoiding `/2` which would double runner load). 2. `.gitea/workflows/status-reaper.yml` Restore schedule cron block (revert mc#645 comment-out for THIS workflow only). Cron stays `/5 * * * *`. 3. `.gitea/workflows/main-red-watchdog.yml` Restore schedule cron block (revert mc#645 comment-out) AND raise job-level `timeout-minutes: 5 -> 15`. Original 5min cap was producing cancels under runner-saturation latency, which fed the very `[main-red]` issues this workflow files (self-poisoning). 4. `tests/test_status_reaper.py` + test_default_sweep_limit_is_30 (contract pin) + test_reap_widened_window_catches_retroactive_failure: mocks 30 SHAs, plants the failing context on SHA[20] (depth strictly past rev2's window=10), asserts the compensation POST lands on that SHA. Existing tests retain explicit `limit=10` overrides and remain unchanged. Suite: 42/42 passed (was 40 + 2 new). Verification plan (post-merge, 10-15 min after merge / 2-3 cron ticks): - DB: SELECT id, status FROM action_run WHERE workflow_id= 'status-reaper.yml' ORDER BY id DESC LIMIT 5 -> all status=1 - Log via web UI: /molecule-ai/molecule-core/actions/runs/<index>/jobs/0/logs -> summary line should now show compensated > 0 with compensated_per_sha populated - Direct probe: pick a SHA in the last 30 main commits with class-O fails, GET /repos/molecule-ai/molecule-core/commits/{sha}/status -> compensated contexts now show state=success with description starting 'Compensated by status-reaper' If rev3 STILL shows compensated:0 after the window-widening, the diagnosis is wrong and a DIFFERENT bug needs to be uncovered (per hongming-pc2 caveat 03:25Z). Re-enabling the crons IS the diagnosis verification. Cross-links: - PR#618 (rev1, drop-concurrency, merge `4db64bcb`) - PR#633 (rev2, sweep-recent-commits, merge `e7965a0f`) - PR#645 (interim disable, merge `4c54b590`) — re-enable being reverted - task #90 (orch rev3 tracker) / task #46 (hongming-pc2 tracker) - feedback_brief_hypothesis_vs_evidence (empirical evidence above) - feedback_strict_root_only_after_class_a (3-in-one root fix vs. longer patching chain) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>		2026-05-11 20:29:06 -07:00
..
e2e	fix(ci): canonicalize MOLECULE_STAGING_ADMIN_TOKEN -> CP_STAGING_ADMIN_API_TOKEN (post-#443 rebase) + drop staging-smoke continue-on-error	2026-05-11 04:33:56 -07:00
harness	ci(docker): pin base image digests in all Dockerfiles	2026-05-09 23:56:39 +00:00
ops	ops: add Railway SHA-pin drift audit script + regression test (#2001 )	2026-04-27 05:01:23 -07:00
README.md	chore: final open-source cleanup — binary, stale paths, private refs	2026-04-18 00:38:55 -07:00
test_ci_required_drift.py	feat(internal#219 §4+§6): port ci-required-drift + audit-force-merge sidecar from CP	2026-05-11 00:35:25 -07:00
test_main_red_watchdog.py	feat(ci): main-red watchdog (Option C of main-never-red directive) — closes #420	2026-05-11 00:36:20 -07:00
test_status_reaper.py	fix(ci): status-reaper rev3 widens window 10->30 + raises watchdog timeout + re-enables both crons	2026-05-11 20:29:06 -07:00

README.md

Tests

This repo uses the standard monorepo testing convention: unit tests live with their package, cross-component E2E tests live here.

Where to find tests

Scope	Location
Go unit + integration (platform, CLI, handlers)	`workspace-server/*/_test.go` — run with `cd workspace-server && go test -race ./...`
TypeScript unit (canvas components, hooks, store)	`canvas/src/**/__tests__/` — run with `cd canvas && npm test -- --run`
TypeScript unit (MCP server handlers)	`mcp-server/src/__tests__/` — run with `cd mcp-server && npx jest`
Python unit (workspace runtime, adapters)	`workspace/tests/` — run with `cd workspace && python3 -m pytest`
Python unit (SDK: plugin + remote agent)	`sdk/python/tests/` — run with `cd sdk/python && python3 -m pytest`
Cross-component E2E (spans platform + runtime + HTTP)	`tests/e2e/` ← you are here

Why split this way

Go requires co-located _test.go files to access unexported symbols.
Per-package test commands keep the inner loop fast — changing canvas doesn't re-run Go tests.
tests/e2e/ covers scenarios that no single package owns: a full workspace lifecycle, A2A across two provisioned agents, delegation chains, bundle round-trips.

Running E2E

Every E2E script here assumes the platform is running at localhost:8080 and (where noted) provisioned agents are online. See the header comment of each .sh for specifics.

Cleaning up rogue test workspaces

If an E2E run aborts before its teardown runs (Ctrl-C, crash, CI timeout), the platform can be left with workspaces whose config volume is stale or empty — Docker's unless-stopped restart policy then spins those containers in a FileNotFoundError loop. The platform's pre-flight check (#17) marks such workspaces failed on the next restart, but a manual cleanup is useful:

bash scripts/cleanup-rogue-workspaces.sh               # deletes ws with id/name starting aaaaaaaa-, bbbbbbbb-, cccccccc-, test-ws-
MOLECULE_URL=http://host:8080 bash scripts/cleanup-rogue-workspaces.sh

The script DELETEs each matching workspace via the API and force-removes the ws-<id[:12]> container as a belt-and-suspenders fallback.