|
All checks were successful
audit-force-merge / audit (pull_request) Successful in 18s
Adds a sentinel that detects post-merge CI red on `main` and files an
idempotent `[main-red] {repo}: {SHA[:10]}` issue. Auto-closes the issue
when main returns to green. Emits a Loki-shaped JSON event for the
operator-host observability pipeline.
Pattern source: CP `0adf2098` (ci-required-drift). Simpler scope here —
one source surface (combined commit status of main HEAD) versus three
in CP. Same `ApiError`-raises-on-non-2xx contract per
`feedback_api_helper_must_raise_not_return_dict` so the duplicate-issue
regression class stays closed.
Does NOT auto-revert. Option B is explicitly rejected per
`feedback_no_such_thing_as_flakes` + `feedback_fix_root_not_symptom`.
The watchdog files an alarm; humans fix forward.
Files:
- .gitea/workflows/main-red-watchdog.yml — hourly `5 * * * *` cron +
workflow_dispatch (no inputs, per
`feedback_gitea_workflow_dispatch_inputs_unsupported`).
- .gitea/scripts/main-red-watchdog.py — sidecar with `--dry-run`.
- tests/test_main_red_watchdog.py — 26 pytest cases.
Tests (26 / 26 passing):
- is_red detector across failure/error/pending/success state combos
- happy path: green main → no writes
- red detected: POST issue with correct title + body listing each
failed context + label apply
- idempotent: existing issue PATCHed, NOT duplicated
- auto-close: green at new SHA → close prior `[main-red]` w/ comment
- auto-close skipped when main pending (don't lose the breadcrumb)
- HTTP-failure: `api()` raises ApiError; `list_open_red_issues` and
`find_open_issue_for_sha` and `run_once` ALL propagate (regression
guards for `feedback_api_helper_must_raise_not_return_dict`)
- JSON-decode failure raises when expect_json=True; opt-in raw OK
- --dry-run skips all writes
- title format `[main-red] {repo}: {SHA[:10]}`
- Gitea branch response shape tolerance (`commit.id` OR `commit.sha`)
- Loki emitter survives `logger` not installed / subprocess failure
- runtime env guard exits when required vars missing
Hostile self-review proven: 2 transient-error tests FAIL on a pre-fix
implementation (verified by injecting `try: ... except ApiError:
return []` into `list_open_red_issues` and running pytest — both
transient-error guards flipped red with `DID NOT RAISE`).
Live dry-run against molecule-ai/molecule-core main confirms the script
parses the real Gitea combined-status response correctly (current main
is in fact red at
|
||
|---|---|---|
| .. | ||
| e2e | ||
| harness | ||
| ops | ||
| README.md | ||
| test_main_red_watchdog.py | ||
Tests
This repo uses the standard monorepo testing convention: unit tests live with their package, cross-component E2E tests live here.
Where to find tests
| Scope | Location |
|---|---|
| Go unit + integration (platform, CLI, handlers) | workspace-server/**/*_test.go — run with cd workspace-server && go test -race ./... |
| TypeScript unit (canvas components, hooks, store) | canvas/src/**/__tests__/ — run with cd canvas && npm test -- --run |
| TypeScript unit (MCP server handlers) | mcp-server/src/__tests__/ — run with cd mcp-server && npx jest |
| Python unit (workspace runtime, adapters) | workspace/tests/ — run with cd workspace && python3 -m pytest |
| Python unit (SDK: plugin + remote agent) | sdk/python/tests/ — run with cd sdk/python && python3 -m pytest |
| Cross-component E2E (spans platform + runtime + HTTP) | tests/e2e/ ← you are here |
Why split this way
- Go requires co-located
_test.gofiles to access unexported symbols. - Per-package test commands keep the inner loop fast — changing canvas doesn't re-run Go tests.
tests/e2e/covers scenarios that no single package owns: a full workspace lifecycle, A2A across two provisioned agents, delegation chains, bundle round-trips.
Running E2E
Every E2E script here assumes the platform is running at localhost:8080 and (where noted) provisioned agents are online. See the header comment of each .sh for specifics.
Cleaning up rogue test workspaces
If an E2E run aborts before its teardown runs (Ctrl-C, crash, CI timeout),
the platform can be left with workspaces whose config volume is stale or
empty — Docker's unless-stopped restart policy then spins those
containers in a FileNotFoundError loop. The platform's pre-flight check
(#17) marks such workspaces failed on the next restart, but a manual
cleanup is useful:
bash scripts/cleanup-rogue-workspaces.sh # deletes ws with id/name starting aaaaaaaa-, bbbbbbbb-, cccccccc-, test-ws-
MOLECULE_URL=http://host:8080 bash scripts/cleanup-rogue-workspaces.sh
The script DELETEs each matching workspace via the API and
force-removes the ws-<id[:12]> container as a belt-and-suspenders
fallback.