molecule-core/docs/development/testing-e2e.md
Hongming Wang eca9796a5b docs: sync documentation with 2026-04-13 merges (PRs #1-#8)
Covers today's quality + infra pass: brand/structural cleanup, MCP
per-domain refactor (1697 -> 89 lines, 87 tools), canvas ConfirmDialog
unification, 4 platform handler decompositions (+47 Go tests), E2E
hardening for Phase 30.1/30.6 auth, and two new CI jobs (e2e-api +
shellcheck).

- CLAUDE.md: updated test counts (Go 536, canvas 357, SDK 121, MCP 97,
  workspace 1084); documented MCP per-domain split + new api.ts; added
  handler-decomposition section; Phase 30.1/30.6 auth callout; new
  CI jobs; env vars cross-ref.
- PLAN.md: Phase 31 "Quality + Infra Pass" marked shipped; test totals
  refreshed to 2,295.
- README.zh-CN.md: license badge MIT -> BSL 1.1; added BSL license block.
- docs/api-protocol/platform-api.md: registry table gains Auth column
  documenting Phase 30.1 bearer-token and Phase 30.6 X-Workspace-ID
  requirements on heartbeat/update-card/discover/peers.
- docs/development/local-development.md: updated stale test counts;
  added e2e-api + shellcheck CI jobs; pointer to new testing-e2e.md.
- docs/development/testing-e2e.md: new — per-script reference, auth
  prerequisites, local run, CI coverage, adding-a-new-check checklist.
- docs/edit-history/2026-04-13.md: top-of-file summary section added
  spanning PRs #1-#8; preserves existing per-feature entries below.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 17:46:28 -07:00

3.2 KiB

E2E Testing

End-to-end test scripts live under tests/e2e/ and exercise the platform against a real Postgres + Redis. Every script is shellcheck-clean and shares helpers from tests/e2e/_lib.sh + tests/e2e/_extract_token.py.

Scripts

Script Checks Prerequisites
test_api.sh 62 platform running on :8080; no live agents required
test_comprehensive_e2e.sh 67 platform running; spins up its own workspaces
test_a2a_e2e.sh 22 platform + 2 provisioned agents (Echo + SEO) with OPENROUTER_API_KEY
test_activity_e2e.sh 25 platform + 1 online agent
test_claude_code_e2e.sh platform + Claude Code runtime; exercises CLI adapter

Auth Prerequisites (Phase 30)

After Phase 30.1, the following routes require Authorization: Bearer <token> once a workspace has any live token on file (legacy workspaces are grandfathered):

  • POST /registry/heartbeat
  • POST /registry/update-card

After Phase 30.6, the following routes additionally require X-Workspace-ID on the caller side (bearer token validated, fail-open on DB hiccup):

  • GET /registry/discover/:id
  • GET /registry/:id/peers

The scripts handle this by:

  1. Creating a workspace → platform returns no token yet.
  2. Calling POST /registry/register — response body includes auth_token once per workspace.
  3. Extracting the token via _extract_token.py (reads JSON from stdin).
  4. Passing it in subsequent heartbeat / discover / peers calls.

test_comprehensive_e2e.sh registers each workspace immediately after creation so the provisioner's auto-register doesn't race the test's explicit register. test_activity_e2e.sh re-registers a detected-already-online agent to capture a fresh bearer token.

Running Locally

# Quickest check after any platform change:
cd platform && go build ./cmd/server && ./server &
bash tests/e2e/test_api.sh        # expect 62/62 pass

# Comprehensive sweep:
bash tests/e2e/test_comprehensive_e2e.sh   # expect 67/67 pass

Both scripts include a pre-test cleanup that deletes workspaces from previous runs so a stale DB won't cause spurious failures.

What CI Runs

.github/workflows/ci.yml (added 2026-04-13):

  • e2e-api — spins up Postgres + Redis via service containers, applies migrations with docker exec, builds the platform binary, runs tests/e2e/test_api.sh. All 62 checks must pass.
  • shellcheck — runs the shellcheck marketplace action against every tests/e2e/*.sh.

The other E2E scripts are not yet in CI because they require provisioned agents and LLM credentials; run them locally before merging runtime-touching changes.

Adding a New E2E Check

  1. Source tests/e2e/_lib.sh for assert_* helpers, bearer-token extraction, and the cleanup preamble.
  2. When hitting an auth-gated route, always register the workspace first and thread the returned token through subsequent requests.
  3. Keep each check idempotent — the comprehensive script is expected to be re-runnable on the same DB.
  4. Run shellcheck tests/e2e/your_script.sh locally before pushing.