History

Hongming Wang c7478af99f feat(e2e): extend priority-runtimes test to cover all 8 templates Tonight's wire-real E2E sweep exposed 12+ root causes across the post- #87 template extraction. Most would have been caught by an actual provision-and-online test running on each template — but the test only covered claude-code + hermes. Extending it to cover all 8 ensures any future regression in any template fails the test, not production. What's added: - run_openai_runtime(runtime, label): generic provisioner for the 5 OpenAI-backed templates (langgraph, crewai, autogen, deepagents, openclaw). Same shape as run_hermes minus the HERMES_* config block that hermes-agent needs. - run_gemini_cli: separate function — gemini-cli wants a Google AI key (E2E_GEMINI_API_KEY), not OpenAI. - Each new runtime registered in the dispatch loop. New `all` keyword for E2E_RUNTIMES runs every covered runtime. claude-code + hermes keep their dedicated functions; both have unique provisioning quirks (claude-code OAuth + claude-code-specific volume mounts; hermes 15-min cold-boot) that don't generalize cleanly. Skip-if-no-key pattern matches the existing one — partially-keyed CI gets clean skips, not false-fails. Usage: E2E_OPENAI_API_KEY=... E2E_RUNTIMES=langgraph ./test_priority_runtimes_e2e.sh E2E_OPENAI_API_KEY=... E2E_RUNTIMES=all ./test_priority_runtimes_e2e.sh Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 05:57:59 -07:00
..
e2e	feat(e2e): extend priority-runtimes test to cover all 8 templates	2026-04-27 05:57:59 -07:00
README.md	chore: final open-source cleanup — binary, stale paths, private refs	2026-04-18 00:38:55 -07:00

Hongming Wang c7478af99f feat(e2e): extend priority-runtimes test to cover all 8 templates

Tonight's wire-real E2E sweep exposed 12+ root causes across the post-
#87 template extraction. Most would have been caught by an actual
provision-and-online test running on each template — but the test only
covered claude-code + hermes. Extending it to cover all 8 ensures any
future regression in any template fails the test, not production.

What's added:
- run_openai_runtime(runtime, label): generic provisioner for the 5
  OpenAI-backed templates (langgraph, crewai, autogen, deepagents,
  openclaw). Same shape as run_hermes minus the HERMES_* config block
  that hermes-agent needs.
- run_gemini_cli: separate function — gemini-cli wants a Google AI
  key (E2E_GEMINI_API_KEY), not OpenAI.
- Each new runtime registered in the dispatch loop. New `all` keyword
  for E2E_RUNTIMES runs every covered runtime.

claude-code + hermes keep their dedicated functions; both have unique
provisioning quirks (claude-code OAuth + claude-code-specific volume
mounts; hermes 15-min cold-boot) that don't generalize cleanly.

Skip-if-no-key pattern matches the existing one — partially-keyed CI
gets clean skips, not false-fails.

Usage:
  E2E_OPENAI_API_KEY=... E2E_RUNTIMES=langgraph     ./test_priority_runtimes_e2e.sh
  E2E_OPENAI_API_KEY=... E2E_RUNTIMES=all           ./test_priority_runtimes_e2e.sh

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-27 05:57:59 -07:00

e2e

feat(e2e): extend priority-runtimes test to cover all 8 templates

2026-04-27 05:57:59 -07:00

README.md

chore: final open-source cleanup — binary, stale paths, private refs

2026-04-18 00:38:55 -07:00

README.md

Tests

This repo uses the standard monorepo testing convention: unit tests live with their package, cross-component E2E tests live here.

Where to find tests

Scope	Location
Go unit + integration (platform, CLI, handlers)	`workspace-server/*/_test.go` — run with `cd workspace-server && go test -race ./...`
TypeScript unit (canvas components, hooks, store)	`canvas/src/**/__tests__/` — run with `cd canvas && npm test -- --run`
TypeScript unit (MCP server handlers)	`mcp-server/src/__tests__/` — run with `cd mcp-server && npx jest`
Python unit (workspace runtime, adapters)	`workspace/tests/` — run with `cd workspace && python3 -m pytest`
Python unit (SDK: plugin + remote agent)	`sdk/python/tests/` — run with `cd sdk/python && python3 -m pytest`
Cross-component E2E (spans platform + runtime + HTTP)	`tests/e2e/` ← you are here

Why split this way

Go requires co-located _test.go files to access unexported symbols.
Per-package test commands keep the inner loop fast — changing canvas doesn't re-run Go tests.
tests/e2e/ covers scenarios that no single package owns: a full workspace lifecycle, A2A across two provisioned agents, delegation chains, bundle round-trips.

Running E2E

Every E2E script here assumes the platform is running at localhost:8080 and (where noted) provisioned agents are online. See the header comment of each .sh for specifics.

Cleaning up rogue test workspaces

If an E2E run aborts before its teardown runs (Ctrl-C, crash, CI timeout), the platform can be left with workspaces whose config volume is stale or empty — Docker's unless-stopped restart policy then spins those containers in a FileNotFoundError loop. The platform's pre-flight check (#17) marks such workspaces failed on the next restart, but a manual cleanup is useful:

bash scripts/cleanup-rogue-workspaces.sh               # deletes ws with id/name starting aaaaaaaa-, bbbbbbbb-, cccccccc-, test-ws-
MOLECULE_URL=http://host:8080 bash scripts/cleanup-rogue-workspaces.sh

The script DELETEs each matching workspace via the API and force-removes the ws-<id[:12]> container as a belt-and-suspenders fallback.