Five additional breakages surfaced while testing the restored stack end-to-end (spin up Hermes template → click node → open side panel → configure secrets → send chat). Each fix is narrowly scoped and has matching unit or e2e tests so they don't regress. ### 1. SSRF defence blocked loopback A2A on self-hosted Docker handlers/ssrf.go was rejecting `http://127.0.0.1:<port>` workspace URLs as loopback, so POST /workspaces/:id/a2a returned 502 on every Canvas chat send in local-dev. The provisioner on self-hosted Docker publishes each container's A2A port on 127.0.0.1:<ephemeral> — that's the only reachable address for the platform-on-host path. Added `devModeAllowsLoopback()` — allows loopback only when MOLECULE_ENV ∈ {development, dev}. SaaS (MOLECULE_ENV=production) continues to block loopback; every other blocked range (metadata 169.254/16, TEST-NET, CGNAT, link-local) stays blocked in dev mode. Tests: 5 new tests in ssrf_test.go covering dev-mode loopback, dev-mode short-alias ("dev"), production still blocks loopback, dev-mode still blocks every other range, and a 9-case table test of the predicate with case/whitespace/typo variants. ### 2. canvas/src/lib/api.ts: 401 → login redirect broke localhost Every 401 called `redirectToLogin()` which navigates to `/cp/auth/login`. That route exists only on SaaS (mounted by the cp_proxy when CP_UPSTREAM_URL is set). On localhost it 404s — users landed on a blank "404 page not found" instead of seeing the actual error they should fix. Gated the redirect on the SaaS-tenant slug check: on <slug>.moleculesai.app, redirect unchanged; on any non-SaaS host (localhost, LAN IP, reserved subdomains like app.moleculesai.app), throw a real error so the calling component can render a retry affordance. Tests: 4 new vitest cases in a dedicated api-401.test.ts (needs jsdom for window.location.hostname) — SaaS redirects, localhost throws, LAN hostname throws, reserved apex throws. ### 3. SecretsSection rendered a hardcoded key list config/secrets-section.tsx shipped a fixed COMMON_KEYS list (Anthropic / OpenAI / Google / SERP / Model Override) regardless of what the workspace's template actually needed. A Hermes workspace declaring MINIMAX_API_KEY in required_env got five irrelevant slots and nothing for the key it actually needed. Made the slot list template-driven via a new `requiredEnv?: string[]` prop passed down from ConfigTab. Added `KNOWN_LABELS` for well-known names and `humanizeKeyName` to turn arbitrary SCREAMING_SNAKE_CASE into a readable label (e.g. MINIMAX_API_KEY → "Minimax API Key"). Acronyms (API, URL, ID, SDK, MCP, LLM, AI) stay uppercase. Legacy fallback preserved when required_env is empty. Tests: 8 new vitest cases covering known-label lookup, humanise fallback, acronym preservation, deduplication, and both fallback paths. ### 4. Confusing placeholder in Required Env Vars field The TagList in ConfigTab labelled "Required Env Vars (from template)" is a DECLARATION field — stores variable names. The placeholder "e.g. CLAUDE_CODE_OAUTH_TOKEN" suggested that, but users naturally typed the value of their API key into the field instead. The actual values go in the Secrets section further down the tab. Relabelled to "Required Env Var Names (from template)", changed the placeholder to "variable NAME (e.g. ANTHROPIC_API_KEY) — not the value", and added a one-line helper below pointing to Secrets. ### 5. Agent chat replies rendered 2-3 times Three delivery paths can fire for a single agent reply — HTTP response to POST /a2a, A2A_RESPONSE WS event, and a send_message_to_user WS push. Paths 2↔3 were already guarded by `sendingFromAPIRef`; path 1 had no guard. Hermes emits both the reply body AND a send_message_to_user with the same text, which manifested as duplicate bubbles with identical timestamps. Added `appendMessageDeduped(prev, msg, windowMs = 3000)` in chat/types.ts — dedupes on (role, content) within a 3s window. Threaded into all three setMessages call sites. The window is short enough that legitimate repeat messages ("hi", "hi") from a real user/agent a few seconds apart still render. Tests: 8 new vitest cases covering empty history, different content, duplicate within window, different roles, window elapsed, stale match, malformed timestamps, and custom window. ### 6. New end-to-end regression test tests/e2e/test_dev_mode.sh — 7 HTTP assertions that run against a live platform with MOLECULE_ENV=development and catch regressions on all the dev-mode escape hatches in a single pass: AdminAuth (empty DB + after-token), WorkspaceAuth (/activity, /delegations), AdminAuth on /approvals/pending, and the populated /org/templates response. Shellcheck-clean. ### Test sweep - `go test -race ./internal/handlers/ ./internal/middleware/ ./internal/provisioner/` — all pass - `npx vitest run` in canvas — 922/922 pass (up from 902) - `shellcheck --severity=warning infra/scripts/setup.sh tests/e2e/test_dev_mode.sh` — clean - `bash tests/e2e/test_dev_mode.sh` — 7/7 pass against a live platform + populated template registry ### SaaS parity Every relaxation remains conditional on MOLECULE_ENV=development. Production tenants run MOLECULE_ENV=production (enforced by the secrets-encryption strict-init path) and always set ADMIN_TOKEN, so none of these code paths fire on hosted SaaS. Behaviour on real tenants is byte-for-byte unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| e2e | ||
| README.md | ||
Tests
This repo uses the standard monorepo testing convention: unit tests live with their package, cross-component E2E tests live here.
Where to find tests
| Scope | Location |
|---|---|
| Go unit + integration (platform, CLI, handlers) | workspace-server/**/*_test.go — run with cd workspace-server && go test -race ./... |
| TypeScript unit (canvas components, hooks, store) | canvas/src/**/__tests__/ — run with cd canvas && npm test -- --run |
| TypeScript unit (MCP server handlers) | mcp-server/src/__tests__/ — run with cd mcp-server && npx jest |
| Python unit (workspace runtime, adapters) | workspace/tests/ — run with cd workspace && python3 -m pytest |
| Python unit (SDK: plugin + remote agent) | sdk/python/tests/ — run with cd sdk/python && python3 -m pytest |
| Cross-component E2E (spans platform + runtime + HTTP) | tests/e2e/ ← you are here |
Why split this way
- Go requires co-located
_test.gofiles to access unexported symbols. - Per-package test commands keep the inner loop fast — changing canvas doesn't re-run Go tests.
tests/e2e/covers scenarios that no single package owns: a full workspace lifecycle, A2A across two provisioned agents, delegation chains, bundle round-trips.
Running E2E
Every E2E script here assumes the platform is running at localhost:8080 and (where noted) provisioned agents are online. See the header comment of each .sh for specifics.
Cleaning up rogue test workspaces
If an E2E run aborts before its teardown runs (Ctrl-C, crash, CI timeout),
the platform can be left with workspaces whose config volume is stale or
empty — Docker's unless-stopped restart policy then spins those
containers in a FileNotFoundError loop. The platform's pre-flight check
(#17) marks such workspaces failed on the next restart, but a manual
cleanup is useful:
bash scripts/cleanup-rogue-workspaces.sh # deletes ws with id/name starting aaaaaaaa-, bbbbbbbb-, cccccccc-, test-ws-
MOLECULE_URL=http://host:8080 bash scripts/cleanup-rogue-workspaces.sh
The script DELETEs each matching workspace via the API and
force-removes the ws-<id[:12]> container as a belt-and-suspenders
fallback.