The harness brings up the SaaS tenant topology on localhost using the SAME workspace-server/Dockerfile.tenant image that ships to production. Tests run against http://harness-tenant.localhost:8080 and exercise the same code path a real tenant takes: client → cf-proxy (nginx; CF tunnel + LB header rewrites) → tenant (Dockerfile.tenant — combined platform + canvas) → cp-stub (minimal Go CP stand-in for /cp/* paths) → postgres + redis Why this exists: bugs that survive `go run ./cmd/server` and ship to prod almost always live in env-gated middleware (TenantGuard, /cp/* proxy, canvas proxy), header rewrites, or the strict-auth / live-token mode. The harness activates ALL of them locally so #2395 + #2397-class bugs can be reproduced before deploy. Phase 1 surface: - cp-stub/main.go: minimal CP stand-in. /cp/auth/me, redeploy-fleet, /__stub/{peers,mode,state} for replay scripts. Catch-all returns 501 with a clear message when a new CP route appears. - cf-proxy/nginx.conf: rewrites Host to <slug>.localhost, injects X-Forwarded-*, disables buffering to mirror CF tunnel streaming semantics. - compose.yml: one service per topology layer; tenant builds from the actual production Dockerfile.tenant. - up.sh / down.sh / seed.sh: lifecycle scripts. - replays/peer-discovery-404.sh: reproduces #2397 + asserts the diagnostic helper from PR #2399 surfaces "404" + "registered". - replays/buildinfo-stale-image.sh: reproduces #2395 + asserts /buildinfo wire shape + GIT_SHA injection from PR #2398. - README.md: topology, quickstart, what the harness does NOT cover. Phases 2-3 (separate PRs): - Phase 2: convert tests/e2e/test_api.sh to target the harness URL instead of localhost; make harness-based replays a required CI gate. - Phase 3: config-coherence lint that diffs harness env list against production CP's env list, fails CI on drift. Verification: - cp-stub builds (go build ./...). - cp-stub responds to all stubbed endpoints (smoke-tested locally). - compose.yml passes `docker compose config --quiet`. - All shell scripts pass `bash -n` syntax check. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| cf-proxy | ||
| cp-stub | ||
| replays | ||
| compose.yml | ||
| down.sh | ||
| README.md | ||
| seed.sh | ||
| up.sh | ||
Production-shape local harness
The harness brings up the SaaS tenant topology on localhost using the
same Dockerfile.tenant image that ships to production. Tests run
against http://harness-tenant.localhost:8080 and exercise the
SAME code path a real tenant takes — including TenantGuard middleware,
the /cp/* reverse proxy, the canvas reverse proxy, and a
Cloudflare-tunnel-shape header rewrite layer.
Why this exists
Local go run ./cmd/server skips:
TenantGuardmiddleware (noMOLECULE_ORG_IDenv)/cp/*reverse proxy mount (noCP_UPSTREAM_URLenv)CANVAS_PROXY_URL(canvas runs separately on:3000)- Header rewrites that production's CF tunnel + LB perform
- Strict-auth mode (no live
ADMIN_TOKEN)
Bugs that survive go run and ship to production almost always live
in one of those layers. The harness activates ALL of them.
Topology
client
↓
cf-proxy nginx, mirrors CF tunnel header rewrites
↓ (Host:harness-tenant.localhost, X-Forwarded-*)
tenant workspace-server/Dockerfile.tenant — same image as prod
↓ (CP_UPSTREAM_URL=http://cp-stub:9090, /cp/* proxied)
cp-stub minimal Go service, mocks CP wire surface
postgres same version as production
redis same version as production
Quickstart
cd tests/harness
./up.sh # builds + starts all services
./seed.sh # mints admin token, registers two sample workspaces
./replays/peer-discovery-404.sh
./replays/buildinfo-stale-image.sh
./down.sh # tear down + remove volumes
First-time setup needs an /etc/hosts entry so harness-tenant.localhost
resolves to the local cf-proxy:
echo "127.0.0.1 harness-tenant.localhost" | sudo tee -a /etc/hosts
(macOS resolves *.localhost automatically in some setups; Linux
typically does not.)
Replay scripts
Each replay script reproduces a real bug class against the harness so fixes can be verified locally before deploy. The bar for adding a replay is "this bug shipped to production despite local E2E being green" — the script becomes the regression gate that closes that gap.
| Replay | Closes | What it proves |
|---|---|---|
peer-discovery-404.sh |
#2397 | tool_list_peers surfaces the actual reason instead of "may be isolated" |
buildinfo-stale-image.sh |
#2395 | GIT_SHA reaches the binary; verify-step comparison logic works |
To add a new replay:
- Drop a script under
replays/named after the issue. - The script's purpose: reproduce the production failure mode against the harness, then assert the fix is present. PASS criterion is the post-fix behavior.
- Wire it into the
tests/harness/run-all-replays.shrunner (TODO, Phase 2).
Extending the cp-stub
cp-stub/main.go serves the minimum surface for the existing replays
plus a catch-all that returns 501 + a clear message when the tenant
asks for a route the stub doesn't implement. To add a new CP route:
- Add a
mux.HandleFuncincp-stub/main.gofor the path. - Return the same wire shape the real CP returns. The contract is "wire compatibility with the staging CP at the time of writing" — document it with a comment pointing at the real CP handler.
- Add a replay script that exercises the path.
What the harness does NOT cover
- Real TLS / cert handling (CF terminates TLS in production; harness is HTTP-only).
- Cloudflare API edge cases (rate limits, DNS propagation timing).
- Real EC2 / SSM / EBS behavior (image-cache replay simulates the outcome but not the AWS API surface).
- Cross-region or multi-AZ topology.
- Real production data scale.
These are intentional Phase 1 limits. If a bug class hits one of these gaps, escalate to staging E2E rather than expanding the harness past its mandate of "exercise the tenant binary in production-shape topology."
Roadmap
- Phase 1 (this PR): harness + cp-stub + cf-proxy + 2 replays.
- Phase 2: convert
tests/e2e/test_api.shto run against the harness instead of localhost. Make harness-based E2E a required CI check. - Phase 3: config-coherence lint that diffs harness env list against production CP's env list, fails CI on drift.