molecule-core/tests/harness/compose.yml
Hongming Wang f13d2b2b7b feat(tests): add production-shape local harness (Phase 1)
The harness brings up the SaaS tenant topology on localhost using the
SAME workspace-server/Dockerfile.tenant image that ships to production.
Tests run against http://harness-tenant.localhost:8080 and exercise the
same code path a real tenant takes:

  client
    → cf-proxy   (nginx; CF tunnel + LB header rewrites)
    → tenant     (Dockerfile.tenant — combined platform + canvas)
    → cp-stub    (minimal Go CP stand-in for /cp/* paths)
    → postgres + redis

Why this exists: bugs that survive `go run ./cmd/server` and ship to
prod almost always live in env-gated middleware (TenantGuard, /cp/*
proxy, canvas proxy), header rewrites, or the strict-auth / live-token
mode. The harness activates ALL of them locally so #2395 + #2397-class
bugs can be reproduced before deploy.

Phase 1 surface:
  - cp-stub/main.go: minimal CP stand-in. /cp/auth/me, redeploy-fleet,
    /__stub/{peers,mode,state} for replay scripts. Catch-all returns
    501 with a clear message when a new CP route appears.
  - cf-proxy/nginx.conf: rewrites Host to <slug>.localhost, injects
    X-Forwarded-*, disables buffering to mirror CF tunnel streaming
    semantics.
  - compose.yml: one service per topology layer; tenant builds from
    the actual production Dockerfile.tenant.
  - up.sh / down.sh / seed.sh: lifecycle scripts.
  - replays/peer-discovery-404.sh: reproduces #2397 + asserts the
    diagnostic helper from PR #2399 surfaces "404" + "registered".
  - replays/buildinfo-stale-image.sh: reproduces #2395 + asserts
    /buildinfo wire shape + GIT_SHA injection from PR #2398.
  - README.md: topology, quickstart, what the harness does NOT cover.

Phases 2-3 (separate PRs):
  - Phase 2: convert tests/e2e/test_api.sh to target the harness URL
    instead of localhost; make harness-based replays a required CI gate.
  - Phase 3: config-coherence lint that diffs harness env list against
    production CP's env list, fails CI on drift.

Verification:
  - cp-stub builds (go build ./...).
  - cp-stub responds to all stubbed endpoints (smoke-tested locally).
  - compose.yml passes `docker compose config --quiet`.
  - All shell scripts pass `bash -n` syntax check.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 11:22:46 -07:00

129 lines
4.4 KiB
YAML

# Production-shape harness for local E2E.
#
# Reproduces the SaaS tenant topology on localhost using the SAME
# images that ship to production:
#
# client → cf-proxy (nginx, mimics CF tunnel headers)
# → tenant (workspace-server/Dockerfile.tenant — combined platform + canvas)
# → cp-stub (control-plane stand-in) for /cp/* and CP-callback paths
# → postgres + redis (same versions as production)
#
# Why this matters: the workspace-server binary IS identical between
# local and production. The bugs that survive local E2E are topology
# bugs — env-gated middleware (TenantGuard, CP proxy, Canvas proxy),
# auth state, header rewrites, real production image. This harness
# activates ALL of them.
#
# Quickstart:
# cd tests/harness && ./up.sh
# ./seed.sh
# ./replays/peer-discovery-404.sh # reproduces issue #2397
#
# Env config:
# GIT_SHA — passed to the tenant build for /buildinfo verification.
# Defaults to "harness" so /buildinfo distinguishes the
# harness build from any cached image.
# CP_STUB_PEERS_MODE — peers failure mode for replay scripts.
# "" / "404" / "401" / "500" / "timeout".
services:
postgres:
image: postgres:16-alpine
environment:
POSTGRES_USER: harness
POSTGRES_PASSWORD: harness
POSTGRES_DB: molecule
networks: [harness-net]
healthcheck:
test: ["CMD-SHELL", "pg_isready -U harness"]
interval: 2s
timeout: 5s
retries: 10
redis:
image: redis:7-alpine
networks: [harness-net]
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 2s
timeout: 5s
retries: 10
cp-stub:
build:
context: ./cp-stub
environment:
PORT: "9090"
CP_STUB_PEERS_MODE: "${CP_STUB_PEERS_MODE:-}"
networks: [harness-net]
healthcheck:
test: ["CMD-SHELL", "wget -q -O- http://localhost:9090/healthz || exit 1"]
interval: 2s
timeout: 5s
retries: 10
# The actual production tenant image — same Dockerfile.tenant CI publishes.
# This is the load-bearing part of the harness: every bug class that hides
# behind "but it works locally" is reproducible HERE, against this image,
# not against `go run ./cmd/server`.
tenant:
build:
context: ../..
dockerfile: workspace-server/Dockerfile.tenant
args:
GIT_SHA: "${GIT_SHA:-harness}"
depends_on:
postgres:
condition: service_healthy
redis:
condition: service_healthy
cp-stub:
condition: service_healthy
environment:
DATABASE_URL: "postgres://harness:harness@postgres:5432/molecule?sslmode=disable"
REDIS_URL: "redis://redis:6379"
PORT: "8080"
PLATFORM_URL: "http://tenant:8080"
MOLECULE_ENV: "production"
# ADMIN_TOKEN flips the platform into strict-auth mode (matches
# production's CP-minted token configuration). Seeded value lets
# E2E scripts authenticate without going through CP.
ADMIN_TOKEN: "harness-admin-token"
# MOLECULE_ORG_ID — activates TenantGuard middleware. Every request
# must carry X-Molecule-Org-Id matching this value. Replays bugs
# that only fire in SaaS mode.
MOLECULE_ORG_ID: "harness-org"
# CP_UPSTREAM_URL — activates the /cp/* reverse proxy mount in
# router.go. Without this set, /cp/* would 404 and the canvas
# bootstrap would silently drift from production behavior.
CP_UPSTREAM_URL: "http://cp-stub:9090"
RATE_LIMIT: "1000"
# Canvas auto-proxy — entrypoint-tenant.sh exports CANVAS_PROXY_URL
# by default; keeping it explicit here makes the topology readable.
CANVAS_PROXY_URL: "http://localhost:3000"
networks: [harness-net]
healthcheck:
test: ["CMD-SHELL", "wget -q -O- http://localhost:8080/health || exit 1"]
interval: 5s
timeout: 5s
retries: 20
# Cloudflare-tunnel-shape proxy — strips the :8080 suffix, rewrites
# Host to the tenant subdomain, injects X-Forwarded-*. Tests target
# http://harness-tenant.localhost:8080 and exercise the production
# routing layer.
cf-proxy:
image: nginx:1.27-alpine
depends_on:
tenant:
condition: service_healthy
volumes:
- ./cf-proxy/nginx.conf:/etc/nginx/nginx.conf:ro
ports:
- "8080:8080"
networks: [harness-net]
networks:
harness-net:
name: molecule-harness-net