molecule-core/tests/harness/cf-proxy/nginx.conf
Hongming Wang f13d2b2b7b feat(tests): add production-shape local harness (Phase 1)
The harness brings up the SaaS tenant topology on localhost using the
SAME workspace-server/Dockerfile.tenant image that ships to production.
Tests run against http://harness-tenant.localhost:8080 and exercise the
same code path a real tenant takes:

  client
    → cf-proxy   (nginx; CF tunnel + LB header rewrites)
    → tenant     (Dockerfile.tenant — combined platform + canvas)
    → cp-stub    (minimal Go CP stand-in for /cp/* paths)
    → postgres + redis

Why this exists: bugs that survive `go run ./cmd/server` and ship to
prod almost always live in env-gated middleware (TenantGuard, /cp/*
proxy, canvas proxy), header rewrites, or the strict-auth / live-token
mode. The harness activates ALL of them locally so #2395 + #2397-class
bugs can be reproduced before deploy.

Phase 1 surface:
  - cp-stub/main.go: minimal CP stand-in. /cp/auth/me, redeploy-fleet,
    /__stub/{peers,mode,state} for replay scripts. Catch-all returns
    501 with a clear message when a new CP route appears.
  - cf-proxy/nginx.conf: rewrites Host to <slug>.localhost, injects
    X-Forwarded-*, disables buffering to mirror CF tunnel streaming
    semantics.
  - compose.yml: one service per topology layer; tenant builds from
    the actual production Dockerfile.tenant.
  - up.sh / down.sh / seed.sh: lifecycle scripts.
  - replays/peer-discovery-404.sh: reproduces #2397 + asserts the
    diagnostic helper from PR #2399 surfaces "404" + "registered".
  - replays/buildinfo-stale-image.sh: reproduces #2395 + asserts
    /buildinfo wire shape + GIT_SHA injection from PR #2398.
  - README.md: topology, quickstart, what the harness does NOT cover.

Phases 2-3 (separate PRs):
  - Phase 2: convert tests/e2e/test_api.sh to target the harness URL
    instead of localhost; make harness-based replays a required CI gate.
  - Phase 3: config-coherence lint that diffs harness env list against
    production CP's env list, fails CI on drift.

Verification:
  - cp-stub builds (go build ./...).
  - cp-stub responds to all stubbed endpoints (smoke-tested locally).
  - compose.yml passes `docker compose config --quiet`.
  - All shell scripts pass `bash -n` syntax check.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 11:22:46 -07:00

69 lines
3.1 KiB
Nginx Configuration File

# cf-proxy — Cloudflare-tunnel-shape reverse proxy for the local harness.
#
# Production path: agent → CF tunnel → AWS LB → tenant container.
# This config replays the same header rewrites the CF tunnel does so
# the tenant sees the same Host + X-Forwarded-* it would in production.
#
# The tenant's TenantGuard middleware activates on MOLECULE_ORG_ID; the
# canvas's same-origin fetches use the Host header for cookie scoping.
# Both behave correctly in production because CF rewrites Host to the
# tenant subdomain — this proxy reproduces that locally.
#
# How tests reach it:
# curl --resolve 'harness-tenant.localhost:8443:127.0.0.1' \
# https://harness-tenant.localhost:8443/health
# or via /etc/hosts (added automatically by ./up.sh on first boot).
worker_processes 1;
events { worker_connections 256; }
http {
# Map the wildcard <slug>.localhost to the tenant container. The
# tenant container itself doesn't care which slug routed to it —
# what matters is that the Host header it sees matches what
# production's CF tunnel sets, so cookie/CORS/TenantGuard logic
# exercises the same code path.
server {
listen 8080;
server_name *.localhost localhost;
# Cap upload at 50MB to mirror the staging tenant nginx limit;
# chat upload tests will fail closed if the platform handler
# ever silently expands its limit (catches the failure mode
# opposite of the chat-files lazy-heal incident).
client_max_body_size 50m;
location / {
proxy_pass http://tenant:8080;
# Header parity with CF tunnel + AWS LB. Production CF sets
# X-Forwarded-Proto=https; we keep http here because TLS
# termination in compose is unnecessary for testing the
# tenant logic — TLS is a CF concern, not a tenant bug
# surface. If TLS-specific bugs ever bite, add cert-manager
# + listen 8443 ssl here.
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Host $host;
proxy_set_header X-Forwarded-Proto $scheme;
# Streamable HTTP / SSE / WebSocket — the tenant exposes /ws
# and /events/stream + MCP /mcp/stream. Disabling buffering
# reproduces CF tunnel's pass-through streaming semantics
# (CF tunnel = no buffering by default; nginx default IS
# buffering, which would mask issue #2397-class streaming
# bugs by accumulating output until the client disconnects).
proxy_buffering off;
proxy_request_buffering off;
proxy_http_version 1.1;
proxy_set_header Connection "";
# Read timeout — CF tunnel default is 100s. Setting this to
# the same value catches "long agent run finishes after the
# proxy already closed the upstream" failure mode.
proxy_read_timeout 100s;
}
}
}