Commit Graph

2 Commits

Author SHA1 Message Date
Hongming Wang
9dae0503ee fix(harness): generate SECRETS_ENCRYPTION_KEY per-run instead of hardcoding
Replaces the hardcoded base64 sentinel (630dd0da) with a per-run
generation in up.sh, exported into compose's interpolation environment.

Why:
- Hardcoding a 32-byte base64 string in the repo, even one labelled
  "test-only", sets a bad muscle-memory pattern. The next agent or
  contributor copies the shape into another harness — or worse, into a
  staging .env — and the test-only sentinel turns into something
  someone treats as a real key.
- Secret scanners flag key-shaped values regardless of the surrounding
  comment claiming intent. Avoiding the literal entirely sidesteps the
  false-positive.
- A fresh key per harness lifetime more closely mimics prod's
  per-tenant isolation, exercising the same code paths without any
  pretense of stable encrypted-data fixtures (which the harness wipes
  on every ./down.sh anyway).

Implementation:
- up.sh: `openssl rand -base64 32` if SECRETS_ENCRYPTION_KEY isn't
  already set in the caller's env. Honoring a pre-set value lets a
  debug session pin a key for reproducibility (e.g. when investigating
  encrypted-row corruption).
- compose.yml: `${SECRETS_ENCRYPTION_KEY:?…}` makes a misuse loud —
  running `docker compose up` directly bypassing up.sh fails fast with
  a clear error pointing at the right entry point, rather than a 100s
  unhealthy-tenant timeout.

Both paths verified via `docker compose config`:
- with key exported: value interpolates cleanly
- without it: "required variable SECRETS_ENCRYPTION_KEY is missing a
  value: must be set — run via tests/harness/up.sh, which generates
  one per run"
2026-04-30 13:30:14 -07:00
Hongming Wang
f13d2b2b7b feat(tests): add production-shape local harness (Phase 1)
The harness brings up the SaaS tenant topology on localhost using the
SAME workspace-server/Dockerfile.tenant image that ships to production.
Tests run against http://harness-tenant.localhost:8080 and exercise the
same code path a real tenant takes:

  client
    → cf-proxy   (nginx; CF tunnel + LB header rewrites)
    → tenant     (Dockerfile.tenant — combined platform + canvas)
    → cp-stub    (minimal Go CP stand-in for /cp/* paths)
    → postgres + redis

Why this exists: bugs that survive `go run ./cmd/server` and ship to
prod almost always live in env-gated middleware (TenantGuard, /cp/*
proxy, canvas proxy), header rewrites, or the strict-auth / live-token
mode. The harness activates ALL of them locally so #2395 + #2397-class
bugs can be reproduced before deploy.

Phase 1 surface:
  - cp-stub/main.go: minimal CP stand-in. /cp/auth/me, redeploy-fleet,
    /__stub/{peers,mode,state} for replay scripts. Catch-all returns
    501 with a clear message when a new CP route appears.
  - cf-proxy/nginx.conf: rewrites Host to <slug>.localhost, injects
    X-Forwarded-*, disables buffering to mirror CF tunnel streaming
    semantics.
  - compose.yml: one service per topology layer; tenant builds from
    the actual production Dockerfile.tenant.
  - up.sh / down.sh / seed.sh: lifecycle scripts.
  - replays/peer-discovery-404.sh: reproduces #2397 + asserts the
    diagnostic helper from PR #2399 surfaces "404" + "registered".
  - replays/buildinfo-stale-image.sh: reproduces #2395 + asserts
    /buildinfo wire shape + GIT_SHA injection from PR #2398.
  - README.md: topology, quickstart, what the harness does NOT cover.

Phases 2-3 (separate PRs):
  - Phase 2: convert tests/e2e/test_api.sh to target the harness URL
    instead of localhost; make harness-based replays a required CI gate.
  - Phase 3: config-coherence lint that diffs harness env list against
    production CP's env list, fails CI on drift.

Verification:
  - cp-stub builds (go build ./...).
  - cp-stub responds to all stubbed endpoints (smoke-tested locally).
  - compose.yml passes `docker compose config --quiet`.
  - All shell scripts pass `bash -n` syntax check.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 11:22:46 -07:00