Merge pull request #2493 from Molecule-AI/harness/phase-2-multi-tenant

harness(phase-2): multi-tenant compose + cross-tenant isolation replays
This commit is contained in:
Hongming Wang 2026-05-02 04:39:09 +00:00 committed by GitHub
commit 3ca2f40e16
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
11 changed files with 785 additions and 226 deletions

View File

@ -106,16 +106,6 @@ jobs:
path: molecule-ai-plugin-github-app-auth
token: ${{ secrets.PLUGIN_REPO_PAT || secrets.GITHUB_TOKEN }}
- name: Add /etc/hosts entry for harness-tenant.localhost
# ubuntu-latest doesn't auto-resolve *.localhost the way macOS
# sometimes does. seed.sh + replay scripts curl
# http://harness-tenant.localhost:8080 — without the entry
# they'd fail with getaddrinfo ENOTFOUND.
if: needs.detect-changes.outputs.run == 'true'
run: |
echo "127.0.0.1 harness-tenant.localhost" | sudo tee -a /etc/hosts >/dev/null
getent hosts harness-tenant.localhost
- name: Install Python deps for replays
# peer-discovery-404 (and future replays) eval Python against the
# running tenant — importing workspace/a2a_client.py pulls in
@ -144,19 +134,32 @@ jobs:
run: ./run-all-replays.sh
- name: Dump compose logs on failure
# SECRETS_ENCRYPTION_KEY: docker compose validates the entire compose
# file even for read-only `logs` calls. up.sh generates a per-run key
# and exports it to its OWN shell — this step runs in a fresh shell
# that wouldn't see it, so without a placeholder the validate step
# errors before logs print (verified against PR #2492's first run:
# "required variable SECRETS_ENCRYPTION_KEY is missing a value").
# A placeholder is fine — we're only reading log streams, not booting.
if: failure() && needs.detect-changes.outputs.run == 'true'
working-directory: tests/harness
env:
SECRETS_ENCRYPTION_KEY: dump-logs-placeholder
run: |
echo "=== docker compose ps ==="
docker compose -f compose.yml ps || true
echo "=== tenant logs ==="
docker compose -f compose.yml logs tenant || true
echo "=== tenant-alpha logs ==="
docker compose -f compose.yml logs tenant-alpha || true
echo "=== tenant-beta logs ==="
docker compose -f compose.yml logs tenant-beta || true
echo "=== cp-stub logs ==="
docker compose -f compose.yml logs cp-stub || true
echo "=== cf-proxy logs ==="
docker compose -f compose.yml logs cf-proxy || true
echo "=== postgres logs (last 100) ==="
docker compose -f compose.yml logs --tail 100 postgres || true
echo "=== postgres-alpha logs (last 100) ==="
docker compose -f compose.yml logs --tail 100 postgres-alpha || true
echo "=== postgres-beta logs (last 100) ==="
docker compose -f compose.yml logs --tail 100 postgres-beta || true
- name: Force teardown
# We pass KEEP_UP=1 to run-all-replays.sh so the dump step

View File

@ -3,17 +3,27 @@
The harness brings up the SaaS tenant topology on localhost using the
same `Dockerfile.tenant` image that ships to production. Tests target
the cf-proxy on `http://localhost:8080` and pass the tenant identity
via a `Host: harness-tenant.localhost` header — exactly the way
production CF tunnel routes by Host header. The cf-proxy nginx then
rewrites headers and proxies to the tenant container, exercising the
SAME code path a real tenant takes including TenantGuard middleware,
the `/cp/*` reverse proxy, the canvas reverse proxy, and a
Cloudflare-tunnel-shape header rewrite layer.
via a `Host:` header — exactly the way production CF tunnel routes by
Host header. The cf-proxy nginx then rewrites headers and proxies to
the right tenant container, exercising the SAME code path a real tenant
takes including TenantGuard middleware, the `/cp/*` reverse proxy, the
canvas reverse proxy, and a Cloudflare-tunnel-shape header rewrite
layer.
`tests/harness/_curl.sh` is the helper sourced by every replay —
provides `curl_anon`, `curl_admin`, `curl_workspace`, and `psql_exec`
wrappers that set the right Host + auth headers automatically. New
replays should source it rather than rolling their own curl.
Since Phase 2 the harness runs **two tenants in parallel** (alpha and
beta) with their own Postgres instance and distinct
`MOLECULE_ORG_ID`s — same shape as production, where each tenant gets
its own EC2 + DB. This is what cross-tenant isolation replays need to
prove TenantGuard actually 404s a misrouted request.
`tests/harness/_curl.sh` is the helper sourced by every replay. Per
tenant: `curl_alpha_anon` / `curl_alpha_admin` / `curl_beta_anon` /
`curl_beta_admin` / `psql_exec_alpha` / `psql_exec_beta`. Plus
deliberately-wrong cross-tenant negative-test helpers for isolation
replays: `curl_alpha_creds_at_beta` / `curl_beta_creds_at_alpha`.
Legacy single-tenant aliases (`curl_anon`, `curl_admin`, `psql_exec`)
default to alpha so pre-Phase-2 replays continue to work. New replays
should source `_curl.sh` rather than rolling their own curl.
## Why this exists
@ -30,25 +40,37 @@ in one of those layers. The harness activates ALL of them.
## Topology
```
client
cf-proxy nginx, mirrors CF tunnel header rewrites
↓ (Host:harness-tenant.localhost, X-Forwarded-*)
tenant workspace-server/Dockerfile.tenant — same image as prod
↓ (CP_UPSTREAM_URL=http://cp-stub:9090, /cp/* proxied)
cp-stub minimal Go service, mocks CP wire surface
postgres same version as production
redis same version as production
client
cf-proxy nginx, mirrors CF tunnel header rewrites
↓ (routes by Host header)
┌─────────────────────────┴─────────────────────────┐
↓ ↓
tenant-alpha tenant-beta
Host: harness-tenant-alpha.localhost Host: harness-tenant-beta.localhost
MOLECULE_ORG_ID=harness-org-alpha MOLECULE_ORG_ID=harness-org-beta
↓ ↓
postgres-alpha postgres-beta
↓ ↓
└─────────────────────────┬─────────────────────────┘
cp-stub + redis (shared)
```
Each tenant runs the production `Dockerfile.tenant` image with its own
admin token, org id, and Postgres instance — identical isolation
boundaries to production where each tenant gets a dedicated EC2 + DB.
cp-stub and redis are shared because they model the per-region
multi-tenant CP and a single Redis cluster.
## Quickstart
```bash
cd tests/harness
./up.sh # builds + starts all services
./seed.sh # mints admin token, registers two sample workspaces
./replays/peer-discovery-404.sh
./replays/buildinfo-stale-image.sh
./up.sh # builds + starts all services (both tenants)
./seed.sh # registers parent+child workspaces in BOTH tenants
./replays/tenant-isolation.sh
./replays/per-tenant-independence.sh
./down.sh # tear down + remove volumes
```
@ -62,17 +84,19 @@ REBUILD=1 ./run-all-replays.sh # rebuild images before booting
```
No `/etc/hosts` edit required — replays use the cf-proxy's loopback
port and pass `Host: harness-tenant.localhost` as a header (`_curl.sh`
handles this automatically). This matches how production CF tunnel
routes: the URL is the public CF endpoint, the Host header carries the
per-tenant identity. Quick check:
port and pass the per-tenant `Host:` header (`_curl.sh` handles this
automatically). This matches how production CF tunnel routes: the URL
is the public CF endpoint, the Host header carries the per-tenant
identity. Quick check:
```bash
curl -H "Host: harness-tenant.localhost" http://localhost:8080/health
curl -H "Host: harness-tenant-alpha.localhost" http://localhost:8080/health
curl -H "Host: harness-tenant-beta.localhost" http://localhost:8080/health
```
(If you have a legacy `/etc/hosts` entry from older docs, it still
works — `BASE` and `TENANT_HOST` both honor env-var overrides.)
works — `BASE`, `ALPHA_HOST`, `BETA_HOST` all honor env-var overrides.
The legacy `harness-tenant.localhost` host alias maps to alpha.)
## Replay scripts
@ -87,6 +111,8 @@ green" — the script becomes the regression gate that closes that gap.
| `buildinfo-stale-image.sh` | #2395 | GIT_SHA reaches the binary; verify-step comparison logic works |
| `chat-history.sh` | #2472 + #2474 + #2476 | `peer_id` filter (incl. OR over source/target) + `before_ts` paging + UUID/RFC3339 trust boundary on the activity route |
| `channel-envelope-trust-boundary.sh` | #2471 + #2481 | published wheel scrubs malformed `peer_id` from the channel envelope and from `agent_card_url` (path-traversal + XML-attr injection) |
| `tenant-isolation.sh` | Phase 2 | TenantGuard 404s any request whose `X-Molecule-Org-Id` doesn't match the container's `MOLECULE_ORG_ID` (covers cross-tenant routing bug + allowlist drift); per-tenant `/workspaces` listings stay partitioned |
| `per-tenant-independence.sh` | Phase 2 | parallel A2A workflows in both tenants don't bleed into each other's `activity_logs` / `workspaces`, including under a concurrent INSERT race (catches lib/pq prepared-statement cache collision + shared-pool poisoning) |
To add a new replay:
1. Drop a script under `replays/` named after the issue.
@ -125,6 +151,6 @@ its mandate of "exercise the tenant binary in production-shape topology."
## Roadmap
- **Phase 1 (shipped):** harness + cp-stub + cf-proxy + 4 replays + `run-all-replays.sh` runner. No-sudo `Host`-header path via `_curl.sh`. Per-replay psql seeding for tests that need DB-side fixtures.
- **Phase 2 (in flight):** multi-tenant — second `tenant-beta` service in compose, second Postgres database, replays for cross-tenant A2A + TenantGuard isolation. Convert `tests/e2e/test_api.sh` to target the harness instead of localhost. Make harness-based E2E a required CI check (a workflow that invokes `run-all-replays.sh` on every PR via the self-hosted Mac runner).
- **Phase 3:** replace `cp-stub/` with the real `molecule-controlplane` Docker build. Add a config-coherence lint that diffs harness env list against production CP's env list and fails CI on drift.
- **Phase 2 (shipped):** multi-tenant — `tenant-alpha` + `tenant-beta` with their own Postgres instances and distinct `MOLECULE_ORG_ID`s; cf-proxy nginx routes by Host header (prod CF tunnel parity); `seed.sh` registers parent+child workspaces in both tenants; `_curl.sh` exposes per-tenant + cross-tenant-negative helpers; new replays cover TenantGuard isolation (`tenant-isolation.sh`) and per-tenant independence under concurrent load (`per-tenant-independence.sh`). `harness-replays.yml` runs `run-all-replays.sh` as a required check on every PR touching `workspace-server/**`, `canvas/**`, `tests/harness/**`, or the workflow itself.
- **Phase 3:** replace `cp-stub/` with the real `molecule-controlplane` Docker build. Add a config-coherence lint that diffs harness env list against production CP's env list and fails CI on drift. Convert `tests/e2e/test_api.sh` to target the harness instead of localhost.
- **Phase 4 (long-term):** Miniflare in front of cf-proxy for real CF emulation (WAF, BotID, rate-limit, cf-tunnel headers). LocalStack for the EC2 provisioner. Anonymized prod-traffic recording/replay for SaaS-scale regression detection.

View File

@ -5,55 +5,122 @@
# URL is to a public CF endpoint and the Host header carries the
# per-tenant identity. We replay the same shape locally:
#
# curl -H "Host: harness-tenant.localhost" http://localhost:8080/health
# curl -H "Host: harness-tenant-alpha.localhost" http://localhost:8080/health
#
# This matches what cf-proxy/nginx.conf already routes (`server_name
# *.localhost localhost`) and avoids the macOS /etc/hosts requirement
# that previously gated the harness behind a sudo step.
# *.localhost` + `map $host $tenant_upstream`) and avoids the macOS
# /etc/hosts requirement that previously gated the harness behind a
# sudo step.
#
# Backwards-compatible: if /etc/hosts resolves harness-tenant.localhost
# (the legacy path), the bare URL still works because the helper falls
# back to that. New scripts SHOULD use the helper functions.
# Multi-tenant since Phase 2: alpha and beta tenants run in parallel.
# `curl_alpha_admin` and `curl_beta_admin` target each tenant's URL
# with that tenant's ADMIN_TOKEN + MOLECULE_ORG_ID. The legacy
# `curl_admin` is aliased to alpha for backwards compat with the
# pre-Phase-2 single-tenant replays.
#
# Usage:
# HERE="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
# source "$HERE/../_curl.sh" # from replays/<name>.sh
# curl_admin "$BASE/health"
# curl_anon "$BASE/health"
# curl_alpha_admin "$BASE/health"
# curl_beta_admin "$BASE/health"
# Bind to the cf-proxy's loopback port — the proxy front-doors every
# tenant and routes by Host header, exactly like production's CF tunnel.
: "${BASE:=http://localhost:8080}"
: "${TENANT_HOST:=harness-tenant.localhost}"
: "${ADMIN_TOKEN:=harness-admin-token}"
: "${ORG_ID:=harness-org}"
# Anonymous request — only Host header (no auth). Use for /health,
# /buildinfo, and any other route that's intentionally public.
# Per-tenant identity. Each pair must match the corresponding tenant
# container's environment in compose.yml or auth/TenantGuard will fail
# in non-obvious ways (401 vs 403 vs silent route to wrong tenant).
: "${ALPHA_HOST:=harness-tenant-alpha.localhost}"
: "${ALPHA_ADMIN_TOKEN:=harness-admin-token-alpha}"
: "${ALPHA_ORG_ID:=harness-org-alpha}"
: "${BETA_HOST:=harness-tenant-beta.localhost}"
: "${BETA_ADMIN_TOKEN:=harness-admin-token-beta}"
: "${BETA_ORG_ID:=harness-org-beta}"
# Legacy single-tenant aliases — pre-Phase-2 replays use these without
# knowing the topology grew. They map to alpha. New replays should use
# the explicit alpha/beta variants for clarity.
: "${TENANT_HOST:=$ALPHA_HOST}"
: "${ADMIN_TOKEN:=$ALPHA_ADMIN_TOKEN}"
: "${ORG_ID:=$ALPHA_ORG_ID}"
# ─── Anonymous (no auth) ──────────────────────────────────────────────
# Anonymous request to alpha. Use for /health, /buildinfo, etc.
curl_alpha_anon() {
curl -sS -H "Host: ${ALPHA_HOST}" "$@"
}
# Anonymous request to beta.
curl_beta_anon() {
curl -sS -H "Host: ${BETA_HOST}" "$@"
}
# Legacy alias for single-tenant replays.
curl_anon() {
curl -sS -H "Host: ${TENANT_HOST}" "$@"
}
# Admin-token request — full SaaS auth shape. Sets the bearer token,
# tenant org header (activates TenantGuard middleware), and a default
# JSON Content-Type. Replays admin paths exactly the way CP does in
# production, so any TenantGuard / strict-auth bug surfaces locally.
curl_admin() {
# ─── Admin-token requests ─────────────────────────────────────────────
# Admin-token request to alpha tenant. SaaS-shape auth: bearer token,
# tenant org header (TenantGuard activates), JSON content type.
curl_alpha_admin() {
curl -sS \
-H "Host: ${TENANT_HOST}" \
-H "Authorization: Bearer ${ADMIN_TOKEN}" \
-H "X-Molecule-Org-Id: ${ORG_ID}" \
-H "Host: ${ALPHA_HOST}" \
-H "Authorization: Bearer ${ALPHA_ADMIN_TOKEN}" \
-H "X-Molecule-Org-Id: ${ALPHA_ORG_ID}" \
-H "Content-Type: application/json" \
"$@"
}
# Workspace-scoped request — uses a per-workspace bearer minted from
# /admin/workspaces/:id/test-token. The platform's auth.go middleware
# accepts this bearer for the workspace's own routes, so this is the
# right shape for replays that exercise an in-workspace tool calling
# back to the platform (chat_history, list_peers, etc).
#
# Caller must export WORKSPACE_TOKEN before invoking.
# Admin-token request to beta tenant.
curl_beta_admin() {
curl -sS \
-H "Host: ${BETA_HOST}" \
-H "Authorization: Bearer ${BETA_ADMIN_TOKEN}" \
-H "X-Molecule-Org-Id: ${BETA_ORG_ID}" \
-H "Content-Type: application/json" \
"$@"
}
# Legacy alias.
curl_admin() {
curl_alpha_admin "$@"
}
# ─── Cross-tenant negative-test helpers ───────────────────────────────
# These exist to MAKE WRONG calls — replays use them to assert
# TenantGuard rejects them. Names spell out what's mismatched.
# alpha bearer + alpha org, but talking to beta's URL. TenantGuard
# should reject because the org header doesn't match beta's MOLECULE_ORG_ID.
curl_alpha_creds_at_beta() {
curl -sS \
-H "Host: ${BETA_HOST}" \
-H "Authorization: Bearer ${ALPHA_ADMIN_TOKEN}" \
-H "X-Molecule-Org-Id: ${ALPHA_ORG_ID}" \
-H "Content-Type: application/json" \
"$@"
}
# beta bearer + beta org, but talking to alpha's URL.
curl_beta_creds_at_alpha() {
curl -sS \
-H "Host: ${ALPHA_HOST}" \
-H "Authorization: Bearer ${BETA_ADMIN_TOKEN}" \
-H "X-Molecule-Org-Id: ${BETA_ORG_ID}" \
-H "Content-Type: application/json" \
"$@"
}
# ─── Workspace-scoped (per-workspace bearer) ──────────────────────────
# Workspace-scoped request to alpha — uses a per-workspace bearer
# minted from /admin/workspaces/:id/test-token. Caller must export
# WORKSPACE_TOKEN.
curl_workspace() {
: "${WORKSPACE_TOKEN:?WORKSPACE_TOKEN must be set — mint via /admin/workspaces/:id/test-token}"
curl -sS \
@ -64,19 +131,29 @@ curl_workspace() {
"$@"
}
# ─── Postgres exec (per-tenant) ───────────────────────────────────────
# Direct postgres exec — for replays that need to seed activity_logs
# rows or read DB state that has no public HTTP route. Wraps the
# `docker compose exec` pattern so replays can stay shell-only.
# rows or read DB state that has no public HTTP route.
#
# SECRETS_ENCRYPTION_KEY is set to a placeholder so compose's `:?must
# be set` interpolation guard (which gates running the harness without
# up.sh) doesn't trip on `exec` — exec only reaches an already-running
# service so the env var is irrelevant, but compose still validates
# the file. The placeholder is never written anywhere or used by any
# service.
psql_exec() {
# SECRETS_ENCRYPTION_KEY placeholder lets compose validate without
# requiring up.sh's per-run key (exec doesn't actually use it but
# compose validates the file).
psql_exec_alpha() {
SECRETS_ENCRYPTION_KEY="${SECRETS_ENCRYPTION_KEY:-exec-placeholder}" \
docker compose -f "${HARNESS_COMPOSE:-$(dirname "${BASH_SOURCE[0]}")/compose.yml}" \
exec -T postgres \
exec -T postgres-alpha \
psql -U harness -d molecule -At "$@"
}
psql_exec_beta() {
SECRETS_ENCRYPTION_KEY="${SECRETS_ENCRYPTION_KEY:-exec-placeholder}" \
docker compose -f "${HARNESS_COMPOSE:-$(dirname "${BASH_SOURCE[0]}")/compose.yml}" \
exec -T postgres-beta \
psql -U harness -d molecule -At "$@"
}
# Legacy alias — single-tenant replays default to alpha's DB.
psql_exec() {
psql_exec_alpha "$@"
}

View File

@ -4,28 +4,54 @@
# This config replays the same header rewrites the CF tunnel does so
# the tenant sees the same Host + X-Forwarded-* it would in production.
#
# The tenant's TenantGuard middleware activates on MOLECULE_ORG_ID; the
# canvas's same-origin fetches use the Host header for cookie scoping.
# Both behave correctly in production because CF rewrites Host to the
# tenant subdomain this proxy reproduces that locally.
# Multi-tenant: nginx routes by Host header to the right tenant
# container exactly the same way the production CF tunnel does
# (URL is the public CF endpoint, Host carries the tenant identity).
#
# How tests reach it:
# curl --resolve 'harness-tenant.localhost:8443:127.0.0.1' \
# https://harness-tenant.localhost:8443/health
# or via /etc/hosts (added automatically by ./up.sh on first boot).
# How tests reach it (no /etc/hosts required):
# curl -H 'Host: harness-tenant-alpha.localhost' http://localhost:8080/health
# curl -H 'Host: harness-tenant-beta.localhost' http://localhost:8080/health
#
# Backwards-compat: harness-tenant.localhost (no -alpha/-beta suffix) maps
# to alpha for legacy single-tenant replays.
worker_processes 1;
events { worker_connections 256; }
http {
# Map the wildcard <slug>.localhost to the tenant container. The
# tenant container itself doesn't care which slug routed to it
# what matters is that the Host header it sees matches what
# production's CF tunnel sets, so cookie/CORS/TenantGuard logic
# exercises the same code path.
# Docker's embedded DNS at 127.0.0.11. Required because the
# `proxy_pass http://$tenant_upstream:8080` below uses a variable
# nginx needs an explicit resolver to do per-request DNS lookups
# (literal hostnames are resolved once at startup, variables are
# resolved per-request). Without this, nginx fails closed with
# "no resolver defined" + 502.
#
# `valid=30s` caps cache life so a tenant container restart picks
# up a new IP within 30 seconds. ipv6=off skips AAAA lookups that
# Docker DNS doesn't always serve cleanly.
resolver 127.0.0.11 valid=30s ipv6=off;
# Reusable proxy block so each tenant server only carries the
# upstream-pointer + its identity-specific tweaks. Keeping the
# header rewrites + buffering settings centralised prevents drift
# between alpha and beta as the harness grows.
map $host $tenant_upstream {
default tenant-alpha;
harness-tenant.localhost tenant-alpha;
harness-tenant-alpha.localhost tenant-alpha;
harness-tenant-beta.localhost tenant-beta;
}
server {
listen 8080;
server_name *.localhost localhost;
listen 8080 default_server;
# Reject Host headers we don't recognise without this, an
# unknown Host would silently route to the default tenant and
# mask cross-tenant routing bugs in test output.
server_name harness-tenant.localhost
harness-tenant-alpha.localhost
harness-tenant-beta.localhost
localhost;
# Cap upload at 50MB to mirror the staging tenant nginx limit;
# chat upload tests will fail closed if the platform handler
@ -34,7 +60,10 @@ http {
client_max_body_size 50m;
location / {
proxy_pass http://tenant:8080;
# The map above resolves $tenant_upstream to the right
# container based on the Host header production CF tunnel
# behavior in one line.
proxy_pass http://$tenant_upstream:8080;
# Header parity with CF tunnel + AWS LB. Production CF sets
# X-Forwarded-Proto=https; we keep http here because TLS

View File

@ -1,45 +1,38 @@
# Production-shape harness for local E2E.
# Production-shape harness for local E2E. Multi-tenant.
#
# Reproduces the SaaS tenant topology on localhost using the SAME
# images that ship to production:
#
# client → cf-proxy (nginx, mimics CF tunnel headers)
# → tenant (workspace-server/Dockerfile.tenant — combined platform + canvas)
# → cp-stub (control-plane stand-in) for /cp/* and CP-callback paths
# → postgres + redis (same versions as production)
# client → cf-proxy (nginx, mimics CF tunnel headers, routes by Host)
# ├─ Host: harness-tenant-alpha.localhost → tenant-alpha
# │ ↓ (CP_UPSTREAM_URL=http://cp-stub:9090)
# │ tenant-alpha (workspace-server/Dockerfile.tenant)
# │ ↓
# │ postgres-alpha (per-tenant DB, matches prod)
# ├─ Host: harness-tenant-beta.localhost → tenant-beta
# │ ↓
# │ tenant-beta + postgres-beta
# └─ cp-stub + redis (shared infra; CP is Railway-singleton in prod,
# redis is shared cluster)
#
# Why this matters: the workspace-server binary IS identical between
# local and production. The bugs that survive local E2E are topology
# bugs — env-gated middleware (TenantGuard, CP proxy, Canvas proxy),
# auth state, header rewrites, real production image. This harness
# activates ALL of them.
# The two-tenant topology catches:
# - TenantGuard cross-tenant escape (alpha-org token shouldn't see
# beta-tenant data even with a valid bearer)
# - cf-proxy Host-header routing correctness
# - Per-tenant DB isolation (workspaces table, activity_logs)
# - Concurrent multi-tenant operation (no shared mutable state)
#
# Quickstart:
# cd tests/harness && ./up.sh
# ./seed.sh
# ./replays/peer-discovery-404.sh # reproduces issue #2397
# Quickstart (no /etc/hosts edits — see README):
# cd tests/harness && ./up.sh && ./seed.sh
# ./replays/peer-discovery-404.sh
# ./run-all-replays.sh
#
# Env config:
# GIT_SHA — passed to the tenant build for /buildinfo verification.
# Defaults to "harness" so /buildinfo distinguishes the
# harness build from any cached image.
# GIT_SHA — passed to BOTH tenant builds for /buildinfo verification.
# CP_STUB_PEERS_MODE — peers failure mode for replay scripts.
# "" / "404" / "401" / "500" / "timeout".
services:
postgres:
image: postgres:16-alpine
environment:
POSTGRES_USER: harness
POSTGRES_PASSWORD: harness
POSTGRES_DB: molecule
networks: [harness-net]
healthcheck:
test: ["CMD-SHELL", "pg_isready -U harness"]
interval: 2s
timeout: 5s
retries: 10
# ─── Shared infra (matches prod: CP is Railway-singleton, redis shared) ───
redis:
image: redis:7-alpine
networks: [harness-net]
@ -62,52 +55,44 @@ services:
timeout: 5s
retries: 10
# The actual production tenant image — same Dockerfile.tenant CI publishes.
# This is the load-bearing part of the harness: every bug class that hides
# behind "but it works locally" is reproducible HERE, against this image,
# not against `go run ./cmd/server`.
tenant:
# ─── Tenant alpha: postgres + workspace-server ────────────────────────
postgres-alpha:
image: postgres:16-alpine
environment:
POSTGRES_USER: harness
POSTGRES_PASSWORD: harness
POSTGRES_DB: molecule
networks: [harness-net]
healthcheck:
test: ["CMD-SHELL", "pg_isready -U harness"]
interval: 2s
timeout: 5s
retries: 10
tenant-alpha:
build:
context: ../..
dockerfile: workspace-server/Dockerfile.tenant
args:
GIT_SHA: "${GIT_SHA:-harness}"
depends_on:
postgres:
postgres-alpha:
condition: service_healthy
redis:
condition: service_healthy
cp-stub:
condition: service_healthy
environment:
DATABASE_URL: "postgres://harness:harness@postgres:5432/molecule?sslmode=disable"
DATABASE_URL: "postgres://harness:harness@postgres-alpha:5432/molecule?sslmode=disable"
REDIS_URL: "redis://redis:6379"
PORT: "8080"
PLATFORM_URL: "http://tenant:8080"
PLATFORM_URL: "http://tenant-alpha:8080"
MOLECULE_ENV: "production"
# SECRETS_ENCRYPTION_KEY is required when MOLECULE_ENV=production —
# crypto.InitStrict() refuses to boot without it. up.sh generates a
# fresh 32-byte key per harness lifetime via `openssl rand -base64 32`
# and exports it into this compose file's interpolation environment.
# The :? sentinel makes the misuse loud — running `docker compose up`
# directly without going through up.sh fails fast with a clear error
# rather than getting a confusing tenant-unhealthy timeout.
SECRETS_ENCRYPTION_KEY: "${SECRETS_ENCRYPTION_KEY:?must be set — run via tests/harness/up.sh, which generates one per run}"
# ADMIN_TOKEN flips the platform into strict-auth mode (matches
# production's CP-minted token configuration). Seeded value lets
# E2E scripts authenticate without going through CP.
ADMIN_TOKEN: "harness-admin-token"
# MOLECULE_ORG_ID — activates TenantGuard middleware. Every request
# must carry X-Molecule-Org-Id matching this value. Replays bugs
# that only fire in SaaS mode.
MOLECULE_ORG_ID: "harness-org"
# CP_UPSTREAM_URL — activates the /cp/* reverse proxy mount in
# router.go. Without this set, /cp/* would 404 and the canvas
# bootstrap would silently drift from production behavior.
ADMIN_TOKEN: "harness-admin-token-alpha"
MOLECULE_ORG_ID: "harness-org-alpha"
CP_UPSTREAM_URL: "http://cp-stub:9090"
RATE_LIMIT: "1000"
# Canvas auto-proxy — entrypoint-tenant.sh exports CANVAS_PROXY_URL
# by default; keeping it explicit here makes the topology readable.
CANVAS_PROXY_URL: "http://localhost:3000"
networks: [harness-net]
healthcheck:
@ -116,21 +101,69 @@ services:
timeout: 5s
retries: 20
# Cloudflare-tunnel-shape proxy — strips the :8080 suffix, rewrites
# Host to the tenant subdomain, injects X-Forwarded-*. Tests target
# http://harness-tenant.localhost:8080 and exercise the production
# routing layer.
# ─── Tenant beta: postgres + workspace-server (parallel to alpha) ─────
postgres-beta:
image: postgres:16-alpine
environment:
POSTGRES_USER: harness
POSTGRES_PASSWORD: harness
POSTGRES_DB: molecule
networks: [harness-net]
healthcheck:
test: ["CMD-SHELL", "pg_isready -U harness"]
interval: 2s
timeout: 5s
retries: 10
tenant-beta:
build:
context: ../..
dockerfile: workspace-server/Dockerfile.tenant
args:
GIT_SHA: "${GIT_SHA:-harness}"
depends_on:
postgres-beta:
condition: service_healthy
redis:
condition: service_healthy
cp-stub:
condition: service_healthy
environment:
DATABASE_URL: "postgres://harness:harness@postgres-beta:5432/molecule?sslmode=disable"
REDIS_URL: "redis://redis:6379"
PORT: "8080"
PLATFORM_URL: "http://tenant-beta:8080"
MOLECULE_ENV: "production"
SECRETS_ENCRYPTION_KEY: "${SECRETS_ENCRYPTION_KEY:?must be set — run via tests/harness/up.sh, which generates one per run}"
# Distinct ADMIN_TOKEN — replays use this to verify TenantGuard
# blocks alpha-token presented at beta's URL.
ADMIN_TOKEN: "harness-admin-token-beta"
MOLECULE_ORG_ID: "harness-org-beta"
CP_UPSTREAM_URL: "http://cp-stub:9090"
RATE_LIMIT: "1000"
CANVAS_PROXY_URL: "http://localhost:3000"
networks: [harness-net]
healthcheck:
test: ["CMD-SHELL", "wget -q -O- http://localhost:8080/health || exit 1"]
interval: 5s
timeout: 5s
retries: 20
# ─── cf-proxy: routes by Host to the right tenant container ───────────
# Production shape: same single CF tunnel front-doors every tenant
# subdomain — the Host header carries the tenant identity, not the
# routing destination. Local cf-proxy mirrors this exactly.
cf-proxy:
image: nginx:1.27-alpine
depends_on:
tenant:
tenant-alpha:
condition: service_healthy
tenant-beta:
condition: service_healthy
volumes:
- ./cf-proxy/nginx.conf:/etc/nginx/nginx.conf:ro
# Bind to 127.0.0.1 only — the harness uses a hardcoded ADMIN_TOKEN
# ("harness-admin-token") so binding 0.0.0.0 (compose's default)
# would expose admin access to anyone on the local network or VPN.
# Loopback-only is safe for E2E and prevents a known-token leak.
# Bind to 127.0.0.1 only — hardcoded ADMIN_TOKENs make 0.0.0.0
# exposure unsafe even on a local network.
ports:
- "127.0.0.1:8080:8080"
networks: [harness-net]

View File

@ -1,6 +1,17 @@
#!/usr/bin/env bash
# Tear down the harness and wipe per-tenant volumes.
#
# SECRETS_ENCRYPTION_KEY placeholder: docker compose validates the entire
# compose file even for `down -v` (a destructive read-only operation that
# doesn't read the env). up.sh generates a per-run key into its own
# shell — this script runs in a fresh shell that wouldn't see it. Without
# the placeholder, `compose down` exits non-zero before removing volumes,
# silently leaking workspaces+activity_logs into the next ./up.sh + seed.sh
# (verified 2026-05-02: tenant-isolation.sh F1/F2 saw 3× duplicate
# alpha-parent + alpha-child rows accumulated across three prior boots).
set -euo pipefail
HERE="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
cd "$HERE"
docker compose -f compose.yml down -v --remove-orphans
SECRETS_ENCRYPTION_KEY="${SECRETS_ENCRYPTION_KEY:-down-placeholder}" \
docker compose -f compose.yml down -v --remove-orphans
echo "[harness] down + volumes removed."

View File

@ -0,0 +1,178 @@
#!/usr/bin/env bash
# Replay for per-tenant independence — each tenant runs the same
# workflow concurrently with no cross-bleed in workspaces table or
# activity_logs.
#
# What this proves that tenant-isolation.sh doesn't:
# tenant-isolation.sh proves that REQUESTS get rejected at the
# middleware layer when they target the wrong tenant. THIS replay
# proves that even when both tenants are doing legitimate work
# simultaneously, the back-end state stays partitioned: no row in
# alpha's activity_logs ever shows up in beta's, no FK-resolution
# ever crosses tenants, etc.
#
# Test shape: seed activity_logs in BOTH tenants in parallel using
# distinct row counts (3 vs 5) so we can distinguish them. Then
# fetch each tenant's history and assert the count + content match
# the seed exactly — proves no leak in either direction.
#
# Phases:
# A. Seed alpha tenant: 3 a2a_receive rows (parent ← child).
# B. Seed beta tenant: 5 a2a_receive rows (parent ← child).
# C. GET alpha history → exactly 3 rows, all alpha-summary.
# D. GET beta history → exactly 5 rows, all beta-summary.
# E. Direct DB sanity — alpha PG has only alpha rows, beta PG only beta.
# F. Concurrent write race — both tenants take turns INSERTing
# simultaneously; each tenant's count after the race matches what
# it INSERTed. Catches "shared cache poison" / "shared connection
# pool" failure modes that don't show up in single-tenant tests.
set -euo pipefail
HERE="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
HARNESS_ROOT="$(dirname "$HERE")"
cd "$HARNESS_ROOT"
if [ ! -f .seed.env ]; then
echo "[replay] no .seed.env — running ./seed.sh first..."
./seed.sh
fi
# shellcheck source=/dev/null
source .seed.env
# shellcheck source=../_curl.sh
source "$HARNESS_ROOT/_curl.sh"
PASS=0
FAIL=0
assert() {
local desc="$1" expected="$2" actual="$3"
if [ "$expected" = "$actual" ]; then
printf " PASS %s\n" "$desc"
PASS=$((PASS + 1))
else
printf " FAIL %s\n expected: %s\n got : %s\n" "$desc" "$expected" "$actual" >&2
FAIL=$((FAIL + 1))
fi
}
# ─── Cleanup (idempotent) ──────────────────────────────────────────────
psql_exec_alpha >/dev/null <<SQL
DELETE FROM activity_logs WHERE workspace_id = '$ALPHA_PARENT_ID';
SQL
psql_exec_beta >/dev/null <<SQL
DELETE FROM activity_logs WHERE workspace_id = '$BETA_PARENT_ID';
SQL
# ─── Phase A: seed alpha (3 rows) ──────────────────────────────────────
echo "[replay] A. seeding alpha tenant: 3 a2a_receive rows for alpha-parent ←alpha-child"
psql_exec_alpha >/dev/null <<SQL
INSERT INTO activity_logs (workspace_id, activity_type, source_id, target_id, method, summary, created_at)
VALUES
('$ALPHA_PARENT_ID', 'a2a_receive', '$ALPHA_CHILD_ID', '$ALPHA_PARENT_ID', 'message/send', 'alpha-msg-1', NOW() - INTERVAL '3 hours'),
('$ALPHA_PARENT_ID', 'a2a_receive', '$ALPHA_CHILD_ID', '$ALPHA_PARENT_ID', 'message/send', 'alpha-msg-2', NOW() - INTERVAL '2 hours'),
('$ALPHA_PARENT_ID', 'a2a_receive', '$ALPHA_CHILD_ID', '$ALPHA_PARENT_ID', 'message/send', 'alpha-msg-3', NOW() - INTERVAL '1 hour');
SQL
# ─── Phase B: seed beta (5 rows — distinct count) ──────────────────────
echo "[replay] B. seeding beta tenant: 5 a2a_receive rows for beta-parent ← beta-child"
psql_exec_beta >/dev/null <<SQL
INSERT INTO activity_logs (workspace_id, activity_type, source_id, target_id, method, summary, created_at)
VALUES
('$BETA_PARENT_ID', 'a2a_receive', '$BETA_CHILD_ID', '$BETA_PARENT_ID', 'message/send', 'beta-msg-1', NOW() - INTERVAL '5 hours'),
('$BETA_PARENT_ID', 'a2a_receive', '$BETA_CHILD_ID', '$BETA_PARENT_ID', 'message/send', 'beta-msg-2', NOW() - INTERVAL '4 hours'),
('$BETA_PARENT_ID', 'a2a_receive', '$BETA_CHILD_ID', '$BETA_PARENT_ID', 'message/send', 'beta-msg-3', NOW() - INTERVAL '3 hours'),
('$BETA_PARENT_ID', 'a2a_receive', '$BETA_CHILD_ID', '$BETA_PARENT_ID', 'message/send', 'beta-msg-4', NOW() - INTERVAL '2 hours'),
('$BETA_PARENT_ID', 'a2a_receive', '$BETA_CHILD_ID', '$BETA_PARENT_ID', 'message/send', 'beta-msg-5', NOW() - INTERVAL '1 hour');
SQL
# ─── Phase C: alpha tenant sees only its 3 rows ────────────────────────
echo ""
echo "[replay] C. alpha history via /activity ..."
ALPHA_RESP=$(curl_alpha_admin "$BASE/workspaces/$ALPHA_PARENT_ID/activity?type=a2a_receive&peer_id=$ALPHA_CHILD_ID&limit=20")
assert "C1: alpha row count = 3" "3" "$(echo "$ALPHA_RESP" | jq 'length')"
# Every summary must start with "alpha-msg-" — beta leak would manifest
# as a beta-msg-* string in this list.
ALPHA_NON_ALPHA=$(echo "$ALPHA_RESP" | jq -r '[.[].summary | select(startswith("alpha-msg-") | not)] | length')
assert "C2: zero non-alpha summaries leaked into alpha" "0" "$ALPHA_NON_ALPHA"
# ─── Phase D: beta tenant sees only its 5 rows ─────────────────────────
echo ""
echo "[replay] D. beta history via /activity ..."
BETA_RESP=$(curl_beta_admin "$BASE/workspaces/$BETA_PARENT_ID/activity?type=a2a_receive&peer_id=$BETA_CHILD_ID&limit=20")
assert "D1: beta row count = 5" "5" "$(echo "$BETA_RESP" | jq 'length')"
BETA_NON_BETA=$(echo "$BETA_RESP" | jq -r '[.[].summary | select(startswith("beta-msg-") | not)] | length')
assert "D2: zero non-beta summaries leaked into beta" "0" "$BETA_NON_BETA"
# ─── Phase E: direct DB-side sanity ────────────────────────────────────
echo ""
echo "[replay] E. direct DB-side counts ..."
ALPHA_DB=$(psql_exec_alpha -c "SELECT COUNT(*) FROM activity_logs WHERE workspace_id = '$ALPHA_PARENT_ID';")
BETA_DB=$(psql_exec_beta -c "SELECT COUNT(*) FROM activity_logs WHERE workspace_id = '$BETA_PARENT_ID';")
assert "E1: postgres-alpha has exactly 3 alpha rows" "3" "$ALPHA_DB"
assert "E2: postgres-beta has exactly 5 beta rows" "5" "$BETA_DB"
# Cross-DB sanity: alpha PG has zero beta-named workspaces, vice versa.
ALPHA_HAS_BETA=$(psql_exec_alpha -c "SELECT COUNT(*) FROM workspaces WHERE name LIKE 'beta-%';")
BETA_HAS_ALPHA=$(psql_exec_beta -c "SELECT COUNT(*) FROM workspaces WHERE name LIKE 'alpha-%';")
assert "E3: postgres-alpha has zero beta-named workspaces" "0" "$ALPHA_HAS_BETA"
assert "E4: postgres-beta has zero alpha-named workspaces" "0" "$BETA_HAS_ALPHA"
# ─── Phase F: concurrent INSERT race ───────────────────────────────────
# Both tenants take turns inserting 10 rows concurrently. Race shape
# catches: shared-connection-pool corruption, lib/pq prepared-statement
# cache collision (org-wide hazard per memory), redis cross-keyspace
# bleed. Each side must end with EXACTLY +10 rows from its own writes.
echo ""
echo "[replay] F. concurrent insert race — 10 rows per tenant in parallel"
(
for i in $(seq 1 10); do
psql_exec_alpha >/dev/null <<SQL
INSERT INTO activity_logs (workspace_id, activity_type, source_id, target_id, method, summary)
VALUES ('$ALPHA_PARENT_ID', 'a2a_receive', '$ALPHA_CHILD_ID', '$ALPHA_PARENT_ID', 'message/send', 'alpha-race-$i');
SQL
done
) &
ALPHA_PID=$!
(
for i in $(seq 1 10); do
psql_exec_beta >/dev/null <<SQL
INSERT INTO activity_logs (workspace_id, activity_type, source_id, target_id, method, summary)
VALUES ('$BETA_PARENT_ID', 'a2a_receive', '$BETA_CHILD_ID', '$BETA_PARENT_ID', 'message/send', 'beta-race-$i');
SQL
done
) &
BETA_PID=$!
wait $ALPHA_PID $BETA_PID
ALPHA_AFTER=$(psql_exec_alpha -c "SELECT COUNT(*) FROM activity_logs WHERE workspace_id = '$ALPHA_PARENT_ID';")
BETA_AFTER=$(psql_exec_beta -c "SELECT COUNT(*) FROM activity_logs WHERE workspace_id = '$BETA_PARENT_ID';")
assert "F1: alpha has 13 rows after race (3 + 10)" "13" "$ALPHA_AFTER"
assert "F2: beta has 15 rows after race (5 + 10)" "15" "$BETA_AFTER"
# Concurrency leak check: alpha's "race" rows must all be alpha-race-*,
# beta's must all be beta-race-*. A pool/cache cross-bleed would surface
# as some tenant getting the other's writes.
ALPHA_RACE_NAMES=$(psql_exec_alpha -c "SELECT COUNT(*) FROM activity_logs WHERE workspace_id = '$ALPHA_PARENT_ID' AND summary LIKE 'beta-race-%';")
BETA_RACE_NAMES=$(psql_exec_beta -c "SELECT COUNT(*) FROM activity_logs WHERE workspace_id = '$BETA_PARENT_ID' AND summary LIKE 'alpha-race-%';")
assert "F3: zero beta-race rows leaked into alpha PG" "0" "$ALPHA_RACE_NAMES"
assert "F4: zero alpha-race rows leaked into beta PG" "0" "$BETA_RACE_NAMES"
# ─── Cleanup ───────────────────────────────────────────────────────────
psql_exec_alpha >/dev/null <<SQL
DELETE FROM activity_logs WHERE workspace_id = '$ALPHA_PARENT_ID';
SQL
psql_exec_beta >/dev/null <<SQL
DELETE FROM activity_logs WHERE workspace_id = '$BETA_PARENT_ID';
SQL
echo ""
if [ "$FAIL" -gt 0 ]; then
echo "[replay] FAIL: $PASS pass, $FAIL fail"
exit 1
fi
echo "[replay] PASS: $PASS/$PASS — per-tenant independence holds (DB partition + concurrent race)"

View File

@ -0,0 +1,172 @@
#!/usr/bin/env bash
# Replay for cross-tenant isolation — TenantGuard middleware MUST 404
# any request whose X-Molecule-Org-Id (or Fly-Replay state, or
# same-origin Canvas trust) doesn't match the tenant container's
# configured MOLECULE_ORG_ID.
#
# Why this matters in production:
# - One Cloudflare tunnel front-doors every tenant subdomain.
# - DNS/routing layer can mis-direct a request (CF cache poisoning,
# misconfigured CNAME, internal traffic mirror).
# - TenantGuard is the last-line defense — it 404s any request whose
# declared org doesn't match what the tenant binary was provisioned
# with. Returning 404 (not 403) is intentional: the existence of a
# tenant on this machine must not be probable by an outsider.
#
# What this replay catches:
# - A regression where TenantGuard accidentally allows requests with
# a different org id (e.g. someone removes the strict equality check).
# - cf-proxy routing-by-Host bug that sends alpha's request to beta's
# container (the negative test would suddenly succeed).
# - Allowlist drift — if /workspaces is added to tenantGuardAllowlist
# it would silently be cross-tenant readable.
#
# Phases:
# A. Positive controls — each tenant accepts its own valid creds.
# B. Org-header mismatch — alpha-org header at beta's URL → 404.
# C. Reverse — beta-org header at alpha's URL → 404.
# D. Right URL, wrong org header (typo) → 404.
# E. Bearer present but no org header → 404 (TenantGuard rejects).
# F. Per-tenant DB isolation — alpha's /workspaces enumerates only
# alpha workspaces; beta's only beta. Confirms cf-proxy + TenantGuard
# really did partition the request to the right backing DB.
# G. Allowlisted /health stays public on both tenants (sanity check —
# a regression that put /health behind the guard would 404 too).
set -euo pipefail
HERE="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
HARNESS_ROOT="$(dirname "$HERE")"
cd "$HARNESS_ROOT"
if [ ! -f .seed.env ]; then
echo "[replay] no .seed.env — running ./seed.sh first..."
./seed.sh
fi
# shellcheck source=/dev/null
source .seed.env
# shellcheck source=../_curl.sh
source "$HARNESS_ROOT/_curl.sh"
PASS=0
FAIL=0
assert_status() {
local desc="$1" expected="$2" actual="$3"
if [ "$expected" = "$actual" ]; then
printf " PASS %s (HTTP %s)\n" "$desc" "$actual"
PASS=$((PASS + 1))
else
printf " FAIL %s\n expected HTTP %s, got HTTP %s\n" "$desc" "$expected" "$actual" >&2
FAIL=$((FAIL + 1))
fi
}
# ─── Phase A: positive controls ────────────────────────────────────────
echo "[replay] A. positive controls — each tenant accepts its own valid creds"
ALPHA_OWN=$(curl_alpha_admin -o /dev/null -w '%{http_code}' "$BASE/workspaces")
assert_status "A1: alpha creds at alpha returns 200" "200" "$ALPHA_OWN"
BETA_OWN=$(curl_beta_admin -o /dev/null -w '%{http_code}' "$BASE/workspaces")
assert_status "A2: beta creds at beta returns 200" "200" "$BETA_OWN"
# ─── Phase B: alpha creds at beta's URL → 404 ──────────────────────────
echo ""
echo "[replay] B. alpha-org header at beta's URL — TenantGuard must 404"
CROSS_AB=$(curl_alpha_creds_at_beta -o /tmp/iso-ab.json -w '%{http_code}' "$BASE/workspaces")
assert_status "B1: alpha-org header at beta URL → 404" "404" "$CROSS_AB"
# Body must be a generic 404 — never reveal that beta exists or that
# the org check fired (TenantGuard is intentionally indistinguishable
# from "no such route" to an outside scanner).
B_BODY=$(cat /tmp/iso-ab.json)
if echo "$B_BODY" | grep -qiE "tenant|org|forbidden|denied"; then
printf " FAIL B2: 404 body leaks tenant/org/auth keywords (info disclosure)\n body: %s\n" "$B_BODY" >&2
FAIL=$((FAIL + 1))
else
printf " PASS B2: 404 body has no tenant/org leak\n"
PASS=$((PASS + 1))
fi
# ─── Phase C: beta creds at alpha's URL → 404 ──────────────────────────
echo ""
echo "[replay] C. beta-org header at alpha's URL — TenantGuard must 404"
CROSS_BA=$(curl_beta_creds_at_alpha -o /tmp/iso-ba.json -w '%{http_code}' "$BASE/workspaces")
assert_status "C1: beta-org header at alpha URL → 404" "404" "$CROSS_BA"
# ─── Phase D: right URL, garbage org header ────────────────────────────
echo ""
echo "[replay] D. right URL, garbage org header → 404"
GARBAGE=$(curl -sS -o /dev/null -w '%{http_code}' \
-H "Host: ${ALPHA_HOST}" \
-H "Authorization: Bearer ${ALPHA_ADMIN_TOKEN}" \
-H "X-Molecule-Org-Id: not-the-right-org" \
"$BASE/workspaces")
assert_status "D1: garbage org id at alpha URL → 404" "404" "$GARBAGE"
# ─── Phase E: bearer present but no org header at all → 404 ────────────
echo ""
echo "[replay] E. valid bearer but missing X-Molecule-Org-Id → 404"
NO_ORG=$(curl -sS -o /dev/null -w '%{http_code}' \
-H "Host: ${ALPHA_HOST}" \
-H "Authorization: Bearer ${ALPHA_ADMIN_TOKEN}" \
"$BASE/workspaces")
assert_status "E1: missing X-Molecule-Org-Id → 404" "404" "$NO_ORG"
# ─── Phase F: per-tenant DB isolation via list_workspaces ──────────────
echo ""
echo "[replay] F. per-tenant DB isolation via /workspaces listing"
ALPHA_LIST=$(curl_alpha_admin "$BASE/workspaces")
ALPHA_NAMES=$(echo "$ALPHA_LIST" | jq -r '.[].name' | sort | tr '\n' ',' | sed 's/,$//')
echo "[replay] alpha tenant sees: $ALPHA_NAMES"
if [ "$ALPHA_NAMES" = "alpha-child,alpha-parent" ]; then
printf " PASS F1: alpha enumerates only alpha workspaces\n"
PASS=$((PASS + 1))
else
printf " FAIL F1: alpha enumerated unexpected workspaces\n expected: alpha-child,alpha-parent\n got : %s\n" "$ALPHA_NAMES" >&2
FAIL=$((FAIL + 1))
fi
BETA_LIST=$(curl_beta_admin "$BASE/workspaces")
BETA_NAMES=$(echo "$BETA_LIST" | jq -r '.[].name' | sort | tr '\n' ',' | sed 's/,$//')
echo "[replay] beta tenant sees: $BETA_NAMES"
if [ "$BETA_NAMES" = "beta-child,beta-parent" ]; then
printf " PASS F2: beta enumerates only beta workspaces\n"
PASS=$((PASS + 1))
else
printf " FAIL F2: beta enumerated unexpected workspaces\n expected: beta-child,beta-parent\n got : %s\n" "$BETA_NAMES" >&2
FAIL=$((FAIL + 1))
fi
# Cross-check: neither tenant's list contains the other's workspace ids.
LEAKED_INTO_ALPHA=$(echo "$ALPHA_LIST" | jq -r --arg b1 "$BETA_PARENT_ID" --arg b2 "$BETA_CHILD_ID" \
'[.[] | select(.id == $b1 or .id == $b2)] | length')
assert_status "F3: alpha list contains zero beta workspace ids" "0" "$LEAKED_INTO_ALPHA"
LEAKED_INTO_BETA=$(echo "$BETA_LIST" | jq -r --arg a1 "$ALPHA_PARENT_ID" --arg a2 "$ALPHA_CHILD_ID" \
'[.[] | select(.id == $a1 or .id == $a2)] | length')
assert_status "F4: beta list contains zero alpha workspace ids" "0" "$LEAKED_INTO_BETA"
# ─── Phase G: /health is allowlisted (sanity) ──────────────────────────
echo ""
echo "[replay] G. /health stays public on both tenants (TenantGuard allowlist sanity)"
ALPHA_HEALTH=$(curl -sS -o /dev/null -w '%{http_code}' -H "Host: ${ALPHA_HOST}" "$BASE/health")
assert_status "G1: alpha /health public → 200" "200" "$ALPHA_HEALTH"
BETA_HEALTH=$(curl -sS -o /dev/null -w '%{http_code}' -H "Host: ${BETA_HOST}" "$BASE/health")
assert_status "G2: beta /health public → 200" "200" "$BETA_HEALTH"
echo ""
if [ "$FAIL" -gt 0 ]; then
echo "[replay] FAIL: $PASS pass, $FAIL fail"
exit 1
fi
echo "[replay] PASS: $PASS/$PASS — TenantGuard isolation + per-tenant DB partitioning hold"

View File

@ -12,3 +12,9 @@
# when a new replay introduces a new Python import.
httpx>=0.28.1
# channel-envelope-trust-boundary.sh imports from `molecule_runtime.*` (the
# wheel-rewritten path) so it catches the failure mode where the wheel
# build silently strips a fix that unit tests on local source still pass.
# >= 0.1.78 ships PR #2481's peer_id trust-boundary guard.
molecule-ai-workspace-runtime>=0.1.78

View File

@ -1,13 +1,20 @@
#!/usr/bin/env bash
# Seed the harness with two registered workspaces so peer-discovery
# replay scripts have something to discover.
# Seed BOTH tenants with parent + child workspaces so peer-discovery
# and cross-tenant replays have something to discover.
#
# - "alpha" parent (tier 0)
# - "beta" child of alpha (tier 1)
# Tenant alpha:
# - alpha-parent (tier 0)
# - alpha-child (tier 1, child of alpha-parent)
# Tenant beta:
# - beta-parent (tier 0)
# - beta-child (tier 1, child of beta-parent)
#
# Both register via the platform's /workspaces endpoint, which is what
# CP does at provision time. The platform then has them in its DB;
# tool_list_peers from inside alpha can resolve beta as a peer.
# IDs are server-generated (POST /workspaces ignores body.id) — we
# capture the returned id rather than minting client-side. Older
# versions silently desynced from the workspaces table, breaking
# FK-dependent replays.
#
# All four IDs persist to .seed.env so replays can target any of them.
set -euo pipefail
HERE="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
@ -16,51 +23,67 @@ cd "$HERE"
# shellcheck source=_curl.sh
source "$HERE/_curl.sh"
echo "[seed] confirming tenant is reachable via cf-proxy..."
HEALTH=$(curl_anon "$BASE/health" || echo "")
if [ -z "$HEALTH" ]; then
echo "[seed] FAILED: $BASE/health unreachable. Did ./up.sh complete?"
create_workspace() {
local tenant="$1" name="$2" tier="$3" parent="${4:-}"
local body
if [ -n "$parent" ]; then
body="{\"name\":\"$name\",\"tier\":$tier,\"parent_id\":\"$parent\",\"runtime\":\"langgraph\"}"
else
body="{\"name\":\"$name\",\"tier\":$tier,\"runtime\":\"langgraph\"}"
fi
local id
if [ "$tenant" = "alpha" ]; then
id=$(curl_alpha_admin -X POST "$BASE/workspaces" -d "$body" | jq -r '.id')
else
id=$(curl_beta_admin -X POST "$BASE/workspaces" -d "$body" | jq -r '.id')
fi
if [ -z "$id" ] || [ "$id" = "null" ]; then
echo "[seed] FAIL: $tenant/$name workspace creation returned no id" >&2
return 1
fi
echo "$id"
}
echo "[seed] confirming both tenants reachable..."
ALPHA_HEALTH=$(curl_alpha_anon "$BASE/health" || echo "")
BETA_HEALTH=$(curl_beta_anon "$BASE/health" || echo "")
if [ -z "$ALPHA_HEALTH" ] || [ -z "$BETA_HEALTH" ]; then
echo "[seed] FAIL: tenant unreachable. alpha='$ALPHA_HEALTH' beta='$BETA_HEALTH'"
echo " Did ./up.sh complete cleanly?"
exit 1
fi
echo "[seed] $HEALTH"
echo "[seed] alpha: $ALPHA_HEALTH"
echo "[seed] beta : $BETA_HEALTH"
echo "[seed] confirming /buildinfo returns the harness GIT_SHA..."
BUILD=$(curl_anon "$BASE/buildinfo" || echo "")
echo "[seed] $BUILD"
echo ""
echo "[seed] tenant alpha — creating alpha-parent + alpha-child ..."
ALPHA_PARENT_ID=$(create_workspace alpha alpha-parent 0)
echo "[seed] alpha-parent id=$ALPHA_PARENT_ID"
ALPHA_CHILD_ID=$(create_workspace alpha alpha-child 1 "$ALPHA_PARENT_ID")
echo "[seed] alpha-child id=$ALPHA_CHILD_ID"
# Create alpha (parent) and beta (child of alpha). The handler always
# generates the workspace id server-side and ignores any id in the
# request body, so we capture the returned id rather than minting one
# locally — older versions of this script minted client-side and would
# silently desync from the workspaces table, breaking FK-dependent
# replays (chat-history seeds activity_logs which has a FK to workspaces).
echo "[seed] creating workspace 'alpha' (parent)..."
ALPHA_ID=$(curl_admin -X POST "$BASE/workspaces" \
-d '{"name":"alpha","tier":0,"runtime":"langgraph"}' \
| jq -r '.id')
if [ -z "$ALPHA_ID" ] || [ "$ALPHA_ID" = "null" ]; then
echo "[seed] FAIL: alpha workspace creation returned no id"
exit 1
fi
echo "[seed] alpha id=$ALPHA_ID"
echo ""
echo "[seed] tenant beta — creating beta-parent + beta-child ..."
BETA_PARENT_ID=$(create_workspace beta beta-parent 0)
echo "[seed] beta-parent id=$BETA_PARENT_ID"
BETA_CHILD_ID=$(create_workspace beta beta-child 1 "$BETA_PARENT_ID")
echo "[seed] beta-child id=$BETA_CHILD_ID"
echo "[seed] creating workspace 'beta' (child of alpha)..."
BETA_ID=$(curl_admin -X POST "$BASE/workspaces" \
-d "{\"name\":\"beta\",\"tier\":1,\"parent_id\":\"$ALPHA_ID\",\"runtime\":\"langgraph\"}" \
| jq -r '.id')
if [ -z "$BETA_ID" ] || [ "$BETA_ID" = "null" ]; then
echo "[seed] FAIL: beta workspace creation returned no id"
exit 1
fi
echo "[seed] beta id=$BETA_ID"
# Stash IDs so replay scripts pick them up.
# Stash IDs for replay scripts.
#
# Backwards-compat: ALPHA_ID + BETA_ID aliases keep pre-Phase-2 replays
# working (they used these names for the alpha tenant's parent + child).
{
echo "ALPHA_ID=$ALPHA_ID"
echo "BETA_ID=$BETA_ID"
echo "ALPHA_PARENT_ID=$ALPHA_PARENT_ID"
echo "ALPHA_CHILD_ID=$ALPHA_CHILD_ID"
echo "BETA_PARENT_ID=$BETA_PARENT_ID"
echo "BETA_CHILD_ID=$BETA_CHILD_ID"
echo "# legacy aliases — pre-Phase-2 replays expect these names"
echo "ALPHA_ID=$ALPHA_PARENT_ID"
echo "BETA_ID=$ALPHA_CHILD_ID"
} > "$HERE/.seed.env"
echo ""
echo "[seed] done. IDs persisted to tests/harness/.seed.env"
echo "[seed] ALPHA_ID=$ALPHA_ID"
echo "[seed] BETA_ID=$BETA_ID"
echo "[seed] alpha: parent=$ALPHA_PARENT_ID child=$ALPHA_CHILD_ID"
echo "[seed] beta : parent=$BETA_PARENT_ID child=$BETA_CHILD_ID"

View File

@ -38,21 +38,22 @@ if [ "$REBUILD" = true ]; then
docker compose -f compose.yml build --no-cache tenant cp-stub
fi
echo "[harness] starting cp-stub + postgres + redis + tenant + cf-proxy ..."
echo "[harness] starting redis + cp-stub + tenant-alpha + tenant-beta + cf-proxy ..."
docker compose -f compose.yml up -d --wait
# Sudo-free reachability: cf-proxy/nginx routes by Host header (matches
# production CF tunnel), so replays target loopback :8080 with a Host
# header rather than depending on /etc/hosts resolution. _curl.sh
# centralises this. Legacy /etc/hosts users still work — the BASE env
# var override accepts either shape.
# Sudo-free reachability: cf-proxy/nginx routes by Host header to the
# right tenant container (matches production CF tunnel: same URL,
# different Host = different tenant). Replays target loopback :8080
# with a per-tenant Host header. _curl.sh centralises the helper
# functions (curl_alpha_admin, curl_beta_admin, etc.).
echo ""
echo "[harness] up."
echo " Tenant via cf-proxy: http://localhost:8080/health"
echo " (Host: harness-tenant.localhost)"
echo " cp-stub: internal-only via compose net"
echo "[harness] up. Multi-tenant topology:"
echo " tenant-alpha: Host: harness-tenant-alpha.localhost"
echo " tenant-beta: Host: harness-tenant-beta.localhost"
echo " legacy alias: Host: harness-tenant.localhost → alpha"
echo ""
echo " Quick check:"
echo " curl -H 'Host: harness-tenant.localhost' http://localhost:8080/health"
echo " Quick check (no /etc/hosts needed):"
echo " curl -H 'Host: harness-tenant-alpha.localhost' http://localhost:8080/health"
echo " curl -H 'Host: harness-tenant-beta.localhost' http://localhost:8080/health"
echo ""
echo "Next: ./seed.sh # mint admin token + register sample workspaces"
echo "Next: ./seed.sh # register parent+child workspaces in BOTH tenants"