forked from molecule-ai/molecule-core
Brings the local harness from "single tenant covering the request path" to "two tenants covering both the request path AND the per-tenant isolation boundary" — the same shape production runs (one EC2 + one Postgres + one MOLECULE_ORG_ID per tenant). Why this matters: the four prior replays exercise the SaaS request path against one tenant. They cannot prove that TenantGuard rejects a misrouted request (production CF tunnel + AWS LB are the failure surface), nor that two tenants doing legitimate work in parallel keep their `activity_logs` / `workspaces` / connection-pool state partitioned. Both are real bug classes — TenantGuard allowlist drift shipped #2398, lib/pq prepared-statement cache collision is documented as an org-wide hazard. What changed: 1. compose.yml — split into two tenants. tenant-alpha + postgres-alpha + tenant-beta + postgres-beta + the shared cp-stub, redis, cf-proxy. Each tenant gets a distinct ADMIN_TOKEN + MOLECULE_ORG_ID and its own Postgres database. cf-proxy depends on both tenants becoming healthy. 2. cf-proxy/nginx.conf — Host-header → tenant routing. `map $host $tenant_upstream` resolves the right backend per request. Required `resolver 127.0.0.11 valid=30s ipv6=off;` because nginx needs an explicit DNS resolver to use a variable in `proxy_pass` (literal hostnames resolve once at startup; variables resolve per request — without the resolver nginx fails closed with 502). `server_name` lists both tenants + the legacy alias so unknown Host headers don't silently route to a default and mask routing bugs. 3. _curl.sh — per-tenant + cross-tenant-negative helpers. `curl_alpha_admin` / `curl_beta_admin` set the right Host + Authorization + X-Molecule-Org-Id triple. `curl_alpha_creds_at_beta` / `curl_beta_creds_at_alpha` exist precisely to make WRONG requests (replays use them to assert TenantGuard rejects). `psql_exec_alpha` / `psql_exec_beta` shell out per-tenant Postgres exec. Legacy aliases (`curl_admin`, `psql_exec`) keep the four pre-Phase-2 replays working without edits. 4. seed.sh — registers parent+child workspaces in BOTH tenants. Captures server-generated IDs via `jq -r '.id'` (POST /workspaces ignores body.id, so the older client-side mint silently desynced from the workspaces table and broke FK-dependent replays). Stashes `ALPHA_PARENT_ID` / `ALPHA_CHILD_ID` / `BETA_PARENT_ID` / `BETA_CHILD_ID` to .seed.env, plus legacy `ALPHA_ID` / `BETA_ID` aliases for backwards compat with chat-history / channel-envelope. 5. New replays. tenant-isolation.sh (13 assertions) — TenantGuard 404s any request whose X-Molecule-Org-Id doesn't match the container's MOLECULE_ORG_ID. Asserts the 404 body has zero tenant/org/forbidden/denied keywords (existence of a tenant must not be probable from the outside). Covers cross-tenant routing misconfigure + allowlist drift + missing-org-header. per-tenant-independence.sh (12 assertions) — both tenants seed activity_logs in parallel with distinct row counts (3 vs 5) and confirm each tenant's history endpoint returns exactly its own counts. Then a concurrent INSERT race (10 rows per tenant in parallel via `&` + wait) catches shared-pool corruption + prepared-statement cache poisoning + redis cross-keyspace bleed. 6. Bug fix: down.sh + dump-logs SECRETS_ENCRYPTION_KEY validation. `docker compose down -v` validates the entire compose file even though it doesn't read the env. up.sh generates a per-run key into its own shell — down.sh runs in a fresh shell that wouldn't see it, so without a placeholder `compose down` exited non-zero before removing volumes. Workspaces silently leaked into the next ./up.sh + seed.sh boot. Caught when tenant-isolation.sh F1/F2 saw 3× duplicate alpha-parent rows accumulated across three prior runs. Same fix applied to the workflow's dump-logs step. 7. requirements.txt — pin molecule-ai-workspace-runtime>=0.1.78. channel-envelope-trust-boundary.sh imports from `molecule_runtime.*` (the wheel-rewritten path) so it catches the failure mode where the wheel build silently strips a fix that unit tests on local source still pass. CI was failing this replay because the wheel wasn't installed — caught in the staging push run from #2492. 8. .github/workflows/harness-replays.yml — Phase 2 plumbing. * Removed /etc/hosts step (Host-header path eliminated the need; scripts already source _curl.sh). * Updated dump-logs to reference the new service names (tenant-alpha + tenant-beta + postgres-alpha + postgres-beta). * Added SECRETS_ENCRYPTION_KEY placeholder env on the dump step. Verified: ./run-all-replays.sh from a clean state — 6/6 passed (buildinfo-stale-image, channel-envelope-trust-boundary, chat-history, peer-discovery-404, per-tenant-independence, tenant-isolation). Roadmap section updated: Phase 2 marked shipped. Phase 3 promoted to "replace cp-stub with real molecule-controlplane Docker build + env coherence lint." Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
160 lines
6.0 KiB
Bash
160 lines
6.0 KiB
Bash
# Sourceable helper for harness replays. Centralises the
|
|
# curl-against-cf-proxy pattern so scripts don't depend on /etc/hosts.
|
|
#
|
|
# Production CF tunnel routes by Host header, not by DNS — the request
|
|
# URL is to a public CF endpoint and the Host header carries the
|
|
# per-tenant identity. We replay the same shape locally:
|
|
#
|
|
# curl -H "Host: harness-tenant-alpha.localhost" http://localhost:8080/health
|
|
#
|
|
# This matches what cf-proxy/nginx.conf already routes (`server_name
|
|
# *.localhost` + `map $host $tenant_upstream`) and avoids the macOS
|
|
# /etc/hosts requirement that previously gated the harness behind a
|
|
# sudo step.
|
|
#
|
|
# Multi-tenant since Phase 2: alpha and beta tenants run in parallel.
|
|
# `curl_alpha_admin` and `curl_beta_admin` target each tenant's URL
|
|
# with that tenant's ADMIN_TOKEN + MOLECULE_ORG_ID. The legacy
|
|
# `curl_admin` is aliased to alpha for backwards compat with the
|
|
# pre-Phase-2 single-tenant replays.
|
|
#
|
|
# Usage:
|
|
# HERE="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
|
# source "$HERE/../_curl.sh" # from replays/<name>.sh
|
|
# curl_alpha_admin "$BASE/health"
|
|
# curl_beta_admin "$BASE/health"
|
|
|
|
# Bind to the cf-proxy's loopback port — the proxy front-doors every
|
|
# tenant and routes by Host header, exactly like production's CF tunnel.
|
|
: "${BASE:=http://localhost:8080}"
|
|
|
|
# Per-tenant identity. Each pair must match the corresponding tenant
|
|
# container's environment in compose.yml or auth/TenantGuard will fail
|
|
# in non-obvious ways (401 vs 403 vs silent route to wrong tenant).
|
|
: "${ALPHA_HOST:=harness-tenant-alpha.localhost}"
|
|
: "${ALPHA_ADMIN_TOKEN:=harness-admin-token-alpha}"
|
|
: "${ALPHA_ORG_ID:=harness-org-alpha}"
|
|
|
|
: "${BETA_HOST:=harness-tenant-beta.localhost}"
|
|
: "${BETA_ADMIN_TOKEN:=harness-admin-token-beta}"
|
|
: "${BETA_ORG_ID:=harness-org-beta}"
|
|
|
|
# Legacy single-tenant aliases — pre-Phase-2 replays use these without
|
|
# knowing the topology grew. They map to alpha. New replays should use
|
|
# the explicit alpha/beta variants for clarity.
|
|
: "${TENANT_HOST:=$ALPHA_HOST}"
|
|
: "${ADMIN_TOKEN:=$ALPHA_ADMIN_TOKEN}"
|
|
: "${ORG_ID:=$ALPHA_ORG_ID}"
|
|
|
|
# ─── Anonymous (no auth) ──────────────────────────────────────────────
|
|
|
|
# Anonymous request to alpha. Use for /health, /buildinfo, etc.
|
|
curl_alpha_anon() {
|
|
curl -sS -H "Host: ${ALPHA_HOST}" "$@"
|
|
}
|
|
|
|
# Anonymous request to beta.
|
|
curl_beta_anon() {
|
|
curl -sS -H "Host: ${BETA_HOST}" "$@"
|
|
}
|
|
|
|
# Legacy alias for single-tenant replays.
|
|
curl_anon() {
|
|
curl -sS -H "Host: ${TENANT_HOST}" "$@"
|
|
}
|
|
|
|
# ─── Admin-token requests ─────────────────────────────────────────────
|
|
|
|
# Admin-token request to alpha tenant. SaaS-shape auth: bearer token,
|
|
# tenant org header (TenantGuard activates), JSON content type.
|
|
curl_alpha_admin() {
|
|
curl -sS \
|
|
-H "Host: ${ALPHA_HOST}" \
|
|
-H "Authorization: Bearer ${ALPHA_ADMIN_TOKEN}" \
|
|
-H "X-Molecule-Org-Id: ${ALPHA_ORG_ID}" \
|
|
-H "Content-Type: application/json" \
|
|
"$@"
|
|
}
|
|
|
|
# Admin-token request to beta tenant.
|
|
curl_beta_admin() {
|
|
curl -sS \
|
|
-H "Host: ${BETA_HOST}" \
|
|
-H "Authorization: Bearer ${BETA_ADMIN_TOKEN}" \
|
|
-H "X-Molecule-Org-Id: ${BETA_ORG_ID}" \
|
|
-H "Content-Type: application/json" \
|
|
"$@"
|
|
}
|
|
|
|
# Legacy alias.
|
|
curl_admin() {
|
|
curl_alpha_admin "$@"
|
|
}
|
|
|
|
# ─── Cross-tenant negative-test helpers ───────────────────────────────
|
|
# These exist to MAKE WRONG calls — replays use them to assert
|
|
# TenantGuard rejects them. Names spell out what's mismatched.
|
|
|
|
# alpha bearer + alpha org, but talking to beta's URL. TenantGuard
|
|
# should reject because the org header doesn't match beta's MOLECULE_ORG_ID.
|
|
curl_alpha_creds_at_beta() {
|
|
curl -sS \
|
|
-H "Host: ${BETA_HOST}" \
|
|
-H "Authorization: Bearer ${ALPHA_ADMIN_TOKEN}" \
|
|
-H "X-Molecule-Org-Id: ${ALPHA_ORG_ID}" \
|
|
-H "Content-Type: application/json" \
|
|
"$@"
|
|
}
|
|
|
|
# beta bearer + beta org, but talking to alpha's URL.
|
|
curl_beta_creds_at_alpha() {
|
|
curl -sS \
|
|
-H "Host: ${ALPHA_HOST}" \
|
|
-H "Authorization: Bearer ${BETA_ADMIN_TOKEN}" \
|
|
-H "X-Molecule-Org-Id: ${BETA_ORG_ID}" \
|
|
-H "Content-Type: application/json" \
|
|
"$@"
|
|
}
|
|
|
|
# ─── Workspace-scoped (per-workspace bearer) ──────────────────────────
|
|
|
|
# Workspace-scoped request to alpha — uses a per-workspace bearer
|
|
# minted from /admin/workspaces/:id/test-token. Caller must export
|
|
# WORKSPACE_TOKEN.
|
|
curl_workspace() {
|
|
: "${WORKSPACE_TOKEN:?WORKSPACE_TOKEN must be set — mint via /admin/workspaces/:id/test-token}"
|
|
curl -sS \
|
|
-H "Host: ${TENANT_HOST}" \
|
|
-H "Authorization: Bearer ${WORKSPACE_TOKEN}" \
|
|
-H "X-Molecule-Org-Id: ${ORG_ID}" \
|
|
-H "Content-Type: application/json" \
|
|
"$@"
|
|
}
|
|
|
|
# ─── Postgres exec (per-tenant) ───────────────────────────────────────
|
|
|
|
# Direct postgres exec — for replays that need to seed activity_logs
|
|
# rows or read DB state that has no public HTTP route.
|
|
#
|
|
# SECRETS_ENCRYPTION_KEY placeholder lets compose validate without
|
|
# requiring up.sh's per-run key (exec doesn't actually use it but
|
|
# compose validates the file).
|
|
psql_exec_alpha() {
|
|
SECRETS_ENCRYPTION_KEY="${SECRETS_ENCRYPTION_KEY:-exec-placeholder}" \
|
|
docker compose -f "${HARNESS_COMPOSE:-$(dirname "${BASH_SOURCE[0]}")/compose.yml}" \
|
|
exec -T postgres-alpha \
|
|
psql -U harness -d molecule -At "$@"
|
|
}
|
|
|
|
psql_exec_beta() {
|
|
SECRETS_ENCRYPTION_KEY="${SECRETS_ENCRYPTION_KEY:-exec-placeholder}" \
|
|
docker compose -f "${HARNESS_COMPOSE:-$(dirname "${BASH_SOURCE[0]}")/compose.yml}" \
|
|
exec -T postgres-beta \
|
|
psql -U harness -d molecule -At "$@"
|
|
}
|
|
|
|
# Legacy alias — single-tenant replays default to alpha's DB.
|
|
psql_exec() {
|
|
psql_exec_alpha "$@"
|
|
}
|