molecule-core/tests/harness/compose.yml
claude-ceo-assistant (Claude Opus 4.7 on Hongming's MacBook) 87b971a292
Some checks failed
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 10s
E2E API Smoke Test / detect-changes (pull_request) Successful in 9s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 8s
CI / Detect changes (pull_request) Successful in 9s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Successful in 9s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 9s
branch-protection drift check / Branch protection drift (pull_request) Successful in 11s
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Successful in 10s
Check merge_group trigger on required workflows / Required workflows have merge_group trigger (pull_request) Successful in 11s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Successful in 9s
Harness Replays / detect-changes (pull_request) Successful in 9s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 7s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
CI / Platform (Go) (pull_request) Successful in 3s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s
CI / Python Lint & Test (pull_request) Successful in 4s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 4s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 3s
CI / Canvas (Next.js) (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Harness Replays / Harness Replays (pull_request) Failing after 46s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 49s
fix(ci): close 3 chronic Gitea-Actions workflow flakes (closes #88)
Three workflows have been failing on every push to this Gitea repo for
GitHub-shaped reasons that don't translate to act_runner. Surfaced
while landing #84; bundled per `feedback_gitea_actions_migration_audit_pattern`
("bundle per-repo, not per-finding") instead of three separate PRs.

1) handlers-postgres-integration: localhost → 127.0.0.1
   - lib/pq tries to dial localhost → ::1 first; the postgres service
     container only listens on IPv4 → ECONNREFUSED → all
     TestIntegration_* fail. Pin IPv4 to make the job deterministic.

2) pr-guards / disable-auto-merge-on-push: Gitea no-op
   - The previous reusable-workflow caller invoked `gh pr merge
     --disable-auto`, which calls GitHub's GraphQL API. Gitea returns
     HTTP 405 on /api/graphql → step always fails. Inline the step so
     it can detect Gitea (GITEA_ACTIONS=true OR repo url under
     moleculesai.app) and no-op with a notice. Auto-merge gating is
     moot on Gitea anyway: there's no `--auto` primitive being
     touched. Job stays ALWAYS-RUN so branch protection's required
     check still lands SUCCESS (avoids the SKIPPED-in-set trap from
     `feedback_branch_protection_check_name_parity`).

3) Harness Replays: cf-proxy nginx.conf via docker `configs:` (not bind)
   - act_runner runs the workflow inside a runner container; runc in
     the docker daemon below resolves bind-mount source paths on the
     OUTER host, not inside the runner. The path
     `/workspace/.../cf-proxy/nginx.conf` is invisible there → "not a
     directory" runc error. Switching to compose `configs:` packages
     the file as content rather than a host bind, sidestepping the
     DinD path-translation gap.

Local validation:
  - YAML parsed clean for all 3 files.
  - cf-proxy nginx.conf: standalone `docker compose run cf-proxy
    nginx -T` reproduced the configs: mount end-to-end and dumped the
    config correctly. The full harness compose still renders via
    `docker compose config`.

Real-CI verification will land on this branch's first push.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 17:06:09 -07:00

209 lines
7.6 KiB
YAML

# Production-shape harness for local E2E. Multi-tenant.
#
# Reproduces the SaaS tenant topology on localhost using the SAME
# images that ship to production:
#
# client → cf-proxy (nginx, mimics CF tunnel headers, routes by Host)
# ├─ Host: harness-tenant-alpha.localhost → tenant-alpha
# │ ↓ (CP_UPSTREAM_URL=http://cp-stub:9090)
# │ tenant-alpha (workspace-server/Dockerfile.tenant)
# │ ↓
# │ postgres-alpha (per-tenant DB, matches prod)
# ├─ Host: harness-tenant-beta.localhost → tenant-beta
# │ ↓
# │ tenant-beta + postgres-beta
# └─ cp-stub + redis (shared infra; CP is Railway-singleton in prod,
# redis is shared cluster)
#
# The two-tenant topology catches:
# - TenantGuard cross-tenant escape (alpha-org token shouldn't see
# beta-tenant data even with a valid bearer)
# - cf-proxy Host-header routing correctness
# - Per-tenant DB isolation (workspaces table, activity_logs)
# - Concurrent multi-tenant operation (no shared mutable state)
#
# Quickstart (no /etc/hosts edits — see README):
# cd tests/harness && ./up.sh && ./seed.sh
# ./replays/peer-discovery-404.sh
# ./run-all-replays.sh
#
# Env config:
# GIT_SHA — passed to BOTH tenant builds for /buildinfo verification.
# CP_STUB_PEERS_MODE — peers failure mode for replay scripts.
services:
# ─── Shared infra (matches prod: CP is Railway-singleton, redis shared) ───
redis:
image: redis:7-alpine
networks: [harness-net]
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 2s
timeout: 5s
retries: 10
cp-stub:
build:
context: ./cp-stub
environment:
PORT: "9090"
CP_STUB_PEERS_MODE: "${CP_STUB_PEERS_MODE:-}"
networks: [harness-net]
healthcheck:
test: ["CMD-SHELL", "wget -q -O- http://localhost:9090/healthz || exit 1"]
interval: 2s
timeout: 5s
retries: 10
# ─── Tenant alpha: postgres + workspace-server ────────────────────────
postgres-alpha:
image: postgres:16-alpine
environment:
POSTGRES_USER: harness
POSTGRES_PASSWORD: harness
POSTGRES_DB: molecule
networks: [harness-net]
healthcheck:
test: ["CMD-SHELL", "pg_isready -U harness"]
interval: 2s
timeout: 5s
retries: 10
tenant-alpha:
build:
context: ../..
dockerfile: workspace-server/Dockerfile.tenant
args:
GIT_SHA: "${GIT_SHA:-harness}"
depends_on:
postgres-alpha:
condition: service_healthy
redis:
condition: service_healthy
cp-stub:
condition: service_healthy
environment:
DATABASE_URL: "postgres://harness:harness@postgres-alpha:5432/molecule?sslmode=disable"
REDIS_URL: "redis://redis:6379"
PORT: "8080"
PLATFORM_URL: "http://tenant-alpha:8080"
MOLECULE_ENV: "production"
SECRETS_ENCRYPTION_KEY: "${SECRETS_ENCRYPTION_KEY:?must be set — run via tests/harness/up.sh, which generates one per run}"
ADMIN_TOKEN: "harness-admin-token-alpha"
MOLECULE_ORG_ID: "harness-org-alpha"
CP_UPSTREAM_URL: "http://cp-stub:9090"
RATE_LIMIT: "1000"
CANVAS_PROXY_URL: "http://localhost:3000"
# Memory v2 sidecar (PR #2906) bundles the plugin into the
# tenant image and starts it before the main server. The plugin
# runs `CREATE EXTENSION vector` on first boot, which fails on
# the harness's plain postgres:15-alpine (no pgvector). The
# harness doesn't exercise memory features, so disable the
# sidecar via the entrypoint's documented escape hatch.
MEMORY_PLUGIN_DISABLE: "1"
networks: [harness-net]
healthcheck:
test: ["CMD-SHELL", "wget -q -O- http://localhost:8080/health || exit 1"]
interval: 5s
timeout: 5s
retries: 20
# ─── Tenant beta: postgres + workspace-server (parallel to alpha) ─────
postgres-beta:
image: postgres:16-alpine
environment:
POSTGRES_USER: harness
POSTGRES_PASSWORD: harness
POSTGRES_DB: molecule
networks: [harness-net]
healthcheck:
test: ["CMD-SHELL", "pg_isready -U harness"]
interval: 2s
timeout: 5s
retries: 10
tenant-beta:
build:
context: ../..
dockerfile: workspace-server/Dockerfile.tenant
args:
GIT_SHA: "${GIT_SHA:-harness}"
depends_on:
postgres-beta:
condition: service_healthy
redis:
condition: service_healthy
cp-stub:
condition: service_healthy
environment:
DATABASE_URL: "postgres://harness:harness@postgres-beta:5432/molecule?sslmode=disable"
REDIS_URL: "redis://redis:6379"
PORT: "8080"
PLATFORM_URL: "http://tenant-beta:8080"
MOLECULE_ENV: "production"
SECRETS_ENCRYPTION_KEY: "${SECRETS_ENCRYPTION_KEY:?must be set — run via tests/harness/up.sh, which generates one per run}"
# Distinct ADMIN_TOKEN — replays use this to verify TenantGuard
# blocks alpha-token presented at beta's URL.
ADMIN_TOKEN: "harness-admin-token-beta"
MOLECULE_ORG_ID: "harness-org-beta"
CP_UPSTREAM_URL: "http://cp-stub:9090"
RATE_LIMIT: "1000"
CANVAS_PROXY_URL: "http://localhost:3000"
# Memory v2 sidecar (PR #2906) bundles the plugin into the
# tenant image and starts it before the main server. The plugin
# runs `CREATE EXTENSION vector` on first boot, which fails on
# the harness's plain postgres:15-alpine (no pgvector). The
# harness doesn't exercise memory features, so disable the
# sidecar via the entrypoint's documented escape hatch.
MEMORY_PLUGIN_DISABLE: "1"
networks: [harness-net]
healthcheck:
test: ["CMD-SHELL", "wget -q -O- http://localhost:8080/health || exit 1"]
interval: 5s
timeout: 5s
retries: 20
# ─── cf-proxy: routes by Host to the right tenant container ───────────
# Production shape: same single CF tunnel front-doors every tenant
# subdomain — the Host header carries the tenant identity, not the
# routing destination. Local cf-proxy mirrors this exactly.
#
# nginx.conf delivery: docker compose `configs:` block (not a bind
# mount) so the file ships as content packaged by compose, not a
# host-path bind that has to be visible to the docker daemon's runc.
# Bind mounts break under Gitea's act_runner DinD because runc
# resolves the source path on the OUTER docker host (the runner's
# host filesystem), not inside the runner container — the path
# `/workspace/.../tests/harness/cf-proxy/nginx.conf` is only visible
# to the runner, not to the daemon below it. The `configs:` form
# uploads the file to the daemon as part of the service definition
# and is bind-mount-equivalent at the container level. See issue #88
# item 2.
cf-proxy:
image: nginx:1.27-alpine
depends_on:
tenant-alpha:
condition: service_healthy
tenant-beta:
condition: service_healthy
configs:
- source: cf-proxy-nginx-conf
target: /etc/nginx/nginx.conf
mode: 0444
# Bind to 127.0.0.1 only — hardcoded ADMIN_TOKENs make 0.0.0.0
# exposure unsafe even on a local network.
ports:
- "127.0.0.1:8080:8080"
networks: [harness-net]
configs:
# Defined once at compose level so any future service (e.g. a second
# nginx variant for an external-connect smoke test) can reuse the
# same source file.
cf-proxy-nginx-conf:
file: ./cf-proxy/nginx.conf
networks:
harness-net:
name: molecule-harness-net