harness(phase-0): sudo-free Host-header path + chat_history + envelope replays

Three changes that bring the local harness from "covers what staging
covers minus the SaaS topology" to "exercises every surface we shipped
this session against the prod-shape Dockerfile.tenant image."

1. Drop the /etc/hosts requirement.

   Replays previously needed `127.0.0.1 harness-tenant.localhost` in
   /etc/hosts to resolve the cf-proxy. That gated the harness behind a
   sudo step on every fresh dev box and CI runner. The cf-proxy nginx
   already routes by Host header (matches production CF tunnel: URL is
   public, Host carries tenant identity), so the no-sudo path is to
   target loopback :8080 with `Host: harness-tenant.localhost` set as
   a header.

   New `tests/harness/_curl.sh` centralises this — curl_anon /
   curl_admin / curl_workspace / psql_exec wrappers all set the Host
   + auth headers automatically. seed.sh, peer-discovery-404.sh,
   buildinfo-stale-image.sh updated to source it. Legacy /etc/hosts
   users still work via env-var override.

2. Fix the seed.sh FK regression that blocked DB-side replays.

   POST /workspaces ignores any `id` in the request body and generates
   one server-side. seed.sh was minting client-side UUIDs that never
   reached the workspaces table, so any replay that INSERTed into
   activity_logs (FK-constrained on workspace_id) failed with the
   workspace-not-found error. Capture the returned id from the
   response instead.

3. Two new replays cover the surfaces shipped this session.

   chat-history.sh — exercises the full SaaS-shape wire that PR #2472
   (peer_id filter), #2474 (chat_history client tool), and #2476
   (before_ts paging) ride on. 8 phases / 16 assertions: peer_id filter,
   limit cap, before_ts paging, OR-clause covering both source_id and
   target_id, malformed peer_id 400, malformed before_ts 400, URL-encoded
   SQLi-shape rejection. Verified PASS against the live harness.

   channel-envelope-trust-boundary.sh — exercises PR #2471 + #2481 by
   importing from `molecule_runtime.*` (the wheel-rewritten path) so
   it catches "wheel build dropped a fix that unit tests still pass."
   5 phases / 11 assertions: malicious peer_id scrubbed from envelope,
   agent_card_url omitted on validation failure, XML-injection bytes
   scrubbed, valid UUID preserved, _agent_card_url_for direct gate.
   Verified PASS against published wheel 0.1.79.

run-all-replays.sh auto-discovers — no registration needed. Full
lifecycle (boot → seed → 4 replays → teardown) runs clean.

Roadmap section updated to reflect Phase 1 (this PR) → Phase 2
(multi-tenant + CI gate) → Phase 3 (real CP) → Phase 4 (Miniflare +
LocalStack + traffic replay).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Hongming Wang 2026-05-01 20:12:49 -07:00
parent f6a48d593e
commit 5cca462843
10 changed files with 513 additions and 60 deletions

1
.gitignore vendored
View File

@ -146,3 +146,4 @@ backups/
*-temp.txt
/test-pmm-*.txt
/tick-reflections-*.md
tests/harness/cp-stub/cp-stub

2
tests/harness/.gitignore vendored Normal file
View File

@ -0,0 +1,2 @@
# Harness ephemeral state. Re-generated by ./seed.sh on every boot.
.seed.env

View File

@ -1,12 +1,20 @@
# Production-shape local harness
The harness brings up the SaaS tenant topology on localhost using the
same `Dockerfile.tenant` image that ships to production. Tests run
against `http://harness-tenant.localhost:8080` and exercise the
SAME code path a real tenant takes — including TenantGuard middleware,
same `Dockerfile.tenant` image that ships to production. Tests target
the cf-proxy on `http://localhost:8080` and pass the tenant identity
via a `Host: harness-tenant.localhost` header — exactly the way
production CF tunnel routes by Host header. The cf-proxy nginx then
rewrites headers and proxies to the tenant container, exercising the
SAME code path a real tenant takes including TenantGuard middleware,
the `/cp/*` reverse proxy, the canvas reverse proxy, and a
Cloudflare-tunnel-shape header rewrite layer.
`tests/harness/_curl.sh` is the helper sourced by every replay —
provides `curl_anon`, `curl_admin`, `curl_workspace`, and `psql_exec`
wrappers that set the right Host + auth headers automatically. New
replays should source it rather than rolling their own curl.
## Why this exists
Local `go run ./cmd/server` skips:
@ -53,15 +61,18 @@ KEEP_UP=1 ./run-all-replays.sh # leave harness up for debugging
REBUILD=1 ./run-all-replays.sh # rebuild images before booting
```
First-time setup needs an `/etc/hosts` entry so `harness-tenant.localhost`
resolves to the local cf-proxy:
No `/etc/hosts` edit required — replays use the cf-proxy's loopback
port and pass `Host: harness-tenant.localhost` as a header (`_curl.sh`
handles this automatically). This matches how production CF tunnel
routes: the URL is the public CF endpoint, the Host header carries the
per-tenant identity. Quick check:
```bash
echo "127.0.0.1 harness-tenant.localhost" | sudo tee -a /etc/hosts
curl -H "Host: harness-tenant.localhost" http://localhost:8080/health
```
(macOS resolves `*.localhost` automatically in some setups; Linux
typically does not.)
(If you have a legacy `/etc/hosts` entry from older docs, it still
works — `BASE` and `TENANT_HOST` both honor env-var overrides.)
## Replay scripts
@ -74,6 +85,8 @@ green" — the script becomes the regression gate that closes that gap.
|--------|--------|----------------|
| `peer-discovery-404.sh` | #2397 | tool_list_peers surfaces the actual reason instead of "may be isolated" |
| `buildinfo-stale-image.sh` | #2395 | GIT_SHA reaches the binary; verify-step comparison logic works |
| `chat-history.sh` | #2472 + #2474 + #2476 | `peer_id` filter (incl. OR over source/target) + `before_ts` paging + UUID/RFC3339 trust boundary on the activity route |
| `channel-envelope-trust-boundary.sh` | #2471 + #2481 | published wheel scrubs malformed `peer_id` from the channel envelope and from `agent_card_url` (path-traversal + XML-attr injection) |
To add a new replay:
1. Drop a script under `replays/` named after the issue.
@ -111,9 +124,7 @@ its mandate of "exercise the tenant binary in production-shape topology."
## Roadmap
- **Phase 1 (shipped):** harness + cp-stub + cf-proxy + 2 replays + `run-all-replays.sh` runner.
- **Phase 2:** convert `tests/e2e/test_api.sh` to run against the
harness instead of localhost. Make harness-based E2E a required CI
check (a workflow that invokes `run-all-replays.sh` on every PR).
- **Phase 3:** config-coherence lint that diffs harness env list
against production CP's env list, fails CI on drift.
- **Phase 1 (shipped):** harness + cp-stub + cf-proxy + 4 replays + `run-all-replays.sh` runner. No-sudo `Host`-header path via `_curl.sh`. Per-replay psql seeding for tests that need DB-side fixtures.
- **Phase 2 (in flight):** multi-tenant — second `tenant-beta` service in compose, second Postgres database, replays for cross-tenant A2A + TenantGuard isolation. Convert `tests/e2e/test_api.sh` to target the harness instead of localhost. Make harness-based E2E a required CI check (a workflow that invokes `run-all-replays.sh` on every PR via the self-hosted Mac runner).
- **Phase 3:** replace `cp-stub/` with the real `molecule-controlplane` Docker build. Add a config-coherence lint that diffs harness env list against production CP's env list and fails CI on drift.
- **Phase 4 (long-term):** Miniflare in front of cf-proxy for real CF emulation (WAF, BotID, rate-limit, cf-tunnel headers). LocalStack for the EC2 provisioner. Anonymized prod-traffic recording/replay for SaaS-scale regression detection.

82
tests/harness/_curl.sh Normal file
View File

@ -0,0 +1,82 @@
# Sourceable helper for harness replays. Centralises the
# curl-against-cf-proxy pattern so scripts don't depend on /etc/hosts.
#
# Production CF tunnel routes by Host header, not by DNS — the request
# URL is to a public CF endpoint and the Host header carries the
# per-tenant identity. We replay the same shape locally:
#
# curl -H "Host: harness-tenant.localhost" http://localhost:8080/health
#
# This matches what cf-proxy/nginx.conf already routes (`server_name
# *.localhost localhost`) and avoids the macOS /etc/hosts requirement
# that previously gated the harness behind a sudo step.
#
# Backwards-compatible: if /etc/hosts resolves harness-tenant.localhost
# (the legacy path), the bare URL still works because the helper falls
# back to that. New scripts SHOULD use the helper functions.
#
# Usage:
# HERE="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
# source "$HERE/../_curl.sh" # from replays/<name>.sh
# curl_admin "$BASE/health"
# curl_anon "$BASE/health"
# Bind to the cf-proxy's loopback port — the proxy front-doors every
# tenant and routes by Host header, exactly like production's CF tunnel.
: "${BASE:=http://localhost:8080}"
: "${TENANT_HOST:=harness-tenant.localhost}"
: "${ADMIN_TOKEN:=harness-admin-token}"
: "${ORG_ID:=harness-org}"
# Anonymous request — only Host header (no auth). Use for /health,
# /buildinfo, and any other route that's intentionally public.
curl_anon() {
curl -sS -H "Host: ${TENANT_HOST}" "$@"
}
# Admin-token request — full SaaS auth shape. Sets the bearer token,
# tenant org header (activates TenantGuard middleware), and a default
# JSON Content-Type. Replays admin paths exactly the way CP does in
# production, so any TenantGuard / strict-auth bug surfaces locally.
curl_admin() {
curl -sS \
-H "Host: ${TENANT_HOST}" \
-H "Authorization: Bearer ${ADMIN_TOKEN}" \
-H "X-Molecule-Org-Id: ${ORG_ID}" \
-H "Content-Type: application/json" \
"$@"
}
# Workspace-scoped request — uses a per-workspace bearer minted from
# /admin/workspaces/:id/test-token. The platform's auth.go middleware
# accepts this bearer for the workspace's own routes, so this is the
# right shape for replays that exercise an in-workspace tool calling
# back to the platform (chat_history, list_peers, etc).
#
# Caller must export WORKSPACE_TOKEN before invoking.
curl_workspace() {
: "${WORKSPACE_TOKEN:?WORKSPACE_TOKEN must be set — mint via /admin/workspaces/:id/test-token}"
curl -sS \
-H "Host: ${TENANT_HOST}" \
-H "Authorization: Bearer ${WORKSPACE_TOKEN}" \
-H "X-Molecule-Org-Id: ${ORG_ID}" \
-H "Content-Type: application/json" \
"$@"
}
# Direct postgres exec — for replays that need to seed activity_logs
# rows or read DB state that has no public HTTP route. Wraps the
# `docker compose exec` pattern so replays can stay shell-only.
#
# SECRETS_ENCRYPTION_KEY is set to a placeholder so compose's `:?must
# be set` interpolation guard (which gates running the harness without
# up.sh) doesn't trip on `exec` — exec only reaches an already-running
# service so the env var is irrelevant, but compose still validates
# the file. The placeholder is never written anywhere or used by any
# service.
psql_exec() {
SECRETS_ENCRYPTION_KEY="${SECRETS_ENCRYPTION_KEY:-exec-placeholder}" \
docker compose -f "${HARNESS_COMPOSE:-$(dirname "${BASH_SOURCE[0]}")/compose.yml}" \
exec -T postgres \
psql -U harness -d molecule -At "$@"
}

View File

@ -22,12 +22,12 @@
set -euo pipefail
HERE="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
HARNESS_ROOT="$(dirname "$HERE")"
BASE="${BASE:-http://harness-tenant.localhost:8080}"
# shellcheck source=../_curl.sh
source "$HARNESS_ROOT/_curl.sh"
# 1. Confirm /buildinfo wire shape — same shape the workflow's jq lookup expects.
echo "[replay] curl $BASE/buildinfo ..."
BUILD_JSON=$(curl -sS "$BASE/buildinfo")
BUILD_JSON=$(curl_anon "$BASE/buildinfo")
echo "[replay] $BUILD_JSON"
ACTUAL_SHA=$(echo "$BUILD_JSON" | jq -r '.git_sha // ""')

View File

@ -0,0 +1,182 @@
#!/usr/bin/env bash
# Replay for the channel envelope peer_id trust-boundary fix
# (PR #2481, follow-up to PR #2471). Verifies that the PUBLISHED wheel
# installed on this machine — not local source — gates malformed peer_id
# at both the envelope builder and the agent_card_url builder.
#
# Why this matters:
# - Unit tests in workspace/tests/ run against local source. They
# prove the fix works in source. They DO NOT prove the published
# wheel contains the fix.
# - The wheel rewriter (scripts/build_runtime_package.py) renames
# symbols + paths. Any rewrite drift could silently strip the
# guard from the shipped artifact.
# - This replay imports from `molecule_runtime.a2a_mcp_server` (the
# wheel-rewritten path), exercises the actual published code, and
# asserts the envelope shape. If the wheel build ever ships without
# the guard, this fails — even if unit tests on local source pass.
#
# Phases:
# A. Confirm an installed molecule-runtime version that contains the
# #2481 fix (>= 0.1.78).
# B. Call `_build_channel_notification` with peer_id="../../foo" and
# assert (1) meta["peer_id"] == "", (2) no agent_card_url field,
# (3) no peer_name/peer_role.
# C. Symmetric case: peer_id with embedded XML-attribute injection
# bytes — assert the same scrubbing.
# D. Happy path: a valid UUID peer_id is preserved (proves we didn't
# regress legitimate enrichment).
# E. Direct check on the URL builder — `_agent_card_url_for("../../foo")`
# must return "" and never an unsanitised URL.
set -euo pipefail
HERE="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
HARNESS_ROOT="$(dirname "$HERE")"
cd "$HARNESS_ROOT"
# shellcheck source=../_curl.sh
source "$HARNESS_ROOT/_curl.sh"
PASS=0
FAIL=0
assert() {
local desc="$1" expected="$2" actual="$3"
if [ "$expected" = "$actual" ]; then
printf " PASS %s\n" "$desc"
PASS=$((PASS + 1))
else
printf " FAIL %s\n expected: %s\n got : %s\n" "$desc" "$expected" "$actual" >&2
FAIL=$((FAIL + 1))
fi
}
# ─── Phase A: wheel version contains the fix ───────────────────────────
echo "[replay] A. confirming installed molecule-ai-workspace-runtime contains #2481..."
INSTALLED=$(pip3 show molecule-ai-workspace-runtime 2>/dev/null | awk -F': ' '/^Version:/ {print $2}')
if [ -z "$INSTALLED" ]; then
echo "[replay] FAIL A: molecule-ai-workspace-runtime not installed."
echo " Install: pip3 install molecule-ai-workspace-runtime"
exit 2
fi
echo "[replay] installed version: $INSTALLED"
# 0.1.78 is the first published version after #2481 merged to staging.
# Compare via Python distutils-style version sort (works across patch
# bumps without sed-fragility).
HAS_FIX=$(python3 -c "
from packaging.version import parse
print('yes' if parse('$INSTALLED') >= parse('0.1.78') else 'no')
" 2>/dev/null || echo "unknown")
if [ "$HAS_FIX" != "yes" ]; then
echo "[replay] FAIL A: installed $INSTALLED < 0.1.78 (the version that shipped the #2481 fix)."
echo " Upgrade: pip3 install --upgrade molecule-ai-workspace-runtime"
exit 2
fi
echo "[replay] ✓ contains #2481 trust-boundary fix"
# ─── Phase B-E: in-process assertions against the installed wheel ──────
# We don't need WORKSPACE_ID/PLATFORM_URL/MOLECULE_WORKSPACE_TOKEN to
# import the module — the env validation only fires at console-script
# entry. We use molecule_runtime.* (the wheel-rewritten import path)
# rather than workspace.a2a_mcp_server (local source) so this exercises
# the SHIPPED code.
echo ""
echo "[replay] B-E. exercising _build_channel_notification + _agent_card_url_for from the installed wheel..."
OUT=$(WORKSPACE_ID=00000000-0000-0000-0000-000000000000 \
PLATFORM_URL=http://localhost:8080 \
MOLECULE_WORKSPACE_TOKEN=stub \
MOLECULE_MCP_DISABLE_HEARTBEAT=1 \
python3 - <<'PYEOF'
import json
import sys
from molecule_runtime.a2a_mcp_server import _build_channel_notification
from molecule_runtime.a2a_client import _agent_card_url_for
results = []
def emit(name, value):
results.append({"name": name, "value": value})
# ── B: path-traversal peer_id stripped from envelope ──
payload = _build_channel_notification({
"peer_id": "../../foo",
"kind": "peer_agent",
"text": "redirect-attempt",
"activity_id": "act-1",
"method": "message/send",
"created_at": "2026-05-01T00:00:00Z",
})
meta = payload["params"]["meta"]
emit("B1_peer_id_scrubbed", meta.get("peer_id", "<missing>"))
emit("B2_agent_card_url_absent", "absent" if "agent_card_url" not in meta else meta["agent_card_url"])
emit("B3_peer_name_absent", "absent" if "peer_name" not in meta else meta["peer_name"])
emit("B4_peer_role_absent", "absent" if "peer_role" not in meta else meta["peer_role"])
# ── C: XML-attribute-injection-shape peer_id ──
payload = _build_channel_notification({
"peer_id": 'aaa" onclick="alert(1)',
"kind": "peer_agent",
"text": "xss",
})
meta = payload["params"]["meta"]
emit("C1_peer_id_scrubbed", meta.get("peer_id", "<missing>"))
emit("C2_agent_card_url_absent", "absent" if "agent_card_url" not in meta else "leaked")
# ── D: legitimate UUID is preserved ──
valid_uuid = "11111111-2222-3333-4444-555555555555"
payload = _build_channel_notification({
"peer_id": valid_uuid,
"kind": "peer_agent",
"text": "legit",
})
meta = payload["params"]["meta"]
emit("D1_peer_id_preserved", meta.get("peer_id", "<missing>"))
# agent_card_url IS present (we don't gate the URL itself on whether the registry is reachable)
emit("D2_agent_card_url_present", "yes" if meta.get("agent_card_url", "").endswith(valid_uuid) else "no")
# ── E: direct URL builder gate ──
emit("E1_url_builder_strips_traversal", _agent_card_url_for("../../foo"))
emit("E2_url_builder_strips_xml", _agent_card_url_for('a" onclick="x'))
emit("E3_url_builder_accepts_uuid_endswith", "yes" if _agent_card_url_for(valid_uuid).endswith(valid_uuid) else "no")
print(json.dumps(results))
PYEOF
)
# Parse and assert each result.
echo "$OUT" | python3 -c "
import json, sys
results = json.loads(sys.stdin.read())
for r in results:
print(f\"{r['name']}={r['value']}\")
" > /tmp/cha-envelope-results.txt
while IFS='=' read -r key value; do
case "$key" in
B1_peer_id_scrubbed) assert "B1: malicious peer_id scrubbed to \"\"" "" "$value" ;;
B2_agent_card_url_absent) assert "B2: agent_card_url not emitted" "absent" "$value" ;;
B3_peer_name_absent) assert "B3: peer_name not enriched" "absent" "$value" ;;
B4_peer_role_absent) assert "B4: peer_role not enriched" "absent" "$value" ;;
C1_peer_id_scrubbed) assert "C1: XML-injection peer_id scrubbed" "" "$value" ;;
C2_agent_card_url_absent) assert "C2: XML-injection URL not emitted" "absent" "$value" ;;
D1_peer_id_preserved) assert "D1: valid UUID peer_id preserved" "11111111-2222-3333-4444-555555555555" "$value" ;;
D2_agent_card_url_present) assert "D2: agent_card_url present for valid id" "yes" "$value" ;;
E1_url_builder_strips_traversal) assert "E1: _agent_card_url_for(\"../../foo\") returns \"\"" "" "$value" ;;
E2_url_builder_strips_xml) assert "E2: _agent_card_url_for(XML-injection) returns \"\"" "" "$value" ;;
E3_url_builder_accepts_uuid_endswith) assert "E3: _agent_card_url_for(valid uuid) builds canonical URL" "yes" "$value" ;;
esac
done < /tmp/cha-envelope-results.txt
echo ""
if [ "$FAIL" -gt 0 ]; then
echo "[replay] FAIL: $PASS pass, $FAIL fail"
echo ""
echo "[replay] If B/C/E failed: the published wheel does NOT contain the #2481 fix."
echo "[replay] Likely causes:"
echo " - Wheel rewriter dropped _validate_peer_id from molecule_runtime.a2a_client"
echo " - publish-runtime.yml regressed to a SHA before #2481 (check pip install version)"
exit 1
fi
echo "[replay] PASS: $PASS/$PASS — channel envelope peer_id trust boundary holds in published wheel $INSTALLED"

View File

@ -0,0 +1,175 @@
#!/usr/bin/env bash
# Replay for the chat_history MCP tool — exercises the full SaaS-shape
# wire that PRs #2472 (peer_id filter), #2474 (chat_history client), and
# #2476 (before_ts paging) ride on. Runs against the prod-shape tenant
# image, not unit-mock'd handlers, so any drift between the Go handler
# and the Python tool's expectations surfaces here.
#
# What this catches that unit tests don't:
# - Real Postgres planner behaviour on the (source_id = $X OR target_id = $X)
# OR clause (issue #2478 — both indexes missing).
# - cf-proxy header rewrites + TenantGuard middleware in the path.
# - lib/pq + Postgres driver type binding for time.Time parameters.
# - JSON encoding of created_at across the wire (timezone, precision).
#
# Phases:
# A. Seed three a2a_receive rows for alpha with peer_id=beta, spread
# across distinct timestamps.
# B. Basic peer_id filter: GET ?type=a2a_receive&peer_id=beta&limit=10
# → assert 3 rows DESC.
# C. Limit cap: limit=2 → assert 2 newest rows.
# D. before_ts paging: take the 2nd-newest's created_at, GET with
# before_ts=that → assert the 1 strictly-older row.
# E. OR clause (target side): seed an a2a_send row where source=alpha,
# target=beta. GET with type unset, peer_id=beta → assert that row
# surfaces too (target_id match, not just source_id).
# F. Trust-boundary: peer_id="not-a-uuid" → 400 + "peer_id must be a UUID".
# G. Trust-boundary: before_ts="garbage" → 400 + RFC3339 example.
# H. URL-encoded SQL-injection-shape peer_id → 400 (matches activity_test.go's
# malicious-peer-id panel).
set -euo pipefail
HERE="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
HARNESS_ROOT="$(dirname "$HERE")"
cd "$HARNESS_ROOT"
if [ ! -f .seed.env ]; then
echo "[replay] no .seed.env — running ./seed.sh first..."
./seed.sh
fi
# shellcheck source=/dev/null
source .seed.env
# shellcheck source=../_curl.sh
source "$HARNESS_ROOT/_curl.sh"
PASS=0
FAIL=0
assert() {
local desc="$1" expected="$2" actual="$3"
if [ "$expected" = "$actual" ]; then
printf " PASS %s\n" "$desc"
PASS=$((PASS + 1))
else
printf " FAIL %s\n expected: %s\n got : %s\n" "$desc" "$expected" "$actual" >&2
FAIL=$((FAIL + 1))
fi
}
assert_contains() {
local desc="$1" needle="$2" haystack="$3"
if echo "$haystack" | grep -qF "$needle"; then
printf " PASS %s\n" "$desc"
PASS=$((PASS + 1))
else
printf " FAIL %s\n expected to contain: %s\n got: %s\n" "$desc" "$needle" "$haystack" >&2
FAIL=$((FAIL + 1))
fi
}
echo "[replay] alpha=$ALPHA_ID beta=$BETA_ID"
# ─── Phase A: seed the activity_logs table ─────────────────────────────
# Inserted via psql so the seed is independent of the platform's HTTP
# Notify path — that path itself ships through the same handler chain
# we want to test, and seeding through it would conflate setup and
# assertion.
echo ""
echo "[replay] A. seeding 3 a2a_receive rows for alpha←beta at distinct timestamps..."
psql_exec >/dev/null <<SQL
DELETE FROM activity_logs WHERE workspace_id = '$ALPHA_ID';
INSERT INTO activity_logs (workspace_id, activity_type, source_id, target_id, method, summary, created_at)
VALUES
('$ALPHA_ID', 'a2a_receive', '$BETA_ID', '$ALPHA_ID', 'message/send', 'oldest from beta', NOW() - INTERVAL '4 hours'),
('$ALPHA_ID', 'a2a_receive', '$BETA_ID', '$ALPHA_ID', 'message/send', 'middle from beta', NOW() - INTERVAL '2 hours'),
('$ALPHA_ID', 'a2a_receive', '$BETA_ID', '$ALPHA_ID', 'message/send', 'newest from beta', NOW() - INTERVAL '1 hour');
SQL
echo "[replay] inserted 3 rows"
# ─── Phase B: basic peer_id filter ─────────────────────────────────────
echo ""
echo "[replay] B. GET ?type=a2a_receive&peer_id=beta&limit=10 ..."
RESP=$(curl_admin "$BASE/workspaces/$ALPHA_ID/activity?type=a2a_receive&peer_id=$BETA_ID&limit=10")
COUNT=$(echo "$RESP" | jq 'length')
assert "B1: returns 3 rows" "3" "$COUNT"
# DESC order — newest first
NEWEST_SUMMARY=$(echo "$RESP" | jq -r '.[0].summary')
assert "B2: newest first (DESC ordering)" "newest from beta" "$NEWEST_SUMMARY"
OLDEST_SUMMARY=$(echo "$RESP" | jq -r '.[2].summary')
assert "B3: oldest last" "oldest from beta" "$OLDEST_SUMMARY"
# ─── Phase C: limit cap ────────────────────────────────────────────────
echo ""
echo "[replay] C. limit=2 (expecting 2 newest) ..."
RESP=$(curl_admin "$BASE/workspaces/$ALPHA_ID/activity?type=a2a_receive&peer_id=$BETA_ID&limit=2")
assert "C1: limit clamps to 2" "2" "$(echo "$RESP" | jq 'length')"
assert "C2: kept newest" "newest from beta" "$(echo "$RESP" | jq -r '.[0].summary')"
assert "C3: kept middle" "middle from beta" "$(echo "$RESP" | jq -r '.[1].summary')"
# ─── Phase D: before_ts paging ─────────────────────────────────────────
echo ""
echo "[replay] D. before_ts paging — walk backwards from middle row's created_at ..."
# Take the newest row's created_at, page from there.
NEWEST_TS=$(curl_admin "$BASE/workspaces/$ALPHA_ID/activity?type=a2a_receive&peer_id=$BETA_ID&limit=1" \
| jq -r '.[0].created_at')
# RFC3339 with timezone — Go's time.Parse(RFC3339) handles `2026-...Z` AND
# `2026-...+00:00`. Postgres returns the latter; URL-encode the +.
NEWEST_TS_ENCODED=$(echo "$NEWEST_TS" | python3 -c 'import sys, urllib.parse; print(urllib.parse.quote(sys.stdin.read().strip(), safe=""))')
RESP=$(curl_admin "$BASE/workspaces/$ALPHA_ID/activity?type=a2a_receive&peer_id=$BETA_ID&before_ts=$NEWEST_TS_ENCODED&limit=10")
assert "D1: 2 rows older than newest" "2" "$(echo "$RESP" | jq 'length')"
assert "D2: middle is now newest in the slice" "middle from beta" "$(echo "$RESP" | jq -r '.[0].summary')"
# Strict less-than — the row at exactly NEWEST_TS must NOT come back.
NOT_INCLUDED=$(echo "$RESP" | jq -r '[.[].summary] | index("newest from beta") // "absent"')
assert "D3: strictly older — newest excluded" "absent" "$NOT_INCLUDED"
# ─── Phase E: OR clause covers target_id direction ─────────────────────
echo ""
echo "[replay] E. OR clause: seed an a2a_send row (alpha→beta) and confirm it surfaces ..."
psql_exec >/dev/null <<SQL
INSERT INTO activity_logs (workspace_id, activity_type, source_id, target_id, method, summary, created_at)
VALUES ('$ALPHA_ID', 'a2a_send', '$ALPHA_ID', '$BETA_ID', 'message/send', 'sent to beta', NOW());
SQL
# No type filter — we want both a2a_receive AND a2a_send rows back.
RESP=$(curl_admin "$BASE/workspaces/$ALPHA_ID/activity?peer_id=$BETA_ID&limit=10")
HAS_SENT=$(echo "$RESP" | jq '[.[].summary] | any(. == "sent to beta")')
assert "E1: a2a_send (alpha→beta) returned via target_id match" "true" "$HAS_SENT"
TOTAL=$(echo "$RESP" | jq 'length')
assert "E2: total = 4 (3 receives + 1 send)" "4" "$TOTAL"
# ─── Phase F: malformed peer_id → 400 ──────────────────────────────────
echo ""
echo "[replay] F. malformed peer_id → 400 ..."
HTTP_CODE=$(curl_admin -o /tmp/cha-bad-peer.json -w '%{http_code}' \
"$BASE/workspaces/$ALPHA_ID/activity?type=a2a_receive&peer_id=not-a-uuid")
assert "F1: HTTP 400" "400" "$HTTP_CODE"
assert_contains "F2: error names the param" "peer_id must be a UUID" "$(cat /tmp/cha-bad-peer.json)"
# ─── Phase G: malformed before_ts → 400 ────────────────────────────────
echo ""
echo "[replay] G. malformed before_ts → 400 ..."
HTTP_CODE=$(curl_admin -o /tmp/cha-bad-ts.json -w '%{http_code}' \
"$BASE/workspaces/$ALPHA_ID/activity?type=a2a_receive&before_ts=garbage")
assert "G1: HTTP 400" "400" "$HTTP_CODE"
assert_contains "G2: error mentions RFC3339" "RFC3339" "$(cat /tmp/cha-bad-ts.json)"
# ─── Phase H: SQL-injection-shape peer_id is rejected ──────────────────
echo ""
echo "[replay] H. URL-encoded SQLi-shape peer_id → 400 ..."
SQLI_ENCODED="%27%20OR%201%3D1%20--" # ' OR 1=1 --
HTTP_CODE=$(curl_admin -o /tmp/cha-sqli.json -w '%{http_code}' \
"$BASE/workspaces/$ALPHA_ID/activity?type=a2a_receive&peer_id=$SQLI_ENCODED")
assert "H1: HTTP 400 (UUID validation rejects before SQL builder sees it)" "400" "$HTTP_CODE"
# ─── Cleanup: tear down seeded rows so subsequent runs don't accumulate ─
psql_exec >/dev/null <<SQL
DELETE FROM activity_logs WHERE workspace_id = '$ALPHA_ID';
SQL
echo ""
if [ "$FAIL" -gt 0 ]; then
echo "[replay] FAIL: $PASS pass, $FAIL fail"
exit 1
fi
echo "[replay] PASS: $PASS/$PASS — chat_history wire (peer_id filter + before_ts paging + trust boundary + OR clause)"

View File

@ -36,17 +36,13 @@ if [ ! -f .seed.env ]; then
fi
# shellcheck source=/dev/null
source .seed.env
BASE="${BASE:-http://harness-tenant.localhost:8080}"
ADMIN="harness-admin-token"
ORG="harness-org"
# shellcheck source=../_curl.sh
source "$HARNESS_ROOT/_curl.sh"
# ─── (a) WIRE: tenant returns 404 for an unregistered workspace ────────
ROGUE_ID="$(uuidgen | tr '[:upper:]' '[:lower:]')"
echo "[replay] (a) WIRE: querying /registry/$ROGUE_ID/peers (unregistered workspace)..."
HTTP_CODE=$(curl -sS -o /tmp/peer-replay.json -w '%{http_code}' \
-H "Authorization: Bearer $ADMIN" \
-H "X-Molecule-Org-Id: $ORG" \
HTTP_CODE=$(curl_admin -o /tmp/peer-replay.json -w '%{http_code}' \
-H "X-Workspace-ID: $ROGUE_ID" \
"$BASE/registry/$ROGUE_ID/peers")

View File

@ -5,52 +5,53 @@
# - "alpha" parent (tier 0)
# - "beta" child of alpha (tier 1)
#
# Both register via the platform's /registry/register endpoint, which
# is what real workspaces do at boot. The platform then has them in its
# DB; tool_list_peers from inside alpha can resolve beta as a peer.
# Both register via the platform's /workspaces endpoint, which is what
# CP does at provision time. The platform then has them in its DB;
# tool_list_peers from inside alpha can resolve beta as a peer.
set -euo pipefail
HERE="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
cd "$HERE"
BASE="${BASE:-http://harness-tenant.localhost:8080}"
ADMIN="harness-admin-token"
ORG="harness-org"
curl_admin() {
curl -sS -H "Authorization: Bearer $ADMIN" \
-H "X-Molecule-Org-Id: $ORG" \
-H "Content-Type: application/json" "$@"
}
# shellcheck source=_curl.sh
source "$HERE/_curl.sh"
echo "[seed] confirming tenant is reachable via cf-proxy..."
HEALTH=$(curl -sS "$BASE/health" || echo "")
HEALTH=$(curl_anon "$BASE/health" || echo "")
if [ -z "$HEALTH" ]; then
echo "[seed] FAILED: $BASE/health unreachable. Did ./up.sh complete? Did you add"
echo " 127.0.0.1 harness-tenant.localhost to /etc/hosts?"
echo "[seed] FAILED: $BASE/health unreachable. Did ./up.sh complete?"
exit 1
fi
echo "[seed] $HEALTH"
echo "[seed] confirming /buildinfo returns the harness GIT_SHA..."
BUILD=$(curl -sS "$BASE/buildinfo" || echo "")
BUILD=$(curl_anon "$BASE/buildinfo" || echo "")
echo "[seed] $BUILD"
# Mint a fresh admin-call workspace ID for the parent. Platform's
# /admin/workspaces/:id/test-token mints a per-workspace bearer; the
# replay scripts use it to call the workspace-scoped routes.
# Create alpha (parent) and beta (child of alpha). The handler always
# generates the workspace id server-side and ignores any id in the
# request body, so we capture the returned id rather than minting one
# locally — older versions of this script minted client-side and would
# silently desync from the workspaces table, breaking FK-dependent
# replays (chat-history seeds activity_logs which has a FK to workspaces).
echo "[seed] creating workspace 'alpha' (parent)..."
ALPHA_ID=$(uuidgen | tr '[:upper:]' '[:lower:]')
curl_admin -X POST "$BASE/workspaces" \
-d "{\"id\":\"$ALPHA_ID\",\"name\":\"alpha\",\"tier\":0,\"runtime\":\"langgraph\"}" \
>/dev/null
ALPHA_ID=$(curl_admin -X POST "$BASE/workspaces" \
-d '{"name":"alpha","tier":0,"runtime":"langgraph"}' \
| jq -r '.id')
if [ -z "$ALPHA_ID" ] || [ "$ALPHA_ID" = "null" ]; then
echo "[seed] FAIL: alpha workspace creation returned no id"
exit 1
fi
echo "[seed] alpha id=$ALPHA_ID"
echo "[seed] creating workspace 'beta' (child of alpha)..."
BETA_ID=$(uuidgen | tr '[:upper:]' '[:lower:]')
curl_admin -X POST "$BASE/workspaces" \
-d "{\"id\":\"$BETA_ID\",\"name\":\"beta\",\"tier\":1,\"parent_id\":\"$ALPHA_ID\",\"runtime\":\"langgraph\"}" \
>/dev/null
BETA_ID=$(curl_admin -X POST "$BASE/workspaces" \
-d "{\"name\":\"beta\",\"tier\":1,\"parent_id\":\"$ALPHA_ID\",\"runtime\":\"langgraph\"}" \
| jq -r '.id')
if [ -z "$BETA_ID" ] || [ "$BETA_ID" = "null" ]; then
echo "[seed] FAIL: beta workspace creation returned no id"
exit 1
fi
echo "[seed] beta id=$BETA_ID"
# Stash IDs so replay scripts pick them up.

View File

@ -41,15 +41,18 @@ fi
echo "[harness] starting cp-stub + postgres + redis + tenant + cf-proxy ..."
docker compose -f compose.yml up -d --wait
echo "[harness] /etc/hosts entry for harness-tenant.localhost..."
if ! grep -q '^127\.0\.0\.1[[:space:]]\+harness-tenant\.localhost' /etc/hosts; then
echo " (skip — your /etc/hosts may not resolve *.localhost. If tests fail with"
echo " 'getaddrinfo' errors, add: 127.0.0.1 harness-tenant.localhost)"
fi
# Sudo-free reachability: cf-proxy/nginx routes by Host header (matches
# production CF tunnel), so replays target loopback :8080 with a Host
# header rather than depending on /etc/hosts resolution. _curl.sh
# centralises this. Legacy /etc/hosts users still work — the BASE env
# var override accepts either shape.
echo ""
echo "[harness] up. Tenant: http://harness-tenant.localhost:8080/health"
echo " http://harness-tenant.localhost:8080/buildinfo"
echo " cp-stub: http://localhost (internal-only via compose net)"
echo "[harness] up."
echo " Tenant via cf-proxy: http://localhost:8080/health"
echo " (Host: harness-tenant.localhost)"
echo " cp-stub: internal-only via compose net"
echo ""
echo " Quick check:"
echo " curl -H 'Host: harness-tenant.localhost' http://localhost:8080/health"
echo ""
echo "Next: ./seed.sh # mint admin token + register sample workspaces"