test(e2e): forbid dev token path in staging peer visibility #1650

Merged
hongming merged 1 commits from fix/staging-token-diagnostic into main 2026-05-21 20:26:34 +00:00
Owner

What

Removes the dev-only /admin/workspaces/:id/test-token fallback from staging peer-visibility E2E and adds a focused staging token diagnostic for hermes vs claude-code.

Why

Staging/production E2E must prove production behavior. /admin/workspaces/:id/test-token is intentionally disabled in production-like envs, so staging tests must use the production-safe admin route POST /admin/workspaces/:id/tokens or fail.

The diagnostic run also proved the original failure was not Hermes-specific: both Hermes and Claude Code failed on the stale staging tenant image, then both passed after staging CP was pointed at platform-tenant:staging-latest.

Brief-falsification log

[H1] Hermes-specific token issue.
Verification: Ran PV_RUNTIMES='hermes claude-code' bash tests/e2e/test_peer_visibility_token_mint_staging.sh against staging.
Result: falsified. Both Hermes and Claude Code failed when staging pulled stale build a93c4ce.

[H2] Shared auth route missing because tenant image was stale.
Verification: Fresh tenant buildinfo showed a93c4ce, which predates 07457ad fix(core): add admin workspace token mint route; staging ECR latest pointed to staging-a93c4ce while staging-latest pointed to current c3806cd.
Result: supported. After switching Railway staging TENANT_IMAGE/STAGING_TENANT_IMAGE to :staging-latest and redeploying CP, fresh tenant buildinfo showed c3806cd and token diagnostic passed for both Hermes and Claude Code.

Comprehensive testing performed

  • bash -n tests/e2e/test_peer_visibility_mcp_staging.sh && bash -n tests/e2e/test_peer_visibility_token_mint_staging.sh && bash -n tests/e2e/test_peer_visibility_mcp_local.sh
  • if rg -n '/admin/workspaces/.*/test-token|test-token' tests/e2e/test_*staging*.sh; then exit 1; fi -> no matches
  • Live staging diagnostic before env fix: tenant buildinfo a93c4ce, both hermes and claude-code failed POST /admin/workspaces/:id/tokens with Next.js 404
  • Live staging diagnostic after env fix + CP redeploy: tenant buildinfo c3806cd, token diagnostic passed for hermes claude-code

Local-postgres E2E run

N/A for this PR: staging E2E harness change plus live staging diagnostic. Local peer-visibility script syntax was checked; no local platform behavior changed.

Staging-smoke verified or pending

Verified with PV_RUNTIMES='hermes claude-code' bash tests/e2e/test_peer_visibility_token_mint_staging.sh against https://staging-api.moleculesai.app. Result: passed after staging CP env moved to platform-tenant:staging-latest.

Root-cause not symptom

Root cause was staging tenant image tag drift: fresh staging tenants used stale ECR tag latest (a93c4ce) instead of SSOT-published staging-latest (c3806cd), so the production-safe admin token route did not exist in the tenant image.

Five-Axis review walked

Correctness: staging tests now use only production-safe token route and classify all requested runtimes before failing. Readability: diagnostic wrapper is intentionally thin. Architecture: keeps the full MCP assertion shared; only adds a stop-after-token mode. Security: removes dev-only test-token use from staging. Performance: diagnostic mode exits before workspace online/MCP checks.

No backwards-compat shim / dead code added

Yes. No compatibility shim; the dev-only fallback was removed from staging tests.

Memory/saved-feedback consulted

No task-specific saved memory was used for this change. Applied current AGENTS/SOP and live Gitea/Railway/ECR evidence.

Verification

  • bash -n tests/e2e/test_peer_visibility_mcp_staging.sh && bash -n tests/e2e/test_peer_visibility_token_mint_staging.sh && bash -n tests/e2e/test_peer_visibility_mcp_local.sh -> staging-no-dev-token-ok
  • Staging live token diagnostic after env fix -> ✅ token diagnostic passed for runtimes: hermes claude-code

Coverage ledger

file/function branch/condition test name red/green evidence drift caught
tests/e2e/test_peer_visibility_mcp_staging.sh managed runtime create returns no auth_token; admin route must mint test_peer_visibility_token_mint_staging.sh Red: both Hermes and Claude Code got Next.js 404 on stale image; Green: both minted tokens on c3806cd Stale tenant image missing production-safe admin token route
.gitea/workflows/e2e-peer-visibility.yml::pr-validate staging script references dev-only test-token PR validation grep Green: rg found no test-token in test_*staging*.sh Dev-only token fallback drifting back into staging E2E
tests/e2e/test_peer_visibility_token_mint_staging.sh runtime classification before full MCP wait live staging diagnostic Green: reports Hermes and Claude Code token acquisition independently False attribution of shared auth route failures to Hermes

Idempotency notes

The diagnostic uses the same scoped throwaway org teardown as the full staging peer-visibility gate. No cluster-wide cleanup.

Loki query

N/A — route/image drift was proven through tenant buildinfo, Gitea commit history, ECR tags, and live E2E output.

Tier

tier:low

## What Removes the dev-only `/admin/workspaces/:id/test-token` fallback from staging peer-visibility E2E and adds a focused staging token diagnostic for `hermes` vs `claude-code`. ## Why Staging/production E2E must prove production behavior. `/admin/workspaces/:id/test-token` is intentionally disabled in production-like envs, so staging tests must use the production-safe admin route `POST /admin/workspaces/:id/tokens` or fail. The diagnostic run also proved the original failure was not Hermes-specific: both Hermes and Claude Code failed on the stale staging tenant image, then both passed after staging CP was pointed at `platform-tenant:staging-latest`. ## Brief-falsification log [H1] Hermes-specific token issue. Verification: Ran `PV_RUNTIMES='hermes claude-code' bash tests/e2e/test_peer_visibility_token_mint_staging.sh` against staging. Result: falsified. Both Hermes and Claude Code failed when staging pulled stale build `a93c4ce`. [H2] Shared auth route missing because tenant image was stale. Verification: Fresh tenant buildinfo showed `a93c4ce`, which predates `07457ad fix(core): add admin workspace token mint route`; staging ECR `latest` pointed to `staging-a93c4ce` while `staging-latest` pointed to current `c3806cd`. Result: supported. After switching Railway staging `TENANT_IMAGE`/`STAGING_TENANT_IMAGE` to `:staging-latest` and redeploying CP, fresh tenant buildinfo showed `c3806cd` and token diagnostic passed for both Hermes and Claude Code. ## Comprehensive testing performed - `bash -n tests/e2e/test_peer_visibility_mcp_staging.sh && bash -n tests/e2e/test_peer_visibility_token_mint_staging.sh && bash -n tests/e2e/test_peer_visibility_mcp_local.sh` - `if rg -n '/admin/workspaces/.*/test-token|test-token' tests/e2e/test_*staging*.sh; then exit 1; fi` -> no matches - Live staging diagnostic before env fix: tenant buildinfo `a93c4ce`, both `hermes` and `claude-code` failed `POST /admin/workspaces/:id/tokens` with Next.js 404 - Live staging diagnostic after env fix + CP redeploy: tenant buildinfo `c3806cd`, token diagnostic passed for `hermes claude-code` ## Local-postgres E2E run N/A for this PR: staging E2E harness change plus live staging diagnostic. Local peer-visibility script syntax was checked; no local platform behavior changed. ## Staging-smoke verified or pending Verified with `PV_RUNTIMES='hermes claude-code' bash tests/e2e/test_peer_visibility_token_mint_staging.sh` against `https://staging-api.moleculesai.app`. Result: passed after staging CP env moved to `platform-tenant:staging-latest`. ## Root-cause not symptom Root cause was staging tenant image tag drift: fresh staging tenants used stale ECR tag `latest` (`a93c4ce`) instead of SSOT-published `staging-latest` (`c3806cd`), so the production-safe admin token route did not exist in the tenant image. ## Five-Axis review walked Correctness: staging tests now use only production-safe token route and classify all requested runtimes before failing. Readability: diagnostic wrapper is intentionally thin. Architecture: keeps the full MCP assertion shared; only adds a stop-after-token mode. Security: removes dev-only test-token use from staging. Performance: diagnostic mode exits before workspace online/MCP checks. ## No backwards-compat shim / dead code added Yes. No compatibility shim; the dev-only fallback was removed from staging tests. ## Memory/saved-feedback consulted No task-specific saved memory was used for this change. Applied current AGENTS/SOP and live Gitea/Railway/ECR evidence. ## Verification - `bash -n tests/e2e/test_peer_visibility_mcp_staging.sh && bash -n tests/e2e/test_peer_visibility_token_mint_staging.sh && bash -n tests/e2e/test_peer_visibility_mcp_local.sh` -> `staging-no-dev-token-ok` - Staging live token diagnostic after env fix -> `✅ token diagnostic passed for runtimes: hermes claude-code` ## Coverage ledger | file/function | branch/condition | test name | red/green evidence | drift caught | |---|---|---|---|---| | `tests/e2e/test_peer_visibility_mcp_staging.sh` | managed runtime create returns no `auth_token`; admin route must mint | `test_peer_visibility_token_mint_staging.sh` | Red: both Hermes and Claude Code got Next.js 404 on stale image; Green: both minted tokens on `c3806cd` | Stale tenant image missing production-safe admin token route | | `.gitea/workflows/e2e-peer-visibility.yml::pr-validate` | staging script references dev-only `test-token` | PR validation grep | Green: `rg` found no `test-token` in `test_*staging*.sh` | Dev-only token fallback drifting back into staging E2E | | `tests/e2e/test_peer_visibility_token_mint_staging.sh` | runtime classification before full MCP wait | live staging diagnostic | Green: reports Hermes and Claude Code token acquisition independently | False attribution of shared auth route failures to Hermes | ## Idempotency notes The diagnostic uses the same scoped throwaway org teardown as the full staging peer-visibility gate. No cluster-wide cleanup. ## Loki query N/A — route/image drift was proven through tenant buildinfo, Gitea commit history, ECR tags, and live E2E output. ## Tier tier:low
hongming added 1 commit 2026-05-21 20:22:21 +00:00
test(e2e): forbid dev token path in staging peer visibility
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 3s
CI / Python Lint & Test (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 6s
E2E API Smoke Test / detect-changes (pull_request) Successful in 10s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 8s
E2E Chat / detect-changes (pull_request) Successful in 10s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 10s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 14s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 12s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 10s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 52s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m16s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 3s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m16s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m21s
gate-check-v3 / gate-check (pull_request) Successful in 4s
qa-review / approved (pull_request) Failing after 4s
sop-checklist / na-declarations (pull_request) N/A: (none)
security-review / approved (pull_request) Failing after 5s
sop-checklist / review-refire (pull_request) Has been skipped
sop-checklist / all-items-acked (pull_request) Successful in 5s
CI / Platform (Go) (pull_request) Successful in 2s
sop-tier-check / tier-check (pull_request) Successful in 4s
CI / Canvas (Next.js) (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 6s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 10s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m15s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 3s
CI / all-required (pull_request) Successful in 2m21s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m11s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 58s
audit-force-merge / audit (pull_request) Successful in 6s
119743d0de
hongming merged commit ff2557d899 into main 2026-05-21 20:26:34 +00:00
Sign in to join this conversation.
No Reviewers
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: molecule-ai/molecule-core#1650