Staging peer-visibility E2E cannot mint MCP bearer for managed runtimes #1644

Open
opened 2026-05-21 18:47:03 +00:00 by hongming · 1 comment
Owner

Observed on molecule-core main merge commit da4b86a159, run 91206 / task 142513. Local peer visibility E2E passed, but staging peer visibility failed while provisioning Hermes because the workspace create response returned no auth_token and both token fallbacks failed: POST /admin/workspaces/:id/tokens returned HTML 404, GET /admin/workspaces/:id/test-token returned {"error":"not found"}. This looks separate from the MCP delegate platform-path fix; the tenant route/image or staging admin-token surface is drifting from the test's expected bearer-mint path. Need determine whether the staging tenant image lacks the admin token route, the route is intentionally disabled in production-shaped tenants, or the E2E should use a different SSOT bearer-mint surface.

Observed on molecule-core main merge commit da4b86a1593d555ecd6282d39db6be29ee047a61, run 91206 / task 142513. Local peer visibility E2E passed, but staging peer visibility failed while provisioning Hermes because the workspace create response returned no auth_token and both token fallbacks failed: POST /admin/workspaces/:id/tokens returned HTML 404, GET /admin/workspaces/:id/test-token returned {"error":"not found"}. This looks separate from the MCP delegate platform-path fix; the tenant route/image or staging admin-token surface is drifting from the test's expected bearer-mint path. Need determine whether the staging tenant image lacks the admin token route, the route is intentionally disabled in production-shaped tenants, or the E2E should use a different SSOT bearer-mint surface.
Member

MECHANISM: Current main still has the peer-visibility staging failure class tracked here, now isolated from the earlier Platform-Go/Handlers-Postgres main break. On main 406d73ff (fix(templates): revert templates.go change from #1781), CI / Platform (Go) and Handlers Postgres Integration are green, but E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push) is red. The failing path is the staging job in .gitea/workflows/e2e-peer-visibility.yml:298-345: it uses staging CP, verifies admin/LLM keys, then runs tests/e2e/test_peer_visibility_mcp_staging.sh to provision runtimes and call literal MCP list_peers.

EVIDENCE: Gitea status API for main 406d73ff reports CI / Platform (Go) success, CI / all-required success, and Handlers Postgres Integration success, while E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push) fails after 2m33s at /molecule-ai/molecule-core/actions/runs/84418/jobs/2. This matches the existing #1644 symptom class: local peer visibility can pass, but staging fails before/around managed-runtime bearer minting and literal MCP visibility. Workflow comments at .gitea/workflows/e2e-peer-visibility.yml:28-35 confirm this is intentionally not a registry/proxy check; it exercises the real /workspaces/:id/mcp list_peers call.

RECOMMENDED FIX SHAPE: Responsible repo/files are molecule-ai/molecule-core/.gitea/workflows/e2e-peer-visibility.yml and molecule-ai/molecule-core/tests/e2e/test_peer_visibility_mcp_staging.sh, plus the staging CP admin token/bearer-mint route they call. Infra should inspect run 84418 job 2 logs and confirm whether the failure is still missing/404 token minting versus a later MCP list_peers authorization failure, then align the staging test with the production-safe bearer mint surface rather than weakening the gate.

MECHANISM: Current main still has the peer-visibility staging failure class tracked here, now isolated from the earlier Platform-Go/Handlers-Postgres main break. On main `406d73ff` (`fix(templates): revert templates.go change from #1781`), `CI / Platform (Go)` and `Handlers Postgres Integration` are green, but `E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push)` is red. The failing path is the staging job in `.gitea/workflows/e2e-peer-visibility.yml:298-345`: it uses staging CP, verifies admin/LLM keys, then runs `tests/e2e/test_peer_visibility_mcp_staging.sh` to provision runtimes and call literal MCP `list_peers`. EVIDENCE: Gitea status API for main `406d73ff` reports `CI / Platform (Go)` success, `CI / all-required` success, and `Handlers Postgres Integration` success, while `E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push)` fails after 2m33s at `/molecule-ai/molecule-core/actions/runs/84418/jobs/2`. This matches the existing #1644 symptom class: local peer visibility can pass, but staging fails before/around managed-runtime bearer minting and literal MCP visibility. Workflow comments at `.gitea/workflows/e2e-peer-visibility.yml:28-35` confirm this is intentionally not a registry/proxy check; it exercises the real `/workspaces/:id/mcp` list_peers call. RECOMMENDED FIX SHAPE: Responsible repo/files are `molecule-ai/molecule-core/.gitea/workflows/e2e-peer-visibility.yml` and `molecule-ai/molecule-core/tests/e2e/test_peer_visibility_mcp_staging.sh`, plus the staging CP admin token/bearer-mint route they call. Infra should inspect run 84418 job 2 logs and confirm whether the failure is still missing/404 token minting versus a later MCP `list_peers` authorization failure, then align the staging test with the production-safe bearer mint surface rather than weakening the gate.
Sign in to join this conversation.
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: molecule-ai/molecule-core#1644