e2e-api: wire admin auth so the mock arm validates under REQUIRE-LIVE
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 9s
CI / Python Lint & Test (pull_request) Successful in 7s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 8s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 8s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 19s
CI / Detect changes (pull_request) Successful in 31s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 11s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 29s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 3s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 8s
CI / Platform (Go) (pull_request) Successful in 2s
E2E API Smoke Test / detect-changes (pull_request) Successful in 41s
E2E Chat / detect-changes (pull_request) Successful in 39s
CI / Canvas (Next.js) (pull_request) Successful in 3s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 18s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 16s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 18s
gate-check-v3 / gate-check (pull_request_target) Successful in 19s
CI / all-required (pull_request) Successful in 5s
qa-review / approved (pull_request_target) Failing after 14s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 15s
sop-checklist / review-refire (pull_request_target) Has been skipped
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Failing after 1m19s
E2E Chat / E2E Chat (pull_request) Successful in 3s
sop-tier-check / tier-check (pull_request_target) Failing after 12s
CI / Canvas Deploy Status (pull_request) Has been skipped
security-review / approved (pull_request_target) Failing after 36s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 36s
lint-mask-pr-atomicity / lint-mask-pr-atomicity (pull_request) Successful in 1m22s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m18s
qa-review / approved (pull_request_review) Has been skipped
security-review / approved (pull_request_review) Has been skipped
sop-tier-check / tier-check (pull_request_review) Failing after 9s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m27s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 48s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 2m1s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 2m20s

The REQUIRED `E2E API Smoke Test` gate did not honestly validate any
runtime: the priority-runtimes mock arm's POST /org/import returned
401 {"error":"admin auth required"} because the e2e-api CI platform
runs with no admin token configured and the test sent no admin bearer.
So E2E_REQUIRE_LIVE was left OFF and the gate proved nothing about a
runtime (CR2's review). Root cause confirmed from CI log of head
74fd0814 (task 273465 line 562).

AdminAuth (workspace-server/internal/middleware/wsauth_middleware.go:164)
reads ADMIN_TOKEN; setting it also closes isDevModeFailOpen
(devmode.go:50). POST /org/import (router.go:778) and POST
/admin/workspaces/:id/tokens (router.go:427) are both AdminAuth-gated.

Fix:
- e2e-api.yml: set a deterministic ADMIN_TOKEN on the platform-server
  process and export the matching MOLECULE_ADMIN_TOKEN (the var the
  e2e scripts send as the bearer) so platform-checks == test-sends.
- test_priority_runtimes_e2e.sh run_mock: send the admin bearer on the
  /org/import curl (mirrors e2e_mint_workspace_token), and parse the
  workspace id from the real response key ("workspaces", org.go:898-901
  — the old "results" key never existed; it was masked by the 401).
  A missing id is now a hard fail() (real break → RED), not bestfail().
- _lib.sh e2e_delete_workspace: guard "${curl_args[@]}" with the
  ${arr[@]+"…"} idiom so the EXIT-trap cleanup (empty array) doesn't
  abort non-zero under set -u and turn a validated run RED.
- Re-enable the honest gate: E2E_REQUIRE_LIVE='1' in e2e-api.yml.

Proven locally (PG+Redis+platform-server): without admin auth
/org/import → 401; with it the mock arm validates end-to-end
(create → online → canned A2A "On it, boss." → activity_logs row →
1 validated → exit 0). RED direction proven (admin auth absent →
hard FAIL → exit 1). Gate-logic unit test 7/7 green. MiniMax stays
best-effort. Updated stale comments. No new credentials.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
core-devops
2026-06-05 00:15:20 -07:00
parent 74fd08144d
commit 8fb5dbed59
3 changed files with 94 additions and 62 deletions
+41 -29
View File
@@ -272,6 +272,24 @@ jobs:
echo "::error::Redis did not become ready in 15s"
docker logs "$REDIS_CONTAINER" || true
exit 1
- name: Set deterministic admin token for the e2e platform
if: needs.detect-changes.outputs.api == 'true'
run: |
# AdminAuth (workspace-server/internal/middleware/wsauth_middleware.go:164)
# reads ADMIN_TOKEN. Setting it (a) closes isDevModeFailOpen (devmode.go:50
# returns false when ADMIN_TOKEN is non-empty), so admin routes require a
# bearer, and (b) makes Tier-2b accept a bearer that constant-time-equals
# ADMIN_TOKEN. The platform process inherits ADMIN_TOKEN from $GITHUB_ENV.
#
# MOLECULE_ADMIN_TOKEN is the var the e2e scripts send as the bearer
# (tests/e2e/_lib.sh:33 e2e_mint_workspace_token, and the run_mock
# org-import curl). Set BOTH to the SAME value so the bearer the test
# sends == the secret the platform checks. Deterministic test value;
# this platform is ephemeral, single-run, and never reachable off-host.
E2E_ADMIN_TOKEN="e2e-api-admin-${{ github.run_id }}-${{ github.run_attempt }}"
echo "ADMIN_TOKEN=${E2E_ADMIN_TOKEN}" >> "$GITHUB_ENV"
echo "MOLECULE_ADMIN_TOKEN=${E2E_ADMIN_TOKEN}" >> "$GITHUB_ENV"
echo "Admin token configured for the e2e platform (ADMIN_TOKEN + MOLECULE_ADMIN_TOKEN)."
- name: Build platform
if: needs.detect-changes.outputs.api == 'true'
working-directory: workspace-server
@@ -397,38 +415,32 @@ jobs:
- name: Run notify-with-attachments E2E
if: needs.detect-changes.outputs.api == 'true'
run: bash tests/e2e/test_notify_attachments_e2e.sh
- name: "Run priority-runtimes E2E (live arms opportunistic; gate logic unit-tested separately)"
# DELIBERATELY NOT forcing E2E_REQUIRE_LIVE here.
- name: "Run priority-runtimes E2E (REQUIRE-LIVE: mock validates the runtime plumbing end-to-end)"
# E2E_REQUIRE_LIVE=1 is ON: the run MUST validate >=1 runtime end-to-end
# or it exits NON-zero (RED). This is now SAFE because the `mock` arm can
# actually provision in CI: the only blocker was that POST /org/import and
# POST /admin/workspaces/:id/tokens are AdminAuth-gated
# (router.go:778 + :427) and this job previously configured NO admin token,
# so every admin call 401'd ("admin auth required"). The "Set deterministic
# admin token" step above now sets ADMIN_TOKEN on the platform AND exports
# the matching MOLECULE_ADMIN_TOKEN the e2e scripts send as the bearer, so
# the mock arm can org-import → online → mint token → canned A2A reply →
# validated(). That guarantees VALIDATED>=1 on a healthy platform, so the
# REQUIRED `E2E API Smoke Test` gate now HONESTLY validates a runtime
# end-to-end; if the mock plumbing (DB insert, status flip, A2A proxy,
# activity logging, or the admin-auth wiring) genuinely breaks, the gate
# goes RED instead of false-green. The zero-validated→RED decision is also
# regression-gated WITHOUT provisioning by the bash unit test
# tests/e2e/test_require_live_priority_gate_unit.sh (wired into ci.yml's
# "Run E2E bash unit tests" job), so a revert of that logic still fails CI.
#
# ESTABLISHED FACT (2 CI runs, PR #2286): this CI substrate cannot
# provision ANY runtime end-to-end — MiniMax create returns 422
# UNREGISTERED_MODEL_FOR_RUNTIME, the `mock` org-import create FAILS,
# and claude-code needs an LLM key CI doesn't have. So with
# E2E_REQUIRE_LIVE=1 every arm SKIPs/MISSes, VALIDATED stays 0, and the
# script exits NON-zero — which makes the REQUIRED `E2E API Smoke Test`
# gate permanently RED FOR EVERYONE. We must not ship a gate that's
# red-for-all, so this job stays GREEN by validating only what CI can
# actually validate (DB + migrations + platform health + the API/echo
# arms run by the earlier steps), exactly as it did before #2286.
#
# The false-green this PR fixes — test_priority_runtimes_e2e.sh exiting
# 0 when zero runtimes validated under REQUIRE-LIVE — is regression-
# gated WITHOUT provisioning by the bash unit test
# tests/e2e/test_require_live_priority_gate_unit.sh, wired into ci.yml's
# "Run E2E bash unit tests" job. That unit test drives the REAL
# evaluate_require_live_gate() decision and runs on every PR, so a
# revert of the zero-validated→RED logic fails CI. No provisioning
# needed to gate the logic.
#
# The MiniMax key stays wired as an OPPORTUNISTIC best-effort arm: if a
# future CI substrate can actually provision it, it validates as a bonus
# real-LLM check; today it reports a best-effort MISS and never reds the
# gate (REQUIRE_LIVE unset → all-skip is a LOUD skip + exit 0). ZERO new
# credentials. Wiring a CI-provisionable live-completion arm + then
# turning REQUIRE-LIVE on here is the deferred FOLLOW-UP (tracked
# separately), NOT this PR.
# MiniMax stays an OPPORTUNISTIC best-effort arm: create is registry-fragile
# in CI (422 UNREGISTERED_MODEL_FOR_RUNTIME), so a miss is reported via
# bestfail() and never reds the gate — mock carries the required validation,
# MiniMax is a bonus real-LLM check when it comes up. ZERO new credentials.
if: needs.detect-changes.outputs.api == 'true'
env:
E2E_REQUIRE_LIVE: '1'
E2E_MINIMAX_API_KEY: ${{ secrets.MOLECULE_STAGING_MINIMAX_API_KEY }}
run: bash tests/e2e/test_priority_runtimes_e2e.sh
- name: Install standalone runtime parser from Gitea registry
+8 -2
View File
@@ -53,15 +53,21 @@ e2e_delete_workspace() {
if [ -z "$wid" ]; then
return 0
fi
# ${curl_args[@]+"…"} guard: under `set -u` an empty array expands to an
# "unbound variable" error on bash <4.4 (macOS 3.2, some Linux). This form
# expands to nothing when the array is empty. Callers from the priority-
# runtimes EXIT trap pass no extra curl args, so the array IS empty there —
# without the guard the trap aborts non-zero AFTER the gate already passed,
# turning a validated run RED. (Same idiom already used for CREATED_WSIDS.)
if [ -z "$name" ]; then
name=$(curl -s "$BASE/workspaces/$wid" "${curl_args[@]}" | python3 -c "import json,sys
name=$(curl -s "$BASE/workspaces/$wid" ${curl_args[@]+"${curl_args[@]}"} | python3 -c "import json,sys
try:
print(json.load(sys.stdin).get('name',''))
except Exception:
pass" 2>/dev/null || true)
fi
curl -s -X DELETE "$BASE/workspaces/$wid?confirm=true" \
-H "X-Confirm-Name: $name" "${curl_args[@]}" > /dev/null || true
-H "X-Confirm-Name: $name" ${curl_args[@]+"${curl_args[@]}"} > /dev/null || true
}
e2e_cleanup_all_workspaces() {
+45 -31
View File
@@ -47,33 +47,34 @@
# asserts the gate's exit code — no platform, no provisioning, no network.
# So the false-green can't silently come back: a revert of the guard fails CI.
#
# CI POSTURE (deferred live arm — see .gitea/workflows/e2e-api.yml):
# The live e2e-api job does NOT set E2E_REQUIRE_LIVE, because CI cannot
# currently provision ANY runtime end-to-end (MiniMax create → 422
# UNREGISTERED_MODEL_FOR_RUNTIME; the mock org-import arm fails create in CI;
# claude-code needs an LLM key CI lacks). Forcing E2E_REQUIRE_LIVE=1 there
# would make the REQUIRED `E2E API Smoke Test` gate permanently RED for
# everyone. So in CI this script still runs its DB/migration/platform-health
# arms green, and the zero-validated→RED logic is gated by the bash unit test
# above instead. Wiring a real live-completion arm into CI (a runtime that
# actually provisions without a secret CI can't supply) is tracked as a
# FOLLOW-UP, not this PR.
# CI POSTURE (REQUIRE-LIVE ON — see .gitea/workflows/e2e-api.yml):
# The live e2e-api job SETS E2E_REQUIRE_LIVE=1. The `mock` arm is the
# CI-provisionable live-completion arm: it org-imports a mock workspace
# (→online→canned A2A reply) with NO external secret. The only thing that
# previously blocked it in CI was admin auth — POST /org/import and POST
# /admin/workspaces/:id/tokens are AdminAuth-gated, and the job set no admin
# token, so every admin call 401'd ("admin auth required"). The job now sets
# ADMIN_TOKEN on the platform AND exports the matching MOLECULE_ADMIN_TOKEN
# the scripts send, so mock validates end-to-end and VALIDATED>=1 holds on a
# healthy platform — the REQUIRED `E2E API Smoke Test` gate now HONESTLY
# validates a runtime. If the mock plumbing or the admin-auth wiring breaks,
# the gate goes RED (not false-green). The zero-validated→RED decision is also
# regression-gated WITHOUT provisioning by the bash unit test above, so a
# revert of that logic still fails CI.
#
# LIVE ARMS (run when their prerequisite is present; opportunistic):
# - `mock` (run_mock) is the no-key arm: a virtual workspace (no
# container, no EC2, no provider) whose org-import path is INTENDED to
# short-circuit to status='online' with a canned A2A reply. NOTE: in the
# current CI substrate the mock org-import `create` step FAILS, so mock
# does NOT validate in CI today — it is a local/dev arm and a candidate
# for the deferred live-completion follow-up, not a CI backbone.
# - `mock` (run_mock) is the no-key REQUIRE-LIVE backbone: a virtual
# workspace (no container, no EC2, no provider) whose org-import path
# short-circuits to status='online' with a canned A2A reply. It validates
# in CI now that the e2e-api job wires an admin token (org-import + token
# mint are AdminAuth-gated), so it is the guaranteed >=1 validation.
# - MiniMax (E2E_MINIMAX_API_KEY, from MOLECULE_STAGING_MINIMAX_API_KEY) is
# an OPPORTUNISTIC best-effort real-LLM arm: registry-fragile in CI (422
# UNREGISTERED_MODEL_FOR_RUNTIME — see run_minimax header), so a miss is
# a best-effort MISS via bestfail() and does NOT red the gate.
# Because NO arm can currently provision end-to-end in CI, the CI e2e-api job
# does NOT force E2E_REQUIRE_LIVE (it would red the REQUIRED gate forever).
# The zero-validated→RED logic is instead regression-gated by the bash unit
# test (see above). A CI-provisionable live arm is the deferred follow-up.
# The CI e2e-api job sets E2E_REQUIRE_LIVE=1: mock guarantees a validation, so
# the REQUIRED gate is honest (RED if the mock plumbing/admin-auth breaks). The
# zero-validated→RED logic is also regression-gated by the bash unit test above.
#
# Usage:
# # Enforce REQUIRE-LIVE locally (need >=1 arm to actually validate):
@@ -544,8 +545,17 @@ run_mock() {
# mock never USES the model, so any non-empty value satisfies the
# contract. The org-import path does not run the Create handler's
# registry model-validation, so "mock" is accepted as-is.
# POST /org/import is AdminAuth-gated (router.go:778). When the platform has
# ADMIN_TOKEN set (as the e2e-api CI job now does), an unauthenticated import
# 401s with {"error":"admin auth required"}. Send the same admin bearer the
# mint helper uses (MOLECULE_ADMIN_TOKEN, ADMIN_TOKEN fallback) — guarded so a
# bootstrap/dev platform with no admin token (fail-open) still works.
local admin_bearer="${MOLECULE_ADMIN_TOKEN:-${ADMIN_TOKEN:-}}"
local admin_auth=()
[ -n "$admin_bearer" ] && admin_auth=(-H "Authorization: Bearer $admin_bearer")
local import_resp wsid
import_resp=$(curl -s -X POST "$BASE/org/import" -H "Content-Type: application/json" \
${admin_auth[@]+"${admin_auth[@]}"} \
-d '{
"template": {
"name": "Priority E2E Mock Org",
@@ -555,26 +565,30 @@ run_mock() {
]
}
}')
# org-import returns {"results":[{"id":...,"name":...}, ...]} (plus
# reconcile counters). Pull the id of the single workspace we declared.
# org-import returns {"org":..., "count":N, "workspaces":[{"id":...,
# "name":...,"tier":...}, ...]} (handlers/org.go:898-901). Pull the id of
# the single workspace we declared. (Older "results" key fallback kept for
# forward/back compat in case the response shape is ever versioned.)
wsid=$(echo "$import_resp" | python3 -c '
import json, sys
try:
d = json.load(sys.stdin)
except Exception:
sys.exit(0)
for r in (d.get("results") or []):
for r in (d.get("workspaces") or d.get("results") or []):
if r.get("name") == "Priority E2E (mock)" and r.get("id"):
print(r["id"]); break
') || true
if [ -z "$wsid" ]; then
# CI's e2e-api platform cannot org-import a mock workspace (observed in
# CI: create returns no id). Treat as a best-effort MISS, not a hard FAIL,
# so it never reds the required gate — the false-green LOGIC is gated by
# tests/e2e/test_require_live_priority_gate_unit.sh, not by a live arm CI
# can't run. Where the platform CAN create a mock (local / future CI), the
# online/token/reply checks below still hard-fail on a real mock break.
bestfail "create mock workspace (org-import; CI cannot create mock — best-effort)" "$import_resp"
# mock org-import is the REQUIRE-LIVE backbone and is EXPECTED to succeed in
# CI now that the e2e-api job wires an admin token (ADMIN_TOKEN on the
# platform + MOLECULE_ADMIN_TOKEN sent above). A missing id here is a REAL
# break (admin-auth wiring, org-import create, or the mock short-circuit) and
# MUST red the gate — so this is a hard fail(), not a best-effort miss. Under
# E2E_REQUIRE_LIVE=1 a FAIL also forces a non-zero exit via
# evaluate_require_live_gate. Surface the response so the break is visible
# (e.g. {"error":"admin auth required"} would mean the token wiring regressed).
fail "create mock workspace (org-import)" "$import_resp"
return 0
fi
CREATED_WSIDS+=("$wsid")