fix(e2e): canvas-tabs staging setup waits for RENDERABLE, not online (#2199) #2202
Reference in New Issue
Block a user
Delete Branch "fix/e2e-staging-canvas-tabs-red"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Red:
E2E Staging Canvas (Playwright) / Canvas tabs E2E= FAILURE on main HEADb9d2f023(issue #2199).Actual failure (runner-6, task 258160 — container logs, on-disk Gitea logs are stale post-1.26.2)
The failure is in the Playwright globalSetup, not in any spec assertion:
Root-cause verdict: NOT a canvas/test regression, NOT timing fragility
It is a deterministic consequence of workspace-server #2162 (
fix(provision): platform-managed workspace must fail-closed when CP proxy env absent, merged 2026-06-03 — a correct production safety fix). The canvas E2E creates a barehermes/gpt-4oworkspace, which defaults closed toplatform_managed(workspace_provision.go:~1009). On a staging tenant withoutMOLECULE_LLM_BASE_URL/MOLECULE_LLM_USAGE_TOKEN, the agent now aborts at boot withMISSING_PLATFORM_PROXY, surfacing as the pre-start credential-abort shape (status:"failed",uptime_seconds:0, nolast_sample_error). Pre-#2162 the same workspace booted credential-less (the bug #2162 fixed), so the old harness happened to pass.This is environmental / test-design, not a product regression: #2162 should stay; nothing in canvas broke.
Why the fix belongs in the harness
staging-tabs.spec.tsonly opens the 13 side-panel tabs and asserts no hard crash / no "Failed to load" toast. It makes zero LLM calls and even mocks/cp/auth/me+401→200. All it needs is a workspace row so the node + tabs render — a fully-onlineagent is not required.The fix (single file:
canvas/e2e/staging-setup.tsstep 6)Wait for RENDERABLE instead of strictly
online:online-> happy path (staging with proxy env)failed+uptime_seconds==0+ nolast_sample_error-> pre-start credential-abort: agent never ran, row still renders -> proceed, with a loudconsole.warnfailed(last_sample_errorpresent, ORuptime_seconds>0= agent started then crashed) -> still hard-throws (no masking)Genuine infra provision failure remains loud one step earlier at the org level (
instance_status === "failed", unchanged).Verification
tscclean forcanvas/e2e/staging-*(pre-existing tsc errors are all in unrelated__tests__files).npx playwright test --config=playwright.staging.config.ts --listresolves globalSetup + the single spec.Closes #2199.
🤖 Generated with Claude Code
Owner force-merged (honest bypass). Clears canvas-tabs Playwright red #2199: #2162 fail-closed (correct) made the credential-less platform_managed test workspace abort pre-start; staging-setup now waits for RENDERABLE (the tabs spec needs a row, not a booted agent) while still hard-throwing on genuine failures. Required CI green. Token revoked.
New recurrence/evidence on molecule-core main
793d376a1a30:E2E Staging Canvas / Canvas tabs E2Ejob275093now fails earlier than the priorlast_sample_errorpath.canvas/e2e/staging-setup.ts:237-245still createsruntime:"hermes"withmodel:"gpt-4o"; staging rejects the workspace create withUNREGISTERED_MODEL_FOR_RUNTIME(model "gpt-4o" is not a registered model for runtime "hermes"). This confirms the #2202 fix direction should update the harness to a registered/renderable runtime-model pair (or omit model if the API supplies a valid default), not just wait longer after creation.