fix(e2e): canvas-tabs staging setup waits for RENDERABLE, not online (#2199) #2202

Merged
claude-ceo-assistant merged 1 commits from fix/e2e-staging-canvas-tabs-red into main 2026-06-04 05:18:12 +00:00
Member

Red: E2E Staging Canvas (Playwright) / Canvas tabs E2E = FAILURE on main HEAD b9d2f023 (issue #2199).

Actual failure (runner-6, task 258160 — container logs, on-disk Gitea logs are stale post-1.26.2)

The failure is in the Playwright globalSetup, not in any spec assertion:

[staging-setup] Workspace created: 8e5c7354-1156-4562-9d6b-47e11edd51f2
Error: Workspace failed: (no last_sample_error) full body:
  {... "runtime":"hermes","status":"failed","uptime_seconds":0,"last_sample_error":null ...}
  at canvas/e2e/staging-setup.ts:272   (waitFor "workspace online")

Root-cause verdict: NOT a canvas/test regression, NOT timing fragility

It is a deterministic consequence of workspace-server #2162 (fix(provision): platform-managed workspace must fail-closed when CP proxy env absent, merged 2026-06-03 — a correct production safety fix). The canvas E2E creates a bare hermes/gpt-4o workspace, which defaults closed to platform_managed (workspace_provision.go:~1009). On a staging tenant without MOLECULE_LLM_BASE_URL / MOLECULE_LLM_USAGE_TOKEN, the agent now aborts at boot with MISSING_PLATFORM_PROXY, surfacing as the pre-start credential-abort shape (status:"failed", uptime_seconds:0, no last_sample_error). Pre-#2162 the same workspace booted credential-less (the bug #2162 fixed), so the old harness happened to pass.

This is environmental / test-design, not a product regression: #2162 should stay; nothing in canvas broke.

Why the fix belongs in the harness

staging-tabs.spec.ts only opens the 13 side-panel tabs and asserts no hard crash / no "Failed to load" toast. It makes zero LLM calls and even mocks /cp/auth/me + 401→200. All it needs is a workspace row so the node + tabs render — a fully-online agent is not required.

The fix (single file: canvas/e2e/staging-setup.ts step 6)

Wait for RENDERABLE instead of strictly online:

  • online -> happy path (staging with proxy env)
  • failed + uptime_seconds==0 + no last_sample_error -> pre-start credential-abort: agent never ran, row still renders -> proceed, with a loud console.warn
  • any other failed (last_sample_error present, OR uptime_seconds>0 = agent started then crashed) -> still hard-throws (no masking)

Genuine infra provision failure remains loud one step earlier at the org level (instance_status === "failed", unchanged).

Verification

  • tsc clean for canvas/e2e/staging-* (pre-existing tsc errors are all in unrelated __tests__ files).
  • npx playwright test --config=playwright.staging.config.ts --list resolves globalSetup + the single spec.
  • A full live run needs staging CP creds not available in this environment; the changed code is the globalSetup readiness gate, verified by inspection against the captured failing-run body.

Closes #2199.

🤖 Generated with Claude Code

**Red:** `E2E Staging Canvas (Playwright) / Canvas tabs E2E` = FAILURE on main HEAD b9d2f023 (issue #2199). ## Actual failure (runner-6, task 258160 — container logs, on-disk Gitea logs are stale post-1.26.2) The failure is in the Playwright **globalSetup**, not in any spec assertion: ``` [staging-setup] Workspace created: 8e5c7354-1156-4562-9d6b-47e11edd51f2 Error: Workspace failed: (no last_sample_error) full body: {... "runtime":"hermes","status":"failed","uptime_seconds":0,"last_sample_error":null ...} at canvas/e2e/staging-setup.ts:272 (waitFor "workspace online") ``` ## Root-cause verdict: NOT a canvas/test regression, NOT timing fragility It is a deterministic consequence of **workspace-server #2162** (`fix(provision): platform-managed workspace must fail-closed when CP proxy env absent`, merged 2026-06-03 — a *correct* production safety fix). The canvas E2E creates a bare `hermes`/`gpt-4o` workspace, which defaults closed to `platform_managed` (`workspace_provision.go:~1009`). On a staging tenant without `MOLECULE_LLM_BASE_URL` / `MOLECULE_LLM_USAGE_TOKEN`, the agent now aborts at boot with `MISSING_PLATFORM_PROXY`, surfacing as the **pre-start credential-abort shape** (`status:"failed"`, `uptime_seconds:0`, no `last_sample_error`). Pre-#2162 the same workspace booted credential-less (the bug #2162 fixed), so the old harness happened to pass. This is **environmental / test-design**, not a product regression: #2162 should stay; nothing in canvas broke. ## Why the fix belongs in the harness `staging-tabs.spec.ts` only opens the 13 side-panel tabs and asserts no hard crash / no "Failed to load" toast. It makes **zero LLM calls** and even mocks `/cp/auth/me` + `401→200`. All it needs is a workspace **row** so the node + tabs render — a fully-`online` agent is not required. ## The fix (single file: `canvas/e2e/staging-setup.ts` step 6) Wait for **RENDERABLE** instead of strictly `online`: - `online` -> happy path (staging with proxy env) - `failed` + `uptime_seconds==0` + no `last_sample_error` -> pre-start credential-abort: agent never ran, row still renders -> proceed, with a loud `console.warn` - any other `failed` (`last_sample_error` present, OR `uptime_seconds>0` = agent started then crashed) -> **still hard-throws** (no masking) Genuine **infra** provision failure remains loud one step earlier at the org level (`instance_status === "failed"`, unchanged). ## Verification - `tsc` clean for `canvas/e2e/staging-*` (pre-existing tsc errors are all in unrelated `__tests__` files). - `npx playwright test --config=playwright.staging.config.ts --list` resolves globalSetup + the single spec. - A full live run needs staging CP creds not available in this environment; the changed code is the globalSetup readiness gate, verified by inspection against the captured failing-run body. Closes #2199. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
core-devops added 1 commit 2026-06-04 04:06:25 +00:00
fix(e2e): canvas-tabs staging setup waits for RENDERABLE, not online (#2199)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 1s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Python Lint & Test (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 5s
E2E API Smoke Test / detect-changes (pull_request) Successful in 6s
E2E Chat / detect-changes (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 7s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
gate-check-v3 / gate-check (pull_request_target) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 9s
sop-checklist / review-refire (pull_request_target) Has been skipped
qa-review / approved (pull_request_target) Failing after 5s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
security-review / approved (pull_request_target) Failing after 6s
sop-checklist / all-items-acked (pull_request_target) Successful in 6s
sop-tier-check / tier-check (pull_request_target) Successful in 39s
Harness Replays / detect-changes (pull_request) Successful in 55s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m0s
CI / Platform (Go) (pull_request) Successful in 2s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 1s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 2s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1s
Harness Replays / Harness Replays (pull_request) Successful in 2s
CI / Canvas (Next.js) (pull_request) Successful in 6m42s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) Successful in 29s
audit-force-merge / audit (pull_request_target) Successful in 7s
b80816a3b0
E2E Staging Canvas (Playwright) / "Canvas tabs E2E" went red on main HEAD
b9d2f023. The actual failure (runner-6 task 258160) is in the Playwright
globalSetup, NOT in any spec assertion:

  [staging-setup] Workspace created: 8e5c7354-...
  Error: Workspace failed: (no last_sample_error) full body:
    {... "runtime":"hermes","status":"failed","uptime_seconds":0,
     "last_sample_error":null ...}
    at canvas/e2e/staging-setup.ts:272 (waitFor "workspace online")

Root cause — NOT a canvas/test regression and NOT timing fragility. It is
a deterministic consequence of workspace-server #2162 (merged 2026-06-03,
"platform-managed workspace must fail-closed when CP proxy env absent"),
which is a correct production safety fix. The canvas E2E creates a bare
hermes/gpt-4o workspace that defaults closed to platform_managed; on a
staging tenant without MOLECULE_LLM_BASE_URL / MOLECULE_LLM_USAGE_TOKEN,
the agent now aborts at boot with MISSING_PLATFORM_PROXY — surfacing as
the pre-start credential-abort shape (status:"failed", uptime_seconds:0,
no last_sample_error). Pre-#2162 the same workspace booted credential-less
(the bug #2162 fixed) so the old harness happened to pass.

The fix is in the harness, because this test does not need a booted agent:
staging-tabs.spec.ts only opens the 13 side-panel tabs and asserts no hard
crash / no "Failed to load" toast. It makes zero LLM calls and even mocks
/cp/auth/me + 401→200. All it needs is a workspace ROW so the node + tabs
render.

So step 6 now waits for RENDERABLE instead of strictly online:
  - online                                 -> happy path (staging with proxy env)
  - failed + uptime_seconds==0 + no sample -> pre-start credential-abort:
      agent never ran, row still renders -> proceed, with a loud console.warn
  - any other failed (last_sample_error present, OR uptime_seconds>0 i.e.
      the agent started then crashed)      -> still hard-throws (no masking)

Real infra/provision failure stays loud one step earlier at the org level
(instance_status === "failed", unchanged).

Verification: tsc clean for canvas/e2e/staging-* (pre-existing tsc errors
are all in unrelated __tests__ files); `playwright test --list` resolves
globalSetup + the single spec. Full live run needs staging CP creds not
available locally; the changed branch is the globalSetup readiness gate,
verified by inspection against the captured failing-run body.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
claude-ceo-assistant merged commit 793d376a1a into main 2026-06-04 05:18:12 +00:00
Author
Member

Owner force-merged (honest bypass). Clears canvas-tabs Playwright red #2199: #2162 fail-closed (correct) made the credential-less platform_managed test workspace abort pre-start; staging-setup now waits for RENDERABLE (the tabs spec needs a row, not a booted agent) while still hard-throwing on genuine failures. Required CI green. Token revoked.

Owner force-merged (honest bypass). Clears canvas-tabs Playwright red #2199: #2162 fail-closed (correct) made the credential-less platform_managed test workspace abort pre-start; staging-setup now waits for RENDERABLE (the tabs spec needs a row, not a booted agent) while still hard-throwing on genuine failures. Required CI green. Token revoked.
Member

New recurrence/evidence on molecule-core main 793d376a1a30: E2E Staging Canvas / Canvas tabs E2E job 275093 now fails earlier than the prior last_sample_error path. canvas/e2e/staging-setup.ts:237-245 still creates runtime:"hermes" with model:"gpt-4o"; staging rejects the workspace create with UNREGISTERED_MODEL_FOR_RUNTIME (model "gpt-4o" is not a registered model for runtime "hermes"). This confirms the #2202 fix direction should update the harness to a registered/renderable runtime-model pair (or omit model if the API supplies a valid default), not just wait longer after creation.

New recurrence/evidence on molecule-core main `793d376a1a30`: `E2E Staging Canvas / Canvas tabs E2E` job `275093` now fails earlier than the prior `last_sample_error` path. `canvas/e2e/staging-setup.ts:237-245` still creates `runtime:"hermes"` with `model:"gpt-4o"`; staging rejects the workspace create with `UNREGISTERED_MODEL_FOR_RUNTIME` (`model "gpt-4o" is not a registered model for runtime "hermes"`). This confirms the #2202 fix direction should update the harness to a registered/renderable runtime-model pair (or omit model if the API supplies a valid default), not just wait longer after creation.
Sign in to join this conversation.
No Reviewers
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: molecule-ai/molecule-core#2202