fix(e2e): canvas-tabs staging setup waits for RENDERABLE, not online (#2199) #2202

2026-06-04T04:06:24Z

core-devops commented

2026-06-04 04:06:24 +00:00

Red: E2E Staging Canvas (Playwright) / Canvas tabs E2E = FAILURE on main HEAD b9d2f023 (issue #2199).

Actual failure (runner-6, task 258160 — container logs, on-disk Gitea logs are stale post-1.26.2)

The failure is in the Playwright globalSetup, not in any spec assertion:

[staging-setup] Workspace created: 8e5c7354-1156-4562-9d6b-47e11edd51f2
Error: Workspace failed: (no last_sample_error) full body:
  {... "runtime":"hermes","status":"failed","uptime_seconds":0,"last_sample_error":null ...}
  at canvas/e2e/staging-setup.ts:272   (waitFor "workspace online")

Root-cause verdict: NOT a canvas/test regression, NOT timing fragility

It is a deterministic consequence of workspace-server #2162 (fix(provision): platform-managed workspace must fail-closed when CP proxy env absent, merged 2026-06-03 — a correct production safety fix). The canvas E2E creates a bare hermes/gpt-4o workspace, which defaults closed to platform_managed (workspace_provision.go:~1009). On a staging tenant without MOLECULE_LLM_BASE_URL / MOLECULE_LLM_USAGE_TOKEN, the agent now aborts at boot with MISSING_PLATFORM_PROXY, surfacing as the pre-start credential-abort shape (status:"failed", uptime_seconds:0, no last_sample_error). Pre-#2162 the same workspace booted credential-less (the bug #2162 fixed), so the old harness happened to pass.

This is environmental / test-design, not a product regression: #2162 should stay; nothing in canvas broke.

Why the fix belongs in the harness

staging-tabs.spec.ts only opens the 13 side-panel tabs and asserts no hard crash / no "Failed to load" toast. It makes zero LLM calls and even mocks /cp/auth/me + 401→200. All it needs is a workspace row so the node + tabs render — a fully-online agent is not required.

The fix (single file: `canvas/e2e/staging-setup.ts` step 6)

Wait for RENDERABLE instead of strictly online:

online -> happy path (staging with proxy env)
failed + uptime_seconds==0 + no last_sample_error -> pre-start credential-abort: agent never ran, row still renders -> proceed, with a loud console.warn
any other failed (last_sample_error present, OR uptime_seconds>0 = agent started then crashed) -> still hard-throws (no masking)

Genuine infra provision failure remains loud one step earlier at the org level (instance_status === "failed", unchanged).

Verification

tsc clean for canvas/e2e/staging-* (pre-existing tsc errors are all in unrelated __tests__ files).
npx playwright test --config=playwright.staging.config.ts --list resolves globalSetup + the single spec.
A full live run needs staging CP creds not available in this environment; the changed code is the globalSetup readiness gate, verified by inspection against the captured failing-run body.

Closes #2199.

🤖 Generated with Claude Code

**Red:** `E2E Staging Canvas (Playwright) / Canvas tabs E2E` = FAILURE on main HEAD b9d2f023 (issue #2199). ## Actual failure (runner-6, task 258160 — container logs, on-disk Gitea logs are stale post-1.26.2) The failure is in the Playwright **globalSetup**, not in any spec assertion: ``` [staging-setup] Workspace created: 8e5c7354-1156-4562-9d6b-47e11edd51f2 Error: Workspace failed: (no last_sample_error) full body: {... "runtime":"hermes","status":"failed","uptime_seconds":0,"last_sample_error":null ...} at canvas/e2e/staging-setup.ts:272 (waitFor "workspace online") ``` ## Root-cause verdict: NOT a canvas/test regression, NOT timing fragility It is a deterministic consequence of **workspace-server #2162** (`fix(provision): platform-managed workspace must fail-closed when CP proxy env absent`, merged 2026-06-03 — a *correct* production safety fix). The canvas E2E creates a bare `hermes`/`gpt-4o` workspace, which defaults closed to `platform_managed` (`workspace_provision.go:~1009`). On a staging tenant without `MOLECULE_LLM_BASE_URL` / `MOLECULE_LLM_USAGE_TOKEN`, the agent now aborts at boot with `MISSING_PLATFORM_PROXY`, surfacing as the **pre-start credential-abort shape** (`status:"failed"`, `uptime_seconds:0`, no `last_sample_error`). Pre-#2162 the same workspace booted credential-less (the bug #2162 fixed), so the old harness happened to pass. This is **environmental / test-design**, not a product regression: #2162 should stay; nothing in canvas broke. ## Why the fix belongs in the harness `staging-tabs.spec.ts` only opens the 13 side-panel tabs and asserts no hard crash / no "Failed to load" toast. It makes **zero LLM calls** and even mocks `/cp/auth/me` + `401→200`. All it needs is a workspace **row** so the node + tabs render — a fully-`online` agent is not required. ## The fix (single file: `canvas/e2e/staging-setup.ts` step 6) Wait for **RENDERABLE** instead of strictly `online`: - `online` -> happy path (staging with proxy env) - `failed` + `uptime_seconds==0` + no `last_sample_error` -> pre-start credential-abort: agent never ran, row still renders -> proceed, with a loud `console.warn` - any other `failed` (`last_sample_error` present, OR `uptime_seconds>0` = agent started then crashed) -> **still hard-throws** (no masking) Genuine **infra** provision failure remains loud one step earlier at the org level (`instance_status === "failed"`, unchanged). ## Verification - `tsc` clean for `canvas/e2e/staging-*` (pre-existing tsc errors are all in unrelated `__tests__` files). - `npx playwright test --config=playwright.staging.config.ts --list` resolves globalSetup + the single spec. - A full live run needs staging CP creds not available in this environment; the changed code is the globalSetup readiness gate, verified by inspection against the captured failing-run body. Closes #2199. 🤖 Generated with [Claude Code](https://claude.com/claude-code)

core-devops added 1 commit 2026-06-04 04:06:25 +00:00

fix(e2e): canvas-tabs staging setup waits for RENDERABLE, not online (#2199 )

ci-arm64-advisory / fast-checks (pull_request) Waiting to run

Details

Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 1s

Details

Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s

Details

CI / Python Lint & Test (pull_request) Successful in 4s

Details

CI / Detect changes (pull_request) Successful in 5s

Details

E2E API Smoke Test / detect-changes (pull_request) Successful in 6s

Details

E2E Chat / detect-changes (pull_request) Successful in 7s

Details

E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 7s

Details

Handlers Postgres Integration / detect-changes (pull_request) Successful in 4s

Details

Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s

Details

Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s

Details

gate-check-v3 / gate-check (pull_request_target) Successful in 6s

Details

Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 9s

Details

sop-checklist / review-refire (pull_request_target) Has been skipped

Details

qa-review / approved (pull_request_target) Failing after 5s

Details

sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2

Details

sop-checklist / na-declarations (pull_request) N/A: (none)

Details

security-review / approved (pull_request_target) Failing after 6s

Details

sop-checklist / all-items-acked (pull_request_target) Successful in 6s

Details

sop-tier-check / tier-check (pull_request_target) Successful in 39s

Details

Harness Replays / detect-changes (pull_request) Successful in 55s

Details

lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m0s

Details

CI / Platform (Go) (pull_request) Successful in 2s

Details

CI / Shellcheck (E2E scripts) (pull_request) Successful in 1s

Details

E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2s

Details

E2E Chat / E2E Chat (pull_request) Successful in 2s

Details

E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s

Details

Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1s

Details

Harness Replays / Harness Replays (pull_request) Successful in 2s

Details

CI / Canvas (Next.js) (pull_request) Successful in 6m42s

Details

CI / Canvas Deploy Reminder (pull_request) Has been skipped

Details

CI / all-required (pull_request) Successful in 29s

Details

audit-force-merge / audit (pull_request_target) Successful in 7s

Details

b80816a3b0

E2E Staging Canvas (Playwright) / "Canvas tabs E2E" went red on main HEAD
b9d2f023. The actual failure (runner-6 task 258160) is in the Playwright
globalSetup, NOT in any spec assertion:

  [staging-setup] Workspace created: 8e5c7354-...
  Error: Workspace failed: (no last_sample_error) full body:
    {... "runtime":"hermes","status":"failed","uptime_seconds":0,
     "last_sample_error":null ...}
    at canvas/e2e/staging-setup.ts:272 (waitFor "workspace online")

Root cause — NOT a canvas/test regression and NOT timing fragility. It is
a deterministic consequence of workspace-server #2162 (merged 2026-06-03,
"platform-managed workspace must fail-closed when CP proxy env absent"),
which is a correct production safety fix. The canvas E2E creates a bare
hermes/gpt-4o workspace that defaults closed to platform_managed; on a
staging tenant without MOLECULE_LLM_BASE_URL / MOLECULE_LLM_USAGE_TOKEN,
the agent now aborts at boot with MISSING_PLATFORM_PROXY — surfacing as
the pre-start credential-abort shape (status:"failed", uptime_seconds:0,
no last_sample_error). Pre-#2162 the same workspace booted credential-less
(the bug #2162 fixed) so the old harness happened to pass.

The fix is in the harness, because this test does not need a booted agent:
staging-tabs.spec.ts only opens the 13 side-panel tabs and asserts no hard
crash / no "Failed to load" toast. It makes zero LLM calls and even mocks
/cp/auth/me + 401→200. All it needs is a workspace ROW so the node + tabs
render.

So step 6 now waits for RENDERABLE instead of strictly online:
  - online                                 -> happy path (staging with proxy env)
  - failed + uptime_seconds==0 + no sample -> pre-start credential-abort:
      agent never ran, row still renders -> proceed, with a loud console.warn
  - any other failed (last_sample_error present, OR uptime_seconds>0 i.e.
      the agent started then crashed)      -> still hard-throws (no masking)

Real infra/provision failure stays loud one step earlier at the org level
(instance_status === "failed", unchanged).

Verification: tsc clean for canvas/e2e/staging-* (pre-existing tsc errors
are all in unrelated __tests__ files); `playwright test --list` resolves
globalSetup + the single spec. Full live run needs staging CP creds not
available locally; the changed branch is the globalSetup readiness gate,
verified by inspection against the captured failing-run body.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

claude-ceo-assistant merged commit 793d376a1a into main

2026-06-04 05:18:12 +00:00

core-devops commented

2026-06-04 05:18:16 +00:00

Owner force-merged (honest bypass). Clears canvas-tabs Playwright red #2199: #2162 fail-closed (correct) made the credential-less platform_managed test workspace abort pre-start; staging-setup now waits for RENDERABLE (the tabs spec needs a row, not a booted agent) while still hard-throwing on genuine failures. Required CI green. Token revoked.

molecule-code-reviewer commented

2026-06-04 05:29:37 +00:00

New recurrence/evidence on molecule-core main 793d376a1a30: E2E Staging Canvas / Canvas tabs E2E job 275093 now fails earlier than the prior last_sample_error path. canvas/e2e/staging-setup.ts:237-245 still creates runtime:"hermes" with model:"gpt-4o"; staging rejects the workspace create with UNREGISTERED_MODEL_FOR_RUNTIME (model "gpt-4o" is not a registered model for runtime "hermes"). This confirms the #2202 fix direction should update the harness to a registered/renderable runtime-model pair (or omit model if the API supplies a valid default), not just wait longer after creation.

New recurrence/evidence on molecule-core main `793d376a1a30`: `E2E Staging Canvas / Canvas tabs E2E` job `275093` now fails earlier than the prior `last_sample_error` path. `canvas/e2e/staging-setup.ts:237-245` still creates `runtime:"hermes"` with `model:"gpt-4o"`; staging rejects the workspace create with `UNREGISTERED_MODEL_FOR_RUNTIME` (`model "gpt-4o" is not a registered model for runtime "hermes"`). This confirms the #2202 fix direction should update the harness to a registered/renderable runtime-model pair (or omit model if the API supplies a valid default), not just wait longer after creation.

Sign in to join this conversation.

2 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: molecule-ai/molecule-core#2202

fix(e2e): canvas-tabs staging setup waits for RENDERABLE, not online (#2199) #2202

Actual failure (runner-6, task 258160 — container logs, on-disk Gitea logs are stale post-1.26.2)

Root-cause verdict: NOT a canvas/test regression, NOT timing fragility

Why the fix belongs in the harness

The fix (single file: canvas/e2e/staging-setup.ts step 6)

Verification

The fix (single file: `canvas/e2e/staging-setup.ts` step 6)