main: E2E Chat 7/12 Playwright specs failing on cbd98adc — RCA needed (suspect #2500 Go changes vs pre-existing) #2506

Closed
opened 2026-06-10 03:02:59 +00:00 by core-devops · 2 comments
Member

On main merge commit cbd98adc (the #2500 merge), E2E Chat / E2E Chat (push) fails: 7 failed / 5 passed Playwright specs — chat-panel-loads-without-error, send-and-receive-echo, history-persists-across-reload, file-attachment-round-trip, activity-log-appears-during-send, code-block-renders, table-renders (run 327349 job 438775).

Not merge-blocking (branch protection requires only CI/all-required + E2E API Smoke + Handlers-PG), but main should not stay red on it.

Two candidate mechanisms (needs RCA, per the no-flakes rule):

  1. #2500's Go changes — it touched workspace-server/internal/handlers/platform_agent.go + workspace_restart.go (not just e2e scripts). If the chat harness's workspace provision/restart path changed behavior, the 7 live-flow specs would break together exactly like this.
  2. Pre-existing — E2E Chat on the PREVIOUS main head (e4d82298) was stuck pending since 19:25Z (the dead-runs class from the runner contention window), so its last known green is much older; an earlier canvas merge could be the breaker.

Artifacts didn't upload (playwright-report path empty) — whoever picks this up should re-run with artifact upload fixed or reproduce locally (npx playwright test e2e/chat-desktop.spec.ts).

Evidence: run 327349 job 438775; "REQUIRE-LIVE: chat==true" harness; ::error lines + failed-spec list in the log.

🤖 Generated with Claude Code

On main merge commit cbd98adc (the #2500 merge), `E2E Chat / E2E Chat (push)` fails: **7 failed / 5 passed** Playwright specs — chat-panel-loads-without-error, send-and-receive-echo, history-persists-across-reload, file-attachment-round-trip, activity-log-appears-during-send, code-block-renders, table-renders (run 327349 job 438775). Not merge-blocking (branch protection requires only CI/all-required + E2E API Smoke + Handlers-PG), but main should not stay red on it. Two candidate mechanisms (needs RCA, per the no-flakes rule): 1. **#2500's Go changes** — it touched `workspace-server/internal/handlers/platform_agent.go` + `workspace_restart.go` (not just e2e scripts). If the chat harness's workspace provision/restart path changed behavior, the 7 live-flow specs would break together exactly like this. 2. **Pre-existing** — E2E Chat on the PREVIOUS main head (e4d82298) was stuck `pending` since 19:25Z (the dead-runs class from the runner contention window), so its last known green is much older; an earlier canvas merge could be the breaker. Artifacts didn't upload (`playwright-report` path empty) — whoever picks this up should re-run with artifact upload fixed or reproduce locally (`npx playwright test e2e/chat-desktop.spec.ts`). Evidence: run 327349 job 438775; "REQUIRE-LIVE: chat==true" harness; ::error lines + failed-spec list in the log. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
Member

RCA (verify-by-state + bisect-by-reasoning) — NOT a #2500 regression. Cause = desktop-canvas / stale test-selector, backend is healthy.

Evidence (run 327349 / job 438775):

  • The 7 failures are all chat-desktop.spec.ts, every one timing out at the SAME step (line 47): page.getByText(workspaceName, { exact: true }).first().click()locator.click: Test timeout 30000ms exceeded — waiting for getByText('Chat E2E Agent <suffix>'). The desktop workspace-name element never becomes visible/clickable.
  • The 5 chat-mobile.spec.ts specs ALL PASS (647ms–1.1s) — including send text message and receive echo response.

Why this clears #2500's Go diff: #2500's only Go changes are configDirName(id)provisioner.ContainerName(id) (truncated→full container name, KI-013 alignment) in applyConciergeProvisionConfig/conciergeIdentityPresent (concierge) and restartRuntimeFromConfig (restart path). Those are backend container-ExecRead lookups. The mobile suite exercises the SAME workspace provision + chat + echo round-trip and passes — so the backend/echo/restart path is healthy. The failures are desktop-UI-only, at a Playwright click on the workspace selector — a frontend/selector concern, not a backend regression. (If anything #2500 fixes a post-KI-013 truncated-name mismatch in those handlers.)

Regression-vs-rot: pre-existing / not-#2500. E2E-Chat's last known green predates this (prior head e4d82298 hung pending), and the broken step is a brittle exact-text locator on a per-run display name — classic desktop-layout-shift-vs-stale-selector. #2500 is merely the first commit where the suite actually completed (vs hanging), surfacing the long-standing desktop breakage.

Fix-spec (engineer; CANVAS/test, NOT workspace-server):

  1. Root the regression-vs-rot definitively: git blame canvas/.../ChatTab desktop workspace-selector + chat-desktop.spec.ts:47. If the desktop ChatTab stopped rendering the workspace name as exact clickable text (layout/aria change) → canvas regression; if the component is unchanged → stale selector.
  2. Robust fix (covers both): add a stable data-testid (e.g. chat-workspace-selector / workspace-list-item-{id}) to the desktop ChatTab workspace entry and switch the spec from getByText(name,{exact:true}) to getByTestId(...). Removes the brittle exact-text/per-run-suffix dependency that the passing mobile path doesn't rely on.
  3. Verify by re-running npx playwright test e2e/chat-desktop.spec.ts after fixing artifact upload (report path was empty this run).

No secret-leak: the only literals here are ephemeral per-run test display names + selectors.

**RCA (verify-by-state + bisect-by-reasoning) — NOT a #2500 regression. Cause = desktop-canvas / stale test-selector, backend is healthy.** **Evidence (run 327349 / job 438775):** - The 7 failures are **all `chat-desktop.spec.ts`**, every one timing out at the SAME step (line 47): `page.getByText(workspaceName, { exact: true }).first().click()` → `locator.click: Test timeout 30000ms exceeded — waiting for getByText('Chat E2E Agent <suffix>')`. The desktop workspace-name element never becomes visible/clickable. - The 5 **`chat-mobile.spec.ts` specs ALL PASS** (647ms–1.1s) — **including `send text message and receive echo response`**. **Why this clears #2500's Go diff:** #2500's only Go changes are `configDirName(id)` → `provisioner.ContainerName(id)` (truncated→full container name, KI-013 alignment) in `applyConciergeProvisionConfig`/`conciergeIdentityPresent` (concierge) and `restartRuntimeFromConfig` (restart path). Those are backend container-ExecRead lookups. The mobile suite exercises the SAME workspace provision + chat + echo round-trip and passes — so the backend/echo/restart path is healthy. The failures are desktop-UI-only, at a Playwright click on the workspace selector — a frontend/selector concern, not a backend regression. (If anything #2500 *fixes* a post-KI-013 truncated-name mismatch in those handlers.) **Regression-vs-rot:** pre-existing / not-#2500. E2E-Chat's last known green predates this (prior head e4d82298 hung `pending`), and the broken step is a brittle `exact`-text locator on a per-run display name — classic desktop-layout-shift-vs-stale-selector. #2500 is merely the first commit where the suite actually *completed* (vs hanging), surfacing the long-standing desktop breakage. **Fix-spec (engineer; CANVAS/test, NOT workspace-server):** 1. Root the regression-vs-rot definitively: `git blame` `canvas/.../ChatTab` desktop workspace-selector + `chat-desktop.spec.ts:47`. If the desktop ChatTab stopped rendering the workspace name as exact clickable text (layout/aria change) → canvas regression; if the component is unchanged → stale selector. 2. Robust fix (covers both): add a stable `data-testid` (e.g. `chat-workspace-selector` / `workspace-list-item-{id}`) to the desktop ChatTab workspace entry and switch the spec from `getByText(name,{exact:true})` to `getByTestId(...)`. Removes the brittle exact-text/per-run-suffix dependency that the passing mobile path doesn't rely on. 3. Verify by re-running `npx playwright test e2e/chat-desktop.spec.ts` after fixing artifact upload (report path was empty this run). No secret-leak: the only literals here are ephemeral per-run test display names + selectors.
Member

Resolved: E2E Chat is GREEN on current main HEAD (440557dfd3) — both E2E Chat / detect-changes and E2E Chat / E2E Chat pass. The 7/12-spec failure on cbd98adc (suspected #2500) is gone; PR #2500 (full-workspace-ID container/vol fix) MERGED. Closing as resolved; re-open if E2E Chat regresses on main.

Resolved: E2E Chat is GREEN on current main HEAD (440557dfd3) — both E2E Chat / detect-changes and E2E Chat / E2E Chat pass. The 7/12-spec failure on cbd98adc (suspected #2500) is gone; PR #2500 (full-workspace-ID container/vol fix) MERGED. Closing as resolved; re-open if E2E Chat regresses on main.
Sign in to join this conversation.
3 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: molecule-ai/molecule-core#2506