[main-red] molecule-ai/molecule-core: 1f9b2f4f0b #2802

Closed
opened 2026-06-14 01:07:17 +00:00 by gitea-actions · 7 comments

Main is RED on molecule-ai/molecule-core at 1f9b2f4f0b

Commit: https://git.moleculesai.app/molecule-ai/molecule-core/commit/1f9b2f4f0beb89c8592c5111cddf6400aa8822fa

Auto-filed by .gitea/workflows/main-red-watchdog.yml (Option C of the main-never-red directive). Per feedback_no_such_thing_as_flakes + feedback_fix_root_not_symptom: investigate the root cause; do NOT revert as a reflex. The watchdog itself never reverts.

Failed status contexts

  • E2E Chat / E2E Chat (push)failurelogs
    • Failing after 4m16s

Resolution path

  1. Read the failed logs (links above).
  2. If reproducible locally, fix forward in a PR targeting main.
  3. If the failure is a real flake — STOP. Per feedback_no_such_thing_as_flakes, intermittent failures are real bugs. Investigate to root cause; do not mark as flake.
  4. If the failure is blocking unrelated work for >1 hour, file a follow-up issue and assign someone. Do NOT revert without a human GO per feedback_prod_apply_needs_hongming_chat_go (branch protection is a prod surface).

Debug

{
  "all_contexts": [
    {
      "context": "CI / Python Lint & Test (push)",
      "state": "success"
    },
    {
      "context": "E2E Peer Visibility (literal MCP list_peers) / detect-changes (push)",
      "state": "success"
    },
    {
      "context": "Handlers Postgres Integration / detect-changes (push)",
      "state": "success"
    },
    {
      "context": "E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (push)",
      "state": "skipped"
    },
    {
      "context": "Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push)",
      "state": "success"
    },
    {
      "context": "Secret scan / Scan diff for credential-shaped strings (push)",
      "state": "success"
    },
    {
      "context": "Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (push)",
      "state": "success"
    },
    {
      "context": "E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push)",
      "state": "success"
    },
    {
      "context": "Harness Replays / detect-changes (push)",
      "state": "success"
    },
    {
      "context": "Harness Replays / Harness Replays (push)",
      "state": "success"
    },
    {
      "context": "E2E Chat / detect-changes (push)",
      "state": "success"
    },
    {
      "context": "E2E Staging Canvas (Playwright) / detect-changes (push)",
      "state": "success"
    },
    {
      "context": "E2E API Smoke Test / detect-changes (push)",
      "state": "success"
    },
    {
      "context": "CI / Detect changes (push)",
      "state": "success"
    },
    {
      "context": "E2E API Smoke Test / E2E API Smoke Test (push)",
      "state": "success"
    },
    {
      "context": "CI / Shellcheck (E2E scripts) (push)",
      "state": "success"
    },
    {
      "context": "CI / Platform (Go) (push)",
      "state": "success"
    },
    {
      "context": "lint-no-coe-on-required / lint-no-coe-on-required (push)",
      "state": "success"
    },
    {
      "context": "Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (push)",
      "state": "success"
    },
    {
      "context": "Handlers Postgres Integration / Handlers Postgres Integration (push)",
      "state": "success"
    },
    {
      "context": "Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (push)",
      "state": "success"
    },
    {
      "context": "publish-canvas-image / Build & push canvas image (push)",
      "state": "success"
    },
    {
      "context": "publish-workspace-server-image / build-and-push (push)",
      "state": "success"
    },
    {
      "context": "CI / Canvas (Next.js) (push)",
      "state": "success"
    },
    {
      "context": "CI / Canvas Deploy Status (push)",
      "state": "success"
    },
    {
      "context": "CI / all-required (push)",
      "state": "success"
    },
    {
      "context": "E2E Chat / E2E Chat (push)",
      "state": "failure"
    },
    {
      "context": "publish-canvas-image / Promote canvas :latest to CI-green build (push)",
      "state": "success"
    },
    {
      "context": "publish-workspace-server-image / Production auto-deploy (push)",
      "state": "success"
    },
    {
      "context": "E2E Staging Canvas (Playwright) / Canvas tabs E2E (push)",
      "state": "success"
    }
  ],
  "branch": "main",
  "combined_state": "failure",
  "failed_contexts": [
    "E2E Chat / E2E Chat (push)"
  ],
  "recheck_combined_state": "failure",
  "recheck_failed_contexts": [
    "E2E Chat / E2E Chat (push)"
  ],
  "sha": "1f9b2f4f0beb89c8592c5111cddf6400aa8822fa"
}

This issue is idempotent: the watchdog runs hourly at :05 and edits this body in place. When main returns to green, the watchdog will close this issue automatically with a "main returned to green" comment.

# Main is RED on `molecule-ai/molecule-core` at `1f9b2f4f0b` Commit: <https://git.moleculesai.app/molecule-ai/molecule-core/commit/1f9b2f4f0beb89c8592c5111cddf6400aa8822fa> Auto-filed by `.gitea/workflows/main-red-watchdog.yml` (Option C of the [main-never-red directive](https://git.moleculesai.app/molecule-ai/molecule-core/issues/420)). Per `feedback_no_such_thing_as_flakes` + `feedback_fix_root_not_symptom`: investigate the root cause; do NOT revert as a reflex. The watchdog itself never reverts. ## Failed status contexts - **E2E Chat / E2E Chat (push)** — `failure` → [logs](/molecule-ai/molecule-core/actions/runs/361732/jobs/493098) - Failing after 4m16s ## Resolution path 1. Read the failed logs (links above). 2. If reproducible locally, fix forward in a PR targeting `main`. 3. If the failure is a real flake — STOP. Per `feedback_no_such_thing_as_flakes`, intermittent failures are real bugs. Investigate to root cause; do not mark as flake. 4. If the failure is blocking unrelated work for >1 hour, file a follow-up issue and assign someone. Do NOT revert without a human GO per `feedback_prod_apply_needs_hongming_chat_go` (branch protection is a prod surface). ## Debug ```json { "all_contexts": [ { "context": "CI / Python Lint & Test (push)", "state": "success" }, { "context": "E2E Peer Visibility (literal MCP list_peers) / detect-changes (push)", "state": "success" }, { "context": "Handlers Postgres Integration / detect-changes (push)", "state": "success" }, { "context": "E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (push)", "state": "skipped" }, { "context": "Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push)", "state": "success" }, { "context": "Secret scan / Scan diff for credential-shaped strings (push)", "state": "success" }, { "context": "Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (push)", "state": "success" }, { "context": "E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push)", "state": "success" }, { "context": "Harness Replays / detect-changes (push)", "state": "success" }, { "context": "Harness Replays / Harness Replays (push)", "state": "success" }, { "context": "E2E Chat / detect-changes (push)", "state": "success" }, { "context": "E2E Staging Canvas (Playwright) / detect-changes (push)", "state": "success" }, { "context": "E2E API Smoke Test / detect-changes (push)", "state": "success" }, { "context": "CI / Detect changes (push)", "state": "success" }, { "context": "E2E API Smoke Test / E2E API Smoke Test (push)", "state": "success" }, { "context": "CI / Shellcheck (E2E scripts) (push)", "state": "success" }, { "context": "CI / Platform (Go) (push)", "state": "success" }, { "context": "lint-no-coe-on-required / lint-no-coe-on-required (push)", "state": "success" }, { "context": "Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (push)", "state": "success" }, { "context": "Handlers Postgres Integration / Handlers Postgres Integration (push)", "state": "success" }, { "context": "Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (push)", "state": "success" }, { "context": "publish-canvas-image / Build & push canvas image (push)", "state": "success" }, { "context": "publish-workspace-server-image / build-and-push (push)", "state": "success" }, { "context": "CI / Canvas (Next.js) (push)", "state": "success" }, { "context": "CI / Canvas Deploy Status (push)", "state": "success" }, { "context": "CI / all-required (push)", "state": "success" }, { "context": "E2E Chat / E2E Chat (push)", "state": "failure" }, { "context": "publish-canvas-image / Promote canvas :latest to CI-green build (push)", "state": "success" }, { "context": "publish-workspace-server-image / Production auto-deploy (push)", "state": "success" }, { "context": "E2E Staging Canvas (Playwright) / Canvas tabs E2E (push)", "state": "success" } ], "branch": "main", "combined_state": "failure", "failed_contexts": [ "E2E Chat / E2E Chat (push)" ], "recheck_combined_state": "failure", "recheck_failed_contexts": [ "E2E Chat / E2E Chat (push)" ], "sha": "1f9b2f4f0beb89c8592c5111cddf6400aa8822fa" } ``` _This issue is idempotent: the watchdog runs hourly at `:05` and edits this body in place. When `main` returns to green, the watchdog will close this issue automatically with a "main returned to green" comment._
Member

MECHANISM: Current molecule-core main is red in E2E Chat, but the merge/required path does not gate on that lane. .gitea/workflows/e2e-chat.yml:107-139 still marks the lane as promotion-pending/non-required and continue-on-error: true; .gitea/workflows/gitea-merge-queue.yml:73-90 explicitly says E2E Chat is non-required and only checks CI / all-required (push); .gitea/workflows/ci.yml:523-544 documents that CI / all-required cannot cover sibling workflows. Result: a chat/canvas regression can land with CI / all-required green while E2E Chat is red, leaving only the main-red watchdog to catch it after merge.

EVIDENCE: On current main d985301685e649b2a0cdd89ff48a98c52198f6ca, commit status has CI / all-required (push) success and E2E Chat / E2E Chat (push) failure (/actions/runs/362293/jobs/494143). The failed Playwright run executed 25 tests with log excerpts 9 failed and 16 passed; failures include chat-desktop.spec.ts:70 waiting for Echo: What is the weather? and multiple chat-separation.spec.ts tab assertions. Existing issue #2802 is the right tracker for this main-red class; I am updating it rather than filing a duplicate.

RECOMMENDED FIX SHAPE: CI/DevOps owner should make E2E Chat a real merge gate for chat/canvas/workspace-server changes: either add E2E Chat / E2E Chat (pull_request) to branch protection/merge-queue required contexts when detect-changes says chat=true, or have gate-check-v3/merge-queue explicitly wait for this sibling workflow before merging. Keep the product/harness red itself routed separately to Canvas/frontend or the existing chat-rendering RCA (#2598/#2802), but the durable process fix is in .gitea/workflows/e2e-chat.yml plus branch-protection/merge-queue config.

MECHANISM: Current molecule-core main is red in E2E Chat, but the merge/required path does not gate on that lane. `.gitea/workflows/e2e-chat.yml:107-139` still marks the lane as promotion-pending/non-required and `continue-on-error: true`; `.gitea/workflows/gitea-merge-queue.yml:73-90` explicitly says `E2E Chat` is non-required and only checks `CI / all-required (push)`; `.gitea/workflows/ci.yml:523-544` documents that `CI / all-required` cannot cover sibling workflows. Result: a chat/canvas regression can land with `CI / all-required` green while E2E Chat is red, leaving only the main-red watchdog to catch it after merge. EVIDENCE: On current main `d985301685e649b2a0cdd89ff48a98c52198f6ca`, commit status has `CI / all-required (push)` success and `E2E Chat / E2E Chat (push)` failure (`/actions/runs/362293/jobs/494143`). The failed Playwright run executed 25 tests with log excerpts `9 failed` and `16 passed`; failures include `chat-desktop.spec.ts:70` waiting for `Echo: What is the weather?` and multiple `chat-separation.spec.ts` tab assertions. Existing issue #2802 is the right tracker for this main-red class; I am updating it rather than filing a duplicate. RECOMMENDED FIX SHAPE: CI/DevOps owner should make E2E Chat a real merge gate for chat/canvas/workspace-server changes: either add `E2E Chat / E2E Chat (pull_request)` to branch protection/merge-queue required contexts when `detect-changes` says `chat=true`, or have gate-check-v3/merge-queue explicitly wait for this sibling workflow before merging. Keep the product/harness red itself routed separately to Canvas/frontend or the existing chat-rendering RCA (#2598/#2802), but the durable process fix is in `.gitea/workflows/e2e-chat.yml` plus branch-protection/merge-queue config.
Member

FOLLOW-UP RCA — CURRENT E2E Chat red is a real regression, not infra.

MECHANISM: The red is local Playwright against the repo's own workspace-server + canvas + echo runtime; it is not staging/LLM/network. The first bad boundary is 37c2821a (merge #2769) with E2E Chat / E2E Chat (push) success, then 78a922ce (merge #2759) with the same lane failing. #2759 changed the desktop chat send/socket path (canvas/src/components/tabs/chat/hooks/useChatSend.ts, useChatSocket.ts, ChatTab.tsx) around per-message messageId correlation and guard release. In the failing run, the UI sends via POST /workspaces/:id/a2a and gets fast 200s, but the desktop ChatTab never renders the synchronous echo response.

EVIDENCE: First red run for 78a922ce is /actions/runs/360399/jobs/490711: chat-desktop.spec.ts:70, :80, :101 fail waiting for Echo: What is the weather?, Echo: Persistence test, and Echo: Please read this file. The server log in that same job shows local POST /workspaces/.../a2a returning 200 in ~5-14ms, so this is not connection/500/401/LLM infra. Later main d9853016 (/actions/runs/362293/jobs/494143) still fails the same desktop echo assertions and additionally fails chat-separation tab assertions; mobile/source-filter tests pass.

RECOMMENDED FIX SHAPE: Route to Canvas/chat owner. Reproduce from 78a922ce vs 37c2821a and repair the #2759 desktop useChatSend/useChatSocket handling so a successful synchronous A2A 200 with result.parts always appends the agent reply to ChatTab My Chat, independent of WS message_id completion. Add an E2E/fixture assertion that the echo runtime was actually reached and that the desktop panel renders the returned Echo: text. The separate CI-gating gap remains valid, but there is also a real code/harness regression to fix now.

FOLLOW-UP RCA — CURRENT E2E Chat red is a real regression, not infra. MECHANISM: The red is local Playwright against the repo's own workspace-server + canvas + echo runtime; it is not staging/LLM/network. The first bad boundary is `37c2821a` (merge #2769) with `E2E Chat / E2E Chat (push)` success, then `78a922ce` (merge #2759) with the same lane failing. #2759 changed the desktop chat send/socket path (`canvas/src/components/tabs/chat/hooks/useChatSend.ts`, `useChatSocket.ts`, `ChatTab.tsx`) around per-message `messageId` correlation and guard release. In the failing run, the UI sends via `POST /workspaces/:id/a2a` and gets fast 200s, but the desktop `ChatTab` never renders the synchronous echo response. EVIDENCE: First red run for `78a922ce` is `/actions/runs/360399/jobs/490711`: `chat-desktop.spec.ts:70`, `:80`, `:101` fail waiting for `Echo: What is the weather?`, `Echo: Persistence test`, and `Echo: Please read this file`. The server log in that same job shows local `POST /workspaces/.../a2a` returning `200` in ~5-14ms, so this is not connection/500/401/LLM infra. Later main `d9853016` (`/actions/runs/362293/jobs/494143`) still fails the same desktop echo assertions and additionally fails chat-separation tab assertions; mobile/source-filter tests pass. RECOMMENDED FIX SHAPE: Route to Canvas/chat owner. Reproduce from `78a922ce` vs `37c2821a` and repair the #2759 desktop `useChatSend`/`useChatSocket` handling so a successful synchronous A2A 200 with `result.parts` always appends the agent reply to `ChatTab` My Chat, independent of WS `message_id` completion. Add an E2E/fixture assertion that the echo runtime was actually reached and that the desktop panel renders the returned `Echo:` text. The separate CI-gating gap remains valid, but there is also a real code/harness regression to fix now.
Member

BLAST-RADIUS SWEEP — no additional hidden non-required E2E regressions found on current main.

MECHANISM: I checked current molecule-core main 688e6ff8e74e5c89cc76af130cae721fa346308d for the other non-required E2E/advisory lanes after the E2E Chat gate-gap finding. The only red status context on this SHA is still E2E Chat / E2E Chat (push). The named non-required lanes are not showing additional live regressions: Peer Visibility push is green, Harness Replays is green, E2E Staging Canvas is green, and both Local Provision Lifecycle lanes are green. E2E Staging SaaS did not emit on this push because its workflow path filter was not touched, so there is no current red SaaS result to classify.

EVIDENCE: Commit status for 688e6ff8 shows CI / all-required (push) success and E2E Chat / E2E Chat (push) failure (/actions/runs/362344/jobs/494229). The same status set shows E2E Peer Visibility ... / E2E Peer Visibility (push) success, Harness Replays / Harness Replays (push) success, E2E Staging Canvas ... / Canvas tabs E2E (push) success, Local Provision Lifecycle E2E / ... stub success, and Local Provision Lifecycle E2E / ... real image + MiniMax LLM, advisory success.

RECOMMENDED FIX SHAPE: No new component dispatch from this sweep. The live hidden-regression blast radius is still the already diagnosed E2E Chat/#2759 Canvas-chat regression plus the CI gating gap documented above. Keep E2E Chat routed to Canvas/chat; keep the gating fix routed to CI/DevOps. Treat absent Staging SaaS on this SHA as path-filter non-execution, not a green or red signal.

BLAST-RADIUS SWEEP — no additional hidden non-required E2E regressions found on current main. MECHANISM: I checked current molecule-core main `688e6ff8e74e5c89cc76af130cae721fa346308d` for the other non-required E2E/advisory lanes after the E2E Chat gate-gap finding. The only red status context on this SHA is still `E2E Chat / E2E Chat (push)`. The named non-required lanes are not showing additional live regressions: Peer Visibility push is green, Harness Replays is green, E2E Staging Canvas is green, and both Local Provision Lifecycle lanes are green. `E2E Staging SaaS` did not emit on this push because its workflow path filter was not touched, so there is no current red SaaS result to classify. EVIDENCE: Commit status for `688e6ff8` shows `CI / all-required (push)` success and `E2E Chat / E2E Chat (push)` failure (`/actions/runs/362344/jobs/494229`). The same status set shows `E2E Peer Visibility ... / E2E Peer Visibility (push)` success, `Harness Replays / Harness Replays (push)` success, `E2E Staging Canvas ... / Canvas tabs E2E (push)` success, `Local Provision Lifecycle E2E / ... stub` success, and `Local Provision Lifecycle E2E / ... real image + MiniMax LLM, advisory` success. RECOMMENDED FIX SHAPE: No new component dispatch from this sweep. The live hidden-regression blast radius is still the already diagnosed E2E Chat/#2759 Canvas-chat regression plus the CI gating gap documented above. Keep E2E Chat routed to Canvas/chat; keep the gating fix routed to CI/DevOps. Treat absent Staging SaaS on this SHA as path-filter non-execution, not a green or red signal.

main returned to green at SHA 9595757acf4ec5392d4bb79d0df7d9aeb1003af5 (https://git.moleculesai.app/molecule-ai/molecule-core/commit/9595757acf4ec5392d4bb79d0df7d9aeb1003af5). Closing automatically. If the underlying root cause is not yet understood, reopen this issue and file a postmortem — green-by-flake is still a bug per feedback_no_such_thing_as_flakes.

`main` returned to green at SHA `9595757acf4ec5392d4bb79d0df7d9aeb1003af5` (<https://git.moleculesai.app/molecule-ai/molecule-core/commit/9595757acf4ec5392d4bb79d0df7d9aeb1003af5>). Closing automatically. If the underlying root cause is not yet understood, reopen this issue and file a postmortem — green-by-flake is still a bug per `feedback_no_such_thing_as_flakes`.
gitea-actions bot closed this issue 2026-06-14 03:05:44 +00:00
Member

Detailed relay from real E2E Chat workflow_dispatch run 362748 / job 494910 on head 401ff02d.

The run was real, not the PR no-op: it executed npx playwright test e2e/chat-desktop.spec.ts e2e/chat-mobile.spec.ts e2e/chat-separation.spec.ts, 26 tests total, result 12 failed / 14 passed.

Failing tests:

  1. e2e/chat-desktop.spec.ts:81Desktop ChatTab › send text message and receive echo response
  2. e2e/chat-desktop.spec.ts:91Desktop ChatTab › history persists across reload
  3. e2e/chat-desktop.spec.ts:112Desktop ChatTab › file attachment round-trip
  4. e2e/chat-desktop.spec.ts:131Desktop ChatTab › activity log appears during send
  5. e2e/chat-desktop.spec.ts:183Desktop ChatTab — Markdown rendering › code block renders <pre>
  6. e2e/chat-desktop.spec.ts:196Desktop ChatTab — Markdown rendering › table renders <table>
  7. e2e/chat-separation.spec.ts:130Chat Sub-Tabs › chat tab shows My Chat and Agent Comms sub-tabs
  8. e2e/chat-separation.spec.ts:136Chat Sub-Tabs › My Chat is selected by default
  9. e2e/chat-separation.spec.ts:143Chat Sub-Tabs › switching to Agent Comms shows different content
  10. e2e/chat-separation.spec.ts:154Chat Sub-Tabs › My Chat has input box, Agent Comms does not
  11. e2e/chat-separation.spec.ts:165Chat Sub-Tabs › switching back to My Chat preserves messages
  12. e2e/chat-separation.spec.ts:380No JS Errors › page loads without errors with chat sub-tabs

Important passes / split:

  • The new PR regression Desktop ChatTab › echo fixture workspace is configured for push delivery PASSED, so the DB row is delivery_mode='push' after seeding.
  • Mobile echo tests PASSED (MobileChat › send text message and receive echo response, history, file attachment).
  • Activity API source-filter tests PASSED, so this run is not failing in the source_id/source-filter path.

Desktop echo details:

  • First desktop echo test failure waits for locator('#panel-chat [data-testid=chat-panel]:visible').getByText('Echo: What is the weather?'); timeout 15000ms; element(s) not found.
  • The user message did render in that test: the preceding assertion chat.getByText('What is the weather?', { exact: true }) passed before the echo assertion failed.
  • Other desktop echo waits fail similarly for Echo: Persistence test, Echo: Please read this file, Echo: ```js, and Echo: | A | B |.
  • Activity-log test waits for #panel-chat [data-testid='activity-log'] and gets element(s) not found after 10000ms.
  • Platform access log shows the desktop sends hit POST /workspaces/<id>/a2a and return HTTP 200 in ~5-9ms, e.g. at 03:54:36, 03:54:53, 03:55:10, 03:55:27, 03:55:39, 03:55:56. The log does not include response bodies, so I cannot say from this artifact whether the body was {status:queued} vs JSON-RPC result; however, because the push-mode DB regression passed and the HTTP status is 200, the remaining failure is no longer explained by the workspace being poll-mode.
  • I do not see console/pageerror detail in the job log. The job references screenshots/error-context files, but no Playwright report artifact was uploaded (No files were found with the provided path: canvas/playwright-report/).

Chat-separation sub-tab details:

  • My Chat / Agent Comms failures are element(s) not found, not present-but-hidden. The locators are under #panel-chat:
    • locator('#panel-chat').getByRole('button', { name: 'My Chat' }) not found.
    • locator('#panel-chat').getByRole('button', { name: 'Agent Comms' }) click waits time out.
  • This means the sub-tab buttons are absent from the role tree under #panel-chat in this run, not merely failing an aria-selected assertion.
  • The source-filter/API section of chat-separation.spec.ts passed, so the remaining chat-separation failure is a UI mount/render/selector issue for the desktop ChatTab sub-tab UI, not the Activity API source bucketing.

My read:

  • #2819's fixture-push change is necessary but not sufficient. It proves and fixes the fixture delivery-mode piece, but the real lane still fails.
  • The desktop echo symptom now aligns with the product-side push-mode render race I originally flagged around #2759/#2816: desktop ChatTab sends reach /a2a and the local user message renders, but the push response is not rendered; mobile echo against the same fixture passes. That points to desktop ChatTab/useChatSend response handling rather than echo-runtime creation still being poll-mode.
  • The chat-separation sub-tab failures look like a separate desktop ChatTab UI/mount/selector issue: the My Chat and Agent Comms buttons are absent under #panel-chat. This is not fixed by changing fixture delivery_mode.

Recommended next fix shape:

  • Keep the #2819 delivery_mode:'push' seed + DB fallback and its regression test.
  • Add the product/render fix equivalent to #2816 for desktop push-mode replies: process/render the synchronous push-mode JSON-RPC result even when any token/WS completion guard has already changed state; do not gate push-mode HTTP response rendering like poll-mode {status:'queued'}.
  • Separately inspect desktop ChatTab sub-tab rendering in the E2E route: why #panel-chat lacks accessible My Chat / Agent Comms buttons while the panel itself exists.
Detailed relay from real E2E Chat workflow_dispatch run 362748 / job 494910 on head 401ff02d. The run was real, not the PR no-op: it executed `npx playwright test e2e/chat-desktop.spec.ts e2e/chat-mobile.spec.ts e2e/chat-separation.spec.ts`, 26 tests total, result `12 failed / 14 passed`. Failing tests: 1. `e2e/chat-desktop.spec.ts:81` — `Desktop ChatTab › send text message and receive echo response` 2. `e2e/chat-desktop.spec.ts:91` — `Desktop ChatTab › history persists across reload` 3. `e2e/chat-desktop.spec.ts:112` — `Desktop ChatTab › file attachment round-trip` 4. `e2e/chat-desktop.spec.ts:131` — `Desktop ChatTab › activity log appears during send` 5. `e2e/chat-desktop.spec.ts:183` — `Desktop ChatTab — Markdown rendering › code block renders <pre>` 6. `e2e/chat-desktop.spec.ts:196` — `Desktop ChatTab — Markdown rendering › table renders <table>` 7. `e2e/chat-separation.spec.ts:130` — `Chat Sub-Tabs › chat tab shows My Chat and Agent Comms sub-tabs` 8. `e2e/chat-separation.spec.ts:136` — `Chat Sub-Tabs › My Chat is selected by default` 9. `e2e/chat-separation.spec.ts:143` — `Chat Sub-Tabs › switching to Agent Comms shows different content` 10. `e2e/chat-separation.spec.ts:154` — `Chat Sub-Tabs › My Chat has input box, Agent Comms does not` 11. `e2e/chat-separation.spec.ts:165` — `Chat Sub-Tabs › switching back to My Chat preserves messages` 12. `e2e/chat-separation.spec.ts:380` — `No JS Errors › page loads without errors with chat sub-tabs` Important passes / split: - The new PR regression `Desktop ChatTab › echo fixture workspace is configured for push delivery` PASSED, so the DB row is `delivery_mode='push'` after seeding. - Mobile echo tests PASSED (`MobileChat › send text message and receive echo response`, history, file attachment). - Activity API source-filter tests PASSED, so this run is not failing in the source_id/source-filter path. Desktop echo details: - First desktop echo test failure waits for `locator('#panel-chat [data-testid=chat-panel]:visible').getByText('Echo: What is the weather?')`; timeout 15000ms; `element(s) not found`. - The user message did render in that test: the preceding assertion `chat.getByText('What is the weather?', { exact: true })` passed before the echo assertion failed. - Other desktop echo waits fail similarly for `Echo: Persistence test`, `Echo: Please read this file`, `Echo: ```js`, and `Echo: | A | B |`. - Activity-log test waits for `#panel-chat [data-testid='activity-log']` and gets `element(s) not found` after 10000ms. - Platform access log shows the desktop sends hit `POST /workspaces/<id>/a2a` and return HTTP 200 in ~5-9ms, e.g. at 03:54:36, 03:54:53, 03:55:10, 03:55:27, 03:55:39, 03:55:56. The log does not include response bodies, so I cannot say from this artifact whether the body was `{status:queued}` vs JSON-RPC result; however, because the push-mode DB regression passed and the HTTP status is 200, the remaining failure is no longer explained by the workspace being poll-mode. - I do not see console/pageerror detail in the job log. The job references screenshots/error-context files, but no Playwright report artifact was uploaded (`No files were found with the provided path: canvas/playwright-report/`). Chat-separation sub-tab details: - `My Chat` / `Agent Comms` failures are `element(s) not found`, not present-but-hidden. The locators are under `#panel-chat`: - `locator('#panel-chat').getByRole('button', { name: 'My Chat' })` not found. - `locator('#panel-chat').getByRole('button', { name: 'Agent Comms' })` click waits time out. - This means the sub-tab buttons are absent from the role tree under `#panel-chat` in this run, not merely failing an `aria-selected` assertion. - The source-filter/API section of `chat-separation.spec.ts` passed, so the remaining chat-separation failure is a UI mount/render/selector issue for the desktop `ChatTab` sub-tab UI, not the Activity API source bucketing. My read: - #2819's fixture-push change is necessary but not sufficient. It proves and fixes the fixture delivery-mode piece, but the real lane still fails. - The desktop echo symptom now aligns with the product-side push-mode render race I originally flagged around #2759/#2816: desktop `ChatTab` sends reach `/a2a` and the local user message renders, but the push response is not rendered; mobile echo against the same fixture passes. That points to desktop `ChatTab`/`useChatSend` response handling rather than echo-runtime creation still being poll-mode. - The chat-separation sub-tab failures look like a separate desktop `ChatTab` UI/mount/selector issue: the `My Chat` and `Agent Comms` buttons are absent under `#panel-chat`. This is not fixed by changing fixture `delivery_mode`. Recommended next fix shape: - Keep the #2819 `delivery_mode:'push'` seed + DB fallback and its regression test. - Add the product/render fix equivalent to #2816 for desktop push-mode replies: process/render the synchronous push-mode JSON-RPC result even when any token/WS completion guard has already changed state; do not gate push-mode HTTP response rendering like poll-mode `{status:'queued'}`. - Separately inspect desktop `ChatTab` sub-tab rendering in the E2E route: why `#panel-chat` lacks accessible `My Chat` / `Agent Comms` buttons while the panel itself exists.
Member

Run 362805/job 495009 comparison against run 362748/job 494910.

MECHANISM: Adding the #2816-style product-render changes materially reduced the E2E Chat failure set. Run 362748 (fixture-only, head 401ff02d) failed 12/26: 6 desktop echo/render tests plus 6 chat-separation sub-tab tests. Run 362805 (consolidated fixture push + product-render + sub-tab attempt, head 5ba99a91) failed 6/26, and all desktop echo/render tests now passed. Therefore the product-render change is necessary/helpful and should stay; fixture-only was insufficient, but product-render fixed the desktop echo half. Remaining root cause is the chat-separation sub-tab UI/selector path: the tests still cannot find My Chat / Agent Comms role buttons under the visible desktop chat panel.

EVIDENCE: In run 362805, these formerly-red desktop tests are green: Desktop ChatTab › send text message and receive echo response, history persists across reload, file attachment round-trip, activity log appears during send, code block renders <pre>, and table renders <table>. Remaining failures are exactly: chat-separation.spec.ts:132 sub-tabs visible, :138 My Chat selected by default, :145 switch to Agent Comms, :156 My Chat input/Agent Comms no input, :167 switch back preserves messages, :382 no JS errors with chat sub-tabs. Representative failure: locator('#panel-chat [data-testid=chat-panel]:visible').getByRole('button', { name: 'My Chat' }) timed out with element(s) not found; Agent Comms clicks likewise time out waiting for the role button. Source-filter and seeded-history tests pass.

RECOMMENDED FIX SHAPE: Keep the #2816 product-render/push-response changes in #2819. Continue fixing only the remaining chat-separation sub-tab surface: inspect what desktop ChatTab actually renders under #panel-chat [data-testid=chat-panel]:visible in the E2E route and align the UI/selector/test with the real tab roles/labels. This is no longer an echo delivery problem.

Run 362805/job 495009 comparison against run 362748/job 494910. MECHANISM: Adding the #2816-style product-render changes materially reduced the E2E Chat failure set. Run 362748 (fixture-only, head 401ff02d) failed 12/26: 6 desktop echo/render tests plus 6 chat-separation sub-tab tests. Run 362805 (consolidated fixture push + product-render + sub-tab attempt, head 5ba99a91) failed 6/26, and all desktop echo/render tests now passed. Therefore the product-render change is necessary/helpful and should stay; fixture-only was insufficient, but product-render fixed the desktop echo half. Remaining root cause is the chat-separation sub-tab UI/selector path: the tests still cannot find `My Chat` / `Agent Comms` role buttons under the visible desktop chat panel. EVIDENCE: In run 362805, these formerly-red desktop tests are green: `Desktop ChatTab › send text message and receive echo response`, `history persists across reload`, `file attachment round-trip`, `activity log appears during send`, `code block renders <pre>`, and `table renders <table>`. Remaining failures are exactly: `chat-separation.spec.ts:132` sub-tabs visible, `:138` My Chat selected by default, `:145` switch to Agent Comms, `:156` My Chat input/Agent Comms no input, `:167` switch back preserves messages, `:382` no JS errors with chat sub-tabs. Representative failure: `locator('#panel-chat [data-testid=chat-panel]:visible').getByRole('button', { name: 'My Chat' })` timed out with `element(s) not found`; Agent Comms clicks likewise time out waiting for the role button. Source-filter and seeded-history tests pass. RECOMMENDED FIX SHAPE: Keep the #2816 product-render/push-response changes in #2819. Continue fixing only the remaining chat-separation sub-tab surface: inspect what desktop `ChatTab` actually renders under `#panel-chat [data-testid=chat-panel]:visible` in the E2E route and align the UI/selector/test with the real tab roles/labels. This is no longer an echo delivery problem.
Member

MECHANISM: Harness Replays PR detect-changes is not failing because it compares the wrong refs. The running job expands .gitea/workflows/harness-replays.yml pull_request refs to BASE=main and HEAD=test/2737-canary-smoke-a2a-pong-harness-capture, then calls /compare/$BASE...$HEAD from the decide step. That is the same compare shape I ran manually, and the manual API returns the harness changes in commits[*].files. The parser also matches this Gitea shape (.gitea/scripts/compare-api-diff-files.py reads data["commits"][*]["files"][*]["filename"]). The blank diagnostics are a separate workflow bug: the detect job only exports run at .gitea/workflows/harness-replays.yml:75-76, while the downstream jobs read needs.detect-changes.outputs.debug at lines 184/193. The decide step writes debug=... to $GITHUB_OUTPUT, but it is never promoted to a job output, so no useful debug= line can appear downstream.

EVIDENCE: run 363293/job 495839 expands the PR path to BASE="main" and HEAD="test/2737-canary-smoke-a2a-pong-harness-capture". Manual Compare API for main...test/2737-canary-smoke-a2a-pong-harness-capture returned HTTP 200, top_files=0, commits=12, and commit files including .gitea/workflows/harness-replays.yml, .gitea/scripts/compare-api-diff-files.py, tests/harness/compose.yml, tests/harness/seed.sh, tests/harness/replays/canary-smoke-a2a-pong.sh, and tests/harness/replays/canary-smoke-org-create-400-capture.sh. The no-op downstream job 495840 printed Debug: empty, matching the missing job output rather than an absent compare diagnostic.

RECOMMENDED FIX SHAPE: in .gitea/workflows/harness-replays.yml, add debug: ${{ steps.decide.outputs.debug }} to jobs.detect-changes.outputs, and keep/apply the outer-shell curl/status handling already present in #2821 so Compare API failures cannot set run=true in a subshell and later overwrite to run=false. For the broader #2802 durable class, apply the same pattern to sibling detect-changes workflows: expose diagnostics as job outputs, parse this Gitea Compare shape (commits[*].files), and make Compare API failure fail-open to real execution rather than silent no-op. I do not recommend changing the PR BASE/HEAD computation for Harness Replays; the live evidence shows it already matches the working manual compare.

MECHANISM: Harness Replays PR detect-changes is not failing because it compares the wrong refs. The running job expands `.gitea/workflows/harness-replays.yml` pull_request refs to `BASE=main` and `HEAD=test/2737-canary-smoke-a2a-pong-harness-capture`, then calls `/compare/$BASE...$HEAD` from the decide step. That is the same compare shape I ran manually, and the manual API returns the harness changes in `commits[*].files`. The parser also matches this Gitea shape (`.gitea/scripts/compare-api-diff-files.py` reads `data["commits"][*]["files"][*]["filename"]`). The blank diagnostics are a separate workflow bug: the detect job only exports `run` at `.gitea/workflows/harness-replays.yml:75-76`, while the downstream jobs read `needs.detect-changes.outputs.debug` at lines 184/193. The decide step writes `debug=...` to `$GITHUB_OUTPUT`, but it is never promoted to a job output, so no useful `debug=` line can appear downstream. EVIDENCE: run 363293/job 495839 expands the PR path to `BASE="main"` and `HEAD="test/2737-canary-smoke-a2a-pong-harness-capture"`. Manual Compare API for `main...test/2737-canary-smoke-a2a-pong-harness-capture` returned HTTP 200, `top_files=0`, `commits=12`, and commit files including `.gitea/workflows/harness-replays.yml`, `.gitea/scripts/compare-api-diff-files.py`, `tests/harness/compose.yml`, `tests/harness/seed.sh`, `tests/harness/replays/canary-smoke-a2a-pong.sh`, and `tests/harness/replays/canary-smoke-org-create-400-capture.sh`. The no-op downstream job 495840 printed `Debug:` empty, matching the missing job output rather than an absent compare diagnostic. RECOMMENDED FIX SHAPE: in `.gitea/workflows/harness-replays.yml`, add `debug: ${{ steps.decide.outputs.debug }}` to `jobs.detect-changes.outputs`, and keep/apply the outer-shell curl/status handling already present in #2821 so Compare API failures cannot set `run=true` in a subshell and later overwrite to `run=false`. For the broader #2802 durable class, apply the same pattern to sibling detect-changes workflows: expose diagnostics as job outputs, parse this Gitea Compare shape (`commits[*].files`), and make Compare API failure fail-open to real execution rather than silent no-op. I do not recommend changing the PR BASE/HEAD computation for Harness Replays; the live evidence shows it already matches the working manual compare.
Sign in to join this conversation.
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: molecule-ai/molecule-core#2802