fix(core#2723 client-side): raise chat-send timeout 120s→30min to match #2727 server-side raise #2731

Merged
devops-engineer merged 1 commits from fix/core2723-client-side-chat-timeout-align into main 2026-06-13 08:27:11 +00:00
Member

Why this is a separate, completing PR

#2727 (commit cd3666d7, MERGED) raised the server-side canvas idle watchdog from 5min to 30min. That's half of the #2723 fix. The other half — the client-side AbortSignal.timeout in useChatSend.ts:240 — was left at 120s.

Why 120s was a problem pre-#2727 too: the 120s timeout was BELOW the prior 5min server idle, so the isClientTimeout handler at useChatSend.ts:261-280 had to silently swallow the AbortSignal firing (surface as 'agent may be unreachable' was wrong for a still-working agent). With the 30min server raise, a 300s chain still dies client-side at 120s — half-fixed #2723.

Fix

Raise the client-side AbortSignal.timeout from 120s to 30min, matching #2727. The isClientTimeout silent-swallow handler is unchanged — it's the right defense-in-depth path for the rare case where the client really does time out (browser tab suspended, network gone for 30+ min, etc).

Diff

canvas/src/components/tabs/chat/hooks/useChatSend.ts | 17 +++++++ (incl. doc comment)
1 file changed, 16 insertions(+), 1 deletion(-)

Refs

  • #2723 (the 3-part CTO-priority issue)
  • #2727 (the server-side raise, MERGED)
  • #2728 (the prior #2723 PR I opened, closed as superseded by #2727 — this PR is the completing client-side half that #2728 had bundled)
  • useChatSend.ts:240 (the site being patched)
  • useChatSend.ts:261-280 (the isClientTimeout handler — unchanged; the silent-swallow contract is the right behavior at 30min too)
  • a2a_proxy.go:1002 const defaultIdleTimeoutDuration = 30 * time.Minute (the server-side target this PR aligns to)
## Why this is a separate, completing PR #2727 (commit `cd3666d7`, MERGED) raised the server-side canvas idle watchdog from 5min to 30min. That's half of the #2723 fix. The other half — the client-side `AbortSignal.timeout` in `useChatSend.ts:240` — was left at 120s. Why 120s was a problem pre-#2727 too: the 120s timeout was BELOW the prior 5min server idle, so the `isClientTimeout` handler at `useChatSend.ts:261-280` had to silently swallow the AbortSignal firing (surface as 'agent may be unreachable' was wrong for a still-working agent). With the 30min server raise, a 300s chain still dies client-side at 120s — half-fixed #2723. ## Fix Raise the client-side `AbortSignal.timeout` from 120s to 30min, matching #2727. The `isClientTimeout` silent-swallow handler is unchanged — it's the right defense-in-depth path for the rare case where the client really does time out (browser tab suspended, network gone for 30+ min, etc). ## Diff ``` canvas/src/components/tabs/chat/hooks/useChatSend.ts | 17 +++++++ (incl. doc comment) 1 file changed, 16 insertions(+), 1 deletion(-) ``` ## Refs - #2723 (the 3-part CTO-priority issue) - #2727 (the server-side raise, MERGED) - #2728 (the prior #2723 PR I opened, closed as superseded by #2727 — this PR is the completing client-side half that #2728 had bundled) - `useChatSend.ts:240` (the site being patched) - `useChatSend.ts:261-280` (the `isClientTimeout` handler — unchanged; the silent-swallow contract is the right behavior at 30min too) - `a2a_proxy.go:1002 const defaultIdleTimeoutDuration = 30 * time.Minute` (the server-side target this PR aligns to)
agent-dev-b added 1 commit 2026-06-13 08:20:55 +00:00
fix(core#2723 client-side): raise chat-send timeout 120s→30min to match #2727 server-side raise
CI / Python Lint & Test (pull_request) Successful in 5s
E2E Peer Visibility (literal MCP list_peers) / detect-changes (pull_request) Successful in 8s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 12s
Harness Replays / detect-changes (pull_request) Successful in 6s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 16s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 5s
sop-checklist / review-refire (pull_request_target) Has been skipped
E2E Chat / detect-changes (pull_request) Successful in 19s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 19s
E2E API Smoke Test / detect-changes (pull_request) Successful in 26s
reserved-path-review / reserved-path-review (pull_request_target) Successful in 7s
sop-checklist / all-items-acked (pull_request_target) Successful in 9s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Has been skipped
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 17s
gate-check-v3 / gate-check (pull_request_target) Failing after 13s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
Harness Replays / Harness Replays (pull_request) Successful in 1s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 16s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 7s
CI / Platform (Go) (pull_request) Successful in 2s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Successful in 31s
E2E Chat / E2E Chat (pull_request) Successful in 5s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 6s
CI / Canvas (Next.js) (pull_request) Successful in 4m19s
CI / Canvas Deploy Status (pull_request) Successful in 5s
CI / all-required (pull_request) Successful in 4s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 4m43s
reserved-path-review / reserved-path-review (pull_request_review) Successful in 8s
qa-review / approved (pull_request_target) Approved via pull_request_review trigger
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
qa-review / approved (pull_request_review) Successful in 9s
sop-checklist / na-declarations (pull_request) N/A: (none)
security-review / approved (pull_request_target) Approved via pull_request_review trigger
security-review / approved (pull_request_review) Successful in 10s
audit-force-merge / audit (pull_request_target) Successful in 8s
cbb373a198
Pre-#2727 the server-side canvas idle watchdog was 5min
(workspace-server/internal/handlers/a2a_proxy.go, defaultIdle
TimeoutDuration). The 120s client-side AbortSignal.timeout in
useChatSend.ts:240 was BELOW the server idle, so the
isClientTimeout handler at useChatSend.ts:261-280 had to
silently swallow the AbortSignal firing — surfaced as 'agent may
be unreachable' otherwise, which is wrong for a still-working
agent (the message was DELIVERED; the reply just arrives via
the AGENT_MESSAGE WS event, not the synchronous POST response).

#2727 (commit cd3666d7) raised the server-side default to 30min.
The 120s client timeout is now even MORE wrong: a 300s chain dies
client-side at 120s while the server would happily have held the
connection for 30min. Half-fixed #2723 — the server side is at
30min, the client side is still at 120s.

This PR completes the fix: raise the client-side AbortSignal to
30min, matching the server. The isClientTimeout handler stays
unchanged — it's a defense-in-depth path for the rare case
where the client really does time out (browser tab in background
+ suspended timer, network gone for 30+ min, etc), and the
silent-swallow-then-wait-for-WS contract is the right behavior
for both.

Refs #2723, #2727, #2728 (closed as superseded).
agent-reviewer-cr2 approved these changes 2026-06-13 08:26:57 +00:00
agent-reviewer-cr2 left a comment
Member

APPROVED on head cbb373a1.

5-axis review: the client timeout is now 30 * 60 * 1000, matching the merged server-side default defaultIdleTimeoutDuration = 30 * time.Minute from #2727, so the browser no longer races the server watchdog for long 300s+ tool chains. The isClientTimeout silent-swallow path is unchanged, which preserves the defense-in-depth behavior if the browser/network really does time out after the larger window. No auth/security surface changes and no performance concern beyond deliberately allowing the already-server-permitted long turn.

On test judgment: a value-pin test would be nice but I do not require it for this one-line client alignment; the load-bearing contract is the existing timeout-swallow behavior plus matching the server constant, and current CI covers the touched Canvas path. CI / Canvas, Canvas Deploy Status, and CI / all-required are green on cbb373a1. /sop-ack

APPROVED on head cbb373a1. 5-axis review: the client timeout is now `30 * 60 * 1000`, matching the merged server-side default `defaultIdleTimeoutDuration = 30 * time.Minute` from #2727, so the browser no longer races the server watchdog for long 300s+ tool chains. The `isClientTimeout` silent-swallow path is unchanged, which preserves the defense-in-depth behavior if the browser/network really does time out after the larger window. No auth/security surface changes and no performance concern beyond deliberately allowing the already-server-permitted long turn. On test judgment: a value-pin test would be nice but I do not require it for this one-line client alignment; the load-bearing contract is the existing timeout-swallow behavior plus matching the server constant, and current CI covers the touched Canvas path. CI / Canvas, Canvas Deploy Status, and CI / all-required are green on cbb373a1. /sop-ack
Member

/sop-ack

/sop-ack
devops-engineer merged commit e186e119aa into main 2026-06-13 08:27:11 +00:00
Sign in to join this conversation.
No Reviewers
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: molecule-ai/molecule-core#2731