molecule-core/canvas
Hongming Wang 057876cb0c fix(delegation): runtime handles 202+queued; canvas surfaces delegation rows
Two bugs that compounded into the "Director does the work itself" UX:

1. workspace/builtin_tools/delegation.py: _execute_delegation only
   handled HTTP 200 in the response branch. When the peer's a2a-proxy
   returned HTTP 202 + {queued: true} (single-SDK-session bottleneck
   on the peer), the loop fell through. Two iterations later the
   `if "error" in result` check tried to access an unbound `result`,
   the goroutine ended quietly, and the delegation stayed at FAILED
   with error="None". The LLM checking status saw "failed" + the
   platform's "Delegation queued — target at capacity" log line in
   chat context, concluded the peer was permanently unavailable, and
   bypassed delegation to do the work itself.

   Fix: explicit 202+queued branch. Adds DelegationStatus.QUEUED,
   marks the local delegation as QUEUED, mirrors to the platform,
   and returns cleanly without retrying. The retry loop is for
   transient transport errors — queueing is a real ack, not a failure
   to retry against (retrying would just re-queue the same task).

   check_delegation_status docstring extended with explicit per-status
   guidance: pending/in_progress → wait, queued → wait (peer busy on
   prior task, reply WILL arrive), completed → use result, failed →
   real error in error field; only fall back on failed, never queued.

2. canvas/src/components/tabs/chat/AgentCommsPanel.tsx: filter dropped
   every delegation row because it whitelisted only a2a_send /
   a2a_receive. activity_type='delegation' rows (written by the
   platform's /delegate handler with method='delegate' or
   'delegate_result') never reached toCommMessage. User saw "No
   agent-to-agent communications yet" while 6+ delegations existed
   in the DB.

   Fix: include "delegation" in the both the initial filter and the
   WS push filter, plus a delegation branch in toCommMessage that
   maps the row as outbound (always — platform proxies on our behalf)
   and uses summary as the primary text source.

Tests:
  - 3 new Python tests cover the 202+queued path: status becomes
    QUEUED not FAILED; no retry on queued (counted by URL match
    against the A2A target since the mock is shared across all
    AsyncClient calls); bare 202 without {queued:true} still
    falls through to the existing retry-then-FAILED path.
  - 3 new TS tests cover the delegation mapper: 'delegate' row
    maps as outbound to target with summary text; queued
    'delegate_result' preserves status='queued' (load-bearing for
    the LLM's wait-vs-bypass decision); missing target_id returns
    null instead of rendering a ghost.

Does NOT solve: the underlying single-SDK-session bottleneck that
causes peers to queue in the first place. Tracked as task #102
(parallel SDK sessions per workspace) — real architectural work.
This PR makes the runtime handle the queueing correctly so the LLM
doesn't bail out, and makes the delegations visible in Agent Comms
so operators can see what's happening.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 15:01:50 -07:00
..
e2e chore(simplify): trim SHA-rot comments + harden TENANT_HOST scheme/port stripping 2026-04-26 11:44:54 -07:00
public chore: replace brand icon and add HANDOFF.md 2026-04-13 13:03:40 -07:00
src fix(delegation): runtime handles 202+queued; canvas surfaces delegation rows 2026-04-26 15:01:50 -07:00
.env.example fix(canvas): close 4 gaps in WS status indicator (env, toast, tests) 2026-04-14 08:26:38 +00:00
.gitignore feat(canvas): SaaS cross-origin — slug header + cookie credentials (Phase F) 2026-04-14 20:08:39 -07:00
components.json chore(canvas): initialize shadcn/ui — components.json + cn utility 2026-04-18 07:57:17 -07:00
Dockerfile chore(canvas): upgrade node:20-alpine → node:22-alpine 2026-04-24 18:54:30 +00:00
next.config.ts fix(canvas,dotenv): review-driven hardening of fit gate + parser parity 2026-04-24 22:23:51 -07:00
package-lock.json fix(canvas): cascade-delete UX — require checkbox before Delete All (#1314) 2026-04-21 07:06:45 +00:00
package.json fix(quickstart): make README cp-paste flow bugless end-to-end (#1871) 2026-04-23 19:53:43 +00:00
playwright.config.ts initial commit — Molecule AI platform 2026-04-13 11:55:37 -07:00
playwright.staging.config.ts feat(e2e): canary + canvas Playwright workflows; delegation mechanics 2026-04-21 04:15:10 -07:00
postcss.config.js initial commit — Molecule AI platform 2026-04-13 11:55:37 -07:00
tailwind.config.ts initial commit — Molecule AI platform 2026-04-13 11:55:37 -07:00
tsconfig.json initial commit — Molecule AI platform 2026-04-13 11:55:37 -07:00
vitest.config.ts initial commit — Molecule AI platform 2026-04-13 11:55:37 -07:00