Previously Phase 3 only checked the workspace-server's poll-mode short-circuit
emit shape ({"status":"queued","delivery_mode":"poll","method":"..."}); the
matching client-side classification was tested in isolation against fixture
dicts in test_a2a_response.py.
This phase closes the loop by piping the actual on-the-wire response from a
real workspace-server back through the wheel's a2a_response.parse() and
asserting it classifies as the Queued variant with the right method +
delivery_mode. A regression in EITHER the server emit shape OR the client
parser will now fail this E2E, eliminating the gap that allowed the original
"unexpected response shape" production bug to ship despite green unit tests.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CI run on PR #2355 surfaced `pq: invalid input syntax for type uuid:
ws-poll-e2e-1777529293-3363` — workspaces.id is UUID-typed and the
hand-rolled "ws-<tag>" shape fails the cast. Phase 1 returned
generic 'registration failed' which cascaded into Phase 3 'lookup
failed' (resolveAgentURL on a non-existent row) and Phase 4 'missing
workspace auth token' (no token extracted because Phase 1 didn't run
the bootstrap path).
Generate v4 UUIDs via uuidgen (with a python3 fallback), one each
for the poll workspace, the caller workspace, and the Phase 2
invalid-mode probe.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
End-to-end coverage for the canvas-chat unblocker. Exercises every
moving part of the #2339 stack against a real platform instance:
Phase 1 — register a workspace as delivery_mode=poll WITHOUT a URL;
verify the response carries delivery_mode=poll.
Phase 2 — invalid delivery_mode rejected with 400 (typo defense).
Phase 3 — POST A2A to the poll-mode workspace; verify proxyA2ARequest
short-circuits and returns 200 {status:queued, delivery_mode:poll,
method:message/send} without ever resolving an agent URL.
Phase 4 — verify the queued message appears in /activity?type=a2a_receive
with the right method + payload (the polling agent reads from here).
Phase 5 — since_id cursor returns ASC-ordered rows STRICTLY AFTER the
cursor; the cursor row itself must NOT be replayed. Sends two
follow-up messages and asserts ordering: rows[0] is the older new
event, rows[-1] is the newer.
Phase 6 — unknown / pruned cursor returns 410 Gone with an explanation.
Phase 7 — cross-workspace cursor isolation: a UUID belonging to one
workspace cannot be used to peek at another workspace's feed (returns
410, same as pruned, no info leak).
Idempotent: per-run unique workspace ids (date+pid). Trap-based cleanup
deletes the test rows on exit; no e2e_cleanup_all_workspaces call (see
feedback_never_run_cluster_cleanup_tests_on_live_platform.md).
Wired into .github/workflows/e2e-api.yml so it runs on every PR that
touches workspace-server/, tests/e2e/, or the workflow file itself —
same gate as the existing test_a2a_e2e + test_notify_attachments suites.
Stacked on #2354 (PR 3: since_id cursor).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>