test_start_poller_thread_is_daemon spawned a daemon thread with no stop
mechanism — the leaked thread polled every 10ms with the test's patched
httpx.Client mock STILL ACTIVE for ~50ms after the test scope. Later
tests that re-patched httpx.Client + asserted call counts on
fetch_and_stage / Client construction got their assertions inflated by
the leaked thread's iterations.
Symptoms: test_poll_once_skips_chat_upload_row_from_queue saw
fetch_and_stage called twice instead of once on Python 3.11 CI;
test_batch_fetcher_owns_client_when_not_supplied saw two Client
constructions instead of one in the full local suite. Both surfaced
only after Phase 5b's BatchFetcher refactor changed the timing window
that allowed the leaked thread to fire mid-test.
Fix: extend start_poller_thread with an optional stop_event kwarg
(backward compatible — production callers pass None and rely on the
daemon flag for process-exit cleanup). The test now signals + joins
on stop_event before exiting scope, so the thread is gone before any
later test patches httpx.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Resolves the two remaining findings from the Phase 1-4 retrospective
review (the Python-side counterparts to phase 5a):
1. Important — inbox_uploads.fetch_and_stage blocked the inbox poll
loop synchronously per row. A user dragging 4 files into chat at
once would stall the poller for 4× per-fetch latency before the
chat message reached the agent. Add BatchFetcher: a thread-pool
wrapper (default 4 workers) that submits fetches concurrently and
exposes wait_all() as the barrier the inbox loop calls before
processing the chat-message row that references the uploads.
The drain barrier is the correctness invariant: rewrite_request_body
must observe a populated URI cache when it walks the chat-message
row's parts. _poll_once now drains the BatchFetcher inline before
the first non-upload row, AND at end-of-batch (case: batch contains
only upload rows; the corresponding chat message arrives in a later
poll, but the future-poll-races-current-fetch race is closed).
2. Nit — fetch_and_stage created two httpx.Client instances per row
(one for GET /content, one for POST /ack). Refactor so a single
client serves both calls. When called from BatchFetcher, the
batch-shared client serves every row's GET + ack — so the second
fetch reuses the TCP+TLS handshake from the first.
Comprehensive tests:
- 13 new inbox_uploads tests:
- fetch_and_stage with supplied client: zero httpx.Client
constructions, GET+POST through the same client, caller's client
not closed (lifecycle owned by caller).
- fetch_and_stage without supplied client: exactly one
httpx.Client constructed (was 2 pre-fix), closed on the way out.
- BatchFetcher: 3 rows × 120ms = parallel completion < 250ms
(vs. ~360ms serial), URI cache hot when wait_all returns,
per-row failure isolation, single-client reuse across all
submits, idempotent close, submit-after-close raises,
owned-vs-supplied client lifecycle, no-op wait_all on empty
batch, graceful httpx-missing degradation.
- 3 new inbox tests:
- poll_once drains uploads before processing the chat-message row
(in-place mutation of row['request_body'] proves the URI was
rewritten BEFORE message_from_activity returned).
- poll_once with only upload rows still drains at end-of-batch.
- poll_once with no upload rows never constructs a BatchFetcher
(zero overhead on the no-upload happy path).
133 total inbox + inbox_uploads tests pass; 0 regressions.
Closes the chat-upload poll-mode-perf gap end-to-end.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Workspace-side fetcher for the platform-staged chat uploads written by
phase 1. Stack atop feat/poll-mode-chat-upload-phase1.
Wire shape — the platform writes one activity_logs row per uploaded
file with `activity_type=a2a_receive`, `method=chat_upload_receive`,
and a `request_body={file_id, name, mimeType, size, uri}` carrying
the synthetic `platform-pending:<wsid>/<fid>` URI.
Workspace-side flow (new module workspace/inbox_uploads.py):
1. Fetch via GET /workspaces/:id/pending-uploads/:file_id/content
2. Stage to /workspace/.molecule/chat-uploads/<32-hex>-<sanitized>
(same on-disk shape as internal_chat_uploads.py — agent-side
URI resolvers see no contract change)
3. POST /workspaces/:id/pending-uploads/:file_id/ack
4. Cache `platform-pending: → workspace:` so the eventual chat
message that REFERENCES the upload (separate, later activity row)
gets URI-rewritten before the agent sees it.
Inbox poller extension (workspace/inbox.py):
- is_chat_upload_row(row) discriminator on `method`
- upload-receive rows trigger fetch_and_stage and are NOT enqueued
as InboxMessages (they're side-effect rows, not chat messages)
- cursor advances past them regardless of fetch outcome — a
permanent /content failure must not stall the cursor and block
real chat traffic
- message_from_activity calls rewrite_request_body to swap
platform-pending: URIs to local workspace: URIs in subsequent
chat messages' file parts. Cache miss leaves the URI untouched
so the agent surfaces an unresolvable URI rather than the inbox
silently dropping the part.
Filename sanitization mirrors workspace-server/internal/handlers
/chat_files.go::SanitizeFilename and workspace/internal_chat_uploads
.py::sanitize_filename — pinned by the existing parity test suites.
Coverage: 100% on inbox_uploads.py; the inbox.py extension is fully
covered by three new tests in test_inbox.py (skip-from-queue,
cursor-advance-past-broken-fetch, URI-rewrite ordering).
The workspace-server's `/notify` handler writes the agent's own
send_message_to_user POSTs to activity_logs as activity_type=
'a2a_receive', method='notify', source_id=NULL so the canvas
chat-history loader can restore those bubbles after a page reload.
The activity API exposes the row to /workspaces/:id/activity?
type=a2a_receive, so the inbox poller picks it up and pushes the
agent's own outbound back as an inbound `← molecule: Agent
message: ...` — confirmed live 2026-05-01.
Add `_is_self_notify_row` predicate matched on (method='notify' AND
no source_id) and call it from `_poll_once` before enqueue. The
predicate combines BOTH discriminators so a future caller using
method='notify' with a real peer_id still passes through. Cursor
advances past skipped rows so we don't re-poll the same self-notify
on every iteration.
Belt-and-braces: long-term fix lives in workspace-server (rename
the misclassified activity_type to 'agent_outbound' — RFC at
#2469). This guard stays regardless because it only excludes rows
we never want.
Tests: 7 new — predicate true/false matrix + integrated _poll_once
behavior (skip, cursor advance, notification suppression).
Mutation-verified: reverting inbox.py to the prior shape fails 7/7;
applied state passes 48/48.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The runtime persists per-workspace state (`.auth_token`,
`.platform_inbound_secret`, `.mcp_inbox_cursor`) under `/configs` —
the workspace-EC2 mount path. Inside a container that's writable,
agent-owned. Outside a container, `/configs` either doesn't exist or
isn't writable by an unprivileged user.
The default broke the external-runtime path (`pip install
molecule-ai-workspace-runtime` + `molecule-mcp` on a Mac/Linux
laptop). First heartbeat tries to persist `.platform_inbound_secret`
and crashes:
[Errno 30] Read-only file system: '/configs'
The heartbeat thread logs and dies. Workspace flips offline within
a minute. Operator sees no actionable error.
Adds workspace/configs_dir.py — single resolution point with a tiered
fallback:
1. CONFIGS_DIR env var, if set — explicit operator override
(preserves existing tests + custom deployments verbatim).
2. /configs — if it exists AND is writable. In-container default;
unchanged behavior for every prod workspace.
3. ~/.molecule-workspace — created with mode 0700 so per-file 0600
perms aren't undermined by a world-readable parent.
Migrates the four readers (platform_auth, platform_inbound_auth,
mcp_cli, inbox) to call configs_dir.resolve() instead of
inlining `Path(os.environ.get("CONFIGS_DIR", "/configs"))`.
Existing tests that assert the old `/configs`-as-default contract
updated to assert the new contract: when CONFIGS_DIR is unset, path
resolves to a writable location — `/configs` if present, fallback
otherwise. Tests skip the fallback branch on hosts that DO have a
writable `/configs` (CI containers).
Verified the original repro is fixed: with no CONFIGS_DIR set on
macOS, configs_dir.resolve() returns ~/.molecule-workspace, the dir
exists, and writes succeed.
Test suite: 1454 passed, 3 skipped, 2 xfailed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a notification seam to the universal molecule-mcp wheel so push-
notification-capable MCP hosts (Claude Code today; any compliant
client tomorrow) get inbound A2A messages as conversation interrupts
instead of having to poll wait_for_message / inbox_peek.
Wire-up:
- inbox.py: module-level _NOTIFICATION_CALLBACK + set_notification_callback()
Fires from InboxState.record() AFTER lock release, with same dict
shape inbox_peek returns. Best-effort — a raising callback never
prevents the message from landing in the queue.
- a2a_mcp_server.py: _build_channel_notification() pure helper +
bridge wiring in main() that schedules notifications via
asyncio.run_coroutine_threadsafe (poller is a daemon thread, MCP
loop is asyncio).
- Method name 'notifications/claude/channel' matches the contract
documented in molecule-mcp-claude-channel/server.ts:509.
- wheel_smoke.py: pin set_notification_callback as a published name,
same regression class as the 0.1.16 main_sync incident.
Pollers (wait_for_message / inbox_peek) keep working unchanged for
runtimes without notification support.
Tests: 6 new in test_inbox.py (callback fires once on record, dedupe
short-circuits before fire, raising cb doesn't break inbox, set/clear
semantics), 5 new in test_a2a_mcp_server.py (method name pin, content
mapping, meta routing, no-id JSON-RPC notification spec, missing-
field tolerance). All 59 combined tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CodeQL flagged the bare `assert state.pop(...) is None` — under
`python -O` asserts are stripped, which would skip the call entirely
and the test would silently pass without exercising the code. Bind
the result first so the call always runs.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The universal MCP server (a2a_mcp_server.py) was outbound-only — agents
in standalone runtimes (Claude Code, hermes, codex, etc.) could
delegate, list peers, and write memories, but never observed the
canvas-user or peer-agent messages addressed to them. This blocked
"constantly responding" loops without forcing operators back onto a
runtime-specific channel plugin.
This PR closes the inbound gap with a poller-fed in-memory queue and
three new MCP tools:
- wait_for_message(timeout_secs?) — block until next message arrives
- inbox_peek(limit?) — list pending messages (non-destructive)
- inbox_pop(activity_id) — drop a handled message
A daemon thread polls /workspaces/:id/activity?type=a2a_receive every
5s, fills the queue from the cursor (since_id), and persists the cursor
to ${CONFIGS_DIR}/.mcp_inbox_cursor so a restart doesn't replay backlog.
On 410 (cursor pruned) we fall back to since_secs=600 for a bounded
recovery window. Activity-row → InboxMessage extraction mirrors the
molecule-mcp-claude-channel plugin's extractText (envelope shapes #1-3
+ summary fallback).
mcp_cli.main starts the poller alongside the existing register +
heartbeat threads. In-container runtimes (which have push delivery via
canvas WebSocket) skip activation, so inbox tools return an
informational "(inbox not enabled)" message instead of double-delivery.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>