Self-review of PR #2949 surfaced two pre-existing defects that the
SSOT consolidation inherited from the original /notify handler. Both
are addressable in a small follow-up; shipping them as a separate PR
keeps the consolidation and the bug-fix individually reviewable.
Critical: byte-slice preview truncation produces invalid UTF-8
-------------------------------------------------------------
Pre-fix:
if len(preview) > 80 {
preview = preview[:80] + "…"
}
`len()` returns BYTES; `preview[:80]` slices on a byte boundary. For
agent-authored chat in CJK / emoji / accented characters, byte 80
lands mid-codepoint → invalid UTF-8 → Postgres JSONB rejects → INSERT
fails → activity_log row never written → message vanishes from chat
history on the next reload. The persistence-failure log fires but
operators have to grep to find it, and the user-visible regression
mode is identical to reno-stars.
Fix: extract `truncatePreviewRunes(s, maxRunes)` that walks the rune
boundary using `for i := range s` (Go's range over string yields rune
start indices). Cap at 80 RUNES not bytes — UI-friendly count, not
storage count.
Important: workspace-lookup error path swallows real DB errors
--------------------------------------------------------------
Pre-fix:
if err := w.db.QueryRowContext(...).Scan(&wsName); err != nil {
return ErrWorkspaceNotFound
}
Conflates `sql.ErrNoRows` (legit not-found → caller 404) with real
DB errors (connection drop, query timeout, pool exhaustion → caller
should 503). During a Postgres outage every notify call surfaced as
"workspace not found" — masking the actual incident in alerting and
making the symptom indistinguishable from "you typed a bad workspace
ID".
Fix: distinguish via `errors.Is(err, sql.ErrNoRows)` and wrap
non-not-found errors with `fmt.Errorf("agent_message: workspace
lookup: %w", err)`. Callers' existing fallback path (return 500 /
return error wrapped) handles the new shape correctly without any
changes — verified by running existing TestNotify_* and
TestMCPHandler_SendMessage_* tests.
Tests added (3 new, 11 total writer tests)
------------------------------------------
- TestTruncatePreviewRunes_RuneBoundary: 8-case table — ASCII, CJK,
exactly-at-max, emoji prefix. Asserts both correct visible output
AND `utf8.ValidString` on every result so the bug shape (invalid
UTF-8) can't recur.
- TestAgentMessageWriter_Send_NonASCIIMessagePersists: end-to-end
with a 200-rune CJK message (exceeds the 80-rune cap, would have
hit the byte-slice bug). Pins the INSERT summary contains valid
UTF-8 with exactly 80-rune body + ellipsis.
- TestAgentMessageWriter_Send_DBErrorOnLookupReturnsWrapped: pins the
DB-outage path returns a wrapped non-ErrWorkspaceNotFound error so
alerting can distinguish 404 from 503. Verified via mock
ExpectQuery returning a transient error.
Verified
--------
- `go vet ./internal/handlers/` clean
- `go build ./...` clean
- All 14 writer + caller tests pass (8 original + 3 new + AST gate +
TestNotify_* + TestMCPHandler_SendMessage_* sibling tests)
Per memory feedback_assert_exact_not_substring.md: every new test
asserts boundary behavior directly (UTF-8 validity, exact rune count,
errors.Is comparison) rather than substring-match in stringified
output.
Refs RFC #2945, PR #2949, PR #2944.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The External Connect modal's Codex and OpenClaw tabs were rendering
this MCP server config:
command = "python3"
args = ["-m", "molecule_runtime.a2a_mcp_server"]
That spawns the bare MCP dispatcher with no presence wiring. The
``molecule-mcp`` console-script wrapper (mcp_cli.main) is what calls
``POST /registry/register`` at startup and runs the 20s heartbeat
thread alongside the MCP stdio loop. Without the wrapper, the canvas
flips the workspace back to ``awaiting_agent`` (OFFLINE) within
60-90s — even while tools work — because nothing is heartbeating.
Operator-side this looks like: the workspace is registered and tools
work fine when invoked, but the canvas shows "offline" / "Restart"
CTA, peer agents see the workspace as awaiting_agent in list_peers
output, and inbound A2A delivery silently fails the readiness check.
A new external-Codex operator (#2957) hit this and spent debugging
time on what should have been a copy-paste install.
Fix: switch both Codex and OpenClaw templates to
``command = "molecule-mcp"`` / ``args = []``, matching the universal
MCP template that already handles this correctly. Inline comment in
each template explains the wrapper-vs-bare-module tradeoff so a
future template author doesn't regress to the shorter form.
Hermes-channel intentionally still spawns the bare module — the
hermes plugin owns the platform plugin path and runs its own
register_platform/heartbeat code in-process; double-heartbeating
would race. Universal/Codex/OpenClaw all need the wrapper.
Regression gate: TestExternalMcpTemplates_UseMoleculeMcpWrapper
asserts the three templates that must use the wrapper actually do,
and explicitly fails on the old ``-m molecule_runtime.a2a_mcp_server``
shape. Verified the test FAILS on pre-fix source by stashing only
external_connection.go and re-running.
Source: molecule-core#2957 issue 1 (item 4 of the report — the
``(codex returned empty output)`` / opaque-canvas-error / stale-
session items live in codex-channel-molecule and are tracked
separately).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the v1 LOCAL/TEAM/GLOBAL tab trio (mapped to the deprecated
shared_context model) with a v2 plugin-driven UI. Without this,
canvas Memory tab was reading the frozen agent_memories table while
all post-cutover agent writes went to the plugin's memory_records —
the tab silently displayed stale data.
## Backend (workspace-server)
New routes under wsAuth, all behind the existing per-tenant token:
GET /workspaces/:id/v2/namespaces → readable + writable lists
GET /workspaces/:id/v2/memories → plugin search proxy
DELETE /workspaces/:id/v2/memories/:mid → plugin forget proxy
memories_v2.go — slim handler:
- Server-side ACL: every search request is intersected with the
resolver's readable-namespaces set (canvas-supplied namespace
that the workspace can't read returns [] not 403, matches v1
existence-non-inferring shape).
- Returns 503 with "set MEMORY_PLUGIN_URL" hint when plugin
isn't wired (canvas surfaces a banner).
- Maps plugin not_found → 404, other plugin errors → 502.
- View shaping: NamespaceView.label rendered server-side
("Workspace (abc-1234)", "Team (t-99)", "Org (acme)", custom)
so canvas doesn't parse namespace names. MemoryView surfaces
pin/expires_at/score/source_workspace_id from Propagation.
memories_v2_test.go — 100% line + 100% function coverage:
- 503 path on every endpoint when unwired
- Namespaces success + readable/writable error paths
- Search: empty intersection, full-path query/kind/limit
propagation, namespace=/no-namespace branches, propagation
map missing/wrong-type, intersect error, plugin error
- Forget: success, plugin not_found→404, other plugin
errors→502, missing memoryId→400
- Helpers: namespaceLabel for all 4 kinds + truncation,
parseLimit edge cases (default/0/negative/over-cap/non-num),
memoryToView field round-trip, indexOfColon, shortID
## Frontend (canvas)
MemoryInspectorPanel rewritten for v2:
- Drop LOCAL/TEAM/GLOBAL trio. Namespace dropdown driven by
GET /v2/namespaces.readable, "All namespaces" default.
- New per-row badges: kind (F/S/C), source (agent/runtime/user),
pin (📌), TTL countdown (⌛12h / "expired"), score% on
semantic search, source-workspace ⇡ws-pee for propagated.
- Drop Edit button — v2 plugin contract has no PATCH; the
model is forget + recommit. Forget stays.
- Plugin-unavailable banner with operator hint when /v2/*
returns 503.
- Bug fix surfaced by test: rollback-on-failed-delete order
of operations (loadEntries() called setError(null) AFTER
we set the failure message, wiping it). Reload first, then
set the error.
MemoryEditorDialog deleted — Add was POST /memories which v2
doesn't support from canvas (writes go via MCP). The legacy
Edit-flow tests go with it.
## Test results
Backend: `go test ./internal/handlers/` — all pass
Backend coverage on memories_v2.go: 100% lines, 100% functions
Canvas: `vitest run` — 91 files, 1273 tests pass (26 new)
Canvas coverage on MemoryInspectorPanel.tsx: 100% lines,
100% functions, 96.7% statements, 84.7% branches
(uncovered branches are defensive `?? fallback` for
contract-impossible kind/source values)
## Migration note
The legacy v1 GET/POST/PATCH/DELETE on /workspaces/:id/memories
remains in place for the back-compat MCP shim (mcp_tools_memory_v2's
legacy routing) and admin export/import. PR-9 (#283) drops
agent_memories along with the v1 endpoints once the cutover
verification window closes.
Closes#2934 item 6 — the deferred follow-up from Ryan's onboarding-
friction report. Quote: "this single command would have saved me
30 of the 45 minutes."
When push delivery fails or the install half-works, the operator
today has no signal — they hand-grep the Claude Code binary or
chase the `from versions: none` red herring. Doctor renders six
checks in one screen with concrete next-step suggestions:
1. Python version >=3.11? (wheel's pin)
2. Wheel install molecule-ai-workspace-runtime importable +
version surfaced
3. PATH for binary `molecule-mcp` resolves on PATH; if not,
prints the resolved user-site bin dir to
add (or recommends pipx)
4. Env vars PLATFORM_URL + WORKSPACE_ID + token (env or
*_FILE or .auth_token)
5. Platform reach GET ${PLATFORM_URL}/healthz returns 2xx
6. Registry register POST /registry/register with the resolved
token returns 2xx — end-to-end auth check
Each line: `[OK|WARN|FAIL] <label>: <status>` plus a `next:` hint
when not OK. ANSI colors auto-disable on non-TTY / NO_COLOR.
Exit code: 0 on all-OK or only-WARN, 1 on any FAIL — scriptable
from CI install-checks.
## Files
`workspace/mcp_doctor.py` (new) — six check functions + `run()`
entry point. Uses urllib (stdlib)
so doctor works even on a partial
install where `requests` is missing.
`workspace/mcp_cli.py` Subcommand dispatch:
molecule-mcp doctor → mcp_doctor.run()
molecule-mcp --help → usage banner
molecule-mcp → server (unchanged)
`workspace/tests/test_mcp_doctor.py` (new) — 10 tests covering each
check's pass/fail/skip path
plus the end-to-end exit-code
contract on a stripped env.
`scripts/build_runtime_package.py` Adds `mcp_doctor` to
TOP_LEVEL_MODULES so the
wheel ships the new module.
## Out of scope (deferred follow-ups)
- Claude Code-specific checks (parse ~/.claude.json, verify each
MCP entry is plugin-sourced + dev-channels flag set). That's a
separate Claude-Code-shaped doctor; lives in the channel plugin.
- Automated remediation. Doctor is diagnostic — tells the operator
what's wrong + how to fix it, doesn't apply changes.
## Verification
- python -m pytest tests/test_mcp_doctor.py -v → 10/10 PASS
- python -m pytest tests/test_mcp_cli*.py → 67/67 PASS
(existing CLI suite still green; subcommand dispatch added
before env-validation, doesn't disturb the server-boot path)
- manual: `molecule-mcp doctor` on a stripped env renders 4 FAIL
+ 2 WARN + exit code 1, with each `next:` hint actionable
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pre-RFC-#2945 the broadcast + activity_log INSERT for "agent → user
chat" was duplicated across two handlers — activity.go's Notify (HTTP
/notify) and mcp_tools.go's toolSendMessageToUser (MCP tools/call).
The duplication is exactly what produced the reno-stars production
data-loss regression (PR #2944): the persistence-half fix landed for
one handler and silently lagged for the other for months, dropping
every long-form external-agent message on reload.
PR #2944 added the missing INSERT to mcp_tools.go and a forward-
looking AST gate. This PR removes the duplication at the source.
What changes
------------
NEW: workspace-server/internal/handlers/agent_message_writer.go
- AgentMessageWriter struct + NewAgentMessageWriter ctor.
- Send(ctx, workspaceID, message, attachments) error: workspace
lookup → broadcast WS AGENT_MESSAGE → INSERT activity_logs.
- ErrWorkspaceNotFound for the lookup-miss path so callers can
return 404 / JSON-RPC error cleanly.
- Best-effort persistence: INSERT failure logs only, returns nil so
the broadcast success isn't undone (matches previous behavior in
both call sites — pinned by test).
- Takes events.EventEmitter (interface) so tests can substitute a
capturing fake without nil-panicking inside hub.Broadcast.
UPDATED: activity.go:Notify
- Replaced ~75 lines of inline broadcast+INSERT with a 12-line
call to AgentMessageWriter.Send.
- Attachment shape conversion (NotifyAttachment → AgentMessageAttachment)
is local to the HTTP handler; the writer's API doesn't import the
HTTP-binding-tagged type.
UPDATED: mcp_tools.go:toolSendMessageToUser
- Replaced ~40 lines (the post-#2944 broadcast+INSERT pair) with a
6-line call to the writer.
- Attachments is nil today because the MCP tool args don't expose
attachments yet. When the schema adds it, build the slice and
pass through; the writer half is ready.
Tests
-----
agent_message_writer_test.go (8 tests, comprehensive):
- TestAgentMessageWriter_Send_Success_NoAttachments — happy path,
pins JSON `{"result":"hi"}`.
- TestAgentMessageWriter_Send_Success_WithAttachments — pins file
parts shape (kind=file, file.{uri,name,mimeType,size}). Uses a
jsonMatcher that decodes + asserts via predicate (tolerant of
map key ordering, exact on shape).
- TestAgentMessageWriter_Send_WorkspaceNotFound — pins
ErrWorkspaceNotFound + asserts NO broadcast NO INSERT.
- TestAgentMessageWriter_Send_DBInsertFailureStillReturnsNil — pins
best-effort persistence contract.
- TestAgentMessageWriter_Send_PreviewTruncation — pins ≤80-char
preview + ellipsis (Ryan's onboarding-friction report would have
bloated activity_logs.summary by 2KB without this).
- TestAgentMessageWriter_Send_BroadcastsAgentMessageEvent — pins WS
event name + payload shape via capturingEmitter.
- TestAgentMessageWriter_Send_OmitsAttachmentsKeyWhenEmpty — pins
the "no key when nil" wire contract.
The existing AST gate from #2944
(TestAgentMessageBroadcastsArePersisted) still holds: any future
function emitting AGENT_MESSAGE without an INSERT fails the test.
With the writer in place that's now redundant — both producers go
through it — but the gate is cheap to keep as defense-in-depth.
Verified: go vet clean; all writer + caller tests pass; existing
TestNotify_* + TestMCPHandler_SendMessage_* + the AST gate all green.
Refs RFC #2945, PR #2944.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Self-review of #2935 turned up two real defects:
1. Stale README issue references — the build_runtime_package.py
README template said "(issue #2934 follow-up)" twice, but the
marketplace-plugin and `doctor` items now have dedicated tracking
issues. Updated to point at #2936 and #2937 respectively.
2. Silent fallthrough on broken MOLECULE_WORKSPACE_TOKEN_FILE — when
an operator EXPLICITLY pointed TOKEN_FILE at a path that didn't
exist / wasn't readable / was blank / contained internal whitespace,
the resolver silently returned the generic "set one of these three
vars" error. That's exactly the silent failure mode #2934 flagged
("a new user has no chance"). Refactor `_read_token_from_file_env`
to return `(token, error)`; surface the SPECIFIC failure when the
operator's intent was clearly the file path. Skip the CONFIGS_DIR
fallback in that case so the operator's config bug isn't masked
by a different source happening to work.
Adds 2 renames + 2 new tests in test_mcp_cli_split.py:
- test_missing_file_returns_specific_error (asserts "does not exist")
- test_empty_file_returns_specific_error (asserts "is empty")
- test_multi_line_file_rejected (asserts "internal whitespace")
- test_token_file_error_skips_configs_dir_fallback (asserts a valid
CONFIGS_DIR/.auth_token does NOT silently rescue a broken
TOKEN_FILE)
All 81 mcp_cli + mcp_cli_multi_workspace + mcp_cli_split tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per user request: audit all similar tools + write comprehensive tests
including E2E for the persistence-of-AGENT_MESSAGE-broadcasts contract.
Audit (all BroadcastOnly call sites in workspace-server/internal/):
| Site | Event | Persisted? | Notes |
|---|---|:---:|---|
| a2a_proxy_helpers.go:275 | A2A_RESPONSE | ✓ | LogActivity above |
| activity.go:486 (Notify) | AGENT_MESSAGE | ✓ | INSERT line 535 |
| activity.go:701 (LogActivity) | ACTIVITY_LOGGED | ✓ | self-emits inside DB write |
| mcp_tools.go:341 (toolSendMessageToUser) | AGENT_MESSAGE | ✓ NEW (this PR) |
| registry.go:575 | TASK_UPDATED | N/A | transient progress, not chat |
| registry.go:596 | WORKSPACE_HEARTBEAT | N/A | infra ping, not chat |
Only one chat-bearing broadcast was missing persistence (the just-
fixed mcp bridge path). No other regressions found.
Tests added (4 new, total 5 send_message_to_user tests):
1. TestAgentMessageBroadcastsArePersisted — AST gate that walks every
non-test .go in the package, finds funcs that BroadcastOnly with
"AGENT_MESSAGE", asserts each ALSO contains an
"INSERT INTO activity_logs". Forward-looking regression block:
any future chat tool that broadcasts without persisting fails the
test with a clear file:func diagnostic. Mutation-tested locally:
removing the INSERT block from toolSendMessageToUser reliably
produces the expected failure.
2. TestMCPHandler_SendMessageToUser_DBErrorLogsAndStill200s — pins
the "best-effort persistence" contract. DB INSERT failures must
NOT abort the tool response (the WS broadcast already succeeded;
retrying would double-render in the live chat). Matches /notify.
3. TestMCPHandler_SendMessageToUser_ResponseBodyShape — pins the
exact `{"result": "<message>"}` JSON shape stored in
response_body. The canvas hydrater (extractResponseText in
historyHydration.ts) reads body.result; any drift here silently
breaks chat history without failing the INSERT. Per memory
feedback_assert_exact_not_substring.md, asserts the literal JSON
shape, not a substring.
4. TestMCPHandler_SendMessageToUser_PersistsToActivityLog (existing,
from previous commit) — pins INSERT shape with regex on
'a2a_receive' + 'notify' literals.
5. TestMCPHandler_SendMessageToUser_Blocked_WhenEnvNotSet (existing)
— env-gate aborts before DB.
Test fixture cleanup: newMCPHandler now uses newTestBroadcaster (real
ws.Hub) instead of events.NewBroadcaster(nil) — the latter nil-panics
inside hub.Broadcast on the AGENT_MESSAGE path. Same broadcaster
shape every other handler test uses.
E2E note: the AST gate is the strongest forward-looking guarantee.
A real-DB integration test would add value for CI but is largely
duplicative of the sqlmock contract tests above (sqlmock pins SQL
shape with much faster feedback). Left as a future enhancement when
the handlers Postgres-integration suite extends MCP coverage.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reported on production tenant reno-stars: an external claude-code agent
(CEO Ryan PC workspace) sent a long-form message via send_message_to_user;
the user saw it live in the chat panel but it vanished after a refresh.
Confirmed via direct production query — the message is NOT in
activity_logs at all (only short test pings around it are persisted).
Root cause: there are TWO server-side handlers for send_message_to_user:
1. HTTP `/workspaces/:id/notify` (activity.go:Notify) — broadcasts WS
AND inserts a row into activity_logs. This is the path the
in-container runtime's tool_send_message_to_user calls.
2. MCP-bridge `tools/call name=send_message_to_user`
(mcp_tools.go:toolSendMessageToUser) — broadcasts WS only,
**never persisted**. This is the path EXTERNAL agents using
molecule-mcp's send_message_to_user tool route through.
The persistence fix landed for path 1 months ago but was never mirrored
on path 2. External agents — exactly the case in reno-stars/CEO Ryan PC
— have been silently losing every long-form notification on reload.
Fix: mirror the activity.go INSERT shape inside toolSendMessageToUser:
INSERT INTO activity_logs
(workspace_id, activity_type, method, summary, response_body, status)
VALUES ($1, 'a2a_receive', 'notify', $2, $3::jsonb, 'ok')
Same wire shape as /notify so the canvas's chat-history hydration
(`type=a2a_receive&source=canvas`) treats both writers identically.
Errors are log-only — broadcast already succeeded, persistence failure
shouldn't block the tool response (matches /notify behavior; downside
is the same data-loss-on-DB-error risk, surfaced via log.Printf).
Tests
-----
- `TestMCPHandler_SendMessageToUser_PersistsToActivityLog` — pins both
the workspace-name lookup AND the INSERT shape. Regex-matches
`'a2a_receive'` + `'notify'` literals so a future refactor that
changes activity_type or method breaks the test loud, not silently
re-introducing the data-loss bug.
- Updated newMCPHandler to use newTestBroadcaster() (real ws.Hub) —
events.NewBroadcaster(nil) crashes inside hub.Broadcast in the
send_message_to_user path. Same shape every other handler test uses.
Verified `go test ./internal/handlers/ -run TestMCPHandler_SendMessage`
green; full vet clean.
Refs reno-stars production incident 2026-05-05.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Continues the OSS-shape refactor. After iters 4a-4d (rbac, delegation,
memory, messaging) the only behavior left in ``a2a_tools.py`` was
``report_activity`` plus three thin inbox-tool wrappers and the
``_enrich_inbound_for_agent`` helper. This iter extracts the inbox
slice to ``a2a_tools_inbox.py`` so the kitchen-sink module shrinks
from 280 LOC to ~165 LOC of imports + report_activity + back-compat
re-export blocks.
Extracted symbols:
- ``_INBOX_NOT_ENABLED_MSG`` (sentinel)
- ``_enrich_inbound_for_agent`` (poll-path peer enrichment helper)
- ``tool_inbox_peek``
- ``tool_inbox_pop``
- ``tool_wait_for_message``
Re-exports (`from a2a_tools_inbox import …`) preserve the public
``a2a_tools.tool_inbox_*`` surface so existing tests + call sites
continue to resolve unchanged.
New tests in test_a2a_tools_inbox_split.py:
1. **Drift gate (5)** — every previously-public symbol on a2a_tools
is the EXACT same object as a2a_tools_inbox.foo (`is`, not `==`),
catches a future "wrap with logging" refactor that silently loses
existing test coverage.
2. **Import contract (1)** — a2a_tools_inbox does NOT eagerly import
a2a_tools at module load. Pins the layered architecture: the
extracted slice depends on ``inbox`` + a lazy ``a2a_client``
import, never on the kitchen-sink that re-exports it.
3. **_enrich_inbound_for_agent branches (5)** — peer_id-empty
(canvas_user) returns dict unchanged; missing peer_id key same;
a2a_client unavailable (test harness, partial install) degrades
gracefully with a bare envelope; registry hit populates
peer_name + peer_role + agent_card_url; registry miss still
surfaces agent_card_url (constructable from peer_id alone).
The full timeout-clamp / validation / JSON-shape behavior matrix for
the three wrappers stays in test_a2a_tools_inbox_wrappers.py — those
tests pass identically against both the alias and the underlying impl.
Wiring updates:
- ``scripts/build_runtime_package.py``: add ``a2a_tools_inbox`` to
``TOP_LEVEL_MODULES`` so it ships in the runtime wheel and the
drift gate doesn't fail the next publish.
- ``.github/workflows/ci.yml``: add ``a2a_tools_inbox.py`` to
``CRITICAL_FILES`` so the 75% MCP/inbox/auth per-file floor
applies — this is now where the inbox-delivery code actually
lives.
Ryan's bug report (#2934) walked through ~45 min of debugging a stock
external-runtime install. This PR fixes the four items he flagged that
have a small surface, and stubs out the larger ones for follow-up.
Fixed in this PR
================
#1 — Python floor disclosure (README in publish bundle)
Add an explicit "Requires Python ≥3.11" section that calls out the
cryptic "Could not find a version that satisfies the requirement"
failure mode; recommend `pipx install` over `pip install` so the
binary lands on PATH automatically; show the explicit `pip install
--user` alternative with the PATH caveat.
#3 — MOLECULE_WORKSPACE_TOKEN_FILE support (mcp_workspace_resolver.py)
Add a third resolution step between the inline env var and the
in-container CONFIGS_DIR fallback. Operators can write the bearer to
a 0600 file (e.g. ~/.config/molecule/token) and point
MOLECULE_WORKSPACE_TOKEN_FILE at it, keeping the secret out of
~/.zsh_history and out of plaintext in MCP-host configs like
~/.claude.json. Inline TOKEN still wins on conflict so rotation flows
are predictable. README documents the safer option as the
recommended path. 6 new tests pin every leg (file resolves, inline
wins, missing/empty file falls through, blank env unset-equivalent,
help text advertises it).
#4 — Push delivery 3-condition gating (README in publish bundle)
Document that real-time push on Claude Code requires (a) the server
to declare experimental.claude/channel (we do), (b) the server to be
marketplace-plugin-sourced (operators must scaffold their own until
the official marketplace lands — see #2934 follow-up), and (c) the
--dangerously-load-development-channels flag on the claude
invocation. Until any of the three is in place, delivery silently
falls back to poll mode with no diagnostic. The README now says all
of this explicitly so a new operator doesn't grep the binary for
channel_enable to figure it out.
#8 — serverInfo.name mismatch (a2a_mcp_server.py)
The server reported `serverInfo.name = "a2a-delegation"` while
operators register it as `molecule` (the name in `claude mcp add
molecule …`). Harmless on tool routing today but matters for any
future Claude Code allowlist that gates push by hardcoded server
name. Renamed to "molecule" with an inline comment explaining the
invariant.
Deferred (separate issues to track)
===================================
#2 — covered transitively by #1's pipx recommendation; no separate fix.
#5 — `moleculesai/claude-code-plugin` marketplace repo (substantial new
repo work; the README references it as a documented follow-up).
#6 — `molecule-mcp doctor` subcommand (substantial new CLI surface;
mentioned in the README's push-vs-poll section as the planned
diagnostic for silent push fallback).
#7 — `--dangerously-load-development-channels` rename — not in our
control; that's Claude Code's flag.
Tests
=====
164/164 mcp_cli + a2a_mcp_server tests pass locally
(WORKSPACE_ID=00000000-0000-0000-0000-000000000001 pytest …) including
6 new TestTokenFileEnv cases. Wheel builds successfully via
scripts/build_runtime_package.py with the new README markers verified
in the output.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
After RFC #2873 iter 4d extracted messaging tools to
``a2a_tools_messaging.py``, the only behavior left in ``a2a_tools.py``
is ``report_activity`` (covered by test_a2a_tools_impl) plus three
thin wrappers around inbox state — ``tool_inbox_peek``,
``tool_inbox_pop``, ``tool_wait_for_message`` — which were never
directly exercised at the module level.
Per-file critical-path coverage dropped to 54.4% on the iter 4d
branch, breaking the 75% MCP/inbox/auth floor in ci.yml.
Adds ``test_a2a_tools_inbox_wrappers.py`` — 14 focused tests on the
three wrappers covering: inbox-disabled fallback (via the
_INBOX_NOT_ENABLED_MSG sentinel), input validation
(empty/non-str activity_id, non-int peek limit), the timeout clamp
contract on wait_for_message (300s ceiling, 0s floor, non-numeric
fallback to 60s), JSON-shape pinning, and the limit/activity_id
forwarding contract.
Result: a2a_tools.py back to 100% covered with the existing impl-tests
suite, gate green.
Two related fixes to the Connect-External-Agent flow that the user
flagged: the "Need help?" disclosure block in the modal is for the
operator's eyes only — but the agent reading the pasted snippet has
no access to that context. And the docs URL was pointing at a
hostname that doesn't resolve.
User-visible problems:
1. The agent doesn't see the install link, docs link, or the common-
error/check pairs that the human pasted. When the agent fails to
register or hits ConnectionRefused, it can't self-diagnose because
the troubleshooting context lives in a separate UI block.
2. https://docs.molecule.ai → DNS NXDOMAIN. Every "Documentation"
link in the modal was a dead link.
## Fixes
### Move help INTO the snippet (not a separate human-only UI block)
Each of the 7 server-rendered templates in
`workspace-server/internal/handlers/external_connection.go` now
appends a `# Need help?` section with: install link, correct docs
link, and the top common errors as `# • symptom — check` pairs.
Templates updated: curl / channel (Claude Code) / mcp (Universal MCP) /
python / hermes / codex / openclaw. Agents reading the paste now have
the same diagnostic context the human did.
### Drop the duplicated UI block in the canvas modal
`canvas/src/components/ExternalConnectModal.tsx`:
- Removed the `TAB_HELP` per-tab metadata constant (152 lines).
- Removed the `HelpBlock` component (62 lines).
- Removed the `<HelpBlock help={TAB_HELP[tab]} />` render call.
The snippet is now the single source of truth for tab-level help.
### Fix the wrong docs hostname
The actual docs site is `doc.moleculesai.app` (singular `doc`,
`.app` not `.ai`), confirmed by:
- `package.json` description in `Molecule-AI/docs` repo →
"Molecule AI documentation site — doc.moleculesai.app"
- HTTP HEAD on the new URL → 200 for both
`/docs/guides/mcp-server-setup` and
`/docs/guides/external-agent-registration`
- HTTP HEAD on old `docs.molecule.ai` → 000 (NXDOMAIN)
All template docs URLs now point at `doc.moleculesai.app`.
## Verification
- `go build ./...` clean
- `go test ./internal/handlers/... -count=1` green
- `pnpm test` → 1291/1291 pass (unchanged)
- `tsc --noEmit` clean
- 219 LOC removed (canvas duplicate UI), 69 LOC added (snippet help)
- Net `-150 LOC` while gaining the agent-readable help
## Out of scope (deferred, captured in followups)
- One blog post still has `canonical: "https://docs.molecule.ai/blog/..."`
in `src/app/blog/2026-04-20-chrome-devtools-mcp/page.mdx` — separate
blog-content fix.
- Comment in `theme-provider.tsx` references `docs.moleculesai.app`
(with `s`) — comment-only, not a runtime URL.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
User-reported friction: pip install molecule-ai-workspace-runtime on a
3.10 interpreter fails with "Could not find a version that satisfies the
requirement (from versions: none)" — pip's requires_python filter
silently drops the only available artifact before attempting install,
so the error doesn't mention Python at all. Operators see
"package missing", file a bug, and chase a phantom CDN/visibility
issue.
Two changes mirror the requirement at the two operator-touch surfaces:
1. workspace-server/internal/handlers/external_connection.go:
the externalUniversalMcpTemplate snippet (rendered into the
canvas Connect-External-Agent modal) now leads with a brief
"Requires Python >= 3.11" block + diagnostic + upgrade paths.
2. docs/workspace-runtime-package.md: same callout at the top of
the doc, before the Overview, so anyone landing here from search
gets the answer immediately.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bot lint flagged the two imports as unused (correct — neither is
referenced after the file shrank during review). Resolves the two
unresolved review threads silently blocking merge per the staging
"all conversations resolved" gate.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Iter 4c (#2890) moved tool_commit_memory + tool_recall_memory into
a2a_tools_memory.py, which has its own top-level `import httpx`.
test_mcp_memory.py + the secret-redact memory tests still patched
`a2a_tools.httpx.AsyncClient`, which after the move is the WRONG
module's reference — the real call inside the moved tool resolves to
`a2a_tools_memory.httpx.AsyncClient` and reaches the network. CI
catches this as 7 failures: JSONDecodeError on empty bodies and
"All connection attempts failed" on the recall side.
Update 7 patch sites to `a2a_tools_memory.httpx.AsyncClient`. The
existing tests in `test_a2a_tools_impl.py` were already updated by
the iter-4c PR; only these two files were missed.
Verified: pytest workspace/tests/test_mcp_memory.py +
test_secret_redact.py — 43/43 pass after the fix (both files were
red on the iter-4c branch CI).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three small hardening passes from #2872's optional/important findings,
batched into one polish PR:
1. errors.Is(err, sql.ErrNoRows) instead of err == sql.ErrNoRows.
The bare equality breaks if any future caller wraps the error via
fmt.Errorf("…: %w", err) — the no-rows happy path would fall
through to the "real DB error" branch and abort the import.
errors.Is unwraps. New test
TestLookupExistingChild_WrappedNoRows_TreatedAsNotFound pins the
fix; verified the test fails on the old `==` shape (build break
on unused-import + assertion failure once import dropped).
2. Bounded 5s timeout on lookupExistingChild instead of
context.Background().
The createWorkspaceTree call site runs in goroutines spawned from
the /org/import handler, so plumbing the request context here
would cascade-cancel into provisionWorkspaceAuto and abort
in-flight EC2 provisioning if the client disconnected mid-import
— that's the wrong tradeoff. A short bounded timeout protects the
per-row SELECT against a wedged DB without taking the
drop-everything-on-disconnect behaviour. The lookup is a single
~10ms query; 5s leaves 500x headroom for transient slow paths.
3. Godoc clarifications on the skip-path block.
- /org/import is ADDITIVE-ONLY, never destructive. Children
present in the existing tree but absent from the new template
are preserved (no DELETE on diff).
- Skip-path does NOT propagate updates to existing nodes — a
re-import that adds an initial_memory or schedule to an
existing workspace is silently dropped. Document the limitation
so future operators know to delete-and-re-import or reach for
a future /org/sync route.
Verification:
- go build ./... → clean
- go test ./internal/handlers/... → all passing (TestLookup* +
TestCreateWorkspaceTree* + TestClass1* + TestGate*)
- 4 lookup tests + 1 new wrap-safety test → 5/5 PASS
- Full handlers suite → green
Refs molecule-core#2872 (Optional findings — wrap-safety + ctx, godoc
clarifications for additive-only + skip-path-update-limitation)
Out of scope (deferred):
- PR-D partial unique index migration + ON CONFLICT — sequenced
after Phase 4 cleanup verified clean per #2872 plan
- PR-E full createWorkspaceTree integration test for partial-match
— needs heavier sqlmock scaffolding for downstream
workspaces_audit/canvas_layouts/secrets/channels INSERTs;
follow-up
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
User-visible problem: agent-comms panel opens mid-conversation on long
histories (the same chat-opens-in-middle bug PR #2903 fixed for
my-chat) and silently renders empty state when the history fetch fails
(no retry button, no diagnostic).
Three changes mirror the my-chat patterns from ChatTab:
1. Initial-mount instant scroll.
Adds hasInitialScrollRef + switches the scroll hook from useEffect
to useLayoutEffect. First arrival of messages → scrollIntoView
`instant`; subsequent appends → `smooth` as before. useLayoutEffect
runs before paint so the user never sees the panel jump for one
frame on every append.
2. Error UI with Retry button.
Adds `loadError` state. The history-load .catch now sets the
error message; a new branch in the render renders a red alert
with the failure text and a Retry button that re-invokes
`loadInitial`. Same shape as ChatTab MyChatPanel's `loadError`
handling — both surfaces should fail loud, not silent.
3. Extracted `loadInitial` callback.
The history-load body becomes a useCallback so the retry button
has a stable reference to call. Mirrors ChatTab's loadInitial.
Tests (4 new in AgentCommsPanel.render.test.tsx):
- Loading state renders the loading copy.
- Error state with Retry button renders on rejection; clicking
Retry fires a second api.get.
- Empty state renders when load succeeds with zero rows.
- scrollIntoView is called with behavior=instant on first message
arrival (pins the chat-opens-in-middle prevention).
Verification:
- pnpm test → 1284/1284 pass (1280 prior + 4 new)
- tsc --noEmit → clean
- 92 → 93 test files, no existing test broken
Closes the parity gap raised in chat. The two surfaces now share:
loading copy / error UI / empty-state placeholder / scroll behaviour /
useLayoutEffect timing.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Covers the user-visible flow that Phase 1-5b shipped (RFC #2891):
register a poll-mode workspace, POST a multi-file /chat/uploads, verify
the activity feed shows one chat_upload_receive row per file, fetch the
bytes via /pending-uploads/:fid/content, ack each row, and confirm a
post-ack fetch returns 404. Also pins cross-workspace bleed protection
(workspace B's bearer on A's URL → 401, B's URL with A's file_id →
404) and the file_id-UUID-parse 400 path.
23 assertions, all green against a local platform (Postgres+Redis+
platform-server stack matches the e2e-api.yml CI recipe verbatim).
Why a new script instead of extending test_poll_mode_e2e.sh: that
script tests A2A short-circuit + since_id cursor semantics; this one
tests the chat-upload path. They share zero handler code on the
platform side and would dilute each other's failure messages if
combined.
Why not the bearerless-401 strict-mode assertion: the platform's
wsauth fail-opens for bearerless requests when MOLECULE_ENV=development
(see middleware/devmode.go). The CI workflow doesn't set that var, but
some local-dev .env files do — the assertion would flap by environment
without testing the poll-mode upload contract. The middleware's own
unit tests cover strict-mode 401.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds class1_ast_gate_test.go — a per-package AST walk that fails the
build if any handler function INSERTs INTO workspaces inside a range
loop body without one of three escape hatches:
1. A call to a registered preflight helper (lookupExistingChild today;
extend preflightCallNames as new helpers are introduced).
2. An ON CONFLICT clause in the same SQL literal (idempotent UPSERT,
like registry.go).
3. An explicit `// class1-gate: idempotent-by-design` comment in the
function body (deliberately awkward — forces a code-review beat).
Why this is broader than the existing
TestCreateWorkspaceTree_CallsLookupBeforeInsert gate in
org_import_idempotency_test.go: that one is hard-coded to one function
in one file. This one walks every non-test .go file in the handlers
package and applies a structural rule independent of file/function
names. A future handler written from scratch in a new file would not
have been covered before — now it is.
Detection mechanism (per AST):
- Collect spans (Lbrace..Rbrace) of every RangeStmt body in each
function. Position-based instead of stack-based — ast.Inspect's
nil-callback ordering doesn't give per-node pop semantics, so a
naive push/pop stack silently miscounts. Position spans are
deterministic.
- Walk every BasicLit, regex-match `^\s*INSERT INTO workspaces\(`
(tightened from bytes.Index "INSERT INTO workspaces" so
workspaces_audit literals don't false-positive — same regex used
by the existing createWorkspaceTree gate).
- For each match: record insertLine, hasONCONFLICT, and the
innermost enclosing RangeStmt line (or 0 if not inside any range).
- Fail the function if INSERT is inside a range AND no preflight
AND no ON CONFLICT AND no allowlist annotation.
Self-tests (per `feedback_assert_exact_not_substring.md` —
verify gate fails on the bug shape before merging):
- TestClass1_GateFiresOnSyntheticBuggySource: synthetic source
where INSERT is inside `for _, child := range children` body
must trigger the gate's three guards (enclosingRangeLine!=0,
hasONCONFLICT=false, no preflight call).
- TestClass1_GateAllowsONCONFLICT: synthetic INSERT...ON CONFLICT
must NOT trigger the gate (idempotent UPSERT case).
- TestClass1_GateAllowsAllowlistAnnotation: function with
`// class1-gate: idempotent-by-design` must be skipped.
- TestClass1_NoUnpreflightedInsertInsideRange: production sweep
over every handler .go file. Currently passes because
org_import.go preflights, registry.go ON-CONFLICTs, and
workspace.go's Create has no INSERT inside a range body.
Verification:
- go test ./internal/handlers/... -run TestClass1_ -count=1
→ 4/4 PASS
- go test ./internal/handlers/... -count=1 → suite green
(no pre-existing test broken by the new file)
Refs molecule-core#2867 (PR-A Class 1 generic AST gate)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>