molecule-core

Author	SHA1	Message	Date
Hongming Wang	700d44ec3d	feat(mcp): multi-workspace routing for memory + chat_history + workspace_info PR-3 of the multi-workspace MCP rollout. PR-1 made the MCP server itself multi-workspace aware (one process, N workspace memberships). PR-2 added source_workspace_id threading to delegate_task / list_peers. This change closes the remaining workspace-scoped tools so a single agent registered into multiple workspaces no longer leaks memories or chat history across tenants. Tools now accepting `source_workspace_id`: - tool_commit_memory(content, scope, source_workspace_id=None) — routes POST to /workspaces/{src}/memories with the source workspace's Bearer token. Body still embeds source_workspace_id for the platform's audit + namespace-isolation enforcement. - tool_recall_memory(query, scope, source_workspace_id=None) — GET /workspaces/{src}/memories with the source workspace's token and ?workspace_id={src} query so the platform scopes the read to the caller's tenant view (PR-1 / multi-workspace mode). - tool_chat_history(peer_id, limit, before_ts, source_workspace_id=None) — auto-routes via the _peer_to_source cache populated by list_peers, with explicit override winning. Falls back to module-level WORKSPACE_ID if neither is available. URL: /workspaces/{src}/chat-history. - tool_get_workspace_info(source_workspace_id=None) — GET /workspaces/{src} with the source workspace's token. Useful for introspecting any workspace the agent is registered into, not just the primary. In every path, `src = source_workspace_id or WORKSPACE_ID`, so single-workspace operators see no behavior change. Tokens are resolved per-workspace via auth_headers(src) / _auth_headers_for_heartbeat(src), which fall through to the legacy AUTH_TOKEN env when not in multi-workspace mode. Also updates input_schemas in platform_tools/registry.py so the new optional parameter is advertised to LLM clients (claude-code, hermes-agent, langchain wrappers). Tests (4 new classes in test_a2a_multi_workspace.py, 21 new tests): - TestCommitMemorySourceRouting — URL + Authorization header per source - TestRecallMemorySourceRouting — URL + query param + Authorization - TestChatHistorySourceRouting — peer-cache auto-route + explicit override - TestGetWorkspaceInfoSourceRouting — URL + Authorization Inbox tools (peek/pop/wait_for_message) already multi-workspace aware since PR-1 — inbox.py spawns per-workspace pollers and tags every InboxMessage with arrival_workspace_id. No further plumbing needed. Suite: 1700 passed, 3 skipped, 2 xfailed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 14:17:58 -07:00
Hongming Wang	1161b97faf	feat(mcp): cross-workspace delegation routing (multi-ws PR-2) PR-2 of the multi-workspace external-agent stack. PR-1 (#2739) landed per-workspace auth + heartbeat + inbox. This PR threads ``source_workspace_id`` through the A2A client + tool surface so an agent registered against multiple workspaces can list peers across all of them and delegate from a specific source. Changes ------- * ``a2a_client``: ``discover_peer``, ``send_a2a_message``, ``get_peers_with_diagnostic``, and ``enrich_peer_metadata`` now accept ``source_workspace_id``. Routing uses it for both the X-Workspace-ID header and (transitively, via ``auth_headers(src)``) the bearer token. Defaults to module-level WORKSPACE_ID for back-compat. * ``a2a_client._peer_to_source``: a new lock-free cache mapping each discovered peer back to the source workspace whose registry surfaced it. ``tool_list_peers`` populates the cache on every call; ``tool_delegate_task`` consults it for auto-routing. * ``a2a_tools.tool_list_peers(source_workspace_id=None)``: when multiple workspaces are registered (MOLECULE_WORKSPACES) and no explicit source is passed, aggregates peers across every registered workspace and tags each entry with ``via: <src[:8]>``. Single-workspace mode is unchanged — no ``via:`` annotation, same output shape. * ``a2a_tools.tool_delegate_task`` and ``tool_delegate_task_async`` resolve source via ``source_workspace_id arg → _peer_to_source[target] → WORKSPACE_ID``. Agents almost never need to specify ``source_`` explicitly — call ``list_peers`` first and the cache handles the rest. ``tool_delegate_task_async`` idempotency key now includes the source workspace, so the same task delegated from two registered workspaces produces two distinct delegations (the right behavior — one per tenant audit trail). * ``platform_auth.list_registered_workspaces()``: new helper for the tool layer to enumerate the multi-ws registry. Lock-free reads matched by the existing single-writer-per-workspace contract from PR-1. * ``platform_auth.self_source_headers``: now passes ``workspace_id`` through to ``auth_headers`` — without this, a multi-workspace POST source-tagged with ``X-Workspace-ID=ws_b`` was authenticating with ws_a's token (or no token if MOLECULE_WORKSPACE_TOKEN unset). Latent PR-1 bug exposed by the new tool surface. * ``a2a_mcp_server`` tool dispatch passes ``source_workspace_id`` from the tool call arguments. * ``platform_tools.registry``: add ``source_workspace_id`` to the delegate_task, delegate_task_async, check_task_status, list_peers input schemas with copy explaining when to use it (rarely — the cache handles it). Tests (15 new, all passing) --------------------------- ``test_a2a_multi_workspace.py``: * TestDiscoverPeerSourceRouting (3): src arg drives header+token, fallback to module ws when omitted, invalid target short-circuits before any HTTP attempt. * TestSendA2AMessageSourceRouting (1): X-Workspace-ID source header + Authorization bearer both come from the source arg via the patched self_source_headers chain. * TestGetPeersSourceRouting (1): URL path AND headers use the source workspace id. * TestToolListPeersAggregation (4): aggregates across multiple registered workspaces, tags origin, leaves single-workspace path unchanged, explicit src arg overrides aggregation, diagnostic joining when every workspace returns empty. * TestToolDelegateTaskAutoRouting (3): cache-driven auto-route, explicit override beats cache, single-workspace fallback to module WORKSPACE_ID. * TestListRegisteredWorkspaces (3): registry enumeration helper. Plus ``tests/snapshots/a2a_instructions_mcp.txt`` regenerated to absorb the new ``source_workspace_id`` schema entries. Back-compat ----------- Every change defaults ``source_workspace_id=None``; legacy single-workspace operators (no MOLECULE_WORKSPACES) see identical behavior — same URLs, same headers, same tool output. The 24 PR-1 tests + 125 existing A2A tests all still pass. Out of scope (PR-3) ------------------- Memory namespacing per registered workspace lands after the new memory system v2 PR (#2740) settles in production. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 08:32:24 -07:00
Hongming Wang	829ab66462	mcp: support multi-workspace external-agent registration (PR-1) External MCP agents (e.g. Claude Code installed on a company PC) can now register against MULTIPLE workspaces from a single process — the agent participates as a peer in workspace A (company) AND workspace B (personal) simultaneously, with one merged inbox tagged so replies route to the correct tenant. Use case (verbatim from operator): "I have this computer AI thats in company's PC, he is going to be put in company's workspace, but personally, I want to register it to my own workspace as well, so that I can talk to it and asking him to do work." ## What changed Wire format — new env var: MOLECULE_WORKSPACES='[ {"id":"<company-wsid>","token":"<company-tok>"}, {"id":"<personal-wsid>","token":"<personal-tok>"} ]' When set, mcp_cli iterates the array and spawns one (register + heartbeat + inbox poller) trio per workspace. Single-workspace mode (WORKSPACE_ID + MOLECULE_WORKSPACE_TOKEN) is unchanged — every existing operator's setup keeps working bit-for-bit. Per-workspace token registry (platform_auth.py): register_workspace_token(wsid, tok) — populated by mcp_cli once per workspace before any thread spawns; thread-safe registration + lock-free reads on the hot path. auth_headers(workspace_id=...) routes to the per-workspace token; auth_headers() with no arg uses the legacy resolution path unchanged (back-compat). Per-workspace inbox cursors (inbox.py): InboxState now supports cursor_paths={wsid: Path,...}. Each poller advances its own cursor — one workspace's slow poll can't stall another, and a 410 only resets the affected workspace's cursor. Single-workspace constructor (cursor_path=Path(...)) still works exactly as before via __post_init__ promotion to the empty-string key. Cursor filenames disambiguated by workspace_id[:8] when multi-workspace; single-workspace keeps the legacy filename so upgrade doesn't invalidate on-disk state. Arrival workspace tagging (inbox.py): InboxMessage.arrival_workspace_id — tells the agent which OF ITS workspaces the inbound message arrived on. Set by the poller from the cursor key. to_dict() omits the field when empty so single- workspace consumers see no shape change. Reply routing (a2a_tools.py + a2a_mcp_server.py + registry.py): send_message_to_user(workspace_id=...) — optional override that selects which workspace's /notify endpoint to POST to (and which token authenticates). Multi-workspace agents pass the inbound message's arrival_workspace_id; single-workspace agents omit it and route to the only registered workspace via the legacy URL. ## Out of scope (future PRs) - PR-2: cross-workspace delegation auto-routing — when an agent receives a request from personal-ws "delegate to ops-bot" and ops-bot lives in company-ws, the agent should auto-pick its company-ws identity for the outbound delegate_task. Today the agent must pass via_workspace explicitly (or fall through to primary workspace). - PR-3: memory namespacing — commit_memory() still writes to the primary workspace's memory regardless of inbound context. Will revisit when the new memory system (PR #2733 just landed) settles. ## Tests workspace/tests/test_mcp_cli_multi_workspace.py — 24 new tests: * MOLECULE_WORKSPACES JSON parsing (valid + 6 error shapes) * Token registry register / lookup / rotation / clear * auth_headers routing by workspace_id with legacy fallback * Per-workspace cursor save/load/reset isolation * arrival_workspace_id present-when-set, omitted-when-empty * default_cursor_path namespacing All 110 pre-existing tests in test_mcp_cli.py / test_inbox.py / test_platform_auth.py still pass — back-compat is mechanical. Refs: project memory entry "External agent multi-workspace registration", design questions answered 2026-05-04 by user (JSON env var; explicit memory writes deferred to PR-3). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 08:06:00 -07:00
Hongming Wang	09e99a09c6	feat(a2a-mcp): add chat_history tool for prior turns with a peer When a peer_agent push lands and the agent needs context from prior turns with that workspace ("what task did this peer assign me last hour?", "what did I tell them?"), the only options today are re-deriving from memory (lossy) or scrolling activity_logs in the canvas (no agent-facing tool). Surface the platform's existing audit log directly via a new MCP tool so agents can read both sides of an A2A conversation in chronological order. Implementation: - a2a_tools.py: new tool_chat_history(peer_id, limit=20, before_ts="") hits /workspaces/<self>/activity?peer_id=X&limit=N (the new server filter from molecule-core#2472). Reverses the DESC response into chronological order so the agent reads top-down. Graceful error envelope on validation/network/non-200 — never crashes the MCP server, agent can branch on Error: prefix. - platform_tools/registry.py: ToolSpec wired into the A2A section so the rendered system-prompt block automatically includes it. Same pattern as the existing inbox_peek/inbox_pop/wait_for_message. - a2a_mcp_server.py: dispatch in handle_tool_call. - executor_helpers.py: _CLI_A2A_COMMAND_KEYWORDS gets a None entry (CLI runtimes don't expose chat history today; flip to a keyword when a2a_cli grows a `history` subcommand). - snapshots/a2a_instructions_mcp.txt regenerated. Tests: 10 new branches in TestChatHistory (validation / param forwarding / limit cap / before_ts pass-through / DESC→chronological reorder / 400 verbatim / 500 generic / network exc / non-list resp). Mutation-verified: reverting a2a_tools.py fails 10/10. Full test suite remains green at 1516 passed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 17:54:23 -07:00
Hongming Wang	b47d4ceb00	feat(workspace-runtime): add inbox polling for standalone molecule-mcp path The universal MCP server (a2a_mcp_server.py) was outbound-only — agents in standalone runtimes (Claude Code, hermes, codex, etc.) could delegate, list peers, and write memories, but never observed the canvas-user or peer-agent messages addressed to them. This blocked "constantly responding" loops without forcing operators back onto a runtime-specific channel plugin. This PR closes the inbound gap with a poller-fed in-memory queue and three new MCP tools: - wait_for_message(timeout_secs?) — block until next message arrives - inbox_peek(limit?) — list pending messages (non-destructive) - inbox_pop(activity_id) — drop a handled message A daemon thread polls /workspaces/:id/activity?type=a2a_receive every 5s, fills the queue from the cursor (since_id), and persists the cursor to ${CONFIGS_DIR}/.mcp_inbox_cursor so a restart doesn't replay backlog. On 410 (cursor pruned) we fall back to since_secs=600 for a bounded recovery window. Activity-row → InboxMessage extraction mirrors the molecule-mcp-claude-channel plugin's extractText (envelope shapes #1-3 + summary fallback). mcp_cli.main starts the poller alongside the existing register + heartbeat threads. In-container runtimes (which have push delivery via canvas WebSocket) skip activation, so inbox tools return an informational "(inbox not enabled)" message instead of double-delivery. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 16:32:48 -07:00
hongmingwang-moleculeai	a18d116606	Merge pull request #2261 from Molecule-AI/fix/harness-cleanup-failed-event harness: SaaS routing + provider-agnostic config for RFC #2251 measurement	2026-04-29 05:35:43 +00:00
Hongming Wang	00e4766046	docs: registry pattern + harness scripts READMEs Two docs covering load-bearing patterns from today's work that weren't previously discoverable: 1. workspace/platform_tools/README.md — explains the ToolSpec single-source-of-truth pattern (#2240), the CLI-block alignment gap that hand-maintained generation can't close (#2258), the snapshot golden files + LF-pinning (#2260), and the add/rename/ remove playbook. The next reader who lands in workspace/platform_tools/ now has the design rationale + the safe-edit procedure colocated with the code. 2. scripts/README.md — disambiguates the three measure-coordinator- task-bounds.sh files that now exist across two repos: - scripts/measure-coordinator-task-bounds.sh (canonical OSS, this repo) - scripts/measure-coordinator-task-bounds-runner.sh (Hermes/MiniMax variant, this repo) - scripts/measure-coordinator-task-bounds.sh (production-shape, in molecule-controlplane) Cross-references reference_harness_pair_pattern (auto-memory) for the cross-repo design rationale. Documents the common safety pattern (cleanup trap, DRY_RUN, non-target guard, cleanup_*_failed events) and the heartbeat-trace caveat. Refs: #2240, #2254, #2257, #2258, #2259, #2260; molecule-controlplane#321. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 22:19:40 -07:00
Hongming Wang	ddf6720498	chore(registry): snapshot tests + CLI-block alignment for #2240 Two follow-ups from the #2240 code review: 1. Snapshot tests for the rendered tool-instruction blocks. The structural tests added in #2240 guarantee tool NAMES are present; these new tests pin the SHAPE — bullet ordering, heading style, footer placement — so a future contributor who reorders fields in `_render_section` or rewrites a `when_to_use` paragraph sees the diff in CI rather than shipping a silently-different system prompt. Golden files live under workspace/tests/snapshots/. 2. CLI-block alignment test + corrected source-of-truth comment. `_A2A_INSTRUCTIONS_CLI` is a separate hand-maintained surface for ollama and other non-MCP runtimes — the registry can't auto-generate it because the CLI subprocess interface uses different command shapes (`peers` vs `list_peers`, etc.). A new `_CLI_A2A_COMMAND_KEYWORDS` mapping declares the registry-tool → CLI-keyword correspondence (or explicit `None` for tools not exposed via subprocess). Two tests enforce coverage: - every a2a tool in the registry is keyed in the mapping - every non-None subcommand keyword literally appears in `_A2A_INSTRUCTIONS_CLI` Caught one real gap: `send_message_to_user` is in the registry but has no CLI subcommand. Mapped to `None` with an explanatory comment. The "no other source of truth" claim in registry.py's docstring was wrong post-#2240 (the CLI block survived) — corrected to describe the two surfaces explicitly and point at the alignment tests as the gate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 20:42:15 -07:00
Hongming Wang	e9a59cda3b	feat(platform): single-source-of-truth tool registry — adapters consume, no drift Establishes workspace/platform_tools/registry.py as THE place tool naming and docs live. Every consumer reads from it; nothing duplicates the source. Closes the architectural gap behind the doc/tool drift discussion 2026-04-28 — adding hundreds of future runtime SDK adapters should not require touching tool names anywhere except the registry. What the registry owns ToolSpec dataclass with: name, short (one-line description), when_to_use (multi-paragraph agent-facing usage guidance), input_schema (JSON Schema), impl (the actual coroutine in a2a_tools.py), section ('a2a' \| 'memory'). TOOLS list with 8 entries — delegate_task, delegate_task_async, check_task_status, list_peers, get_workspace_info, send_message_to_user, commit_memory, recall_memory. What now reads from the registry - workspace/a2a_mcp_server.py The hardcoded TOOLS list (167 lines of hand-maintained dicts) is gone. Replaced with a 6-line list comprehension over the registry. MCP description = spec.short. inputSchema = spec.input_schema. - workspace/executor_helpers.py get_a2a_instructions(mcp=True) and get_hma_instructions() now GENERATE the agent-facing system-prompt text from the registry. Heading + per-tool bullet (spec.short) + per-tool when_to_use + a section-specific footer. No more hand-maintained instruction blocks that drift from reality. - workspace/builtin_tools/delegation.py Renamed delegate_to_workspace -> delegate_task_async to match registry. check_delegation_status -> check_task_status. Added sync delegate_task @tool wrapping a2a_tools.tool_delegate_task (was missing for LangChain runtimes — CP review Issue 3). - workspace/builtin_tools/memory.py Renamed search_memory -> recall_memory to match registry. - workspace/adapter_base.py, workspace/main.py Bundle all 7 core tools (was 6) into all_tools / base_tools. - workspace/coordinator.py, shared_runtime.py, policies/routing.py Updated system-prompt-text references to use the registry names. Structural alignment tests workspace/tests/test_platform_tools.py — 9 tests pin every registry-to-adapter mapping: - registry names are unique - a2a + memory partition is complete (no orphans) - by_name lookup works - MCP server registers exactly the registry's tool set - MCP description equals registry.short for every tool - MCP inputSchema equals registry.input_schema for every tool - get_a2a_instructions text contains every a2a tool name - get_hma_instructions text contains every memory tool name - pre-rename names (delegate_to_workspace, search_memory, check_delegation_status) cannot leak back Adding a future tool means adding one ToolSpec; the test failure list tells the author exactly which adapter to update. Adapter pattern for future SDK support When (e.g.) AutoGen or Pydantic AI gets adapters, the only work needed for tool surfacing is "wrap registry.TOOLS in your SDK's tool format." Names, descriptions, schemas, impl come from the registry — adapter author writes zero strings. Why this needed to ship now PR #2237 (already in staging) injected MCP-world docs as the default system-prompt content. Without the registry, those docs said "delegate_task" while LangChain runtimes only had "delegate_to_workspace" — workers see docs for tools that don't exist (CP review Issue 1+3). PR #2239 was a tactical rename; this PR is the structural fix that prevents the same class of drift from recurring as new adapters ship. PR #2239 was closed in favor of this — same renames, plus the registry, plus structural tests. Single coherent change. Tests: 1232 pass, 2 xfailed (pre-existing). 9 new in test_platform_tools.py; 4 alignment tests in test_prompt.py from #2237 still pass; original test_executor_helpers tests adapted to the registry-driven world. Refs: CP review Issues 1, 2, 3, 5; project memory project_runtime_native_pluggable.md (platform owns A2A); project memory feedback_doc_tool_alignment.md (this is the structural fix for the tactical lesson).	2026-04-28 17:11:36 -07:00

9 Commits