molecule-core

Author	SHA1	Message	Date
Hongming Wang	e72f9ad107	refactor(workspace): extract delegation handlers from a2a_tools.py to a2a_tools_delegation.py (RFC #2873 iter 4b) Second slice of the a2a_tools.py split (stacked on iter 4a). Owns the three delegation MCP tools + the RFC #2829 PR-5 sync-via-polling helper they share: * tool_delegate_task — synchronous delegation * tool_delegate_task_async — fire-and-forget * tool_check_task_status — poll the platform's /delegations log * _delegate_sync_via_polling — durable async + poll for terminal status * _SYNC_POLL_INTERVAL_S / _SYNC_POLL_BUDGET_S constants a2a_tools.py shrinks from 915 → 609 LOC (−306). Stacked on iter 4a's RBAC extraction; uses `from a2a_tools_rbac import auth_headers_for_heartbeat` as its auth-header source. The lazy `from a2a_tools import report_activity` inside tool_delegate_task breaks the circular-import cycle (a2a_tools imports the delegation re-exports at module-load; delegation handler needs report_activity at CALL time). A dedicated test pins this contract. Tests: * 77 existing test_a2a_tools_impl.py tests pass after retargeting 20 patch sites in TestToolDelegateTask + TestToolDelegateTaskAsync + TestToolCheckTaskStatus from `a2a_tools.foo` to `a2a_tools_delegation.foo` (foo ∈ {discover_peer, send_a2a_message, httpx.AsyncClient}). The patches need to target the new module because that's where the call sites live now. * test_a2a_tools_delegation.py adds 8 new tests: - 6 alias drift gates (`a2a_tools.tool_delegate_task is …`) - 2 import-contract tests (no top-level circular dep + a2a_tools surfaces every delegation symbol) - 1 sync-poll budget invariant 113 tests total (77 impl + 28 rbac + 8 delegation), all green. Refs RFC #2873.	2026-05-05 05:00:52 -07:00
Hongming Wang	0c461eb9f1	refactor(workspace): extract RBAC helpers from a2a_tools.py to a2a_tools_rbac.py (RFC #2873 iter 4a) First slice of the a2a_tools.py (991 LOC) split — single-concern module for the workspace's RBAC + auth-header layer: * _ROLE_PERMISSIONS canonical table * _get_workspace_tier * _check_memory_write_permission * _check_memory_read_permission * _is_root_workspace * _auth_headers_for_heartbeat a2a_tools.py shrinks from 991 → 915 LOC. Internal call sites (15 references) work unchanged because the bare names are re-imported at module-level — Python's local-then-module name resolution still finds them in a2a_tools's namespace, so existing tests' patch("a2a_tools._foo", …) keeps working. The RBAC layer can now evolve independently of the 18 tool handlers. Adding a new role or capability action touches one file, not the kitchen-sink module. Tests: * 77 existing test_a2a_tools_impl.py pass unchanged. * test_a2a_tools_rbac.py adds 28 focused tests: - 6 alias drift-gate tests (`_foo is rbac.foo`) - 4 get_workspace_tier env+config branches - 2 is_root_workspace tier branches - 6 check_memory_write_permission roles + override branches - 3 check_memory_read_permission scenarios - 3 auth_headers_for_heartbeat platform_auth branches - 4 ROLE_PERMISSIONS table invariants * Direct coverage for the helper module (was previously only exercised through 991-LOC tool-handler tests). Refs RFC #2873.	2026-05-05 04:43:16 -07:00
Hongming Wang	5950d4cd81	feat(delegations): agent-side cutover — sync delegate uses async+poll path (RFC #2829 PR-5) Behind feature flag DELEGATION_SYNC_VIA_INBOX (default off). When set, tool_delegate_task no longer holds an HTTP message/send connection through the platform proxy waiting for the callee's reply. Instead: 1. POST /workspaces/<src>/delegate (returns 202 + delegation_id) — platform's executeDelegation goroutine handles A2A dispatch in the background. No client-side timeout dependency on the platform holding a connection open. 2. Poll GET /workspaces/<src>/delegations every 3s for a row with matching delegation_id reaching terminal status (completed/failed). 3. Return the response_preview text on completed; surface the wrapped _A2A_ERROR_PREFIX error on failed (so caller error detection stays unchanged). This closes the bug class that broke Hongming's home hermes on 2026-05-05 ("message/send queued but result not available after 600s timeout" while the callee was actively heartbeating "iteration 14/90"). ## Compatibility Default-off feature flag — flag-off path is byte-identical to the legacy send_a2a_message behavior, pinned by TestFlagOffLegacyPath::test_flag_off_uses_send_a2a_message_not_polling. Idempotency-key derivation matches tool_delegate_task_async (SHA-256 of source:target:task) so a restart-mid-delegation gets the same key and the platform returns the existing delegation_id. ## Recovery on timeout If the polling budget (DELEGATION_TIMEOUT, default 300s) elapses without a terminal status, the error message includes the delegation_id + a "call check_task_status('<id>') to retrieve later" hint. The platform's durable row is still live — work is NOT lost, just the synchronous wait is over. Caller can poll for the result later via the existing check_task_status tool. ## Stack with PR-2 PR-2 added the SERVER-SIDE result-push to the caller's a2a_receive inbox row. PR-5 (this PR) adds the AGENT-SIDE cutover. Together they remove the proxy-blocked sync path entirely. PR-2 default-off keeps existing behavior; PR-5 default-off keeps existing behavior. Operators flip both for full effect after staging burn-in. ## Coverage 9 unit tests: - flag off → byte-identical to legacy (send_a2a_message called, _delegate_sync_via_polling NOT called) - dispatch HTTP exception → wrapped error - dispatch non-2xx → wrapped error mentioning HTTP code - dispatch missing delegation_id → wrapped error - completed first poll → response_preview returned - failed status → wrapped error with error_detail - transient poll error → keeps polling, eventually succeeds - deadline exceeded → wrapped timeout error mentions delegation_id + check_task_status hint for recovery - filters by delegation_id (other delegations' rows ignored) All passing locally. CI will run the same suite on a clean env. Refs RFC #2829.	2026-05-04 21:31:11 -07:00
Hongming Wang	700d44ec3d	feat(mcp): multi-workspace routing for memory + chat_history + workspace_info PR-3 of the multi-workspace MCP rollout. PR-1 made the MCP server itself multi-workspace aware (one process, N workspace memberships). PR-2 added source_workspace_id threading to delegate_task / list_peers. This change closes the remaining workspace-scoped tools so a single agent registered into multiple workspaces no longer leaks memories or chat history across tenants. Tools now accepting `source_workspace_id`: - tool_commit_memory(content, scope, source_workspace_id=None) — routes POST to /workspaces/{src}/memories with the source workspace's Bearer token. Body still embeds source_workspace_id for the platform's audit + namespace-isolation enforcement. - tool_recall_memory(query, scope, source_workspace_id=None) — GET /workspaces/{src}/memories with the source workspace's token and ?workspace_id={src} query so the platform scopes the read to the caller's tenant view (PR-1 / multi-workspace mode). - tool_chat_history(peer_id, limit, before_ts, source_workspace_id=None) — auto-routes via the _peer_to_source cache populated by list_peers, with explicit override winning. Falls back to module-level WORKSPACE_ID if neither is available. URL: /workspaces/{src}/chat-history. - tool_get_workspace_info(source_workspace_id=None) — GET /workspaces/{src} with the source workspace's token. Useful for introspecting any workspace the agent is registered into, not just the primary. In every path, `src = source_workspace_id or WORKSPACE_ID`, so single-workspace operators see no behavior change. Tokens are resolved per-workspace via auth_headers(src) / _auth_headers_for_heartbeat(src), which fall through to the legacy AUTH_TOKEN env when not in multi-workspace mode. Also updates input_schemas in platform_tools/registry.py so the new optional parameter is advertised to LLM clients (claude-code, hermes-agent, langchain wrappers). Tests (4 new classes in test_a2a_multi_workspace.py, 21 new tests): - TestCommitMemorySourceRouting — URL + Authorization header per source - TestRecallMemorySourceRouting — URL + query param + Authorization - TestChatHistorySourceRouting — peer-cache auto-route + explicit override - TestGetWorkspaceInfoSourceRouting — URL + Authorization Inbox tools (peek/pop/wait_for_message) already multi-workspace aware since PR-1 — inbox.py spawns per-workspace pollers and tags every InboxMessage with arrival_workspace_id. No further plumbing needed. Suite: 1700 passed, 3 skipped, 2 xfailed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 14:17:58 -07:00
Hongming Wang	1161b97faf	feat(mcp): cross-workspace delegation routing (multi-ws PR-2) PR-2 of the multi-workspace external-agent stack. PR-1 (#2739) landed per-workspace auth + heartbeat + inbox. This PR threads ``source_workspace_id`` through the A2A client + tool surface so an agent registered against multiple workspaces can list peers across all of them and delegate from a specific source. Changes ------- * ``a2a_client``: ``discover_peer``, ``send_a2a_message``, ``get_peers_with_diagnostic``, and ``enrich_peer_metadata`` now accept ``source_workspace_id``. Routing uses it for both the X-Workspace-ID header and (transitively, via ``auth_headers(src)``) the bearer token. Defaults to module-level WORKSPACE_ID for back-compat. * ``a2a_client._peer_to_source``: a new lock-free cache mapping each discovered peer back to the source workspace whose registry surfaced it. ``tool_list_peers`` populates the cache on every call; ``tool_delegate_task`` consults it for auto-routing. * ``a2a_tools.tool_list_peers(source_workspace_id=None)``: when multiple workspaces are registered (MOLECULE_WORKSPACES) and no explicit source is passed, aggregates peers across every registered workspace and tags each entry with ``via: <src[:8]>``. Single-workspace mode is unchanged — no ``via:`` annotation, same output shape. * ``a2a_tools.tool_delegate_task`` and ``tool_delegate_task_async`` resolve source via ``source_workspace_id arg → _peer_to_source[target] → WORKSPACE_ID``. Agents almost never need to specify ``source_`` explicitly — call ``list_peers`` first and the cache handles the rest. ``tool_delegate_task_async`` idempotency key now includes the source workspace, so the same task delegated from two registered workspaces produces two distinct delegations (the right behavior — one per tenant audit trail). * ``platform_auth.list_registered_workspaces()``: new helper for the tool layer to enumerate the multi-ws registry. Lock-free reads matched by the existing single-writer-per-workspace contract from PR-1. * ``platform_auth.self_source_headers``: now passes ``workspace_id`` through to ``auth_headers`` — without this, a multi-workspace POST source-tagged with ``X-Workspace-ID=ws_b`` was authenticating with ws_a's token (or no token if MOLECULE_WORKSPACE_TOKEN unset). Latent PR-1 bug exposed by the new tool surface. * ``a2a_mcp_server`` tool dispatch passes ``source_workspace_id`` from the tool call arguments. * ``platform_tools.registry``: add ``source_workspace_id`` to the delegate_task, delegate_task_async, check_task_status, list_peers input schemas with copy explaining when to use it (rarely — the cache handles it). Tests (15 new, all passing) --------------------------- ``test_a2a_multi_workspace.py``: * TestDiscoverPeerSourceRouting (3): src arg drives header+token, fallback to module ws when omitted, invalid target short-circuits before any HTTP attempt. * TestSendA2AMessageSourceRouting (1): X-Workspace-ID source header + Authorization bearer both come from the source arg via the patched self_source_headers chain. * TestGetPeersSourceRouting (1): URL path AND headers use the source workspace id. * TestToolListPeersAggregation (4): aggregates across multiple registered workspaces, tags origin, leaves single-workspace path unchanged, explicit src arg overrides aggregation, diagnostic joining when every workspace returns empty. * TestToolDelegateTaskAutoRouting (3): cache-driven auto-route, explicit override beats cache, single-workspace fallback to module WORKSPACE_ID. * TestListRegisteredWorkspaces (3): registry enumeration helper. Plus ``tests/snapshots/a2a_instructions_mcp.txt`` regenerated to absorb the new ``source_workspace_id`` schema entries. Back-compat ----------- Every change defaults ``source_workspace_id=None``; legacy single-workspace operators (no MOLECULE_WORKSPACES) see identical behavior — same URLs, same headers, same tool output. The 24 PR-1 tests + 125 existing A2A tests all still pass. Out of scope (PR-3) ------------------- Memory namespacing per registered workspace lands after the new memory system v2 PR (#2740) settles in production. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 08:32:24 -07:00
Hongming Wang	829ab66462	mcp: support multi-workspace external-agent registration (PR-1) External MCP agents (e.g. Claude Code installed on a company PC) can now register against MULTIPLE workspaces from a single process — the agent participates as a peer in workspace A (company) AND workspace B (personal) simultaneously, with one merged inbox tagged so replies route to the correct tenant. Use case (verbatim from operator): "I have this computer AI thats in company's PC, he is going to be put in company's workspace, but personally, I want to register it to my own workspace as well, so that I can talk to it and asking him to do work." ## What changed Wire format — new env var: MOLECULE_WORKSPACES='[ {"id":"<company-wsid>","token":"<company-tok>"}, {"id":"<personal-wsid>","token":"<personal-tok>"} ]' When set, mcp_cli iterates the array and spawns one (register + heartbeat + inbox poller) trio per workspace. Single-workspace mode (WORKSPACE_ID + MOLECULE_WORKSPACE_TOKEN) is unchanged — every existing operator's setup keeps working bit-for-bit. Per-workspace token registry (platform_auth.py): register_workspace_token(wsid, tok) — populated by mcp_cli once per workspace before any thread spawns; thread-safe registration + lock-free reads on the hot path. auth_headers(workspace_id=...) routes to the per-workspace token; auth_headers() with no arg uses the legacy resolution path unchanged (back-compat). Per-workspace inbox cursors (inbox.py): InboxState now supports cursor_paths={wsid: Path,...}. Each poller advances its own cursor — one workspace's slow poll can't stall another, and a 410 only resets the affected workspace's cursor. Single-workspace constructor (cursor_path=Path(...)) still works exactly as before via __post_init__ promotion to the empty-string key. Cursor filenames disambiguated by workspace_id[:8] when multi-workspace; single-workspace keeps the legacy filename so upgrade doesn't invalidate on-disk state. Arrival workspace tagging (inbox.py): InboxMessage.arrival_workspace_id — tells the agent which OF ITS workspaces the inbound message arrived on. Set by the poller from the cursor key. to_dict() omits the field when empty so single- workspace consumers see no shape change. Reply routing (a2a_tools.py + a2a_mcp_server.py + registry.py): send_message_to_user(workspace_id=...) — optional override that selects which workspace's /notify endpoint to POST to (and which token authenticates). Multi-workspace agents pass the inbound message's arrival_workspace_id; single-workspace agents omit it and route to the only registered workspace via the legacy URL. ## Out of scope (future PRs) - PR-2: cross-workspace delegation auto-routing — when an agent receives a request from personal-ws "delegate to ops-bot" and ops-bot lives in company-ws, the agent should auto-pick its company-ws identity for the outbound delegate_task. Today the agent must pass via_workspace explicitly (or fall through to primary workspace). - PR-3: memory namespacing — commit_memory() still writes to the primary workspace's memory regardless of inbound context. Will revisit when the new memory system (PR #2733 just landed) settles. ## Tests workspace/tests/test_mcp_cli_multi_workspace.py — 24 new tests: * MOLECULE_WORKSPACES JSON parsing (valid + 6 error shapes) * Token registry register / lookup / rotation / clear * auth_headers routing by workspace_id with legacy fallback * Per-workspace cursor save/load/reset isolation * arrival_workspace_id present-when-set, omitted-when-empty * default_cursor_path namespacing All 110 pre-existing tests in test_mcp_cli.py / test_inbox.py / test_platform_auth.py still pass — back-compat is mechanical. Refs: project memory entry "External agent multi-workspace registration", design questions answered 2026-05-04 by user (JSON env var; explicit memory writes deferred to PR-3). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 08:06:00 -07:00
Hongming Wang	ff3dcd37f6	fix(chat-history): correct docstring inversion + pin empty-history JSON shape (#2485 ) Two follow-ups from the multi-axis review of #2474: 1. Docstring inversion in tool_chat_history. The doc said '(source_id=peer)' meant 'this workspace is the sender' — actually it means the peer is the sender (source_id is where the activity came FROM). Reframed to 'where the peer is either the sender or the recipient' to match the underlying SQL semantics. 2. Empty-history test. TestChatHistory had 10 tests but no 200+[] happy-path pin. Added test_empty_history_returns_empty_json_list asserting result == '[]' on exact-equality (per assert-exact memory — substring '[]' would match envelope shapes too). Both changes are pure docs+tests — no behaviour change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 10:09:15 -07:00
Hongming Wang	1282b6f3d6	docs(a2a-tools): drop stale comment — before_ts is now server-supported Self-review on PR #2474 + #2476: the comment said we don't forward before_ts, but the code below does. Misleading after #2476 added the server-side filter. Replace with a one-liner that just states the forward-and-validate contract. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 18:13:51 -07:00
Hongming Wang	09e99a09c6	feat(a2a-mcp): add chat_history tool for prior turns with a peer When a peer_agent push lands and the agent needs context from prior turns with that workspace ("what task did this peer assign me last hour?", "what did I tell them?"), the only options today are re-deriving from memory (lossy) or scrolling activity_logs in the canvas (no agent-facing tool). Surface the platform's existing audit log directly via a new MCP tool so agents can read both sides of an A2A conversation in chronological order. Implementation: - a2a_tools.py: new tool_chat_history(peer_id, limit=20, before_ts="") hits /workspaces/<self>/activity?peer_id=X&limit=N (the new server filter from molecule-core#2472). Reverses the DESC response into chronological order so the agent reads top-down. Graceful error envelope on validation/network/non-200 — never crashes the MCP server, agent can branch on Error: prefix. - platform_tools/registry.py: ToolSpec wired into the A2A section so the rendered system-prompt block automatically includes it. Same pattern as the existing inbox_peek/inbox_pop/wait_for_message. - a2a_mcp_server.py: dispatch in handle_tool_call. - executor_helpers.py: _CLI_A2A_COMMAND_KEYWORDS gets a None entry (CLI runtimes don't expose chat history today; flip to a keyword when a2a_cli grows a `history` subcommand). - snapshots/a2a_instructions_mcp.txt regenerated. Tests: 10 new branches in TestChatHistory (validation / param forwarding / limit cap / before_ts pass-through / DESC→chronological reorder / 400 verbatim / 500 generic / network exc / non-list resp). Mutation-verified: reverting a2a_tools.py fails 10/10. Full test suite remains green at 1516 passed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 17:54:23 -07:00
Hongming Wang	993f8c494e	refactor(workspace-runtime): send_a2a_message takes peer_id, validates UUID Two cleanups stacked on PR #2418: 1. Refactor `send_a2a_message(target_url, msg)` → `send_a2a_message(peer_id, msg)`. After #2418 every caller passes `${PLATFORM_URL}/workspaces/{peer_id}/a2a` — the function's parameter pretended to accept arbitrary URLs but in practice only one shape is meaningful. Owning URL construction inside the function makes the contract honest and centralises the peer-id validation introduced below. 2. Add `_validate_peer_id` UUID-shape check at the trust boundary. `discover_peer` and `send_a2a_message` are the entry points where agent-controlled strings flow into URL paths; rejecting non-UUID input at this layer eliminates the URL-interpolation class of bug (`workspace_id="../admin"` etc.) regardless of how the rest of the codebase interpolates ids elsewhere. Auth was already gating malicious access — this is consistency + clear failure over silent platform 4xx. In-container tests cover positive UUIDs, malformed input (``"ws-abc"``, ``"../admin"``, empty), and the contract that ``tool_delegate_task`` hands the peer_id to ``send_a2a_message`` without building URLs itself. Live-verified: external delegation 8dad3e29 → 97ac32e9 returned "refactor verified" from Claude Code Agent through the refactored code; ``_validate_peer_id`` rejects ``"ws-abc"`` and ``"../admin"`` and accepts canonical UUIDs. Stacked on PR #2418 (proxy-routing fix). Will rebase onto staging once #2418 merges. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 17:43:01 -07:00
Hongming Wang	aefb44aff2	fix(workspace-runtime): route delegate_task through platform A2A proxy tool_delegate_task was POSTing directly to peer["url"], which is the Docker-internal hostname (e.g. http://ws-X-Y:8000) for in- container peers. External callers — the standalone molecule-mcp wrapper running on an operator's laptop — get [Errno 8] nodename nor servname every single delegation, breaking the universal-MCP path's last "ride the same code as in-container" claim. The platform's /workspaces/:peer-id/a2a proxy endpoint already handles internal forwarding for in-container peers AND is the only path external runtimes can use. Unify on it: in-container callers pay one extra HTTP hop on the same Docker bridge (microseconds); external callers get a working delegation path for the first time. discover_peer is still called for access-control + online-status detection — only the routing target changes. Verified live on 2026-04-30 against workspace 8dad3e29 (external mac runtime) → 97ac32e9 (Claude Code Agent in-container): direct POST returned ConnectError, proxy POST returned "acknowledged from claude code agent" as requested. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 17:13:50 -07:00
Hongming Wang	b47d4ceb00	feat(workspace-runtime): add inbox polling for standalone molecule-mcp path The universal MCP server (a2a_mcp_server.py) was outbound-only — agents in standalone runtimes (Claude Code, hermes, codex, etc.) could delegate, list peers, and write memories, but never observed the canvas-user or peer-agent messages addressed to them. This blocked "constantly responding" loops without forcing operators back onto a runtime-specific channel plugin. This PR closes the inbound gap with a poller-fed in-memory queue and three new MCP tools: - wait_for_message(timeout_secs?) — block until next message arrives - inbox_peek(limit?) — list pending messages (non-destructive) - inbox_pop(activity_id) — drop a handled message A daemon thread polls /workspaces/:id/activity?type=a2a_receive every 5s, fills the queue from the cursor (since_id), and persists the cursor to ${CONFIGS_DIR}/.mcp_inbox_cursor so a restart doesn't replay backlog. On 410 (cursor pruned) we fall back to since_secs=600 for a bounded recovery window. Activity-row → InboxMessage extraction mirrors the molecule-mcp-claude-channel plugin's extractText (envelope shapes #1-3 + summary fallback). mcp_cli.main starts the poller alongside the existing register + heartbeat threads. In-container runtimes (which have push delivery via canvas WebSocket) skip activation, so inbox tools return an informational "(inbox not enabled)" message instead of double-delivery. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 16:32:48 -07:00
Hongming Wang	3b34dfefbc	feat(workspace): surface peer-discovery failure reason instead of "may be isolated" Closes #2397. Today, every empty-peer condition (true empty, 401/403, 404, 5xx, network) collapses to a single message: "No peers available (this workspace may be isolated)". The user has no way to tell whether they need to provision more workspaces (true isolation), restart the workspace (auth), re-register (404), page on-call (5xx), or check network (timeout) — five different operator actions, one ambiguous string. Wire: - new helper get_peers_with_diagnostic() in a2a_client.py returns (peers, error_summary). error_summary is None on 200; a short actionable string on every other branch. - get_peers() now shims through it so non-tool callers (system-prompt formatters) keep the bare-list contract. - tool_list_peers() switches to the diagnostic helper and surfaces the actual reason. The "may be isolated" string is removed; true empty now reads "no peers in the platform registry." Tests: - TestGetPeersWithDiagnostic: 200, 200-empty, 401, 403, 404, 5xx, network exception, 200-but-non-list-body, and the bare-list-shim regression guard. - TestToolListPeers: each diagnostic branch surfaces its reason + explicit assertion that "may be isolated" is gone. Coverage 91.53% (floor 86%). 122 a2a tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 11:09:26 -07:00
Hongming Wang	af664e3e87	feat(tools): borrow hermes-style discipline — error/summary caps + sharper MCP descriptions Three small wins from the hermes-agent design survey, bundled because each is too small for its own PR but they all improve the priority adapters (claude-code + hermes) immediately. 1. Hermes-style cap on telemetry fields, applied INSIDE report_activity so every caller benefits without remembering. error_detail capped at 4096 (hermes' value); summary capped at 256 (one-liner ceiling). The existing call site in tool_delegate_task already truncated error_detail at 4096, but moving the cap into the helper closes the door on a future caller pasting a giant traceback. response_text is NOT capped (it's the agent's user-visible reply; truncating would silently drop content). Pinned by 4 new tests including a negative-pin that response_text MUST stay untruncated. 2. Sharper MCP tool descriptions for commit_memory + recall_memory — hermes' delegate_task description literally says "WAIT for the response" and delegate_task_async says "Returns immediately." LLMs pick the right tool variant from descriptions; ambiguity costs accuracy. - commit_memory now states it APPENDS (each call creates a row, no overwrite) and that GLOBAL requires tier 0. - recall_memory now states it's case-insensitive substring search with no pagination, returns all matches, and that empty-query is cheap and safer than a narrow keyword. 3. (no code change) Filed task #120 for the bigger user-flow win — a per-workspace tool enable/disable menu in Canvas Config — and task #121 for model-string passthrough (depends on #87 universal-runtime refactor). Verification: - 1312/1312 Python pytest pass (was 1308, +4 new) See task #119 for the architectural follow-ups (event-log layer, declarative skill compat, observability config block) and project memory `project_runtime_native_pluggable.md`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 23:25:54 -07:00
Hongming Wang	6eaacf175b	fix(notify): review-flagged Critical + Required findings on PR #2130 Two Critical bugs caught in code review of the agent→user attachments PR: 1. Empty-URI attachments slipped past validation. Gin's go-playground/validator does NOT iterate slice elements without `dive` — verified zero `dive` usage anywhere in workspace-server — so the inner `binding:"required"` tags on NotifyAttachment.URI/Name were never enforced. `attachments: [{"uri":"","name":""}]` would pass validation, broadcast empty-URI chips that render blank in canvas, AND persist them in activity_logs for every page reload to re-render. Added explicit per-element validation in Notify (returns 400 with `attachment[i]: uri and name are required`) plus defence-in-depth in the canvas filter (rejects empty strings, not just non-strings). 3-case regression test pins the rejection. 2. Hardcoded application/octet-stream stripped real mime types. `_upload_chat_files` always passed octet-stream as the multipart Content-Type. chat_files.go:Upload reads `fh.Header.Get("Content-Type")` FIRST and only falls back to extension-sniffing when the header is empty, so every agent-attached file lost its real type forever — broke the canvas's MIME-based icon/preview logic. Now sniff via `mimetypes.guess_type(path)` and only fall back to octet-stream when sniffing returns None. Plus three Required nits: - `sqlmockArgMatcher` was misleading — the closure always returned true after capture, identical to `sqlmock.AnyArg()` semantics, but named like a custom matcher. Renamed to `sqlmockCaptureArg(*string)` so the intent (capture for post-call inspection, not validate via driver-callback) is unambiguous. - Test asserted notify call by `await_args_list[1]` index — fragile to any future _upload_chat_files refactor that adds a pre-flight POST. Now filter call list by URL suffix `/notify` and assert exactly one match. - Added `TestNotify_RejectsAttachmentWithEmptyURIOrName` (3 cases) covering empty-uri, empty-name, both-empty so the Critical fix stays defended. Deferred to follow-up: - ORDER BY tiebreaker for same-millisecond notifies — pre-existing risk, not regression. - Streaming multipart upload — bounded by the platform's 50MB total cap so RAM ceiling is fixed; switch to streaming if cap rises. - Symlink rejection — agent UID can already read whatever its filesystem perms allow via the shell tool; rejecting symlinks doesn't materially shrink the attack surface. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 19:47:31 -07:00
Hongming Wang	d028fe19ff	feat(notify): agent → user file attachments via send_message_to_user Closes the gap where the Director would say "ZIP is ready at /tmp/foo.zip" in plain text instead of attaching a download chip — the runtime literally had no API for outbound file attachments. The canvas + platform's chat-uploads infrastructure already supported the inbound (user → agent) direction (commit `94d9331c`); this PR wires the outbound side. End-to-end shape: agent: send_message_to_user("Done!", attachments=["/tmp/build.zip"]) ↓ runtime POST /workspaces/<self>/chat/uploads (multipart) ↓ platform /workspace/.molecule/chat-uploads/<uuid>-build.zip → returns {uri: workspace:/...build.zip, name, mimeType, size} ↓ runtime POST /workspaces/<self>/notify {message: "Done!", attachments: [{uri, name, mimeType, size}]} ↓ platform Broadcasts AGENT_MESSAGE with attachments + persists to activity_logs with response_body = {result: "Done!", parts: [{kind:file, file:{...}}]} ↓ canvas WS push: canvas-events.ts adds attachments to agentMessages queue Reload: ChatTab.loadMessagesFromDB → extractFilesFromTask sees parts[] Either path → ChatTab renders download chip via existing path Files changed: workspace-server/internal/handlers/activity.go - NotifyAttachment struct {URI, Name, MimeType, Size} - Notify body accepts attachments[], broadcasts in payload, persists as response_body.parts[].kind="file" canvas/src/store/canvas-events.ts - AGENT_MESSAGE handler reads payload.attachments, type-validates each entry, attaches to agentMessages queue - Skips empty events (was: skipped only when content empty) workspace/a2a_tools.py - tool_send_message_to_user(message, attachments=[paths]) - New _upload_chat_files helper: opens each path, multipart POSTs to /chat/uploads, returns the platform's metadata - Fail-fast on missing file / upload error — never sends a notify with a half-rendered attachment chip workspace/a2a_mcp_server.py - inputSchema declares attachments param so claude-code SDK surfaces it to the model - Defensive filter on the dispatch path (drops non-string entries if the model sends a malformed payload) Tests: - 4 new Python: success path, missing file, upload 5xx, no-attach backwards compat - 1 new Go: Notify-with-attachments persists parts[] in response_body so chat reload reconstructs the chip Why /tmp paths work even though they're outside the canvas's allowed roots: the runtime tool reads the bytes locally and re-uploads through /chat/uploads, which lands the file under /workspace (an allowed root). The agent can specify any readable path. Does NOT include: agent → agent file transfer. Different design problem (cross-workspace download auth: peer would need a credential to call sender's /chat/download). Tracked as a follow-up under task #114. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 19:35:58 -07:00
Hongming Wang	c159d85eb5	fix(a2a): review-driven hardening — prefix-anchored type check, error_detail cap, shared hint module Three required fixes from the bundle review of `391e1872`: 1. workspace/a2a_client.py: substring `type_name in msg` could miss the diagnostic prefix when an exception's message embedded a different class name mid-string (e.g. `OSError("see ConnectionError below")` → printed as plain msg, type lost). Switched to a prefix-anchored check (`msg.startswith(f"{type_name}:")` etc.) so the type label is always added when not already at the start of the message. 2. workspace/a2a_tools.py: `activity_logs.error_detail` is unbounded TEXT on the platform (handlers/activity.go does not validate length). A buggy or hostile peer could stream arbitrarily large error messages into the caller's activity log. Cap at 4096 chars at the producer — comfortably above any real exception traceback, well below an obvious-DoS threshold. 3. New regression test for JSON-RPC `code=0` — pins the `code is not None` semantics so the code is preserved in the detail rather than collapsing into the no-code path. Code=0 is not valid per the spec, but a malformed peer can still emit it and we want it visible for diagnosis. Plus one optional taken: extracted the A2A-error → hint mapping into canvas/src/components/tabs/chat/a2aErrorHint.ts. The two prior copies (AgentCommsPanel.inferCauseHint + ActivityTab.inferA2AErrorHint) had already drifted — Activity tab gained `not found`/`offline` cases the chat panel never picked up, AgentCommsPanel handled empty-input explicitly while Activity didn't. The shared module is the merged superset, with 10 unit tests pinning each named pattern + the "most specific first" ordering (Claude SDK wedge wins over generic timeout). Skipped (per analysis): - Unicode-naive 120-char slice — Python str[:N] slices on code points, not bytes. Safe. - Nested [A2A_ERROR] confusion — non-issue per reviewer; outer prefix winning still produces a structured render. - MessagePreview + JsonBlock dual render on errors — intentional drilldown; raw JSON is below the fold for operators who need it. - console.warn dedup — refetches don't happen per-event so spam risk is low. - str(data)[:200] materialization — A2A response bodies aren't typically MB-sized. Verified: 1005 canvas tests pass (10 new hint tests); 10 Python send_a2a_message tests pass (1 new for code=0); tsc clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 23:47:44 -07:00
Hongming Wang	391e187281	fix(a2a,canvas): make delivery failures comprehensive instead of "[A2A_ERROR] " Symptom: Activity tab and Agent Comms surfaced bare "[A2A_ERROR] " (prefix + nothing) for failed delegations. Operator had no signal to act on — no exception type, no target, no hint about what went wrong, no next step. Fix is in three layers. 1. workspace/a2a_client.py — every error path now produces an actionable detail string: - except branch: some httpx exceptions (RemoteProtocolError, ConnectionReset variants) stringify to "". Pre-fix the catch was `f"{_A2A_ERROR_PREFIX}{e}"` → bare prefix. Now falls back to `<TypeName> (no message — likely connection reset or silent timeout)` and always appends `[target=<url>]` for traceability in chained delegations. - JSON-RPC error branch: previously dropped error.code on the floor and printed "unknown" when message was missing. Now surfaces both, including the well-defined "JSON-RPC error with no message (code=N)" path. - "neither result nor error" branch: pre-fix returned str(payload) which the canvas rendered as a successful response block. Now tagged as A2A_ERROR with a payload snippet so downstream UI routes through the error path. 2. workspace/a2a_tools.py — tool_delegate_task now passes error_detail (the stripped error message) through to the activity-log POST. The platform's activity_logs.error_detail column is the canvas's red error chip source; populating it makes the failure visible in the row header without the user having to expand into raw response_body JSON. The summary line also gets a 120-char prefix of the cause so the collapsed row reads "React Engineer failed: ConnectionResetError: ... [target=...]" instead of "React Engineer failed". 3. canvas/src/components/tabs/ActivityTab.tsx — MessagePreview now detects [A2A_ERROR]-prefixed bodies and renders a structured error block (red chip, stripped detail, cause hint) instead of the previous gray text-block that showed the literal "[A2A_ERROR]" string. inferA2AErrorHint mirrors the patterns from AgentCommsPanel.inferCauseHint so the same symptom reads the same way in both surfaces (Claude SDK init wedge → restart workspace; timeout → busy/stuck; connection-reset → transient blip then check logs). Tests: 9 send_a2a_message tests pass (including a new regression test for the empty-stringifying-exception case that the user reported); 995 canvas tests pass; tsc clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 23:40:05 -07:00
Molecule AI Marketing Lead	e00797ba35	fix(security): prevent cross-tenant memory contamination in commit_memory/recall_memory (GH#1610) Two critical gaps in a2a_tools.py let any tenant workspace poison org-wide (GLOBAL) memory and bypass all RBAC enforcement: 1. tool_commit_memory had no RBAC check — any agent could write any scope. 2. tool_commit_memory had no root-workspace enforcement for GLOBAL scope — Tenant A could POST scope=GLOBAL and pollute the shared memory store that Tenant B's agent reads as trusted context. Fix adds: - _ROLE_PERMISSIONS table (mirrors builtin_tools/audit.py) so a2a_tools has isolated RBAC logic without depending on memory.py. - _check_memory_write_permission() / _check_memory_read_permission() helpers: evaluate RBAC roles from WorkspaceConfig; fail closed (deny) on errors. - _is_root_workspace() / _get_workspace_tier(): read WorkspaceConfig.tier (0 = root/org, 1+ = tenant) from config.yaml; fall back to WORKSPACE_TIER env var. - tool_commit_memory now (a) checks memory.write RBAC, (b) rejects GLOBAL scope for non-root workspaces, (c) embeds workspace_id in the POST body so the platform can namespace-isolate and audit cross-workspace writes. - tool_recall_memory now checks memory.read RBAC before any HTTP call, and always sends workspace_id as a GET param for platform cross-validation. Security regression tests added: - GLOBAL scope denied for non-root (tier>0) workspaces. - RBAC denial blocks all scope levels (including LOCAL) on write. - RBAC denial blocks recall entirely. - workspace_id present in POST body and GET params. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-23 10:21:34 -07:00
molecule-ai[bot]	64ccf8e179	fix: CWE-78 rm scope, go vet failures, delegation idempotency * refactor: split 4 oversized handler files into focused sub-files - org.go (1099 lines) → org.go + org_import.go + org_helpers.go - mcp.go (1001 lines) → mcp.go + mcp_tools.go - workspace.go (934 lines) → workspace.go + workspace_crud.go - a2a_proxy.go (825 lines) → a2a_proxy.go + a2a_proxy_helpers.go No functional changes — same package, same exports, same tests. All files stay under 635 lines. Note: isSafeURL and isPrivateOrMetadataIP are duplicated between mcp_tools.go and a2a_proxy_helpers.go — this is a pre-existing issue from the original mcp.go and a2a_proxy.go, not introduced by this split. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(runtime+scheduler): increment/decrement active_tasks counter (refs #1386) * docs(tutorials): add Self-Hosted AI Agents guide — Docker, Fly Machines, bare metal * docs: add Remote Agents feature + Phase 30 blog links to docs index * docs(marketing): update Phase 30 brief — Action 5 complete, docs/index.md update noted * docs(api-ref): add workspace file copy API reference (#1281) Documents TemplatesHandler.copyFilesToContainer (container_files.go): - Endpoint overview: PUT /workspaces/:id/files/path - Parameter descriptions for all four function parameters - CWE-22 path traversal protection (PRs #1267/1270/1271) - Defense-in-depth: validateRelPath at handler + archive boundary - Full error code table (400/404/500) - curl example with success and path-traversal rejection cases Also covers: writeViaEphemeral routing, findContainer fallback, allowed roots allow-list, and related links to platform-api.md. Co-authored-by: Molecule AI Technical Writer <technical-writer@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> fix(security): CWE-78/CWE-22 — block shell injection in deleteViaEphemeral (#1310) ## Summary Issue #1273: deleteViaEphemeral interpolated filePath directly into rm command, enabling both shell injection (CWE-78) and path traversal (CWE-22) attacks. ## Changes 1. Added validateRelPath(filePath) guard before constructing the rm command. validateRelPath blocks absolute paths and ".." traversal sequences. 2. Changed Cmd from "/configs/"+filePath (string interpolation) to []string{"rm", "-rf", "/configs", filePath} (exec form). This eliminates shell injection entirely — filePath is a plain argument, never interpreted as shell code. ## Security properties - validateRelPath: blocks "../" and absolute paths before they reach Docker - Exec form: filePath cannot inject shell metacharacters even if validation is somehow bypassed - "/configs" as separate arg: rm has exactly two arguments, no room for injected args Closes #1273. Co-authored-by: Molecule AI Infra-Runtime-BE <infra-runtime-be@agents.moleculesai.app> * fix(security): backport SSRF defence (CWE-918) to main — isSafeURL in a2a_proxy.go (#1292) (#1302) * fix(security): backport SSRF defence (CWE-918) to main — isSafeURL in mcp.go and a2a_proxy.go Issue #1042: 3 CodeQL SSRF findings across mcp.go and a2a_proxy.go. staging already ships the fix (PRs #1147, #1154 → merged); main did not include it. - mcp.go: add isSafeURL() + isPrivateOrMetadataIP() helpers; validate agentURL before outbound calls in mcpCallTool (line ~529) and toolDelegateTaskAsync (line ~607) - a2a_proxy.go: add identical isSafeURL() + isPrivateOrMetadataIP() helpers; call isSafeURL() before dispatchA2A in resolveAgentURL() (blocks finding #1 at line 462) - mcp_test.go: 19 new tests covering all blocked URL patterns: file://, ftp://, 127.0.0.1, ::1, 169.254.169.254, 10.x.x.x, 172.16.x.x, 192.168.x.x, empty hostname, invalid URL, isPrivateOrMetadataIP across all private/CGNAT/metadata ranges 1. URL scheme enforcement — http/https only 2. IP literal blocking — loopback, link-local, RFC-1918, CGNAT, doc/test ranges 3. DNS hostname resolution — blocks internal hostnames resolving to private IPs Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(ci-blocker): remove duplicate isSafeURL/isPrivateOrMetadataIP from mcp.go Issue #1292: PR #1274 duplicated isSafeURL + isPrivateOrMetadataIP in mcp.go — both functions already exist on main at lines 829 and 876. Kept the mcp.go definitions (the originals) and removed the 70-line duplicate appended at end of file. a2a_proxy.go functions are unchanged — they serve the same purpose via a separate code path. * fix: remove orphaned commit-text lines from a2a_proxy.go Three lines from the PR/commit title were accidentally baked into the file during the rebase from #1274 to #1302, causing a Go syntax error (a bare string literal at statement level followed by dangling braces). Deletion restores: } return agentURL, nil } Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Molecule AI Infra-Runtime-BE <infra-runtime-be@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Molecule AI Core-BE <core-be@agents.moleculesai.app> Co-authored-by: Molecule AI SDK Lead <sdk-lead@agents.moleculesai.app> * fix(canvas/test): patch test regressions from PR #1243 + proximity hitbox fix (#1313) * fix(ci): revert cancel-in-progress to true — ubuntu-runner dispatch stalled With cancel-in-progress: false, pending CI runs accumulate in the ci-staging concurrency group. New pushes create queued runs, but GitHub dispatches multiple runs for the same SHA instead of replacing the pending one. All runs get stuck/cancelled before completing. Reverting to cancel-in-progress: true restores CI operation — runs that are superseded are cancelled, freeing the concurrency slot for the new run to proceed. Runner availability (ubuntu-latest dispatch stall) is a separate infra issue tracked independently. * fix(security): validate tar header names in copyFilesToContainer — CWE-22 path traversal (#1043) Tar header names were built from raw map keys without validation. A malicious server-side caller could embed "../" in a file name to escape the destPath volume mount (/configs) and write files outside the intended directory. Fix: validate each name with filepath.Clean + IsAbs + HasPrefix("..") checks before using it in the tar header, then join with destPath for the archive header. Also guard parent-directory creation against traversal. Closes #1043. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas/test): patch regressed tests from PR #1243 orgs-page flakiness fix Two regressions introduced by PR #1243 (fix issue #1207): 1. ContextMenu.keyboard.test.tsx — `setPendingDelete` now receives `{id, name, hasChildren}` (cascade-delete UX, PR #1252), but the test expected only `{id, name}`. Added `hasChildren: false` to the assertion. 2. orgs-page.test.tsx — 10 tests awaited `vi.advanceTimersByTimeAsync(50)` without `act()`. With fake timers, `setState` (synchronous) is flushed by `advanceTimersByTimeAsync`, but the React state update it triggers is a microtask — so the test saw stale render. Wrapping in `act(async () => { await vi.advanceTimersByTimeAsync(50); })` ensures microtasks drain before assertions run. All 813 vitest tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): add 100px proximity threshold to drag-to-nest detection Fixes #1052 — previously, getIntersectingNodes() returned any node whose bounding box overlapped the dragged node, regardless of actual pixel distance. On a sparse canvas this triggered the "Nest Workspace" dialog even when the dragged node was nowhere near any target. The fix adds an on-node-drag proximity filter: only nodes within 100px (center-to-center) of the dragged node are eligible as nest targets. Distance is computed as squared Euclidean to avoid the sqrt overhead in the hot drag path. Added two tests to Canvas.pan-to-node.test.tsx covering the mock wiring and confirming the regression is addressed in Canvas.tsx. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com> Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): add ?? 0 guard for optional budget_used in progressPct (#1324) (#1327) * fix(ci): revert cancel-in-progress to true — ubuntu-runner dispatch stalled With cancel-in-progress: false, pending CI runs accumulate in the ci-staging concurrency group. New pushes create queued runs, but GitHub dispatches multiple runs for the same SHA instead of replacing the pending one. All runs get stuck/cancelled before completing. Reverting to cancel-in-progress: true restores CI operation — runs that are superseded are cancelled, freeing the concurrency slot for the new run to proceed. Runner availability (ubuntu-latest dispatch stall) is a separate infra issue tracked independently. * fix(security): validate tar header names in copyFilesToContainer — CWE-22 path traversal (#1043) Tar header names were built from raw map keys without validation. A malicious server-side caller could embed "../" in a file name to escape the destPath volume mount (/configs) and write files outside the intended directory. Fix: validate each name with filepath.Clean + IsAbs + HasPrefix("..") checks before using it in the tar header, then join with destPath for the archive header. Also guard parent-directory creation against traversal. Closes #1043. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas/test): patch regressed tests from PR #1243 orgs-page flakiness fix Two regressions introduced by PR #1243 (fix issue #1207): 1. ContextMenu.keyboard.test.tsx — `setPendingDelete` now receives `{id, name, hasChildren}` (cascade-delete UX, PR #1252), but the test expected only `{id, name}`. Added `hasChildren: false` to the assertion. 2. orgs-page.test.tsx — 10 tests awaited `vi.advanceTimersByTimeAsync(50)` without `act()`. With fake timers, `setState` (synchronous) is flushed by `advanceTimersByTimeAsync`, but the React state update it triggers is a microtask — so the test saw stale render. Wrapping in `act(async () => { await vi.advanceTimersByTimeAsync(50); })` ensures microtasks drain before assertions run. All 813 vitest tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): add 100px proximity threshold to drag-to-nest detection Fixes #1052 — previously, getIntersectingNodes() returned any node whose bounding box overlapped the dragged node, regardless of actual pixel distance. On a sparse canvas this triggered the "Nest Workspace" dialog even when the dragged node was nowhere near any target. The fix adds an on-node-drag proximity filter: only nodes within 100px (center-to-center) of the dragged node are eligible as nest targets. Distance is computed as squared Euclidean to avoid the sqrt overhead in the hot drag path. Added two tests to Canvas.pan-to-node.test.tsx covering the mock wiring and confirming the regression is addressed in Canvas.tsx. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): add ?? 0 guard for optional budget_used in progressPct Fixes #1324 — TypeScript strict mode flags budget.budget_used as possibly undefined in the progressPct ternary, even though the outer condition checks budget_limit > 0. Fix: use nullish coalescing (budget_used ?? 0) so progress shows 0% when the backend returns a partial shape (provisioning-stuck workspaces). Also adds a test covering the undefined-budget_used case with the progress bar aria-valuenow and fill width both at 0%. Closes #1324. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com> Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): add ?? 0 guard for optional budget_used in progressPct (issue #1324) (#1329) * fix(ci): revert cancel-in-progress to true — ubuntu-runner dispatch stalled With cancel-in-progress: false, pending CI runs accumulate in the ci-staging concurrency group. New pushes create queued runs, but GitHub dispatches multiple runs for the same SHA instead of replacing the pending one. All runs get stuck/cancelled before completing. Reverting to cancel-in-progress: true restores CI operation — runs that are superseded are cancelled, freeing the concurrency slot for the new run to proceed. Runner availability (ubuntu-latest dispatch stall) is a separate infra issue tracked independently. * fix(security): validate tar header names in copyFilesToContainer — CWE-22 path traversal (#1043) Tar header names were built from raw map keys without validation. A malicious server-side caller could embed "../" in a file name to escape the destPath volume mount (/configs) and write files outside the intended directory. Fix: validate each name with filepath.Clean + IsAbs + HasPrefix("..") checks before using it in the tar header, then join with destPath for the archive header. Also guard parent-directory creation against traversal. Closes #1043. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas/test): patch regressed tests from PR #1243 orgs-page flakiness fix Two regressions introduced by PR #1243 (fix issue #1207): 1. ContextMenu.keyboard.test.tsx — `setPendingDelete` now receives `{id, name, hasChildren}` (cascade-delete UX, PR #1252), but the test expected only `{id, name}`. Added `hasChildren: false` to the assertion. 2. orgs-page.test.tsx — 10 tests awaited `vi.advanceTimersByTimeAsync(50)` without `act()`. With fake timers, `setState` (synchronous) is flushed by `advanceTimersByTimeAsync`, but the React state update it triggers is a microtask — so the test saw stale render. Wrapping in `act(async () => { await vi.advanceTimersByTimeAsync(50); })` ensures microtasks drain before assertions run. All 813 vitest tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): add 100px proximity threshold to drag-to-nest detection Fixes #1052 — previously, getIntersectingNodes() returned any node whose bounding box overlapped the dragged node, regardless of actual pixel distance. On a sparse canvas this triggered the "Nest Workspace" dialog even when the dragged node was nowhere near any target. The fix adds an on-node-drag proximity filter: only nodes within 100px (center-to-center) of the dragged node are eligible as nest targets. Distance is computed as squared Euclidean to avoid the sqrt overhead in the hot drag path. Added two tests to Canvas.pan-to-node.test.tsx covering the mock wiring and confirming the regression is addressed in Canvas.tsx. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): add ?? 0 guard for optional budget_used in progressPct Fixes #1324 — TypeScript strict mode flags budget.budget_used as possibly undefined in the progressPct ternary, even though the outer condition checks budget_limit > 0. Fix: use nullish coalescing (budget_used ?? 0) so progress shows 0% when the backend returns a partial shape (provisioning-stuck workspaces). Also adds a test covering the undefined-budget_used case with the progress bar aria-valuenow and fill width both at 0%. Closes #1324. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com> Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(platform): unblock SaaS workspace registration end-to-end Every workspace in the cross-EC2 SaaS provisioning shape was failing registration, heartbeat, or A2A routing. Four distinct blockers sat between "EC2 is up" and "agent responds"; three are platform-side and fixed here (the fourth is in the CP user-data, separate PR). 1. SSRF validator blocked RFC-1918 (registry.go + mcp.go) validateAgentURL and isPrivateOrMetadataIP rejected 172.16.0.0/12, which contains the AWS default VPC range (172.31.x.x) that every sibling workspace EC2 registers from. Registration returned 400 and the 10-min provision sweep flipped status to failed. RFC-1918 + IPv6 ULA are now gated behind saasMode(); link-local (169.254/16), loopback, IPv6 metadata (fe80::/10, ::1), and TEST-NET stay blocked unconditionally in both modes. saasMode() resolution order: 1. MOLECULE_DEPLOY_MODE=saas\|self-hosted (explicit operator flag) 2. MOLECULE_ORG_ID presence (legacy implicit signal, kept for back-compat so existing deployments don't need a config change) isPrivateOrMetadataIP now actually checks IPv6 — previously it returned false on any non-IPv4 input, which would let a registered [::1] or [fe80::...] URL bypass the SSRF check entirely. 2. Orphan auth-token minting (workspace_provision.go) issueAndInjectToken mints a token and stuffs it into cfg.ConfigFiles[".auth_token"]. The Docker provisioner writes that file into the /configs volume — the CP provisioner ignores it (only cfg.EnvVars crosses the wire). Result: live token in DB, no plaintext on disk, RegistryHandler.requireWorkspaceToken 401s every /registry/register attempt because the workspace is no longer in the "no live token → bootstrap-allowed" state. Now no-ops in SaaS mode; the register handler already mints on first successful register and returns the plaintext in the response body for the runtime to persist locally. Also removes the redundant wsauth.IssueToken call at the bottom of provisionWorkspaceCP, which created the same orphan-token pattern a second time. 3. Compaction artefacts (bundle/importer.go, handlers/org_tokens.go, scheduler.go, workspace_provision.go) Four pre-existing compile errors on main from an earlier session's code truncation: missing tuple destructuring on ExecContext / redactSecrets / orgTokenActor, missing close-brace in Scheduler.fireSchedule's panic recovery. All one-line mechanical fixes; without them the binary would not build. Tests ----- ssrf_test.go adds: * TestSaasMode — covers the env resolution ladder (explicit flag wins over legacy signal, case-insensitive, whitespace tolerant) * TestIsPrivateOrMetadataIP_SaaSMode — asserts RFC-1918 + IPv6 ULA flip to allowed, metadata/loopback/TEST-NET still blocked * TestIsPrivateOrMetadataIP_IPv6 — regression guard for the old "returns false for all IPv6" behaviour Follow-up issue for CP-sourced workspace_id attestation will be filed separately — closes the residual intra-VPC SSRF + token-race windows the SaaS-mode relaxation introduces. Verified end-to-end today on workspace 6565a2e0 (hermes runtime, OpenAI provider) — agent returned "PONG" in 1.4s after register → heartbeat → A2A proxy → runtime. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(runtime+scheduler): increment/decrement active_tasks + max_concurrent (#1408) Runtime (shared_runtime.py): - set_current_task now increments active_tasks on task start, decrements on completion (was binary 0/1) - Counter never goes below 0 (max(0, n-1)) - Pushes heartbeat immediately on BOTH increment and decrement (#1372) Scheduler (scheduler.go): - Reads max_concurrent_tasks from DB (default 1, backward compatible) - Skips cron only when active_tasks >= max_concurrent_tasks (was > 0) - Leaders can be configured with max_concurrent_tasks > 1 to accept A2A delegations while a cron runs Platform: - Added max_concurrent_tasks column to workspaces (migration 037) - Workspace model + list/get queries include the new field - API exposes max_concurrent_tasks in workspace JSON Config.yaml support (future): runtime_config.max_concurrent_tasks Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(review): address 3 critical issues from code review 1. BLOCKER: executor_helpers.py now uses increment/decrement too (was still binary 0/1, stomping the counter for CLI + SDK executors) 2. BUG: asymmetric getattr defaults fixed — both paths use default 0 (was 0 on increment, 1 on decrement) 3. UX: current_task preserved when active_tasks > 0 on decrement (was clearing task description even when other tasks still running) 4. Scheduler polling loop re-reads max_concurrent_tasks on each poll (was using stale value from initial query) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Hongming Wang <hongmingwangrabbit@gmail.com> Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com> Co-authored-by: Molecule AI Technical Writer <technical-writer@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Molecule AI Infra-Runtime-BE <infra-runtime-be@agents.moleculesai.app> Co-authored-by: Molecule AI Core-BE <core-be@agents.moleculesai.app> Co-authored-by: Molecule AI SDK Lead <sdk-lead@agents.moleculesai.app> Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app> Co-authored-by: Hongming Wang <hongmingwang.rabbit@users.noreply.github.com> * docs: workspace files API reference, skill catalog, and links * docs: fix secrets endpoint path across docs The workspace secrets endpoint is `/workspaces/:id/secrets`, not `/secrets/values`. This was wrong in quickstart.md (Path 2: Remote Agent) and workspace-runtime.md (registration flow example and comparison table). The external-agent-registration guide already had the correct path. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs: fix broken blog cross-link in skills-vs-bundled-tools post Link path had an extra `/docs/` segment: `/docs/blog/...` instead of `/blog/...`. Nextra resolves blog posts directly under `/blog/<slug>`, not under `/docs/blog/`. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs: add skill-catalog.md guide Linked from the skills-vs-bundled-tools blog post as a reference for TTS/image-generation/web-search skills. The blog promises "install directly via the CLI" with a skill catalog — this page fills that promise by documenting available skill types, install commands, version management, custom skill authoring, and removal. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs(marketing): update Phase 30 brief — Action 5 complete, docs/index.md update noted * docs(api-ref): add workspace file copy API reference Documents TemplatesHandler.copyFilesToContainer (container_files.go): - Endpoint overview: PUT /workspaces/:id/files/path - Parameter descriptions for all four function parameters - CWE-22 path traversal protection (PRs #1267/1270/1271) - Defense-in-depth: validateRelPath at handler + archive boundary - Full error code table (400/404/500) - curl example with success and path-traversal rejection cases Also covers: writeViaEphemeral routing, findContainer fallback, allowed roots allow-list, and related links to platform-api.md. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Molecule AI Technical Writer <technical-writer@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com> fix(handlers): add saasMode() gating to isPrivateOrMetadataIP in a2a_proxy_helpers.go Issue #1421 / #1401: PR #1363 (handler split) moved isPrivateOrMetadataIP into a2a_proxy_helpers.go but kept the OLD pre-SaaS version — it unconditionally blocks RFC-1918 addresses, regressing the fix in commits `1125a02` / `cf10733`. The A2A proxy path now has the same SaaS-gated logic as registry.go: - Cloud metadata (169.254/16, fe80::/10, ::1) always blocked in both modes - RFC-1918 (10/8, 172.16/12, 192.168/16) + IPv6 ULA (fc00::/7) blocked in self-hosted, allowed in SaaS cross-EC2 mode - IPv6 addresses now properly checked (previous version returned false for all) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs(marketing): Discord adapter Day 2 Reddit + HN community copy * fix(tests): supply events.Broadcaster pointer to captureBroadcaster Cannot use captureBroadcaster as events.Broadcaster when the struct embeds events.Broadcaster as a value — must initialize as a named field. Fixes go vet error in workspace_provision_test.go: cannot use broadcaster (captureBroadcaster) as events.Broadcaster value Merge pull request #1429 from fix/canvas-tooltip-clear-timer Without this, a 400ms setTimeout from onFocus/onMouseEnter that fires after onBlur will re-show a tooltip the user just dismissed. The setShow(false) in onBlur closes the tooltip immediately but leaves the timer pending — Tab-blur followed by timer-fire would re-show it. Fix: add clearTimeout(timerRef.current) at the top of onBlur, mirroring the pattern already used in onMouseLeave and onFocus. Refs: PR #1367 (a11y keyboard support — this was a pre-existing gap) Co-authored-by: Molecule AI App-FE <app-fe@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas/test): add missing children:[] to setPendingDelete expectation (#1426) PR #1252 (cascade-delete UX) updated setPendingDelete to pass a children array for cascade-warning rendering. The keyboard-a11y test assertion was not updated to match. Test: clicking 'Delete' hoists state to the store and closes the menu Co-authored-by: Molecule AI Core-QA <core-qa@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas/test): add children:[] to setPendingDelete + \' entity fix (closes #1380) (#1427) * ci: retry — trigger fresh runner allocation * fix(canvas/test): add children:[] to setPendingDelete assertion setPendingDelete now includes children:[] (PR #1383 extended the pendingDelete type). The keyboard accessibility test at line 225 used exact object matching which omitted the new field, causing a failure after staging merged #1383. Issue: #1380 * fix(canvas): replace ' HTML entity with straight apostrophe JSX does not entity-decode ' — it renders the literal text "'" instead of "'". Found at line 157 (payment confirmed) and line 321 (empty org list). Replaced with a straight apostrophe, which JSX handles correctly. Ref: issue #1375 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: DevOps Engineer <devops@molecule.ai> Co-authored-by: Molecule AI Core-UIUX <core-uiux@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * Merge pull request #1430 from fix/1421-saas-ssrf-helpers Issue #1421 / #1401: PR #1363 (handler split) moved isPrivateOrMetadataIP into a2a_proxy_helpers.go but kept the OLD pre-SaaS version — it unconditionally blocks RFC-1918 addresses, regressing the fix in commits `1125a02` / `cf10733`. The A2A proxy path now has the same SaaS-gated logic as registry.go: - Cloud metadata (169.254/16, fe80::/10, ::1) always blocked in both modes - RFC-1918 (10/8, 172.16/12, 192.168/16) + IPv6 ULA (fc00::/7) blocked in self-hosted, allowed in SaaS cross-EC2 mode - IPv6 addresses now properly checked (previous version returned false for all) Co-authored-by: Molecule AI Core-BE <core-be@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(P0): CWE-22 path traversal in copyFilesToContainer + ContextMenu test Issue #1434 — CWE-22 Path Traversal Regression: PR #1280 (`dc218212`) correctly used cleaned path in tar header. PR #1363 (`e9615af`) regressed to using uncleaned `name`. Fix: use `clean` in filepath.Join AND add defence-in-depth escape check. Issue #1422 — ContextMenu Test Regression: PR #1340 expanded pendingDelete store type to include `children:[]`. Test assertion missing the field — add `children:[]` to match. Note: ssrf.go created (shared isSafeURL/isPrivateOrMetadataIP) to prepare for the handler-split refactor fix — current branch has no build error, but the shared file will prevent regression when PR #1363 is merged. isSafeURL/isPrivateOrMetadataIP retained in both files for now to avoid breaking callers while the split is finalized. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: resolve 3 go vet failures + add idempotency_key to delegate_task_async - workspace_provision_test.go: add missing mock := setupTestDB(t) to TestSeedInitialMemories_Truncation — mock was referenced but never declared, causing "undefined: mock" vet error - orgtoken/tokens_test.go: discard unused orgID return value with _ in Validate call — "declared and not used" vet error - a2a_tools.py: delegate_task_async now sends idempotency_key (SHA-256 of workspace_id + task) to POST /workspaces/:id/delegate, fixing duplicate task execution when an agent restarts mid-delegation (#1456) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: airenostars <airenostars@gmail.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com> Co-authored-by: Hongming Wang <hongmingwangrabbit@gmail.com> Co-authored-by: Molecule AI Technical Writer <technical-writer@agents.moleculesai.app> Co-authored-by: Molecule AI Infra-Runtime-BE <infra-runtime-be@agents.moleculesai.app> Co-authored-by: Molecule AI Core-BE <core-be@agents.moleculesai.app> Co-authored-by: Molecule AI SDK Lead <sdk-lead@agents.moleculesai.app> Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app> Co-authored-by: Hongming Wang <hongmingwang.rabbit@users.noreply.github.com> Co-authored-by: Molecule AI Community Manager <community-manager@agents.moleculesai.app> Co-authored-by: Molecule AI App-FE <app-fe@agents.moleculesai.app> Co-authored-by: Molecule AI Core-QA <core-qa@agents.moleculesai.app> Co-authored-by: DevOps Engineer <devops@molecule.ai> Co-authored-by: Molecule AI Core-UIUX <core-uiux@agents.moleculesai.app> Co-authored-by: Molecule AI Dev Lead <dev-lead@agents.moleculesai.app>	2026-04-21 18:22:30 +00:00
Hongming Wang	d8026347e5	chore: open-source restructure — rename dirs, remove internal files, scrub secrets Renames: - platform/ → workspace-server/ (Go module path stays as "platform" for external dep compat — will update after plugin module republish) - workspace-template/ → workspace/ Removed (moved to separate repos or deleted): - PLAN.md — internal roadmap (move to private project board) - HANDOFF.md, AGENTS.md — one-time internal session docs - .claude/ — gitignored entirely (local agent config) - infra/cloudflare-worker/ → Molecule-AI/molecule-tenant-proxy - org-templates/molecule-dev/ → standalone template repo - .mcp-eval/ → molecule-mcp-server repo - test-results/ — ephemeral, gitignored Security scrubbing: - Cloudflare account/zone/KV IDs → placeholders - Real EC2 IPs → <EC2_IP> in all docs - CF token prefix, Neon project ID, Fly app names → redacted - Langfuse dev credentials → parameterized - Personal runner username/machine name → generic Community files: - CONTRIBUTING.md — build, test, branch conventions - CODE_OF_CONDUCT.md — Contributor Covenant 2.1 All Dockerfiles, CI workflows, docker-compose, railway.toml, render.yaml, README, CLAUDE.md updated for new directory names. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-18 00:24:44 -07:00

21 Commits