molecule-core

Author	SHA1	Message	Date
Hongming Wang	1161b97faf	feat(mcp): cross-workspace delegation routing (multi-ws PR-2) PR-2 of the multi-workspace external-agent stack. PR-1 (#2739) landed per-workspace auth + heartbeat + inbox. This PR threads ``source_workspace_id`` through the A2A client + tool surface so an agent registered against multiple workspaces can list peers across all of them and delegate from a specific source. Changes ------- * ``a2a_client``: ``discover_peer``, ``send_a2a_message``, ``get_peers_with_diagnostic``, and ``enrich_peer_metadata`` now accept ``source_workspace_id``. Routing uses it for both the X-Workspace-ID header and (transitively, via ``auth_headers(src)``) the bearer token. Defaults to module-level WORKSPACE_ID for back-compat. * ``a2a_client._peer_to_source``: a new lock-free cache mapping each discovered peer back to the source workspace whose registry surfaced it. ``tool_list_peers`` populates the cache on every call; ``tool_delegate_task`` consults it for auto-routing. * ``a2a_tools.tool_list_peers(source_workspace_id=None)``: when multiple workspaces are registered (MOLECULE_WORKSPACES) and no explicit source is passed, aggregates peers across every registered workspace and tags each entry with ``via: <src[:8]>``. Single-workspace mode is unchanged — no ``via:`` annotation, same output shape. * ``a2a_tools.tool_delegate_task`` and ``tool_delegate_task_async`` resolve source via ``source_workspace_id arg → _peer_to_source[target] → WORKSPACE_ID``. Agents almost never need to specify ``source_`` explicitly — call ``list_peers`` first and the cache handles the rest. ``tool_delegate_task_async`` idempotency key now includes the source workspace, so the same task delegated from two registered workspaces produces two distinct delegations (the right behavior — one per tenant audit trail). * ``platform_auth.list_registered_workspaces()``: new helper for the tool layer to enumerate the multi-ws registry. Lock-free reads matched by the existing single-writer-per-workspace contract from PR-1. * ``platform_auth.self_source_headers``: now passes ``workspace_id`` through to ``auth_headers`` — without this, a multi-workspace POST source-tagged with ``X-Workspace-ID=ws_b`` was authenticating with ws_a's token (or no token if MOLECULE_WORKSPACE_TOKEN unset). Latent PR-1 bug exposed by the new tool surface. * ``a2a_mcp_server`` tool dispatch passes ``source_workspace_id`` from the tool call arguments. * ``platform_tools.registry``: add ``source_workspace_id`` to the delegate_task, delegate_task_async, check_task_status, list_peers input schemas with copy explaining when to use it (rarely — the cache handles it). Tests (15 new, all passing) --------------------------- ``test_a2a_multi_workspace.py``: * TestDiscoverPeerSourceRouting (3): src arg drives header+token, fallback to module ws when omitted, invalid target short-circuits before any HTTP attempt. * TestSendA2AMessageSourceRouting (1): X-Workspace-ID source header + Authorization bearer both come from the source arg via the patched self_source_headers chain. * TestGetPeersSourceRouting (1): URL path AND headers use the source workspace id. * TestToolListPeersAggregation (4): aggregates across multiple registered workspaces, tags origin, leaves single-workspace path unchanged, explicit src arg overrides aggregation, diagnostic joining when every workspace returns empty. * TestToolDelegateTaskAutoRouting (3): cache-driven auto-route, explicit override beats cache, single-workspace fallback to module WORKSPACE_ID. * TestListRegisteredWorkspaces (3): registry enumeration helper. Plus ``tests/snapshots/a2a_instructions_mcp.txt`` regenerated to absorb the new ``source_workspace_id`` schema entries. Back-compat ----------- Every change defaults ``source_workspace_id=None``; legacy single-workspace operators (no MOLECULE_WORKSPACES) see identical behavior — same URLs, same headers, same tool output. The 24 PR-1 tests + 125 existing A2A tests all still pass. Out of scope (PR-3) ------------------- Memory namespacing per registered workspace lands after the new memory system v2 PR (#2740) settles in production. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 08:32:24 -07:00
Hongming Wang	0f46c7eefe	Merge pull request #2739 from Molecule-AI/feat/mcp-multi-workspace-pr1 mcp: support multi-workspace external-agent registration (PR-1 of stack)	2026-05-04 15:19:03 +00:00
Hongming Wang	8aea1f008c	Merge pull request #2740 from Molecule-AI/feat/memory-v2-pr8-cutover Memory v2 PR-8: cutover — admin export/import via plugin	2026-05-04 15:18:17 +00:00
Hongming Wang	3195657837	fix: bot-lint nits — drop unused imports, add reason to except Resolves three github-code-quality threads blocking PR-2739 merge: - workspace/tests/test_mcp_cli_multi_workspace.py: remove unused `import os` and `from unittest.mock import patch` (left over from an earlier test draft that mocked at the os.environ layer). - workspace/mcp_cli.py:523: replace bare `pass` in the register_workspace_token ImportError handler with a debug log line + one-line comment explaining the silent-degrade contract (older installs that don't yet ship the helper fall back to the legacy single-token path; single-workspace operators see no behavior change). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 08:16:12 -07:00
Hongming Wang	7b0bd32957	Memory v2 PR-8: cutover — admin export/import via plugin Builds on merged PR-1..7. Adds the operator-controlled cutover flag that flips admin export/import from the legacy direct-DB path to the v2 plugin path. Activation: MEMORY_V2_CUTOVER=true AND the v2 plugin is wired via WithMemoryV2. Both must be true to take the new path; either being false falls through to the existing legacy SQL code unchanged. What ships: * AdminMemoriesHandler gains plugin + resolver fields, wired via WithMemoryV2 (production) / withMemoryV2APIs (tests) * Export: enumerates workspaces, asks resolver for each one's readable namespaces, searches each via plugin, deduplicates by memory id, applies SAFE-T1201 redaction on emitted content (F1084 parity). Returns the legacy memoryExportEntry shape so existing tooling keeps working. * Import: scope→namespace translation mirrors PR-6 shim. Uses UpsertNamespace + CommitMemory; runs SAFE-T1201 redaction BEFORE the plugin sees the content (F1085 parity). * Helpers: legacyScopeFromNamespace + namespaceKindFromLegacyScope (lifted out so admin_memories doesn't depend on MCP handler helpers). skipImport typed error. Operational rollout (cutover sequencing): 1. Today: MEMORY_V2_CUTOVER unset → legacy DB path. 2. After PR-7 backfill applied + smoke verified: operator sets MEMORY_V2_CUTOVER=true. 3. From that point, admin export/import operate on plugin storage; legacy agent_memories table is read-only for the ~60-day grace window before PR-9 drops it. Coverage on new paths: * cutoverActive: 100% * WithMemoryV2 / withMemoryV2APIs: 100% * importViaPlugin: 100% * exportViaPlugin: 97.2% (one defensive scan-error branch in the workspace-list loop) * scopeToWritableNamespaceForImport: 76.9% (resolver-error and no-matching-kind branches exercised end-to-end via Import) * legacyScopeFromNamespace + namespaceKindFromLegacyScope: 100% Edge cases pinned: * Cutover flag matrix (env unset/true/false × wired/unwired) * Export deduplicates memories shared across team (one row per id) * Export tolerates per-workspace failures (resolver / plugin) and keeps going on the rest * Export returns 500 only when the top-level workspace query fails * Empty readable namespaces → empty export (no panic) * Export redacts secrets in plugin path * Import: unknown workspace skipped, unknown scope skipped, plugin upsert/commit errors counted as errors * Import redacts secrets BEFORE plugin sees content * Legacy export/import path unchanged when cutover flag unset	2026-05-04 08:15:10 -07:00
Hongming Wang	6fb9bc9bcd	mcp: regenerate platform_auth signature snapshot for auth_headers(workspace_id=...) PR-1's auth_headers added an optional workspace_id parameter for multi-workspace token routing; the signature drift gate (test_platform_auth_signature_matches_snapshot) caught the change as expected. Snapshot regenerated to capture the new shape — diff is visible in the PR for reviewers + template repos that depend on this surface. Behavior unchanged: auth_headers() with no arg still routes through the legacy resolution path (back-compat exact); the workspace_id arg is opt-in. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 08:11:23 -07:00
Hongming Wang	9cd2c02f14	Merge branch 'staging' into feat/mcp-multi-workspace-pr1	2026-05-04 08:07:34 -07:00
Hongming Wang	9929f73e80	Merge pull request #2738 from Molecule-AI/feat/memory-v2-pr7-backfill Memory v2 PR-7: one-shot backfill CLI (dry-run + apply)	2026-05-04 15:07:14 +00:00
Hongming Wang	829ab66462	mcp: support multi-workspace external-agent registration (PR-1) External MCP agents (e.g. Claude Code installed on a company PC) can now register against MULTIPLE workspaces from a single process — the agent participates as a peer in workspace A (company) AND workspace B (personal) simultaneously, with one merged inbox tagged so replies route to the correct tenant. Use case (verbatim from operator): "I have this computer AI thats in company's PC, he is going to be put in company's workspace, but personally, I want to register it to my own workspace as well, so that I can talk to it and asking him to do work." ## What changed Wire format — new env var: MOLECULE_WORKSPACES='[ {"id":"<company-wsid>","token":"<company-tok>"}, {"id":"<personal-wsid>","token":"<personal-tok>"} ]' When set, mcp_cli iterates the array and spawns one (register + heartbeat + inbox poller) trio per workspace. Single-workspace mode (WORKSPACE_ID + MOLECULE_WORKSPACE_TOKEN) is unchanged — every existing operator's setup keeps working bit-for-bit. Per-workspace token registry (platform_auth.py): register_workspace_token(wsid, tok) — populated by mcp_cli once per workspace before any thread spawns; thread-safe registration + lock-free reads on the hot path. auth_headers(workspace_id=...) routes to the per-workspace token; auth_headers() with no arg uses the legacy resolution path unchanged (back-compat). Per-workspace inbox cursors (inbox.py): InboxState now supports cursor_paths={wsid: Path,...}. Each poller advances its own cursor — one workspace's slow poll can't stall another, and a 410 only resets the affected workspace's cursor. Single-workspace constructor (cursor_path=Path(...)) still works exactly as before via __post_init__ promotion to the empty-string key. Cursor filenames disambiguated by workspace_id[:8] when multi-workspace; single-workspace keeps the legacy filename so upgrade doesn't invalidate on-disk state. Arrival workspace tagging (inbox.py): InboxMessage.arrival_workspace_id — tells the agent which OF ITS workspaces the inbound message arrived on. Set by the poller from the cursor key. to_dict() omits the field when empty so single- workspace consumers see no shape change. Reply routing (a2a_tools.py + a2a_mcp_server.py + registry.py): send_message_to_user(workspace_id=...) — optional override that selects which workspace's /notify endpoint to POST to (and which token authenticates). Multi-workspace agents pass the inbound message's arrival_workspace_id; single-workspace agents omit it and route to the only registered workspace via the legacy URL. ## Out of scope (future PRs) - PR-2: cross-workspace delegation auto-routing — when an agent receives a request from personal-ws "delegate to ops-bot" and ops-bot lives in company-ws, the agent should auto-pick its company-ws identity for the outbound delegate_task. Today the agent must pass via_workspace explicitly (or fall through to primary workspace). - PR-3: memory namespacing — commit_memory() still writes to the primary workspace's memory regardless of inbound context. Will revisit when the new memory system (PR #2733 just landed) settles. ## Tests workspace/tests/test_mcp_cli_multi_workspace.py — 24 new tests: * MOLECULE_WORKSPACES JSON parsing (valid + 6 error shapes) * Token registry register / lookup / rotation / clear * auth_headers routing by workspace_id with legacy fallback * Per-workspace cursor save/load/reset isolation * arrival_workspace_id present-when-set, omitted-when-empty * default_cursor_path namespacing All 110 pre-existing tests in test_mcp_cli.py / test_inbox.py / test_platform_auth.py still pass — back-compat is mechanical. Refs: project memory entry "External agent multi-workspace registration", design questions answered 2026-05-04 by user (JSON env var; explicit memory writes deferred to PR-3). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 08:06:00 -07:00
Hongming Wang	3b3e821a60	Merge pull request #2736 from Molecule-AI/feat/memory-v2-pr6-compat-shim Memory v2 PR-6: backward-compat shim — legacy tools route to v2	2026-05-04 15:05:14 +00:00
Hongming Wang	a08eaa6ca2	Merge pull request #2735 from Molecule-AI/auto-sync/main-51e7d946 chore: sync main → staging (auto, ff to `51e7d946`)	2026-05-04 08:04:43 -07:00
Hongming Wang	c5322f318a	Memory v2 PR-7: one-shot backfill CLI (dry-run + apply) Builds on merged PR-1..6. Operator runs this once at cutover to copy agent_memories rows into the v2 plugin's storage. Usage: memory-backfill -dry-run # count + diff, no writes memory-backfill -apply # actually copy memory-backfill -apply -limit=10000 # cap rows per run memory-backfill -apply -workspace=<uuid> # one workspace only Required env: DATABASE_URL + MEMORY_PLUGIN_URL. Translation matches the PR-6 legacy shim: LOCAL → workspace:<workspace_id> TEAM → team:<root_id> (resolved via the same namespace.Resolver the runtime uses) GLOBAL → org:<root_id> Idempotent: each row is keyed by its UUID; re-running the backfill does not duplicate writes (plugin handles deduplication). What ships: * cmd/memory-backfill/main.go: CLI entry, run() driver, backfill() workhorse, mapScopeToNamespace + namespaceKindFromString helpers * main_test.go: 100% on the functional logic (mapScopeToNamespace, namespaceKindFromString, backfill(), all CLI validation paths) Coverage: 80.2% of statements. The 19.8% gap is main()'s body (log.Fatalf — not unit-testable) and run()'s real-DB integration (sql.Open + db.PingContext + new client/resolver — requires a live postgres). Integration coverage for this path lives in PR-11 (E2E plugin-swap test). Edge cases pinned (in functional logic): * Every legacy scope → namespace mapping * Unknown scope → skip with diagnostic, increment skipped counter * Resolver error → propagate, abort run * No-matching-kind in writable list → skip with error message * Plugin UpsertNamespace error → increment errors, continue * Plugin CommitMemory error → increment errors, continue * Query error → propagate, abort * Scan error → increment errors, continue * Mid-iteration row error → propagate, abort * Workspace filter passes through to SQL WHERE clause * Dry-run mode never calls plugin * CLI: rejects both/neither modes, missing env vars, bad flags	2026-05-04 08:04:07 -07:00
Hongming Wang	290e6dfdc3	Memory v2 PR-6: backward-compat shim — legacy tools route to v2 Builds on merged PR-1..5. Adds the bridge that lets legacy commit_memory / recall_memory tools route through the v2 plugin path when MEMORY_PLUGIN_URL is wired, otherwise fall through to the existing DB-backed code unchanged. What ships: * handlers/mcp_tools_memory_legacy_shim.go — translation helpers: scopeToWritableNamespace, scopeToReadableNamespaces, commitMemoryLegacyShim, recallMemoryLegacyShim, namespaceKindToLegacyScope * handlers/mcp_tools.go — toolCommitMemory + toolRecallMemory now delegate to the shim when memv2 is wired Translation: commit: LOCAL → workspace:<self> TEAM → team:<root> (resolver picks at runtime) empty → defaults to LOCAL (preserves legacy default) GLOBAL → still rejected at MCP bridge (C3 preserved) recall: LOCAL → search restricted to workspace:<self> TEAM → workspace:<self> + team:<root> empty → all readable (matches v2 default behavior) GLOBAL → blocked at MCP bridge (C3 preserved) Response shapes are preserved exactly: commit: {"id":"...","scope":"LOCAL"\|"TEAM"} — agents see no diff recall: [{"id":"...","content":"...","scope":"LOCAL"\|...,"created_at":"..."}, ...] org-namespace memories get the same [MEMORY id=... scope=ORG ns=...] prefix as v2 search; legacy scope label comes back as "GLOBAL" Operational rollout: * Today: MEMORY_PLUGIN_URL unset on most operators → legacy DB path * After PR-7 backfill: operators set MEMORY_PLUGIN_URL → all writes flow through plugin transparently * After PR-8 cutover: dual-write removed, plugin is the only path * After PR-9 (~60 days later): legacy tool entries dropped entirely Coverage: 100% on every helper, 100% on recallMemoryLegacyShim, 94.7% on commitMemoryLegacyShim. The 1 uncovered line is a defensive guard against a v2-response-parse error that's unreachable when the v2 tool is operating correctly (it always returns valid JSON). Edge cases pinned: * scope translation for every legacy value + invalid scope * resolver error propagation * plugin error propagation * GLOBAL still blocked * default-scope fallback (LOCAL) * empty content rejected * No-op when v2 unwired (legacy SQL path exercised via sqlmock) * org-namespace memory wrap on recall + GLOBAL scope label round-trip * No-results returns "No memories found." (legacy message preserved)	2026-05-04 08:01:41 -07:00
Hongming Wang	f74fff6ae4	Merge pull request #2734 from Molecule-AI/feat/memory-v2-pr5-mcp-tools Memory v2 PR-5: 6 new MCP tools wired through the plugin	2026-05-04 14:53:45 +00:00
Hongming Wang	5bfa4b1d80	Memory v2 PR-5: 6 new MCP tools wired through the plugin Builds on PR-1, PR-2, PR-3, PR-4 (all merged). Adds the agent-facing v2 surface for the memory plugin contract. What ships (all in handlers/mcp_tools_memory_v2.go, no edits to the legacy commit_memory / recall_memory paths): commit_memory_v2 — write to a namespace; default workspace:self search_memory — search across namespaces; default = all readable commit_summary — kind=summary, 30-day default TTL, runtime-overridable list_writable_namespaces — discover what you can write to list_readable_namespaces — discover what you can read from forget_memory — delete by id, only in namespaces you can write to Workspace-server is the security perimeter — every layer the plugin mustn't be trusted with runs here: * SAFE-T1201 redactSecrets BEFORE every plugin write * Server-side ACL re-validation: CanWrite + IntersectReadable run on EVERY request, never trusting client-supplied namespaces (a canvas re-parent between list_writable and commit would otherwise let a stale namespace slip through) * org:* writes audited to activity_logs (SHA256, not plaintext) — matches memories.go:201-221 so the schema stays uniform * Audit failure does NOT block the write (logged + continue) — failing closed would deny org-scope writes whenever activity_logs is unhappy * org:* memories get the [MEMORY id=... scope=ORG ns=...]: prefix on read — preserves the prompt-injection mitigation from memories.go:455-461 Coexistence design: legacy commit_memory + recall_memory still wired to their old code paths in mcp_tools.go. PR-6 will alias them to delegate to these v2 implementations. PR-9 (60 days post-cutover) removes the legacy entries. Wiring: * MCPHandler gains an memv2 field (nil-safe; tools return a clear error when MEMORY_PLUGIN_URL is unset rather than crashing) * WithMemoryV2(plugin, resolver) is the production wiring API main.go calls at boot * withMemoryV2APIs(plugin, resolver) is the test-injectable variant against the memoryPluginAPI / namespaceResolverAPI interfaces Coverage: 100.0% on every new function in mcp_tools_memory_v2.go. Edge cases pinned: * empty/whitespace content → reject before plugin * plugin unconfigured → clear error, no crash * ACL violation → clear error * resolver error → wrapped error * plugin error → wrapped error * malformed expires_at → silently ignored (no exception) * org write audit failure → logged, write proceeds * search namespace intersection drops foreign entries * search with all-foreign namespaces → empty result, plugin not called * search org memories get delimiter wrap, workspace memories do not * forget with explicit + default namespace * forget cross-scope rejected * pickStr / pickStringSlice handle missing keys, wrong types, mixed slices * wrapOrgDelimiter format is exact-match * dispatch wires all 6 tools (no "unknown tool" error)	2026-05-04 07:50:26 -07:00
Hongming Wang	51e7d94605	Merge pull request #2724 from Molecule-AI/staging staging → main: auto-promote `3f4c5f8`	2026-05-04 07:50:20 -07:00
Hongming Wang	f2397bf138	Merge pull request #2733 from Molecule-AI/feat/memory-v2-pr3-postgres-plugin Memory v2 PR-3: built-in postgres plugin server + schema migrations	2026-05-04 14:37:24 +00:00
Hongming Wang	ff5f4cbf7c	Memory v2 PR-3: built-in postgres plugin server + schema migrations Builds on merged PR-1 (#2729), independent of PR-2/PR-4. Implements every endpoint of the v1 plugin contract behind an HTTP server (cmd/memory-plugin-postgres/) backed by postgres. Operators run this binary next to workspace-server; it's the default implementation MEMORY_PLUGIN_URL points at. What ships: - cmd/memory-plugin-postgres/main.go: boot, signal-driven shutdown, boot-time migrations, configurable LISTEN/DATABASE/MIGRATION_DIR - cmd/memory-plugin-postgres/migrations/001_memory_v2.up.sql: memory_namespaces (PK on name, kind CHECK, expires_at, metadata) memory_records (FK to namespaces with CASCADE, kind+source CHECK, pgvector embedding, FTS tsvector, ivfflat partial index on embedding, partial index on expires_at) - internal/memory/pgplugin/store.go: storage layer using lib/pq - internal/memory/pgplugin/handlers.go: HTTP layer (no router dep — a switch on URL.Path keeps the binary's dep surface tiny) - 100% statement coverage on store.go + handlers.go Schema notes: - These tables live next to the plugin binary, NOT in workspace- server/migrations/. When operators swap the plugin, these tables become orphaned (operator drops manually). Documented in PR-10. - Search supports semantic (pgvector cosine) → FTS (>=2 char query) → ILIKE (1-char query) → recent-listing (no query), with a TTL filter applied uniformly across all paths. - DELETE on namespace cascades to memory_records (FK ON DELETE CASCADE) — a deleted namespace immediately frees its memories. Coverage corner cases pinned: - Health: ok, degraded (db ping fails), no-ping fn - Every CRUD endpoint: happy path, bad name, bad JSON, bad body, not-found, store errors, exec/scan/marshal errors - Search: FTS, semantic, short-query (ILIKE), no-query (recent), kinds filter, store errors, scan errors, mid-iteration row error - Routing edge cases: unknown path, empty namespace, unknown sub, method-not-allowed, GET on /v1/health (allowed), POST on /v1/health (404), GET on /v1/search (404) - Helper internals: marshalMetadata (nil/happy/unmarshalable), nullTime (nil/non-nil), vectorString (empty/format), nullVectorString (empty/non-empty), scanNamespace + scanMemory metadata-decode errors No callers in workspace-server yet; integration starts in PR-5 (MCP handlers wire the plugin client through to MCP tools).	2026-05-04 07:31:56 -07:00
Hongming Wang	c53b2b104f	Merge pull request #2730 from Molecule-AI/feat/memory-v2-pr4-namespace-resolver Memory v2 PR-4: namespace resolver + tests (stacked on PR-1)	2026-05-04 14:28:22 +00:00
Hongming Wang	01b653d6b0	Memory v2 PR-4: namespace resolver + tests Stacked on PR-1 (#2729). Computes the readable/writable namespace lists for a workspace from the live workspaces tree at request time. No precomputed columns, no migrations — re-parenting on canvas takes effect immediately on the next memory call. What ships: - workspace-server/internal/memory/namespace/resolver.go - walkChain: recursive CTE, walks parent_id chain to root, capped at depth 50 to defend against malformed/cyclic data - derive: maps a chain to (workspace, team, org) namespace strings - ReadableNamespaces / WritableNamespaces: the public API - CanWrite + IntersectReadable: server-side ACL helpers MCP handlers (PR-5) will call before talking to the plugin - resolver_test.go: 100% statement coverage Design choices worth flagging: - Today's tree is depth-1 (root + children). The recursive CTE handles arbitrary depth so we don't have to revisit the resolver when the tree deepens. - GLOBAL→org write restriction (memories.go:167-174) is preserved by gating the org namespace's Writable flag on parent_id IS NULL. - Removed-status workspaces are NOT filtered from the chain walk — matches today's TEAM behavior (memories.go:367-372 filters on read, not on tree walk). - IntersectReadable with empty `requested` returns ALL readable namespaces (default-search-everything semantic from the discovery tools spec). This package has zero callers in this PR; integration starts in PR-5.	2026-05-04 07:25:33 -07:00
Hongming Wang	f05633f5b0	Merge pull request #2732 from Molecule-AI/fix/canary-timeout-tail-latency ci(canary): bump synth timeout 12→20 min to absorb apt tail latency	2026-05-04 14:04:53 +00:00
Hongming Wang	ff1003e5f6	ci(canary): bump timeout-minutes 12 → 20 to absorb apt tail latency Today's 4 cancelled canaries (25319625186 / 25320942822 / 25321618230 / 25322499952) were all blown by the workflow timeout despite the underlying tenant boot completing successfully (PR molecule-controlplane#455 fix verified — boot events all reach `boot_script_finished/ok`). Why the budget was wrong: The tenant user-data install phase runs apt-get update + install of docker.io / jq / awscli / caddy / amazon-ssm-agent FROM RAW UBUNTU on every tenant boot — none of it is pre-baked into the tenant AMI (EC2_AMI=ami-0ea3c35c5c3284d82, raw Jammy 22.04). Empirical fetch_secrets/ok timing across today's canaries: 51s debug-mm-1777888039 (09:47Z) 82s 25319625186 (12:42Z) 143s 25320942822 (13:11Z) 625s 25322499952 (13:43Z) Same EC2_AMI, same instance type (t3.small), same user-data install sequence — variance is entirely apt-mirror tail latency. A 12-min job budget leaves only ~2 min for the workspace on slow-apt days; the workspace itself needs ~3.5 min for claude-code cold boot, so the budget is structurally too tight whenever apt is slow. 20 min absorbs even the 10+ min boot worst-case and still leaves the workspace its full ~7 min budget. Cap stays well under the runner's 6-hour ubuntu-latest job ceiling. Real fix: pre-bake caddy + ssm-agent into the tenant AMI so the boot phase is no-ops on cached pkgs (will file controlplane#TBD as follow-up — packer/install-base.sh today only bakes the WORKSPACE thin AMI, not the tenant AMI; tenants always boot from raw Ubuntu). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 07:02:12 -07:00
Hongming Wang	d9fb57092c	Merge pull request #2731 from Molecule-AI/feat/memory-v2-pr2-client Memory v2 PR-2: HTTP plugin client + circuit breaker + capability negotiation	2026-05-04 14:00:40 +00:00
Hongming Wang	c1cff3169f	Memory v2 PR-2: HTTP plugin client + breaker + capability negotiation Builds on PR-1 (#2729). Implements every endpoint in the OpenAPI spec plus two operational concerns the agent never sees: 1. Capability negotiation. Boot/Refresh probes /v1/health and captures the plugin's capability list. MCP handlers (PR-5) ask SupportsCapability before exposing capability-gated features — e.g., agents can only request semantic search when "embedding" is reported. 2. Circuit breaker. Three consecutive failures open the breaker for 60 seconds; while open, calls fail fast with ErrBreakerOpen. Picked these constants because: - 3 failures: long enough to skip transient blips, short enough to react before all in-flight handlers stack on the timeout - 60s cooldown: long enough to back off a flapping plugin, short enough that recovery is felt within a single session 4xx responses do NOT count toward the breaker (those are client bugs, not plugin health issues); 5xx + transport errors do. What ships: - workspace-server/internal/memory/client/client.go - client_test.go: 100% statement coverage Coverage corner cases pinned: - env-var success branches in New (parseDurationEnv applied) - json.Marshal error (via channel in Propagation) - http.NewRequestWithContext error (via unbalanced bracket in BaseURL) - 204 NoContent on endpoint that normally has a body - 4xx vs 5xx breaker behavior (4xx must NOT trip) - breaker cooldown elapsed → reset on next success - all 6 public endpoints fail-fast when breaker is open This package has no callers in this PR; integration starts in PR-5.	2026-05-04 06:57:24 -07:00
Hongming Wang	f52de74b7b	Merge pull request #2729 from Molecule-AI/feat/memory-v2-pr1-contract Memory v2 PR-1: OpenAPI plugin contract + Go bindings	2026-05-04 13:51:56 +00:00
Hongming Wang	53d823e719	Memory v2 PR-1: OpenAPI plugin contract + Go bindings First of 11 PRs implementing the memory-system plugin refactor (RFC #2728). This PR is pure additive scaffolding — no behavior change, no integration yet. It defines the wire shape between workspace-server and a memory plugin so PR-2 (HTTP client) and PR-3 (built-in postgres plugin) can be built against a single source of truth. What ships: - docs/api-protocol/memory-plugin-v1.yaml: OpenAPI 3.0.3 spec covering /v1/health, namespace upsert/patch/delete, memory commit, search, forget. Auth-free (private network only); workspace-server is the only sanctioned client and the security perimeter. - workspace-server/internal/memory/contract: typed Go bindings with Validate() methods on every wire object so both client (PR-2) and server (PR-3) self-check at the boundary. - Round-trip JSON tests for every type (catch asymmetric tag bugs). - 5 golden vector files under testdata/ pinning the exact wire shape; update via UPDATE_GOLDENS=1. Coverage: 100% of statements in contract.go. The validation rules encode design decisions worth flagging in review: - SearchRequest with empty Namespaces is REJECTED at plugin level — workspace-server is required to intersect the readable set server-side; an empty list reaching the plugin is a bug. - NamespacePatch with no fields is REJECTED — empty patches are pointless round-trips. - MemoryWrite with whitespace-only Content is REJECTED — zero-info memories pollute search results. No code yet calls into this package; integration starts in PR-2.	2026-05-04 06:45:52 -07:00
Hongming Wang	4511659a9e	Merge pull request #2727 from Molecule-AI/ci/synth-e2e-bump-cadence-to-10min ci: bump continuous-synth-e2e cadence 3→6 fires/hour, clean slots	2026-05-04 12:13:40 +00:00
Hongming Wang	032c011b37	ci: bump continuous-synth-e2e cadence 3→6 fires/hour, all clean slots Change cron from '10,30,50' (3 fires/hour) to '2,12,22,32,42,52' (6 fires/hour). All new slots are 1-3 min away from any other cron, avoiding both the cf-sweep collisions (:15, :45) and the :30 heavy slot (canary-staging /30, sweep-aws-secrets, sweep-stale-e2e-orgs every :15). Why: empirically 2026-05-04 the canary fired only once per hour on the 10,30,50 schedule (see #2726). Bumping fires-per-hour gives more chances to land a survived fire under GH's load- related drop ratio, and keeping all slots in clean lanes minimizes the per-fire drop probability. At empirically-observed ~67% drop ratio, 6 attempts/hour yields ~2 effective fires = ~30 min cadence; closer to the 20-min target than the current shape and provides a real degradation alarm if drops get worse. Cost: ~$0.50/day → ~$1/day. Negligible. Closes #2726. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 05:10:48 -07:00
Hongming Wang	c0997a5703	Merge pull request #2722 from Molecule-AI/auto-sync/main-25cb17c9 chore: sync main → staging (auto, ff to `25cb17c9`)	2026-05-04 10:46:46 +00:00
Hongming Wang	1d3d18fd66	Merge pull request #2725 from Molecule-AI/fix/team-expand-routes-via-auto-dispatcher fix(team): route Expand children through provisionWorkspaceAuto so SaaS gets per-workspace EC2	2026-05-04 10:46:44 +00:00
Hongming Wang	be997883c9	Centralize backend selection in provisionWorkspaceAuto User-reported 2026-05-04: deploying a team org-template ("Design Director" + 6 sub-agents) on a SaaS tenant produced 7-of-7 WORKSPACE_PROVISION_FAILED with the misleading message "container started but never called /registry/register". Diagnose returned "docker client not configured on this workspace-server" and the workspace rows had no instance_id. Root cause: TeamHandler.Expand hardcoded h.wh.provisionWorkspace — the Docker leg of WorkspaceHandler. WorkspaceHandler.Create branched on h.cpProv to pick CP-managed EC2 (SaaS) vs local Docker (self-hosted), but Expand never used that branch. On SaaS the docker goroutine ran but had no socket, so children silently sat in "provisioning" until the 600s sweeper marked them failed. Architectural principle (user): templates own runtime/config/prompts/files/plugins; the platform owns where it runs. Backend selection belongs in one helper. Fix: - Extract WorkspaceHandler.provisionWorkspaceAuto: picks CP when cpProv is set, Docker when only provisioner is set, returns false when neither (caller marks failed). - WorkspaceHandler.Create routes through Auto. - TeamHandler.Expand routes through Auto. Tests pin three invariants: - TestProvisionWorkspaceAuto_NoBackendReturnsFalse — Auto signals fall-through correctly so the caller can persist + mark-failed. - TestProvisionWorkspaceAuto_RoutesToCPWhenSet — when cpProv is wired, Start lands on CP (the user-visible regression target). Discipline-verified: removing the cpProv branch fails this. - TestTeamExpand_UsesAutoNotDirectDockerPath — source-level guard against future refactors reintroducing the hardcoded Docker call. Discipline-verified: reverting team.go fails this with a clear message naming the bug class. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 03:43:41 -07:00
Hongming Wang	3f4c5f8076	Merge pull request #2723 from Molecule-AI/fix/communication-overlay-rate-limit fix(canvas): CommunicationOverlay rate-limit storm — cap fan-out, gate on visibility, slow cadence	2026-05-04 10:22:12 +00:00
Hongming Wang	e1c99cd24c	Pin the visibility gate behavior, not just cadence Self-review on PR #2723 caught a coverage gap: the existing "visibility gate" describe block actually tested cadence (10s/30s timing), not the gate itself. If a refactor dropped the `if (!visible) return` line, the cadence test would still pass because the effect would still fire every 30s — the regression would silently ship. New test renders with comms-returning mock so the panel renders, clicks the close button, advances 60s, asserts no further fetches occur. Discipline-verified: removed `if (!visible) return` from the source, test fails as expected. Restored, test passes. Same failure mode as PR #434 (test asserted broken behavior) — pin what you claim to fix, not the easy substring. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 03:18:42 -07:00
Hongming Wang	26b5b21238	Fix CommunicationOverlay rate-limit storm: cap fan-out + gate on visibility User report 2026-05-04: 8+ workspace tenant (Design Director + 6 sub-agents + 3 standalones) saw sustained 429s in canvas console hitting /workspaces/<id>/activity?limit=5. Server-side rate limit is 600 req/min/IP. Three compounding issues in CommunicationOverlay: 1. Polled regardless of visibility — collapsed panel still hammered the API 2. 10s cadence — 6 req every 10s = 36 req/min from this overlay alone 3. Fan-out cap of 6 workspaces — scaled linearly with workspace count Fix: - Gate setInterval on `visible` (effect re-runs when collapsed/expanded) - Cadence 10s → 30s - Fan-out cap 6 → 3 Combined: ~36 req/min worst case → 6 req/min worst case (6x reduction), 0 req/min when collapsed. Tests: - Fan-out cap: 6 online nodes mounted → exactly 3 fetches (was 6) - Offline gate: offline workspace never polled - Cadence: timer at 10s = no new fetch; timer at 30s = next batch fires Each test would fail if the corresponding dial regressed. Follow-up (out of scope): structurally right fix is to consume the WORKSPACE_ACTIVITY WS broadcast instead of polling per-workspace. Server already publishes the events; canvas just isn't subscribing yet. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 03:18:42 -07:00
molecule-ai[bot]	25cb17c906	Merge pull request #2721 from Molecule-AI/staging staging → main: auto-promote `238f4d4`	2026-05-04 03:03:32 -07:00
Hongming Wang	238f4d45df	Merge pull request #2720 from Molecule-AI/fix/chat-upload-poll-mode-distinct-error fix: distinguish poll-mode workspace from transient empty-URL on chat upload	2026-05-04 09:46:05 +00:00
Hongming Wang	bcea8ac822	Broaden empty-URL 422 to cover NULL delivery_mode (production reality) Live-probed user's tenant: three of three external-runtime workspaces register with delivery_mode = NULL, not "poll". The earlier narrow poll-only check fell through to the misleading 503 for the actually- observed shape. Invariant we want: URL empty + not-exactly-"push" → no dispatch path will ever exist → 422. Only push-mode with empty URL is genuinely transient (mid-boot, restart in progress) → 503. Added TestChatUpload_NullModeEmptyURL using the user's actual workspace ID. Existing TestChatUpload_NoURL switched to explicit "push" mode (was relying on default — unsafe given the new branching). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 02:42:46 -07:00
Hongming Wang	87ae691e67	Distinguish poll-mode workspace from transient empty-URL on chat upload External-runtime workspaces that register in poll mode have no callback URL by design — the platform never dispatches to them, so chat upload (HTTP-forward by design) can't proceed. Returning 503 + "workspace url not registered yet" was misleading: the "yet" implied transient state, but the URL would never arrive. Caught externally on 2026-05-04: user uploading an image to an external "mac laptop" runtime workspace saw the 503 and assumed they should retry. The workspace's poll mode meant retrying would never help. Fix: include delivery_mode in the workspace lookup. When URL is empty: - poll mode → 422 + "re-register in push mode with a public URL" (Unprocessable Entity — this request can't succeed against this workspace's configuration; no retry will help) - push mode → 503 + "not registered yet" (genuine transient state — retry after next heartbeat is correct) Test: TestChatUpload_PollModeEmptyURL pins the new 422 path; existing TestChatUpload_NoURL strengthened to assert the "not registered yet" substring stays on the push branch (it would have silently passed if the new 422 path had clobbered both branches). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 02:42:46 -07:00
Hongming Wang	99f6481acc	Merge pull request #2719 from Molecule-AI/auto-sync/main-2c4bfd83 chore: sync main → staging (auto, ff to `2c4bfd83`)	2026-05-04 09:08:18 +00:00
molecule-ai[bot]	2c4bfd83e4	Merge pull request #2718 from Molecule-AI/staging staging → main: auto-promote `9e8aa39`	2026-05-04 09:04:19 +00:00
Hongming Wang	9e8aa39692	Merge pull request #2717 from Molecule-AI/fix/a2a-timeout-cold-llm e2e: bump A2A timeout from 30s → 90s for cold MiniMax workspace	2026-05-04 08:52:03 +00:00
Hongming Wang	b7f0b279eb	e2e: bump A2A timeout from 30s → 90s for cold MiniMax workspace After #2710 + #2714 + the MOLECULE_STAGING_MINIMAX_API_KEY repo secret landed (2026-05-04 08:37Z), the next dispatched canary (run 25309323698) cleared every previous failure point but timed out at step 8/11 with `curl: (28) Operation timed out after 30002 ms`. The canary creates a fresh org per run, so every A2A POST hits a cold workspace + cold MiniMax endpoint: workspace boot → claude-code adapter starts event loop → first prompt ships → TLS handshake to api.minimax.io → cold model warmup → first-token generation Cold-call P95 lands around 25-30s on MiniMax-M2.7-highspeed; the 30-second `CURL_COMMON --max-time` is right on the edge and the run that timed out was 30.002s of zero bytes received. Fix: override `--max-time` for the canary's A2A POST only — 90s gives ~3x headroom. Subsequent A2A turns to the same workspace are sub-second, so this only widens step 8 of the canary's first turn. The shared CURL_COMMON timeout stays at 30s for everything else (provision, register, terminal, peers, teardown), where 30s is right. Verifies the rest of the canary script (provision, DNS, terminal-EIC, A2A round-trip) is platform-correct and the only operational gap is this latency knob.	2026-05-04 01:49:42 -07:00
Hongming Wang	fa3353a3ca	Merge pull request #2716 from Molecule-AI/auto-sync/main-1187a66d chore: sync main → staging (auto, ff to `1187a66d`)	2026-05-04 08:34:59 +00:00
molecule-ai[bot]	1187a66d2e	Merge pull request #2715 from Molecule-AI/staging staging → main: auto-promote `d360c34`	2026-05-04 01:20:07 -07:00
Hongming Wang	d360c34a30	Merge pull request #2714 from Molecule-AI/feat/anthropic-direct-e2e-path e2e: add direct-Anthropic LLM-key path alongside MiniMax + OpenAI	2026-05-04 07:53:26 +00:00
Hongming Wang	287961375f	Merge pull request #2713 from Molecule-AI/auto-sync/main-f1840d46 chore: sync main → staging (auto, ff to `f1840d46`)	2026-05-04 07:53:16 +00:00
Hongming Wang	98f883cb99	e2e: add direct-Anthropic LLM-key path alongside MiniMax + OpenAI Adds a third secrets-injection branch in test_staging_full_saas.sh behind a new E2E_ANTHROPIC_API_KEY env var, wired into all three auto-running E2E workflows (canary-staging, e2e-staging-saas, continuous-synth-e2e) via a new MOLECULE_STAGING_ANTHROPIC_API_KEY repo secret slot. Operator motivation: after #2578 (the staging OpenAI key went over quota and stayed dead 36+ hours) we shipped #2710 to migrate the canary + full-lifecycle E2E to claude-code+MiniMax. Discovered post- merge that MOLECULE_STAGING_MINIMAX_API_KEY had never been set after the synth-E2E migration on 2026-05-03 either — synth has been red the whole time, not just OpenAI quota. Setting up a MiniMax billing account from scratch is non-trivial (needs platform-specific signup, KYC, top-up). Operators who already have an Anthropic API key for their own Claude Code session can now just set MOLECULE_STAGING_ANTHROPIC_API_KEY and have all three auto-running E2E gates green within one cron firing. Priority chain in test_staging_full_saas.sh (first non-empty wins): 1. E2E_MINIMAX_API_KEY → MiniMax (cheapest) 2. E2E_ANTHROPIC_API_KEY → direct Anthropic (cheaper than gpt-4o, lower setup friction than MiniMax) 3. E2E_OPENAI_API_KEY → langgraph/hermes paths Verify-key case-statement in all three workflows accepts EITHER MiniMax OR Anthropic for runtime=claude-code; error message names both options so operators know they don't have to register a MiniMax account if they already have an Anthropic key. Pinned to runtime=claude-code — hermes/langgraph use OpenAI-shaped envs and won't honour ANTHROPIC_API_KEY without further wiring. After this lands + secret is set, the dispatched canary verifies the new path: gh workflow run canary-staging.yml --repo Molecule-AI/molecule-core --ref staging	2026-05-04 00:51:14 -07:00
molecule-ai[bot]	f1840d467c	Merge pull request #2712 from Molecule-AI/staging staging → main: auto-promote `563e58a`	2026-05-04 07:38:58 +00:00
Hongming Wang	5596cb52ef	Merge pull request #2711 from Molecule-AI/auto-sync/main-170e037a chore: sync main → staging (auto, ff to `170e037a`)	2026-05-04 07:25:30 +00:00
Hongming Wang	563e58a835	Merge pull request #2710 from Molecule-AI/fix/canary-staging-migrate-to-minimax canary-staging: migrate from hermes+OpenAI to claude-code+MiniMax	2026-05-04 07:23:37 +00:00

1 2 3 4 5 ...

4095 Commits