molecule-core

Author	SHA1	Message	Date
Hongming Wang	6b445aae2d	Memory v2 fixup I5: workspace purge cleans up plugin namespace Self-review #291. When a workspace is hard-purged, its `workspace:<id>` namespace stays in the plugin storage. Over time deleted workspaces accumulate as orphan namespaces. Fix: optional namespaceCleanupFn hook on WorkspaceHandler. The purge path (workspace_crud.go ~line 520) iterates each purged id and calls the hook best-effort. main.go wires the hook to plugin.DeleteNamespace when MEMORY_PLUGIN_URL is set; operators who haven't enabled the plugin keep the no-op default. Why a hook (not direct plugin import): * Keeps WorkspaceHandler decoupled from the memory contract package (easier to test, smaller blast radius if the contract bumps) * Tests inject a captureCleanupHook stub without standing up a real plugin client * Production wiring stays a one-liner in main.go What gets cleaned up: * `workspace:<id>` for each purged workspace * NOT `team:<root>` / `org:<root>` — those may still be referenced by other workspaces under the same root, so dropping them on a single workspace's purge would orphan team/org data for the survivors. Operator can purge those manually after confirming the entire root is gone. What stays untouched: * Soft-removed workspaces (status='removed', no ?purge=true). The grace window is by design — the data should still be there if the operator unremoves. Tests: * TestWithNamespaceCleanup_DefaultIsNil pins the safe default * TestWithNamespaceCleanup_NilStaysNil pins the explicit-nil case * TestWithNamespaceCleanup_AttachesFn pins the wiring * TestPurge_CallsCleanupHookPerID exercises the per-id loop body * TestPurge_NilHookIsSkipped pins the nil guard A full end-to-end Delete-handler test requires mocking broadcaster + provisioner + descendant SQL chain, which is out-of-scope for a single fixup. Integration coverage for the wired path lives in PR-11's E2E swap test (#293 follow-up).	2026-05-04 09:20:37 -07:00
Hongming Wang	6fc328ef44	Merge pull request #2747 from Molecule-AI/fix/memory-v2-c2-backfill-verify Memory v2 fixup C2: backfill -verify mode (parity check)	2026-05-04 16:08:27 +00:00
Hongming Wang	bb3212ad37	Merge branch 'staging' into fix/memory-v2-c2-backfill-verify	2026-05-04 09:08:21 -07:00
Hongming Wang	d297e75fc9	Merge pull request #2746 from Molecule-AI/fix/memory-v2-i1-i4-small Memory v2 fixup I1+I4: expires_at validation + audit JSON marshal	2026-05-04 16:05:02 +00:00
Hongming Wang	3ae0513209	Merge pull request #2744 from Molecule-AI/fix/memory-v2-c1-backfill-idempotent Memory v2 fixup C1: backfill idempotency via MemoryWrite.id	2026-05-04 16:04:54 +00:00
Hongming Wang	4b6373861c	Memory v2 fixup C2: backfill -verify mode (parity check) Self-review missed deliverable from PR-7's task spec. Operators had no way to confirm a -apply produced equivalent search results to the legacy agent_memories direct queries; this PR ships that. Usage: memory-backfill -verify # 50-workspace random sample memory-backfill -verify -verify-sample=200 # bigger sample memory-backfill -verify -workspace=<uuid> # one specific workspace Algorithm: 1. Pick N random workspaces (or use -workspace if specified) 2. For each: query agent_memories direct, query plugin search via the workspace's readable namespace list 3. Multiset-compare contents: every legacy row must have a matching plugin row. Plugin having MORE rows is OK (team-shared content may be visible from sibling workspaces). 4. Print mismatches with content excerpt; non-zero mismatches/errors yields a non-zero exit so CI can gate cutover. Sql: - Sampling uses ORDER BY random() LIMIT N (TABLESAMPLE has surprising distribution at small populations). - Filters out status='removed' workspaces. Test coverage: * pickWorkspaceSample: single-ws short-circuit, random sampling, query error, scan error * queryLegacyMemories: happy path, error path * verifyParity: - all match → 1 match, 0 mismatch - missing-from-plugin → 1 mismatch with content excerpt - plugin-extra rows → 1 match (legacy is subset of plugin) - legacy query error → 1 error counter - resolver error → 1 error counter - plugin search error → 1 error counter - no readable namespaces + empty legacy → match - no readable namespaces + non-empty legacy → mismatch - pickSample error → propagated up * CLI: -verify+-apply rejected as mutually exclusive; -verify alone is a valid mode Note: namespaceResolverAdapter bridges *namespace.Resolver to the verify package's verifyResolver interface so verify.go has zero dependency on the namespace package — keeps test stubs minimal.	2026-05-04 09:01:31 -07:00
Hongming Wang	3886e8fb9f	Merge pull request #2745 from Molecule-AI/fix/harness-stub-auth-headers-1arg fix(harness): stub platform_auth with *args lambdas (#2743 fallout)	2026-05-04 15:58:24 +00:00
Hongming Wang	d48693144b	Memory v2 fixup I1+I4: expires_at validation + audit JSON marshal Two small Important findings from self-review, bundled because both are <20 line changes touching the same file. I1: expires_at silent drop - mcp_tools_memory_v2.go:130 had `if t, err := ...; err == nil { ... }` which dropped malformed timestamps without telling the agent. Agent passes `expires_at: "tomorrow"`, gets a 200, and the memory has no TTL. - Now returns a clear error: "invalid expires_at: must be RFC3339" - Test renamed: TestCommitMemoryV2_BadExpiresIsIgnored (which codified the bug) → TestCommitMemoryV2_BadExpiresReturnsError (which pins the fix). I4: audit log JSON via Sprintf-%q - auditOrgWrite was building activity_logs.metadata via fmt.Sprintf with %q. Go-quoted strings happen to coincide with JSON-quoted for ASCII (and today's values are pure ASCII: UUID + hex digest) so the bug was latent. - Replaced with json.Marshal of map[string]string. Same wire shape today, but won't silently produce invalid JSON if metadata grows to include arbitrary content snippets. - New test TestAuditOrgWrite_MetadataIsValidJSON uses a custom sqlmock.Argument matcher (jsonValidMatcher) that fails the test if the metadata column isn't parseable JSON. The test runs auditOrgWrite with a content string containing quotes, backslashes, and a control byte — values where %q would diverge from JSON-quote. Both pre-existing tests (TestCommitMemoryV2_AuditsOrgWrites etc.) remain green.	2026-05-04 08:57:58 -07:00
Hongming Wang	1b207b214d	fix(harness): stub platform_auth with args lambdas (#2743 fallout) PR #2743 (multi-workspace MCP PR-2) made auth_headers accept an optional ``workspace_id`` arg and self_source_headers stayed 1-arg-required. The peer-discovery-404 harness replay stubbed both with 0-arg lambdas, so the helper call inside the replay raised: TypeError: <lambda>() takes 0 positional arguments but 1 was given …and the diagnostic captured by the replay was the TypeError text, not the platform-404 string the assertion grep'd for. Caught by PR-2737 (auto-promote staging→main) — the replay went red right after #2743 merged into staging. Switching both stubs to ``args, **kwargs`` makes them tolerant of both the legacy 0-arg call shape AND the new 1-arg-with-workspace call shape, so neither the harness nor the in-tree unit tests need to know which version of the runtime helpers ran the call. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 08:55:42 -07:00
Hongming Wang	1e97fb9a16	Memory v2 fixup C1: backfill idempotency via MemoryWrite.id Self-review (post-merge) flagged that the backfill claimed to be idempotent on re-run but actually duplicates every row because the plugin's INSERT uses gen_random_uuid() and ignores any id passed in. Fix is contract-level: extend MemoryWrite with an optional `id` idempotency key. When supplied, the plugin MUST treat the write as upsert keyed on this id; when omitted, the plugin generates a fresh UUID (production agent commits keep working unchanged). Changes: * docs/api-protocol/memory-plugin-v1.yaml: add id field with description that flags it as idempotency key * internal/memory/contract/contract.go: add ID to MemoryWrite struct, update memory_write_minimal golden vector * internal/memory/pgplugin/store.go: split CommitMemory into two paths — upsert when body.ID set (INSERT ... ON CONFLICT (id) DO UPDATE), plain INSERT otherwise * cmd/memory-backfill/main.go: pass agent_memories.id to MemoryWrite, fix the false comment about 409 deduplication New tests: * pgplugin: TestCommitMemory_WithIDUpserts pins the upsert SQL is used when id is set; TestCommitMemory_UpsertScanError covers the error branch * backfill: TestBackfill_PassesSourceUUIDAsIdempotencyKey pins the forwarding behavior; TestBackfill_RerunIsIdempotent simulates a retry and asserts both runs pass the same uuid (plugin upsert is what makes this safe) Why this matters: operators retrying a failed backfill (which they will — networks fail, transactions abort) would otherwise create N duplicates per memory. The duplicates aren't visible until search results show obvious dupes — debugging that under prod load is bad. Production agent commits are unaffected: they leave id empty, the plugin generates a fresh UUID via gen_random_uuid(), zero behavior change for the hot path.	2026-05-04 08:54:13 -07:00
Hongming Wang	7cffff844b	Merge pull request #2743 from Molecule-AI/feat/mcp-multi-workspace-pr2 feat(mcp): cross-workspace delegation routing (multi-ws PR-2)	2026-05-04 15:43:20 +00:00
Hongming Wang	4a0d7cd545	Merge branch 'staging' into feat/mcp-multi-workspace-pr2	2026-05-04 08:37:20 -07:00
Hongming Wang	35b3ea598a	test: fix WORKSPACE_ID assert to match module attr (CI portability) CI's pytest harness pre-sets WORKSPACE_ID=test in the env before test collection, so a2a_client's module-level WORKSPACE_ID (captured at import time, line 24) holds "test" — but the local fixture's monkeypatch.setenv("WORKSPACE_ID", ...) only affects the ENV value seen on later os.environ reads, NOT the already-bound module attribute. Assert against a2a_client.WORKSPACE_ID directly so the test is portable across local + CI runs without monkey-patching the module itself (which a future test reload might undo). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 08:35:48 -07:00
Hongming Wang	1161b97faf	feat(mcp): cross-workspace delegation routing (multi-ws PR-2) PR-2 of the multi-workspace external-agent stack. PR-1 (#2739) landed per-workspace auth + heartbeat + inbox. This PR threads ``source_workspace_id`` through the A2A client + tool surface so an agent registered against multiple workspaces can list peers across all of them and delegate from a specific source. Changes ------- * ``a2a_client``: ``discover_peer``, ``send_a2a_message``, ``get_peers_with_diagnostic``, and ``enrich_peer_metadata`` now accept ``source_workspace_id``. Routing uses it for both the X-Workspace-ID header and (transitively, via ``auth_headers(src)``) the bearer token. Defaults to module-level WORKSPACE_ID for back-compat. * ``a2a_client._peer_to_source``: a new lock-free cache mapping each discovered peer back to the source workspace whose registry surfaced it. ``tool_list_peers`` populates the cache on every call; ``tool_delegate_task`` consults it for auto-routing. * ``a2a_tools.tool_list_peers(source_workspace_id=None)``: when multiple workspaces are registered (MOLECULE_WORKSPACES) and no explicit source is passed, aggregates peers across every registered workspace and tags each entry with ``via: <src[:8]>``. Single-workspace mode is unchanged — no ``via:`` annotation, same output shape. * ``a2a_tools.tool_delegate_task`` and ``tool_delegate_task_async`` resolve source via ``source_workspace_id arg → _peer_to_source[target] → WORKSPACE_ID``. Agents almost never need to specify ``source_`` explicitly — call ``list_peers`` first and the cache handles the rest. ``tool_delegate_task_async`` idempotency key now includes the source workspace, so the same task delegated from two registered workspaces produces two distinct delegations (the right behavior — one per tenant audit trail). * ``platform_auth.list_registered_workspaces()``: new helper for the tool layer to enumerate the multi-ws registry. Lock-free reads matched by the existing single-writer-per-workspace contract from PR-1. * ``platform_auth.self_source_headers``: now passes ``workspace_id`` through to ``auth_headers`` — without this, a multi-workspace POST source-tagged with ``X-Workspace-ID=ws_b`` was authenticating with ws_a's token (or no token if MOLECULE_WORKSPACE_TOKEN unset). Latent PR-1 bug exposed by the new tool surface. * ``a2a_mcp_server`` tool dispatch passes ``source_workspace_id`` from the tool call arguments. * ``platform_tools.registry``: add ``source_workspace_id`` to the delegate_task, delegate_task_async, check_task_status, list_peers input schemas with copy explaining when to use it (rarely — the cache handles it). Tests (15 new, all passing) --------------------------- ``test_a2a_multi_workspace.py``: * TestDiscoverPeerSourceRouting (3): src arg drives header+token, fallback to module ws when omitted, invalid target short-circuits before any HTTP attempt. * TestSendA2AMessageSourceRouting (1): X-Workspace-ID source header + Authorization bearer both come from the source arg via the patched self_source_headers chain. * TestGetPeersSourceRouting (1): URL path AND headers use the source workspace id. * TestToolListPeersAggregation (4): aggregates across multiple registered workspaces, tags origin, leaves single-workspace path unchanged, explicit src arg overrides aggregation, diagnostic joining when every workspace returns empty. * TestToolDelegateTaskAutoRouting (3): cache-driven auto-route, explicit override beats cache, single-workspace fallback to module WORKSPACE_ID. * TestListRegisteredWorkspaces (3): registry enumeration helper. Plus ``tests/snapshots/a2a_instructions_mcp.txt`` regenerated to absorb the new ``source_workspace_id`` schema entries. Back-compat ----------- Every change defaults ``source_workspace_id=None``; legacy single-workspace operators (no MOLECULE_WORKSPACES) see identical behavior — same URLs, same headers, same tool output. The 24 PR-1 tests + 125 existing A2A tests all still pass. Out of scope (PR-3) ------------------- Memory namespacing per registered workspace lands after the new memory system v2 PR (#2740) settles in production. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 08:32:24 -07:00
Hongming Wang	059962a0a3	Merge pull request #2742 from Molecule-AI/feat/memory-v2-pr11-e2e-swap Memory v2 PR-11: E2E test — flat-plugin swap proves contract works	2026-05-04 15:29:56 +00:00
Hongming Wang	b07575c710	Merge branch 'staging' into feat/memory-v2-pr11-e2e-swap	2026-05-04 08:24:26 -07:00
Hongming Wang	586fa5f84e	Merge pull request #2741 from Molecule-AI/feat/memory-v2-pr10-docs Memory v2 PR-10: operator docs for writing a custom memory plugin	2026-05-04 15:20:35 +00:00
Hongming Wang	b937415e1e	Memory v2 PR-11: E2E test — flat-plugin swap proves contract works Final implementation PR. Builds on PR-1..10 (all merged or queued). Proves the central design property of the plugin contract: ANY plugin satisfying the v1 OpenAPI spec works as a drop-in replacement for the built-in postgres plugin. If this test fails after a refactor, the contract has drifted in a way that breaks ecosystem plugins. What ships: * internal/memory/e2e/swap_test.go — five E2E tests against a deliberately minimal "flat-memory" stub plugin (~50 LOC, single map, zero capabilities) * MCPHandler.Dispatch — small exported wrapper around dispatch so out-of-package E2E tests can drive tools by name without duplicating the whole MCP RPC stack E2E coverage: * TestE2E_FlatPluginRoundTrip: full lifecycle - list_writable_namespaces returns 3 entries - commit_memory_v2 writes through plugin - search_memory finds it back - commit_summary writes a summary - forget_memory deletes - search after forget excludes the deleted memory * TestE2E_LegacyShimRoutesThroughFlatPlugin: PR-6 shim wired up - Legacy commit_memory(scope=LOCAL) ends up in plugin storage - Legacy recall_memory finds it back through plugin search - Response shapes preserved (scope:LOCAL stays scope:LOCAL) * TestE2E_OrgMemoriesDelimiterWrap: prompt-injection mitigation - Org-namespace memory committed - Audit INSERT into activity_logs verified - Search returns content with [MEMORY id=... scope=ORG ns=...] prefix applied * TestE2E_StubPluginCapabilitiesAreEmpty: capability negotiation - Stub plugin reports zero capabilities - Client.SupportsCapability returns false for FTS, embedding - Confirms graceful degradation when plugin doesn't support a feature * TestE2E_PluginUnreachable_AgentSeesClearError: failure surface - Plugin URL pointing at bogus port - commit_memory_v2 returns informative error - No nil-pointer dereference; error message is actionable The flat plugin is intentionally minimal — it has no namespaces table distinct from memory records, no FTS, no semantic search, no TTL. The test proves operators can drop in a 50-line plugin and the agent behavior is identical (modulo capability-gated features).	2026-05-04 08:20:35 -07:00
Hongming Wang	0f46c7eefe	Merge pull request #2739 from Molecule-AI/feat/mcp-multi-workspace-pr1 mcp: support multi-workspace external-agent registration (PR-1 of stack)	2026-05-04 15:19:03 +00:00
Hongming Wang	8aea1f008c	Merge pull request #2740 from Molecule-AI/feat/memory-v2-pr8-cutover Memory v2 PR-8: cutover — admin export/import via plugin	2026-05-04 15:18:17 +00:00
Hongming Wang	8417bce50d	Memory v2 PR-10: operator docs for writing a custom memory plugin Builds on merged PR-1..7 (PR-8 in queue). Pure docs; no code. What ships: * docs/memory-plugins/README.md — contract overview, capability negotiation, deployment models, replacement workflow * docs/memory-plugins/testing-your-plugin.md — using the contract test harness to validate wire compatibility, what the harness DOES NOT cover (capability accuracy, TTL eviction, concurrency) * docs/memory-plugins/pinecone-example/README.md — worked example of a Pinecone-backed plugin: capability mapping (only embedding, no FTS), wire mapping (memory → vector + metadata), production- hardening checklist Documentation strategy: * Lead with what workspace-server takes care of (security perimeter, redaction, ACL, GLOBAL audit, prompt-injection wrap) so plugin authors don't reimplement those layers * Show three deployment models (same machine / separate container / self-managed) so operators see their topology * Capability table makes it explicit what each capability gates so a plugin that supports only one (e.g. semantic search) is still a useful plugin * Pinecone example is honest: shows the skeleton, the wire mapping, and explicitly calls out what's MISSING from the sketch (batch commits, TTL janitor, circuit breaker, metrics)	2026-05-04 08:17:03 -07:00
Hongming Wang	3195657837	fix: bot-lint nits — drop unused imports, add reason to except Resolves three github-code-quality threads blocking PR-2739 merge: - workspace/tests/test_mcp_cli_multi_workspace.py: remove unused `import os` and `from unittest.mock import patch` (left over from an earlier test draft that mocked at the os.environ layer). - workspace/mcp_cli.py:523: replace bare `pass` in the register_workspace_token ImportError handler with a debug log line + one-line comment explaining the silent-degrade contract (older installs that don't yet ship the helper fall back to the legacy single-token path; single-workspace operators see no behavior change). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 08:16:12 -07:00
Hongming Wang	7b0bd32957	Memory v2 PR-8: cutover — admin export/import via plugin Builds on merged PR-1..7. Adds the operator-controlled cutover flag that flips admin export/import from the legacy direct-DB path to the v2 plugin path. Activation: MEMORY_V2_CUTOVER=true AND the v2 plugin is wired via WithMemoryV2. Both must be true to take the new path; either being false falls through to the existing legacy SQL code unchanged. What ships: * AdminMemoriesHandler gains plugin + resolver fields, wired via WithMemoryV2 (production) / withMemoryV2APIs (tests) * Export: enumerates workspaces, asks resolver for each one's readable namespaces, searches each via plugin, deduplicates by memory id, applies SAFE-T1201 redaction on emitted content (F1084 parity). Returns the legacy memoryExportEntry shape so existing tooling keeps working. * Import: scope→namespace translation mirrors PR-6 shim. Uses UpsertNamespace + CommitMemory; runs SAFE-T1201 redaction BEFORE the plugin sees the content (F1085 parity). * Helpers: legacyScopeFromNamespace + namespaceKindFromLegacyScope (lifted out so admin_memories doesn't depend on MCP handler helpers). skipImport typed error. Operational rollout (cutover sequencing): 1. Today: MEMORY_V2_CUTOVER unset → legacy DB path. 2. After PR-7 backfill applied + smoke verified: operator sets MEMORY_V2_CUTOVER=true. 3. From that point, admin export/import operate on plugin storage; legacy agent_memories table is read-only for the ~60-day grace window before PR-9 drops it. Coverage on new paths: * cutoverActive: 100% * WithMemoryV2 / withMemoryV2APIs: 100% * importViaPlugin: 100% * exportViaPlugin: 97.2% (one defensive scan-error branch in the workspace-list loop) * scopeToWritableNamespaceForImport: 76.9% (resolver-error and no-matching-kind branches exercised end-to-end via Import) * legacyScopeFromNamespace + namespaceKindFromLegacyScope: 100% Edge cases pinned: * Cutover flag matrix (env unset/true/false × wired/unwired) * Export deduplicates memories shared across team (one row per id) * Export tolerates per-workspace failures (resolver / plugin) and keeps going on the rest * Export returns 500 only when the top-level workspace query fails * Empty readable namespaces → empty export (no panic) * Export redacts secrets in plugin path * Import: unknown workspace skipped, unknown scope skipped, plugin upsert/commit errors counted as errors * Import redacts secrets BEFORE plugin sees content * Legacy export/import path unchanged when cutover flag unset	2026-05-04 08:15:10 -07:00
Hongming Wang	6fb9bc9bcd	mcp: regenerate platform_auth signature snapshot for auth_headers(workspace_id=...) PR-1's auth_headers added an optional workspace_id parameter for multi-workspace token routing; the signature drift gate (test_platform_auth_signature_matches_snapshot) caught the change as expected. Snapshot regenerated to capture the new shape — diff is visible in the PR for reviewers + template repos that depend on this surface. Behavior unchanged: auth_headers() with no arg still routes through the legacy resolution path (back-compat exact); the workspace_id arg is opt-in. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 08:11:23 -07:00
Hongming Wang	9cd2c02f14	Merge branch 'staging' into feat/mcp-multi-workspace-pr1	2026-05-04 08:07:34 -07:00
Hongming Wang	9929f73e80	Merge pull request #2738 from Molecule-AI/feat/memory-v2-pr7-backfill Memory v2 PR-7: one-shot backfill CLI (dry-run + apply)	2026-05-04 15:07:14 +00:00
Hongming Wang	829ab66462	mcp: support multi-workspace external-agent registration (PR-1) External MCP agents (e.g. Claude Code installed on a company PC) can now register against MULTIPLE workspaces from a single process — the agent participates as a peer in workspace A (company) AND workspace B (personal) simultaneously, with one merged inbox tagged so replies route to the correct tenant. Use case (verbatim from operator): "I have this computer AI thats in company's PC, he is going to be put in company's workspace, but personally, I want to register it to my own workspace as well, so that I can talk to it and asking him to do work." ## What changed Wire format — new env var: MOLECULE_WORKSPACES='[ {"id":"<company-wsid>","token":"<company-tok>"}, {"id":"<personal-wsid>","token":"<personal-tok>"} ]' When set, mcp_cli iterates the array and spawns one (register + heartbeat + inbox poller) trio per workspace. Single-workspace mode (WORKSPACE_ID + MOLECULE_WORKSPACE_TOKEN) is unchanged — every existing operator's setup keeps working bit-for-bit. Per-workspace token registry (platform_auth.py): register_workspace_token(wsid, tok) — populated by mcp_cli once per workspace before any thread spawns; thread-safe registration + lock-free reads on the hot path. auth_headers(workspace_id=...) routes to the per-workspace token; auth_headers() with no arg uses the legacy resolution path unchanged (back-compat). Per-workspace inbox cursors (inbox.py): InboxState now supports cursor_paths={wsid: Path,...}. Each poller advances its own cursor — one workspace's slow poll can't stall another, and a 410 only resets the affected workspace's cursor. Single-workspace constructor (cursor_path=Path(...)) still works exactly as before via __post_init__ promotion to the empty-string key. Cursor filenames disambiguated by workspace_id[:8] when multi-workspace; single-workspace keeps the legacy filename so upgrade doesn't invalidate on-disk state. Arrival workspace tagging (inbox.py): InboxMessage.arrival_workspace_id — tells the agent which OF ITS workspaces the inbound message arrived on. Set by the poller from the cursor key. to_dict() omits the field when empty so single- workspace consumers see no shape change. Reply routing (a2a_tools.py + a2a_mcp_server.py + registry.py): send_message_to_user(workspace_id=...) — optional override that selects which workspace's /notify endpoint to POST to (and which token authenticates). Multi-workspace agents pass the inbound message's arrival_workspace_id; single-workspace agents omit it and route to the only registered workspace via the legacy URL. ## Out of scope (future PRs) - PR-2: cross-workspace delegation auto-routing — when an agent receives a request from personal-ws "delegate to ops-bot" and ops-bot lives in company-ws, the agent should auto-pick its company-ws identity for the outbound delegate_task. Today the agent must pass via_workspace explicitly (or fall through to primary workspace). - PR-3: memory namespacing — commit_memory() still writes to the primary workspace's memory regardless of inbound context. Will revisit when the new memory system (PR #2733 just landed) settles. ## Tests workspace/tests/test_mcp_cli_multi_workspace.py — 24 new tests: * MOLECULE_WORKSPACES JSON parsing (valid + 6 error shapes) * Token registry register / lookup / rotation / clear * auth_headers routing by workspace_id with legacy fallback * Per-workspace cursor save/load/reset isolation * arrival_workspace_id present-when-set, omitted-when-empty * default_cursor_path namespacing All 110 pre-existing tests in test_mcp_cli.py / test_inbox.py / test_platform_auth.py still pass — back-compat is mechanical. Refs: project memory entry "External agent multi-workspace registration", design questions answered 2026-05-04 by user (JSON env var; explicit memory writes deferred to PR-3). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 08:06:00 -07:00
Hongming Wang	3b3e821a60	Merge pull request #2736 from Molecule-AI/feat/memory-v2-pr6-compat-shim Memory v2 PR-6: backward-compat shim — legacy tools route to v2	2026-05-04 15:05:14 +00:00
Hongming Wang	a08eaa6ca2	Merge pull request #2735 from Molecule-AI/auto-sync/main-51e7d946 chore: sync main → staging (auto, ff to `51e7d946`)	2026-05-04 08:04:43 -07:00
Hongming Wang	c5322f318a	Memory v2 PR-7: one-shot backfill CLI (dry-run + apply) Builds on merged PR-1..6. Operator runs this once at cutover to copy agent_memories rows into the v2 plugin's storage. Usage: memory-backfill -dry-run # count + diff, no writes memory-backfill -apply # actually copy memory-backfill -apply -limit=10000 # cap rows per run memory-backfill -apply -workspace=<uuid> # one workspace only Required env: DATABASE_URL + MEMORY_PLUGIN_URL. Translation matches the PR-6 legacy shim: LOCAL → workspace:<workspace_id> TEAM → team:<root_id> (resolved via the same namespace.Resolver the runtime uses) GLOBAL → org:<root_id> Idempotent: each row is keyed by its UUID; re-running the backfill does not duplicate writes (plugin handles deduplication). What ships: * cmd/memory-backfill/main.go: CLI entry, run() driver, backfill() workhorse, mapScopeToNamespace + namespaceKindFromString helpers * main_test.go: 100% on the functional logic (mapScopeToNamespace, namespaceKindFromString, backfill(), all CLI validation paths) Coverage: 80.2% of statements. The 19.8% gap is main()'s body (log.Fatalf — not unit-testable) and run()'s real-DB integration (sql.Open + db.PingContext + new client/resolver — requires a live postgres). Integration coverage for this path lives in PR-11 (E2E plugin-swap test). Edge cases pinned (in functional logic): * Every legacy scope → namespace mapping * Unknown scope → skip with diagnostic, increment skipped counter * Resolver error → propagate, abort run * No-matching-kind in writable list → skip with error message * Plugin UpsertNamespace error → increment errors, continue * Plugin CommitMemory error → increment errors, continue * Query error → propagate, abort * Scan error → increment errors, continue * Mid-iteration row error → propagate, abort * Workspace filter passes through to SQL WHERE clause * Dry-run mode never calls plugin * CLI: rejects both/neither modes, missing env vars, bad flags	2026-05-04 08:04:07 -07:00
Hongming Wang	290e6dfdc3	Memory v2 PR-6: backward-compat shim — legacy tools route to v2 Builds on merged PR-1..5. Adds the bridge that lets legacy commit_memory / recall_memory tools route through the v2 plugin path when MEMORY_PLUGIN_URL is wired, otherwise fall through to the existing DB-backed code unchanged. What ships: * handlers/mcp_tools_memory_legacy_shim.go — translation helpers: scopeToWritableNamespace, scopeToReadableNamespaces, commitMemoryLegacyShim, recallMemoryLegacyShim, namespaceKindToLegacyScope * handlers/mcp_tools.go — toolCommitMemory + toolRecallMemory now delegate to the shim when memv2 is wired Translation: commit: LOCAL → workspace:<self> TEAM → team:<root> (resolver picks at runtime) empty → defaults to LOCAL (preserves legacy default) GLOBAL → still rejected at MCP bridge (C3 preserved) recall: LOCAL → search restricted to workspace:<self> TEAM → workspace:<self> + team:<root> empty → all readable (matches v2 default behavior) GLOBAL → blocked at MCP bridge (C3 preserved) Response shapes are preserved exactly: commit: {"id":"...","scope":"LOCAL"\|"TEAM"} — agents see no diff recall: [{"id":"...","content":"...","scope":"LOCAL"\|...,"created_at":"..."}, ...] org-namespace memories get the same [MEMORY id=... scope=ORG ns=...] prefix as v2 search; legacy scope label comes back as "GLOBAL" Operational rollout: * Today: MEMORY_PLUGIN_URL unset on most operators → legacy DB path * After PR-7 backfill: operators set MEMORY_PLUGIN_URL → all writes flow through plugin transparently * After PR-8 cutover: dual-write removed, plugin is the only path * After PR-9 (~60 days later): legacy tool entries dropped entirely Coverage: 100% on every helper, 100% on recallMemoryLegacyShim, 94.7% on commitMemoryLegacyShim. The 1 uncovered line is a defensive guard against a v2-response-parse error that's unreachable when the v2 tool is operating correctly (it always returns valid JSON). Edge cases pinned: * scope translation for every legacy value + invalid scope * resolver error propagation * plugin error propagation * GLOBAL still blocked * default-scope fallback (LOCAL) * empty content rejected * No-op when v2 unwired (legacy SQL path exercised via sqlmock) * org-namespace memory wrap on recall + GLOBAL scope label round-trip * No-results returns "No memories found." (legacy message preserved)	2026-05-04 08:01:41 -07:00
Hongming Wang	f74fff6ae4	Merge pull request #2734 from Molecule-AI/feat/memory-v2-pr5-mcp-tools Memory v2 PR-5: 6 new MCP tools wired through the plugin	2026-05-04 14:53:45 +00:00
Hongming Wang	5bfa4b1d80	Memory v2 PR-5: 6 new MCP tools wired through the plugin Builds on PR-1, PR-2, PR-3, PR-4 (all merged). Adds the agent-facing v2 surface for the memory plugin contract. What ships (all in handlers/mcp_tools_memory_v2.go, no edits to the legacy commit_memory / recall_memory paths): commit_memory_v2 — write to a namespace; default workspace:self search_memory — search across namespaces; default = all readable commit_summary — kind=summary, 30-day default TTL, runtime-overridable list_writable_namespaces — discover what you can write to list_readable_namespaces — discover what you can read from forget_memory — delete by id, only in namespaces you can write to Workspace-server is the security perimeter — every layer the plugin mustn't be trusted with runs here: * SAFE-T1201 redactSecrets BEFORE every plugin write * Server-side ACL re-validation: CanWrite + IntersectReadable run on EVERY request, never trusting client-supplied namespaces (a canvas re-parent between list_writable and commit would otherwise let a stale namespace slip through) * org:* writes audited to activity_logs (SHA256, not plaintext) — matches memories.go:201-221 so the schema stays uniform * Audit failure does NOT block the write (logged + continue) — failing closed would deny org-scope writes whenever activity_logs is unhappy * org:* memories get the [MEMORY id=... scope=ORG ns=...]: prefix on read — preserves the prompt-injection mitigation from memories.go:455-461 Coexistence design: legacy commit_memory + recall_memory still wired to their old code paths in mcp_tools.go. PR-6 will alias them to delegate to these v2 implementations. PR-9 (60 days post-cutover) removes the legacy entries. Wiring: * MCPHandler gains an memv2 field (nil-safe; tools return a clear error when MEMORY_PLUGIN_URL is unset rather than crashing) * WithMemoryV2(plugin, resolver) is the production wiring API main.go calls at boot * withMemoryV2APIs(plugin, resolver) is the test-injectable variant against the memoryPluginAPI / namespaceResolverAPI interfaces Coverage: 100.0% on every new function in mcp_tools_memory_v2.go. Edge cases pinned: * empty/whitespace content → reject before plugin * plugin unconfigured → clear error, no crash * ACL violation → clear error * resolver error → wrapped error * plugin error → wrapped error * malformed expires_at → silently ignored (no exception) * org write audit failure → logged, write proceeds * search namespace intersection drops foreign entries * search with all-foreign namespaces → empty result, plugin not called * search org memories get delimiter wrap, workspace memories do not * forget with explicit + default namespace * forget cross-scope rejected * pickStr / pickStringSlice handle missing keys, wrong types, mixed slices * wrapOrgDelimiter format is exact-match * dispatch wires all 6 tools (no "unknown tool" error)	2026-05-04 07:50:26 -07:00
Hongming Wang	51e7d94605	Merge pull request #2724 from Molecule-AI/staging staging → main: auto-promote `3f4c5f8`	2026-05-04 07:50:20 -07:00
Hongming Wang	f2397bf138	Merge pull request #2733 from Molecule-AI/feat/memory-v2-pr3-postgres-plugin Memory v2 PR-3: built-in postgres plugin server + schema migrations	2026-05-04 14:37:24 +00:00
Hongming Wang	ff5f4cbf7c	Memory v2 PR-3: built-in postgres plugin server + schema migrations Builds on merged PR-1 (#2729), independent of PR-2/PR-4. Implements every endpoint of the v1 plugin contract behind an HTTP server (cmd/memory-plugin-postgres/) backed by postgres. Operators run this binary next to workspace-server; it's the default implementation MEMORY_PLUGIN_URL points at. What ships: - cmd/memory-plugin-postgres/main.go: boot, signal-driven shutdown, boot-time migrations, configurable LISTEN/DATABASE/MIGRATION_DIR - cmd/memory-plugin-postgres/migrations/001_memory_v2.up.sql: memory_namespaces (PK on name, kind CHECK, expires_at, metadata) memory_records (FK to namespaces with CASCADE, kind+source CHECK, pgvector embedding, FTS tsvector, ivfflat partial index on embedding, partial index on expires_at) - internal/memory/pgplugin/store.go: storage layer using lib/pq - internal/memory/pgplugin/handlers.go: HTTP layer (no router dep — a switch on URL.Path keeps the binary's dep surface tiny) - 100% statement coverage on store.go + handlers.go Schema notes: - These tables live next to the plugin binary, NOT in workspace- server/migrations/. When operators swap the plugin, these tables become orphaned (operator drops manually). Documented in PR-10. - Search supports semantic (pgvector cosine) → FTS (>=2 char query) → ILIKE (1-char query) → recent-listing (no query), with a TTL filter applied uniformly across all paths. - DELETE on namespace cascades to memory_records (FK ON DELETE CASCADE) — a deleted namespace immediately frees its memories. Coverage corner cases pinned: - Health: ok, degraded (db ping fails), no-ping fn - Every CRUD endpoint: happy path, bad name, bad JSON, bad body, not-found, store errors, exec/scan/marshal errors - Search: FTS, semantic, short-query (ILIKE), no-query (recent), kinds filter, store errors, scan errors, mid-iteration row error - Routing edge cases: unknown path, empty namespace, unknown sub, method-not-allowed, GET on /v1/health (allowed), POST on /v1/health (404), GET on /v1/search (404) - Helper internals: marshalMetadata (nil/happy/unmarshalable), nullTime (nil/non-nil), vectorString (empty/format), nullVectorString (empty/non-empty), scanNamespace + scanMemory metadata-decode errors No callers in workspace-server yet; integration starts in PR-5 (MCP handlers wire the plugin client through to MCP tools).	2026-05-04 07:31:56 -07:00
Hongming Wang	c53b2b104f	Merge pull request #2730 from Molecule-AI/feat/memory-v2-pr4-namespace-resolver Memory v2 PR-4: namespace resolver + tests (stacked on PR-1)	2026-05-04 14:28:22 +00:00
Hongming Wang	01b653d6b0	Memory v2 PR-4: namespace resolver + tests Stacked on PR-1 (#2729). Computes the readable/writable namespace lists for a workspace from the live workspaces tree at request time. No precomputed columns, no migrations — re-parenting on canvas takes effect immediately on the next memory call. What ships: - workspace-server/internal/memory/namespace/resolver.go - walkChain: recursive CTE, walks parent_id chain to root, capped at depth 50 to defend against malformed/cyclic data - derive: maps a chain to (workspace, team, org) namespace strings - ReadableNamespaces / WritableNamespaces: the public API - CanWrite + IntersectReadable: server-side ACL helpers MCP handlers (PR-5) will call before talking to the plugin - resolver_test.go: 100% statement coverage Design choices worth flagging: - Today's tree is depth-1 (root + children). The recursive CTE handles arbitrary depth so we don't have to revisit the resolver when the tree deepens. - GLOBAL→org write restriction (memories.go:167-174) is preserved by gating the org namespace's Writable flag on parent_id IS NULL. - Removed-status workspaces are NOT filtered from the chain walk — matches today's TEAM behavior (memories.go:367-372 filters on read, not on tree walk). - IntersectReadable with empty `requested` returns ALL readable namespaces (default-search-everything semantic from the discovery tools spec). This package has zero callers in this PR; integration starts in PR-5.	2026-05-04 07:25:33 -07:00
Hongming Wang	f05633f5b0	Merge pull request #2732 from Molecule-AI/fix/canary-timeout-tail-latency ci(canary): bump synth timeout 12→20 min to absorb apt tail latency	2026-05-04 14:04:53 +00:00
Hongming Wang	ff1003e5f6	ci(canary): bump timeout-minutes 12 → 20 to absorb apt tail latency Today's 4 cancelled canaries (25319625186 / 25320942822 / 25321618230 / 25322499952) were all blown by the workflow timeout despite the underlying tenant boot completing successfully (PR molecule-controlplane#455 fix verified — boot events all reach `boot_script_finished/ok`). Why the budget was wrong: The tenant user-data install phase runs apt-get update + install of docker.io / jq / awscli / caddy / amazon-ssm-agent FROM RAW UBUNTU on every tenant boot — none of it is pre-baked into the tenant AMI (EC2_AMI=ami-0ea3c35c5c3284d82, raw Jammy 22.04). Empirical fetch_secrets/ok timing across today's canaries: 51s debug-mm-1777888039 (09:47Z) 82s 25319625186 (12:42Z) 143s 25320942822 (13:11Z) 625s 25322499952 (13:43Z) Same EC2_AMI, same instance type (t3.small), same user-data install sequence — variance is entirely apt-mirror tail latency. A 12-min job budget leaves only ~2 min for the workspace on slow-apt days; the workspace itself needs ~3.5 min for claude-code cold boot, so the budget is structurally too tight whenever apt is slow. 20 min absorbs even the 10+ min boot worst-case and still leaves the workspace its full ~7 min budget. Cap stays well under the runner's 6-hour ubuntu-latest job ceiling. Real fix: pre-bake caddy + ssm-agent into the tenant AMI so the boot phase is no-ops on cached pkgs (will file controlplane#TBD as follow-up — packer/install-base.sh today only bakes the WORKSPACE thin AMI, not the tenant AMI; tenants always boot from raw Ubuntu). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 07:02:12 -07:00
Hongming Wang	d9fb57092c	Merge pull request #2731 from Molecule-AI/feat/memory-v2-pr2-client Memory v2 PR-2: HTTP plugin client + circuit breaker + capability negotiation	2026-05-04 14:00:40 +00:00
Hongming Wang	c1cff3169f	Memory v2 PR-2: HTTP plugin client + breaker + capability negotiation Builds on PR-1 (#2729). Implements every endpoint in the OpenAPI spec plus two operational concerns the agent never sees: 1. Capability negotiation. Boot/Refresh probes /v1/health and captures the plugin's capability list. MCP handlers (PR-5) ask SupportsCapability before exposing capability-gated features — e.g., agents can only request semantic search when "embedding" is reported. 2. Circuit breaker. Three consecutive failures open the breaker for 60 seconds; while open, calls fail fast with ErrBreakerOpen. Picked these constants because: - 3 failures: long enough to skip transient blips, short enough to react before all in-flight handlers stack on the timeout - 60s cooldown: long enough to back off a flapping plugin, short enough that recovery is felt within a single session 4xx responses do NOT count toward the breaker (those are client bugs, not plugin health issues); 5xx + transport errors do. What ships: - workspace-server/internal/memory/client/client.go - client_test.go: 100% statement coverage Coverage corner cases pinned: - env-var success branches in New (parseDurationEnv applied) - json.Marshal error (via channel in Propagation) - http.NewRequestWithContext error (via unbalanced bracket in BaseURL) - 204 NoContent on endpoint that normally has a body - 4xx vs 5xx breaker behavior (4xx must NOT trip) - breaker cooldown elapsed → reset on next success - all 6 public endpoints fail-fast when breaker is open This package has no callers in this PR; integration starts in PR-5.	2026-05-04 06:57:24 -07:00
Hongming Wang	f52de74b7b	Merge pull request #2729 from Molecule-AI/feat/memory-v2-pr1-contract Memory v2 PR-1: OpenAPI plugin contract + Go bindings	2026-05-04 13:51:56 +00:00
Hongming Wang	53d823e719	Memory v2 PR-1: OpenAPI plugin contract + Go bindings First of 11 PRs implementing the memory-system plugin refactor (RFC #2728). This PR is pure additive scaffolding — no behavior change, no integration yet. It defines the wire shape between workspace-server and a memory plugin so PR-2 (HTTP client) and PR-3 (built-in postgres plugin) can be built against a single source of truth. What ships: - docs/api-protocol/memory-plugin-v1.yaml: OpenAPI 3.0.3 spec covering /v1/health, namespace upsert/patch/delete, memory commit, search, forget. Auth-free (private network only); workspace-server is the only sanctioned client and the security perimeter. - workspace-server/internal/memory/contract: typed Go bindings with Validate() methods on every wire object so both client (PR-2) and server (PR-3) self-check at the boundary. - Round-trip JSON tests for every type (catch asymmetric tag bugs). - 5 golden vector files under testdata/ pinning the exact wire shape; update via UPDATE_GOLDENS=1. Coverage: 100% of statements in contract.go. The validation rules encode design decisions worth flagging in review: - SearchRequest with empty Namespaces is REJECTED at plugin level — workspace-server is required to intersect the readable set server-side; an empty list reaching the plugin is a bug. - NamespacePatch with no fields is REJECTED — empty patches are pointless round-trips. - MemoryWrite with whitespace-only Content is REJECTED — zero-info memories pollute search results. No code yet calls into this package; integration starts in PR-2.	2026-05-04 06:45:52 -07:00
Hongming Wang	4511659a9e	Merge pull request #2727 from Molecule-AI/ci/synth-e2e-bump-cadence-to-10min ci: bump continuous-synth-e2e cadence 3→6 fires/hour, clean slots	2026-05-04 12:13:40 +00:00
Hongming Wang	032c011b37	ci: bump continuous-synth-e2e cadence 3→6 fires/hour, all clean slots Change cron from '10,30,50' (3 fires/hour) to '2,12,22,32,42,52' (6 fires/hour). All new slots are 1-3 min away from any other cron, avoiding both the cf-sweep collisions (:15, :45) and the :30 heavy slot (canary-staging /30, sweep-aws-secrets, sweep-stale-e2e-orgs every :15). Why: empirically 2026-05-04 the canary fired only once per hour on the 10,30,50 schedule (see #2726). Bumping fires-per-hour gives more chances to land a survived fire under GH's load- related drop ratio, and keeping all slots in clean lanes minimizes the per-fire drop probability. At empirically-observed ~67% drop ratio, 6 attempts/hour yields ~2 effective fires = ~30 min cadence; closer to the 20-min target than the current shape and provides a real degradation alarm if drops get worse. Cost: ~$0.50/day → ~$1/day. Negligible. Closes #2726. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 05:10:48 -07:00
Hongming Wang	c0997a5703	Merge pull request #2722 from Molecule-AI/auto-sync/main-25cb17c9 chore: sync main → staging (auto, ff to `25cb17c9`)	2026-05-04 10:46:46 +00:00
Hongming Wang	1d3d18fd66	Merge pull request #2725 from Molecule-AI/fix/team-expand-routes-via-auto-dispatcher fix(team): route Expand children through provisionWorkspaceAuto so SaaS gets per-workspace EC2	2026-05-04 10:46:44 +00:00
Hongming Wang	be997883c9	Centralize backend selection in provisionWorkspaceAuto User-reported 2026-05-04: deploying a team org-template ("Design Director" + 6 sub-agents) on a SaaS tenant produced 7-of-7 WORKSPACE_PROVISION_FAILED with the misleading message "container started but never called /registry/register". Diagnose returned "docker client not configured on this workspace-server" and the workspace rows had no instance_id. Root cause: TeamHandler.Expand hardcoded h.wh.provisionWorkspace — the Docker leg of WorkspaceHandler. WorkspaceHandler.Create branched on h.cpProv to pick CP-managed EC2 (SaaS) vs local Docker (self-hosted), but Expand never used that branch. On SaaS the docker goroutine ran but had no socket, so children silently sat in "provisioning" until the 600s sweeper marked them failed. Architectural principle (user): templates own runtime/config/prompts/files/plugins; the platform owns where it runs. Backend selection belongs in one helper. Fix: - Extract WorkspaceHandler.provisionWorkspaceAuto: picks CP when cpProv is set, Docker when only provisioner is set, returns false when neither (caller marks failed). - WorkspaceHandler.Create routes through Auto. - TeamHandler.Expand routes through Auto. Tests pin three invariants: - TestProvisionWorkspaceAuto_NoBackendReturnsFalse — Auto signals fall-through correctly so the caller can persist + mark-failed. - TestProvisionWorkspaceAuto_RoutesToCPWhenSet — when cpProv is wired, Start lands on CP (the user-visible regression target). Discipline-verified: removing the cpProv branch fails this. - TestTeamExpand_UsesAutoNotDirectDockerPath — source-level guard against future refactors reintroducing the hardcoded Docker call. Discipline-verified: reverting team.go fails this with a clear message naming the bug class. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 03:43:41 -07:00
Hongming Wang	3f4c5f8076	Merge pull request #2723 from Molecule-AI/fix/communication-overlay-rate-limit fix(canvas): CommunicationOverlay rate-limit storm — cap fan-out, gate on visibility, slow cadence	2026-05-04 10:22:12 +00:00

1 2 3 4 5 ...

4113 Commits