molecule-core

Author	SHA1	Message	Date
Hongming Wang	7692dd4975	fix(synth-e2e): correct curl status-code parse in 7c gate The first version of the config.yaml round-trip gate (PR #2773) captured curl output with `-w '\n%{http_code}\n'` and parsed via `tail -n 2 \| head -n 1`. That broke because bash's $(...) strips the trailing newline, leaving only 2 lines in the captured value: line 1: <response body> line 2: <status code> `tail -n 2 \| head -n 1` then returned line 1 (the body), not the status code. The gate misreported 200-with-JSON-body responses as "PUT returned <body>" and failed the canary post-merge at 22:06 UTC. Fix: write body to a tempfile via `-o "$PUT_TMP"` and use `-w '%{http_code}'` as the sole stdout. Status code is now unambiguously the captured value, body is read separately from the tempfile. No newline-counting heuristic needed. Verification: - bash -n clean - shellcheck clean on the modified block - Will be exercised by the next continuous-synth-e2e firing Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 15:08:37 -07:00
Hongming Wang	1282c1c8ff	Merge pull request #2773 from Molecule-AI/test/synth-e2e-config-write-gate test(synth-e2e): add Files API config.yaml round-trip gate (catches #2769 class)	2026-05-04 21:49:07 +00:00
Hongming Wang	a242ca8b01	test(synth-e2e): add Files API config.yaml round-trip gate Today's user-visible bug ("PUT /workspaces/<id>/files/config.yaml: 500 … install: cannot create directory '/opt/configs': Permission denied", fixed in #2769) shipped to production and was caught only when an operator opened the Canvas Config tab and clicked Save & Restart on a claude-code workspace. Two compounding root causes: 1. Path-map fall-through: claude-code wasn't in workspaceFilePathPrefix, so it fell through to the /opt/configs default — a path the workspace EC2 doesn't have (cloud-init only creates /configs). 2. Permission: /configs is root-owned, but the SSH-as-ubuntu install command had no sudo prefix, so the write would have failed with EACCES even with the right path. The synth E2E provisions a fresh workspace every cron firing but never PUTs a file via the Files API. So neither failure mode could fail the canary. Add a new step 7c (between terminal-diagnose and A2A) that: - PUTs a known marker into config.yaml on each provisioned workspace - GETs it back and asserts the marker is present - Fails with an actionable message that names the likely class of regression (path map vs permission) so the next operator doesn't have to re-discover today's debugging path The marker includes the run ID so stale state from a prior canary can't false-pass. Why round-trip (not just PUT-and-200): a 200 from PUT only proves the SSH install succeeded somewhere on disk; the GET-back proves the file landed at the path the runtime actually reads from (i.e., that the host:/configs → container:/configs bind-mount sees it). Without the GET, a future bug that writes to a non-bind-mounted host path would silently no-op from the runtime's POV but pass the gate. Deferred (separate PR, requires AWS-creds wiring): a parallel gate that aws ec2 describe-instances on the workspace EC2 and asserts the attached IamInstanceProfile.Arn — would directly catch the #466 IAM profile gap class. Punted because it needs aws-actions/configure-aws- credentials added to continuous-synth-e2e.yml + a read-only IAM role provisioned on the AWS side. Tracked as task #301. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 14:43:58 -07:00
Hongming Wang	ac9b07b7ad	Merge pull request #2771 from Molecule-AI/fix/mcp-dispatcher-source-workspace-id fix(mcp): wire source_workspace_id through dispatcher for memory/chat_history/workspace_info	2026-05-04 21:43:32 +00:00
Hongming Wang	41ae4ec50b	fix(mcp): wire source_workspace_id through dispatcher for memory + chat_history + workspace_info Self-review of merged PR #2766 (multi-workspace MCP routing) revealed a silent gap: PR #2766 added the ``source_workspace_id`` parameter to ``tool_commit_memory`` / ``tool_recall_memory`` / ``tool_chat_history`` / ``tool_get_workspace_info`` AND advertised it in the registry's input schemas, but the MCP server's dispatch arms in ``a2a_mcp_server.py`` were never updated to forward ``arguments["source_workspace_id"]`` to those four tools. Result: the schema lied. The LLM saw ``source_workspace_id`` as a valid tool parameter, could correctly populate it from the inbound message's ``arrival_workspace_id``, but the dispatcher dropped it on the floor and every memory commit / recall / chat-history fetch silently fell back to the module-level ``WORKSPACE_ID``. The cross-tenant leak that PR #2766 was meant to prevent is NOT prevented for these four tools without this follow-up. Why the existing dispatcher tests didn't catch it: the tests asserted return-value strings (``"working" in result``) but never asserted what arguments the inner tool was called with. So the dispatcher could ignore any kwarg and the tests would still pass. Fix: 1. Wire ``source_workspace_id=arguments.get("source_workspace_id") or None`` into the four dispatch arms, mirroring the pattern already used for ``delegate_task`` / ``delegate_task_async`` / ``check_task_status`` / ``list_peers``. 2. Add five tests in ``test_a2a_mcp_server.py`` that assert the inner tool was awaited with the exact source_workspace_id kwarg (``assert_awaited_once_with(..., source_workspace_id="ws-X")``) — substring-on-result tests can't catch this class of bug. 3. Add a fallback test ensuring single-workspace operators (no source_workspace_id key) get ``source_workspace_id=None`` — pinning the documented None contract over an accidental empty-string forward. Suite: 1705 passed (was 1700 + 5 new), 3 skipped, 2 xfailed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 14:41:24 -07:00
molecule-ai[bot]	02960209a0	Merge pull request #2768 from Molecule-AI/staging staging → main: auto-promote `f70071e`	2026-05-04 14:34:09 -07:00
Hongming Wang	d866d3aa5f	Merge pull request #2769 from Molecule-AI/fix/workspace-config-write-path fix(workspace files API): write claude-code config to /configs, sudo for root-owned base	2026-05-04 21:33:00 +00:00
Hongming Wang	61d5908817	fix(workspace files API): write claude-code config to /configs, sudo for root-owned base Root cause of the user-visible 500 ("install: cannot create directory '/opt/configs': Permission denied") on PUT /workspaces/<id>/files/config.yaml: 1. Path map fall-through. claude-code wasn't in workspaceFilePathPrefix, so resolveWorkspaceFilePath returned the default `/opt/configs/...`. That directory doesn't exist on the workspace EC2 — cloud-init in provisioner/userdata_containerized.go runs `mkdir -p /configs` only. Even if the SSH write had succeeded at /opt/configs, the docker container's bind-mount is host:/configs → container:/configs, so the file would have been invisible to the runtime. 2. /configs ownership. cloud-init runs as root, so /configs is root-owned. The SSH-as-ubuntu install command can't write into it without sudo. Hermes wasn't affected because its base path (/home/ubuntu/.hermes) is ubuntu-owned. Two-line fix: - Add `claude-code: /configs` to the runtime → base-path map and flip the default fall-through from `/opt/configs` to `/configs`. Leave the pre-existing langgraph/external entries pointing at /opt/configs pending a migration audit (no user report on those today, and flipping them would silently relocate any files those runtimes already wrote). - Prefix the remote install command with `sudo -n` so the write succeeds under the standard EC2 ubuntu/passwordless-sudo posture. `-n` (non-interactive) ensures clean failure if that ever changes, rather than a hang waiting for a password prompt. Tests: - TestResolveWorkspaceFilePath_KnownRuntimes adds claude-code + CLAUDE-CODE coverage and updates the empty/unknown default cases to expect /configs. The langgraph/external rows stay green (unchanged values), confirming the scope of the rename. Verification: - go build ./... clean - go test ./internal/handlers/ green - The user-reported bug (PUT /workspaces/57fb7043-79a0-4a53-ae4a-efb39deb457f/files/config.yaml → 500 EACCES on /opt/configs) is the failure mode this fix addresses on both axes (path + sudo). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 14:29:08 -07:00
Hongming Wang	89bdf29d6f	Merge pull request #2766 from Molecule-AI/feat/mcp-multi-ws-tool-routing feat(mcp): multi-workspace routing for memory/chat_history/workspace_info (PR-3)	2026-05-04 21:20:22 +00:00
Hongming Wang	700d44ec3d	feat(mcp): multi-workspace routing for memory + chat_history + workspace_info PR-3 of the multi-workspace MCP rollout. PR-1 made the MCP server itself multi-workspace aware (one process, N workspace memberships). PR-2 added source_workspace_id threading to delegate_task / list_peers. This change closes the remaining workspace-scoped tools so a single agent registered into multiple workspaces no longer leaks memories or chat history across tenants. Tools now accepting `source_workspace_id`: - tool_commit_memory(content, scope, source_workspace_id=None) — routes POST to /workspaces/{src}/memories with the source workspace's Bearer token. Body still embeds source_workspace_id for the platform's audit + namespace-isolation enforcement. - tool_recall_memory(query, scope, source_workspace_id=None) — GET /workspaces/{src}/memories with the source workspace's token and ?workspace_id={src} query so the platform scopes the read to the caller's tenant view (PR-1 / multi-workspace mode). - tool_chat_history(peer_id, limit, before_ts, source_workspace_id=None) — auto-routes via the _peer_to_source cache populated by list_peers, with explicit override winning. Falls back to module-level WORKSPACE_ID if neither is available. URL: /workspaces/{src}/chat-history. - tool_get_workspace_info(source_workspace_id=None) — GET /workspaces/{src} with the source workspace's token. Useful for introspecting any workspace the agent is registered into, not just the primary. In every path, `src = source_workspace_id or WORKSPACE_ID`, so single-workspace operators see no behavior change. Tokens are resolved per-workspace via auth_headers(src) / _auth_headers_for_heartbeat(src), which fall through to the legacy AUTH_TOKEN env when not in multi-workspace mode. Also updates input_schemas in platform_tools/registry.py so the new optional parameter is advertised to LLM clients (claude-code, hermes-agent, langchain wrappers). Tests (4 new classes in test_a2a_multi_workspace.py, 21 new tests): - TestCommitMemorySourceRouting — URL + Authorization header per source - TestRecallMemorySourceRouting — URL + query param + Authorization - TestChatHistorySourceRouting — peer-cache auto-route + explicit override - TestGetWorkspaceInfoSourceRouting — URL + Authorization Inbox tools (peek/pop/wait_for_message) already multi-workspace aware since PR-1 — inbox.py spawns per-workspace pollers and tags every InboxMessage with arrival_workspace_id. No further plumbing needed. Suite: 1700 passed, 3 skipped, 2 xfailed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 14:17:58 -07:00
Hongming Wang	f70071e1e1	Merge pull request #2765 from Molecule-AI/fix/isolate-adapter-failures-from-card fix(runtime): isolate card-skill enrichment + transcript handler from adapter shape mismatch	2026-05-04 21:17:56 +00:00
Hongming Wang	63ac99788b	fix(runtime): isolate card-skill enrichment + transcript handler from adapter shape mismatch PR #2756 added a try/except around adapter.setup() so a missing LLM key doesn't crash the workspace boot. Two paths that now run AFTER setup succeeds were not similarly isolated, leaving small but real coupling risks for future adapter authors. 1. Skill metadata enrichment swap (main.py:248-259). When adapter.setup() returns, main.py reads adapter.loaded_skills and replaces the static stubs in agent_card.skills with rich metadata (description, tags, examples). The list comprehension assumes each element exposes .metadata.{id,name,description,tags,examples}. A future adapter that returns a non-canonical shape would raise AttributeError, propagate to the outer except, capture as adapter_error, and silently degrade an OK boot to the not-configured state — even though setup() actually succeeded. Extract to card_helpers.enrich_card_skills(card, loaded_skills) → bool. Helper swallows enrichment failures, logs the cause, returns False, leaves the static stubs in place. setup() success path continues unchanged. 6 unit tests cover: None input, empty list, canonical happy path, missing .metadata attr, partial .metadata (missing one canonical field), atomic-failure-no-partial-swap. 2. /transcript handler (main.py:513). Calls await adapter.transcript_lines(...) without try/except. BaseAdapter's default returns {"supported": false} so today's 4 adapters never trigger this — but a future adapter override that assumes setup() ran would surface as a 500 from Starlette's default error handler instead of a useful 503 with the exception class + message. Inline try/except returns 503 with the reason, matching the not-configured JSON-RPC handler's pattern. Both changes match the architectural principle the PR #2756 chain established: availability (workspace reachable) is decoupled from configuration / adapter behavior. Operators see useful errors instead of silent degradation; future adapter authors can't accidentally break tenant readiness with a shape mismatch. Adds: - workspace/card_helpers.py (~50 lines, 100% covered) - workspace/tests/test_card_helpers.py (6 tests) - AgentCard/AgentSkill/AgentCapabilities/AgentInterface stubs to workspace/tests/conftest.py so future card-related tests work under the existing a2a-mock infrastructure - card_helpers in TOP_LEVEL_MODULES (drift gate would have caught it) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 14:15:27 -07:00
Hongming Wang	28472f0d2d	Merge pull request #2764 from Molecule-AI/auto-sync/main-f42feb4e chore: sync main → staging (auto, ff to `f42feb4e`)	2026-05-04 19:51:06 +00:00
molecule-ai[bot]	f42feb4ed7	Merge pull request #2763 from Molecule-AI/staging staging → main: auto-promote `99e7f13`	2026-05-04 19:35:21 +00:00
Hongming Wang	99e7f13149	Merge pull request #2762 from Molecule-AI/fix/preflight-env-warn-not-fail fix(preflight): downgrade required_env + auth_token failures to warnings	2026-05-04 19:23:06 +00:00
Hongming Wang	6488ba09e7	fix(preflight): downgrade required_env + auth_token failures to warnings Preflight was hard-failing the workspace boot when required env vars or legacy auth_token_files were missing, raising SystemExit(1) before main.py's PR #2756 try/except could mount the not-configured handler. Result: codex/openclaw workspaces launched without OPENAI_API_KEY were INVISIBLE — `/.well-known/agent-card.json` never returned 200, the bench timed out at 600s, canvas had no actionable signal. PR #2756 fixed half the puzzle (decouple agent-card from adapter.setup() failure); this fixes the other half (decouple from preflight failure). Caught by bench-provision-time run 25335853189 on 2026-05-04: codex and openclaw both timed_out at 609s while claude-code (whose default model needs no env) hit 86.7s on the same AMI. Hermes hit 147s because hermes config doesn't declare top-level required_env. After this change: - Missing required_env: WARN (operator sees it in boot logs); workspace proceeds to adapter.setup() which raises with the same env-name detail; PR #2756's try/except mounts the not-configured handler; /.well-known/agent-card.json serves 200; JSON-RPC POST / returns -32603 "agent not configured" with the env-name in `error.data`. - Missing auth_token_file (legacy path): same treatment. - Other preflight failures (runtime adapter not installable, invalid A2A port) STAY as fails — those are structural, the workspace truly can't run. Updated 4 existing tests that asserted `report.ok is False` on required_env / auth_token misses to assert `report.ok is True` and check `report.warnings` instead. All 31 preflight tests pass; full suite 1664 pass + 1 unrelated flake on staging. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 12:20:34 -07:00
Hongming Wang	8176b5142d	Merge pull request #2759 from Molecule-AI/auto-sync/main-31427776 chore: sync main → staging (auto, ff to `31427776`)	2026-05-04 18:03:49 +00:00
Hongming Wang	314277769e	Merge pull request #2758 from Molecule-AI/staging staging → main: auto-promote `4f9e3fe`	2026-05-04 10:53:03 -07:00
hongmingwang-moleculeai	e0b567e992	Merge pull request #2757 from Molecule-AI/fix/memory-v2-wiring-real-tests Memory v2 wiring: replace decorative tests with real integration	2026-05-04 17:43:09 +00:00
Hongming Wang	707e4d7342	Memory v2 wiring: replace decorative tests with real integration Self-review of #2755 found two tests that didn't actually exercise the production code path: - TestNamespaceCleanupFn_NamespaceFormat asserted "workspace:" + "abc-123" == "workspace:abc-123" — a compile-time invariant, not runtime behavior. Provided no protection if the closure in Bundle.NamespaceCleanupFn ever stopped using that prefix. - TestNamespaceCleanupFn_FailureLogsButReturns built a parallel cleanup closure inline with errors.New, then invoked the parallel closure. The production closure was never exercised. A regression in NamespaceCleanupFn (e.g. forgetting the deferred recover, calling the plugin without nil-check) would still pass this test. Replaced both with real integration: - TestNamespaceCleanupFn_HitsPluginAtCorrectNamespace spins up httptest.Server, points MEMORY_PLUGIN_URL at it, calls Build(), invokes the production closure, and asserts the server actually saw DELETE /v1/namespaces/workspace:abc-123. - TestNamespaceCleanupFn_PluginErrorDoesNotPanic exercises the failure path for real: server returns 500 on DELETE, closure must log and return without propagating. defer-recover is belt-and- suspenders since production calls this from a for-loop in workspace_crud.go that has no recover. Couldn't ship with #2755 because the merge queue locks the branch once enqueued. Following up now that #2755 is merged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 10:38:59 -07:00
Hongming Wang	4f9e3feece	Merge pull request #2756 from Molecule-AI/fix/agent-card-decouple-from-setup fix(runtime): decouple agent-card readiness from adapter.setup()	2026-05-04 17:32:02 +00:00
Hongming Wang	10752fe330	Merge pull request #2755 from Molecule-AI/fix/memory-v2-main-wiring Memory v2 fixup CRITICAL: wire plugin from main.go (was fully dormant)	2026-05-04 17:31:01 +00:00
Hongming Wang	8f7122a9b6	Merge branch 'staging' into fix/agent-card-decouple-from-setup	2026-05-04 10:24:41 -07:00
Hongming Wang	b3982035b3	Merge branch 'staging' into fix/memory-v2-main-wiring	2026-05-04 10:24:31 -07:00
Hongming Wang	d1122f8d28	fix(build): register not_configured_handler in TOP_LEVEL_MODULES The wheel-build drift gate caught the new module added in this PR — without registering it, the published wheel would ship `import not_configured_handler` un-rewritten, which would `ModuleNotFoundError` at runtime under `molecule_runtime.main`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 10:24:02 -07:00
Hongming Wang	4b35d25d86	fix(runtime): decouple agent-card readiness from adapter.setup() Today, if `adapter.setup()` raises (most often: an LLM credential is missing/rotated), main.py crashes before the agent-card route is mounted. start.sh restart-loops, /.well-known/agent-card.json never returns 200, and the workspace is invisible to the bench/canvas — operators see "stuck booting forever" with no clear error to act on. The agent-card is a static capability advertisement (name, version, skills, supported protocols). It doesn't need a working LLM. Coupling its mount to setup() conflates availability ("am I up?") with configuration ("can I actually answer?"). They're different concerns. This change: - Builds AgentCard from `config.skills` (static names from config.yaml) BEFORE adapter.setup(), so the route mounts independent of setup state. - Wraps setup() + create_executor in try/except. On success, mounts the real DefaultRequestHandler with rich loaded_skills metadata swapped into the card in-place. On failure, mounts a JSON-RPC handler that returns -32603 "agent not configured" with the setup() exception in error.data. - Heartbeat keeps running on misconfigured boots so the platform marks the workspace as reachable-but-misconfigured rather than crash-looping. Operators redeploy with corrected env without chasing a restart loop. - initial_prompt and idle_loop are skipped on misconfigured boots — they self-fire to /, which would land in -32603 anyway, and the marker would consume on the first useless attempt. Bench impact (RFC #388 strict <120s): codex/openclaw bench-time-outs were the agent-card-never-returns-200 symptom. With this fix those runtimes serve the card immediately on EC2 boot, so the bench measures infrastructure cold-start (claude-code class: ~50–80s) instead of credential-coupled boot. Adds workspace/not_configured_handler.py (factory + module-level so behavior is unit-testable; main.py is `# pragma: no cover`) and workspace/tests/test_not_configured_handler.py (6 tests covering status code, JSON-RPC envelope shape, id-echo, malformed-body fallback, reason surfacing, batch-body safety). All 1665 existing workspace tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 10:22:31 -07:00
Hongming Wang	46731729d4	Memory v2 fixup Critical: wire plugin from main.go (was fully dormant) Caught during continued review: the entire v2 plugin system shipped in PRs #2729-#2742 + #2744-#2751 was never actually invoked because main.go and router.go don't construct the plugin client/resolver or attach the WithMemoryV2 / WithNamespaceCleanup hooks. Operators setting MEMORY_PLUGIN_URL=... saw zero behavior change because nothing read it. Every fixup we shipped (idempotency, verify mode, expires_at validation, audit JSON, namespace cleanup, O(N) export, boot E2E) was also dormant for the same reason. Root cause: when a multi-handler feature lands across many PRs, none of them are individually responsible for wiring main.go — and the master-task-tracking issue didn't gate-check that the wiring landed. Add main.go integration to every multi-handler RFC checklist. What ships: * internal/memory/wiring/wiring.go: new package that constructs the plugin client + resolver from MEMORY_PLUGIN_URL once. Returns nil when unset (preserves zero-config legacy behavior). Probes /v1/health at boot but doesn't fail-closed — the MCP layer's circuit breaker handles ongoing unavailability. * internal/memory/wiring/wiring_test.go: 6 tests covering the nil/non-nil bundle paths + the namespace-cleanup closure contract (nil-safe, format-stable, failure-tolerant). * cmd/server/main.go: imports memwiring, calls Build(db.DB) once after WorkspaceHandler creation, attaches WithNamespaceCleanup, threads the bundle through router.Setup. * internal/router/router.go: Setup signature gains *memwiring.Bundle param. Inside, attaches WithMemoryV2 to AdminMemoriesHandler and MCPHandler when the bundle is non-nil. After this, the v2 plugin is reachable end-to-end: Operator sets MEMORY_PLUGIN_URL → main.Build instantiates client + resolver → WorkspaceHandler gets cleanup hook → router wires AdminMemoriesHandler + MCPHandler with WithMemoryV2 → MCP tool calls (commit_memory_v2, search_memory, etc.) actually do something → admin export/import respects MEMORY_V2_CUTOVER. Prerequisite for #292 (staging verification) — without this, the operator runbook's step 2 (set MEMORY_PLUGIN_URL, observe behavior) silently no-ops. Verified: all 9 affected test packages still green (memory/{client,contract,e2e,namespace,pgplugin,wiring}, handlers, router, plus the build).	2026-05-04 10:22:30 -07:00
Hongming Wang	6dc2d907a2	Merge pull request #2754 from Molecule-AI/auto-sync/main-849bc973 chore: sync main → staging (auto, ff to `849bc973`)	2026-05-04 17:19:03 +00:00
molecule-ai[bot]	849bc97349	Merge pull request #2753 from Molecule-AI/staging staging → main: auto-promote `e13dcab`	2026-05-04 17:08:11 +00:00
Hongming Wang	e13dcab5e0	Merge pull request #2749 from Molecule-AI/fix/memory-v2-i3-export-on Memory v2 fixup I3: admin export O(workspaces) → O(N_roots+1)	2026-05-04 16:49:43 +00:00
Hongming Wang	721010307c	Merge pull request #2752 from Molecule-AI/auto-sync/main-73a949bb chore: sync main → staging (auto, ff to `73a949bb`)	2026-05-04 16:49:23 +00:00
Hongming Wang	9f47ecf86e	Merge branch 'staging' into fix/memory-v2-i3-export-on	2026-05-04 09:44:37 -07:00
Hongming Wang	ebc20794f3	fix(admin-memories): include each member's private namespace in export ReadableNamespaces(rootID) returns {workspace:rootID, team:rootID, org:rootID} — the workspace: namespace it surfaces is the root's only. The I3 batching change resolved namespaces once per root which silently dropped every child workspace's private memories from admin export (workspace:childID never reached the plugin search). Keep the per-root batching win for team:/org:/custom: namespaces; inject each member's workspace:<id> + owner mapping explicitly so coverage matches the legacy per-workspace iteration. Cost stays at 1 SQL + N_roots resolver + 1 plugin search. Test changes: - New TestExport_IncludesEveryMembersPrivateNamespace uses a per-workspace resolver stub (mirrors real behaviour) and asserts every member's workspace:<id> reaches the plugin search AND that children's private memories appear in the response with correct owner attribution. Verified to FAIL on the pre-fix code. - TestExport_BatchesPluginCallsByRoot updated to expect 5 namespaces (3 workspace + team + org) instead of 3 — it had pinned the buggy 3-namespace behaviour. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 09:44:06 -07:00
Hongming Wang	73a949bb5c	Merge pull request #2737 from Molecule-AI/staging staging → main: auto-promote `f74fff6`	2026-05-04 09:37:55 -07:00
Hongming Wang	281cb04163	Merge pull request #2751 from Molecule-AI/fix/memory-v2-opt2-boot-e2e Memory v2 fixup Opt-2: real-subprocess boot E2E	2026-05-04 16:27:56 +00:00
Hongming Wang	fe7ff5440d	Memory v2 fixup Opt-2: add E2E.md operator runbook Companion to boot_e2e_test.go (just merged). Documents: - When the E2E suite runs (build tag + env var) - Local run with docker postgres - CI integration example (label-gated workflow step) - What each test pins - Explicit gap list (migration drift, recovery, TTL)	2026-05-04 09:24:16 -07:00
Hongming Wang	5b0a75ab73	Memory v2 fixup Optional-2: real-subprocess boot E2E Self-review #293. PR-11's E2E test uses sqlmock + httptest — integration, not E2E. This adds the actual real-subprocess test: build the binary with `go build`, start it pointing at real postgres, drive HTTP via the real client. What in-process tests miss that this catches: - Binary build / boot-path panics (env var typos, mixed-key interface bugs that only surface when start() runs) - Wire encoding bugs that sqlmock smooths over (the pq.Array regression from PR-3 development would have been caught here) - HTTP+TCP-socket edge cases - Real upsert behavior under postgres ON CONFLICT (C1 fix) Build-tag gated so default CI doesn't require docker: go test -tags memory_plugin_e2e -v ./cmd/memory-plugin-postgres/ Tests skip silently when MEMORY_PLUGIN_E2E_DB is unset. Three tests: 1. TestE2E_BootAndHealth — capabilities advertised correctly 2. TestE2E_FullCommitSearchForgetRoundTrip — full agent flow 3. TestE2E_IdempotencyKey — C1 upsert against real postgres Plus E2E.md operator runbook with docker quickstart + CI integration example + explicit statement of what's still uncovered (migration drift, recovery scenarios, TTL eviction over real time).	2026-05-04 09:23:46 -07:00
Hongming Wang	a6dadc7ee0	Merge pull request #2750 from Molecule-AI/fix/memory-v2-i5-namespace-cleanup Memory v2 fixup I5: workspace purge cleans up plugin namespace	2026-05-04 16:23:41 +00:00
Hongming Wang	5e52a0fdad	Merge pull request #2748 from Molecule-AI/docs/memory-v2-fixup-docs Memory v2 docs update: idempotency key + verify mode + cutover runbook	2026-05-04 16:21:02 +00:00
Hongming Wang	6b445aae2d	Memory v2 fixup I5: workspace purge cleans up plugin namespace Self-review #291. When a workspace is hard-purged, its `workspace:<id>` namespace stays in the plugin storage. Over time deleted workspaces accumulate as orphan namespaces. Fix: optional namespaceCleanupFn hook on WorkspaceHandler. The purge path (workspace_crud.go ~line 520) iterates each purged id and calls the hook best-effort. main.go wires the hook to plugin.DeleteNamespace when MEMORY_PLUGIN_URL is set; operators who haven't enabled the plugin keep the no-op default. Why a hook (not direct plugin import): * Keeps WorkspaceHandler decoupled from the memory contract package (easier to test, smaller blast radius if the contract bumps) * Tests inject a captureCleanupHook stub without standing up a real plugin client * Production wiring stays a one-liner in main.go What gets cleaned up: * `workspace:<id>` for each purged workspace * NOT `team:<root>` / `org:<root>` — those may still be referenced by other workspaces under the same root, so dropping them on a single workspace's purge would orphan team/org data for the survivors. Operator can purge those manually after confirming the entire root is gone. What stays untouched: * Soft-removed workspaces (status='removed', no ?purge=true). The grace window is by design — the data should still be there if the operator unremoves. Tests: * TestWithNamespaceCleanup_DefaultIsNil pins the safe default * TestWithNamespaceCleanup_NilStaysNil pins the explicit-nil case * TestWithNamespaceCleanup_AttachesFn pins the wiring * TestPurge_CallsCleanupHookPerID exercises the per-id loop body * TestPurge_NilHookIsSkipped pins the nil guard A full end-to-end Delete-handler test requires mocking broadcaster + provisioner + descendant SQL chain, which is out-of-scope for a single fixup. Integration coverage for the wired path lives in PR-11's E2E swap test (#293 follow-up).	2026-05-04 09:20:37 -07:00
Hongming Wang	4f3d51bd61	Merge branch 'staging' into docs/memory-v2-fixup-docs	2026-05-04 09:18:49 -07:00
Hongming Wang	9a64aeaa2c	Memory v2 fixup I3: admin export O(workspaces) → O(N_roots+1) Self-review #289. The previous exportViaPlugin ran one resolver CTE walk + one plugin search PER WORKSPACE. For a 1000-workspace tenant that's 1000× of each, mostly redundant — workspaces sharing a team/org root see identical readable namespaces. New strategy: 1. Single SQL pass returns each workspace + its computed root_id via a recursive CTE (loadWorkspacesWithRoots). 2. Group by root → unique tree count is typically << workspace count. 3. Resolver runs ONCE per root (any member sees the same readable list). 4. Build the union of all root namespaces; single plugin.Search call. 5. Map each memory back to a workspace_name via pickOwnerForNamespace (workspace:<id> → matching member; team:* / org:* / custom:* → canonical first member of root group). Net call cost: 1 SQL + N_roots resolver + 1 plugin call (vs N_workspaces × resolver + N_workspaces × plugin in the old code). Tests: * TestExport_BatchesPluginCallsByRoot pins the new behavior explicitly: 3 workspaces under 1 root → exactly 1 plugin search (was 3 with the old code). * TestPickOwnerForNamespace covers all five attribution cases: workspace:<id> match, workspace:<id> no-match-fallback, team:, org:, custom:* → first-member-of-root-group; plus empty-members fallback. * All 9 existing TestExport_* / TestImport_* / TestPickOwner / TestNamespaceKindFromLegacyScope / TestSkipImport / etc. tests remain green (verified with -run "Export"). The legacy DB path (when MEMORY_V2_CUTOVER unset) is unchanged.	2026-05-04 09:17:30 -07:00
Hongming Wang	2d783b5ca6	Memory v2 docs update: idempotency key + verify mode + cutover runbook Updates plugin-author and operator docs to reflect the four fixup PRs (C1, C2, I1, I4) for self-review findings. Stacked on C1+C2 so the docs reference behavior that lands in the same wave; rebases to staging once those merge. What changes: * docs/memory-plugins/README.md - New "Memory idempotency" section explaining MemoryWrite.id contract: omit → plugin generates UUID; supplied → upsert - "Replacing the built-in plugin" rewritten as a 6-step operator runbook with concrete commands for -dry-run / -apply / -verify / MEMORY_V2_CUTOVER, including the failure path ("if -verify reports mismatches, do not flip the cutover flag") - Added link to new CHANGELOG.md * docs/memory-plugins/testing-your-plugin.md - New TestMyPlugin_IDIsIdempotencyKey example: write same id twice, assert single row + updated content - "What the harness does NOT cover" expanded with two new operational gates: backfill twice → no double; verify-mode reports zero mismatches * docs/memory-plugins/pinecone-example/README.md - Wire-mapping table updated: id (caller-supplied) → Pinecone vector id (upsert); id (omitted) → plugin-generated UUID - Production-hardening checklist gained an idempotency-key item * docs/memory-plugins/CHANGELOG.md (new) - Captures the four fixup PRs in one place with severity-ordered summary, plugin-author action items, and remaining open follow-ups (#289, #291, #293) for transparency No code changes. Docs-only PR.	2026-05-04 09:08:28 -07:00
Hongming Wang	6fc328ef44	Merge pull request #2747 from Molecule-AI/fix/memory-v2-c2-backfill-verify Memory v2 fixup C2: backfill -verify mode (parity check)	2026-05-04 16:08:27 +00:00
Hongming Wang	bb3212ad37	Merge branch 'staging' into fix/memory-v2-c2-backfill-verify	2026-05-04 09:08:21 -07:00
Hongming Wang	1986260603	Merge remote-tracking branch 'origin/fix/memory-v2-c1-backfill-idempotent' into docs/memory-v2-fixup-docs	2026-05-04 09:05:11 -07:00
Hongming Wang	d297e75fc9	Merge pull request #2746 from Molecule-AI/fix/memory-v2-i1-i4-small Memory v2 fixup I1+I4: expires_at validation + audit JSON marshal	2026-05-04 16:05:02 +00:00
Hongming Wang	3ae0513209	Merge pull request #2744 from Molecule-AI/fix/memory-v2-c1-backfill-idempotent Memory v2 fixup C1: backfill idempotency via MemoryWrite.id	2026-05-04 16:04:54 +00:00
Hongming Wang	4b6373861c	Memory v2 fixup C2: backfill -verify mode (parity check) Self-review missed deliverable from PR-7's task spec. Operators had no way to confirm a -apply produced equivalent search results to the legacy agent_memories direct queries; this PR ships that. Usage: memory-backfill -verify # 50-workspace random sample memory-backfill -verify -verify-sample=200 # bigger sample memory-backfill -verify -workspace=<uuid> # one specific workspace Algorithm: 1. Pick N random workspaces (or use -workspace if specified) 2. For each: query agent_memories direct, query plugin search via the workspace's readable namespace list 3. Multiset-compare contents: every legacy row must have a matching plugin row. Plugin having MORE rows is OK (team-shared content may be visible from sibling workspaces). 4. Print mismatches with content excerpt; non-zero mismatches/errors yields a non-zero exit so CI can gate cutover. Sql: - Sampling uses ORDER BY random() LIMIT N (TABLESAMPLE has surprising distribution at small populations). - Filters out status='removed' workspaces. Test coverage: * pickWorkspaceSample: single-ws short-circuit, random sampling, query error, scan error * queryLegacyMemories: happy path, error path * verifyParity: - all match → 1 match, 0 mismatch - missing-from-plugin → 1 mismatch with content excerpt - plugin-extra rows → 1 match (legacy is subset of plugin) - legacy query error → 1 error counter - resolver error → 1 error counter - plugin search error → 1 error counter - no readable namespaces + empty legacy → match - no readable namespaces + non-empty legacy → mismatch - pickSample error → propagated up * CLI: -verify+-apply rejected as mutually exclusive; -verify alone is a valid mode Note: namespaceResolverAdapter bridges *namespace.Resolver to the verify package's verifyResolver interface so verify.go has zero dependency on the namespace package — keeps test stubs minimal.	2026-05-04 09:01:31 -07:00
Hongming Wang	3886e8fb9f	Merge pull request #2745 from Molecule-AI/fix/harness-stub-auth-headers-1arg fix(harness): stub platform_auth with *args lambdas (#2743 fallout)	2026-05-04 15:58:24 +00:00

1 2 3 4 5 ...

4156 Commits