molecule-core

Author	SHA1	Message	Date
Hongming Wang	28472f0d2d	Merge pull request #2764 from Molecule-AI/auto-sync/main-f42feb4e chore: sync main → staging (auto, ff to `f42feb4e`)	2026-05-04 19:51:06 +00:00
molecule-ai[bot]	f42feb4ed7	Merge pull request #2763 from Molecule-AI/staging staging → main: auto-promote `99e7f13`	2026-05-04 19:35:21 +00:00
Hongming Wang	99e7f13149	Merge pull request #2762 from Molecule-AI/fix/preflight-env-warn-not-fail fix(preflight): downgrade required_env + auth_token failures to warnings	2026-05-04 19:23:06 +00:00
Hongming Wang	6488ba09e7	fix(preflight): downgrade required_env + auth_token failures to warnings Preflight was hard-failing the workspace boot when required env vars or legacy auth_token_files were missing, raising SystemExit(1) before main.py's PR #2756 try/except could mount the not-configured handler. Result: codex/openclaw workspaces launched without OPENAI_API_KEY were INVISIBLE — `/.well-known/agent-card.json` never returned 200, the bench timed out at 600s, canvas had no actionable signal. PR #2756 fixed half the puzzle (decouple agent-card from adapter.setup() failure); this fixes the other half (decouple from preflight failure). Caught by bench-provision-time run 25335853189 on 2026-05-04: codex and openclaw both timed_out at 609s while claude-code (whose default model needs no env) hit 86.7s on the same AMI. Hermes hit 147s because hermes config doesn't declare top-level required_env. After this change: - Missing required_env: WARN (operator sees it in boot logs); workspace proceeds to adapter.setup() which raises with the same env-name detail; PR #2756's try/except mounts the not-configured handler; /.well-known/agent-card.json serves 200; JSON-RPC POST / returns -32603 "agent not configured" with the env-name in `error.data`. - Missing auth_token_file (legacy path): same treatment. - Other preflight failures (runtime adapter not installable, invalid A2A port) STAY as fails — those are structural, the workspace truly can't run. Updated 4 existing tests that asserted `report.ok is False` on required_env / auth_token misses to assert `report.ok is True` and check `report.warnings` instead. All 31 preflight tests pass; full suite 1664 pass + 1 unrelated flake on staging. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 12:20:34 -07:00
Hongming Wang	8176b5142d	Merge pull request #2759 from Molecule-AI/auto-sync/main-31427776 chore: sync main → staging (auto, ff to `31427776`)	2026-05-04 18:03:49 +00:00
Hongming Wang	314277769e	Merge pull request #2758 from Molecule-AI/staging staging → main: auto-promote `4f9e3fe`	2026-05-04 10:53:03 -07:00
hongmingwang-moleculeai	e0b567e992	Merge pull request #2757 from Molecule-AI/fix/memory-v2-wiring-real-tests Memory v2 wiring: replace decorative tests with real integration	2026-05-04 17:43:09 +00:00
Hongming Wang	707e4d7342	Memory v2 wiring: replace decorative tests with real integration Self-review of #2755 found two tests that didn't actually exercise the production code path: - TestNamespaceCleanupFn_NamespaceFormat asserted "workspace:" + "abc-123" == "workspace:abc-123" — a compile-time invariant, not runtime behavior. Provided no protection if the closure in Bundle.NamespaceCleanupFn ever stopped using that prefix. - TestNamespaceCleanupFn_FailureLogsButReturns built a parallel cleanup closure inline with errors.New, then invoked the parallel closure. The production closure was never exercised. A regression in NamespaceCleanupFn (e.g. forgetting the deferred recover, calling the plugin without nil-check) would still pass this test. Replaced both with real integration: - TestNamespaceCleanupFn_HitsPluginAtCorrectNamespace spins up httptest.Server, points MEMORY_PLUGIN_URL at it, calls Build(), invokes the production closure, and asserts the server actually saw DELETE /v1/namespaces/workspace:abc-123. - TestNamespaceCleanupFn_PluginErrorDoesNotPanic exercises the failure path for real: server returns 500 on DELETE, closure must log and return without propagating. defer-recover is belt-and- suspenders since production calls this from a for-loop in workspace_crud.go that has no recover. Couldn't ship with #2755 because the merge queue locks the branch once enqueued. Following up now that #2755 is merged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 10:38:59 -07:00
Hongming Wang	4f9e3feece	Merge pull request #2756 from Molecule-AI/fix/agent-card-decouple-from-setup fix(runtime): decouple agent-card readiness from adapter.setup()	2026-05-04 17:32:02 +00:00
Hongming Wang	10752fe330	Merge pull request #2755 from Molecule-AI/fix/memory-v2-main-wiring Memory v2 fixup CRITICAL: wire plugin from main.go (was fully dormant)	2026-05-04 17:31:01 +00:00
Hongming Wang	8f7122a9b6	Merge branch 'staging' into fix/agent-card-decouple-from-setup	2026-05-04 10:24:41 -07:00
Hongming Wang	b3982035b3	Merge branch 'staging' into fix/memory-v2-main-wiring	2026-05-04 10:24:31 -07:00
Hongming Wang	d1122f8d28	fix(build): register not_configured_handler in TOP_LEVEL_MODULES The wheel-build drift gate caught the new module added in this PR — without registering it, the published wheel would ship `import not_configured_handler` un-rewritten, which would `ModuleNotFoundError` at runtime under `molecule_runtime.main`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 10:24:02 -07:00
Hongming Wang	4b35d25d86	fix(runtime): decouple agent-card readiness from adapter.setup() Today, if `adapter.setup()` raises (most often: an LLM credential is missing/rotated), main.py crashes before the agent-card route is mounted. start.sh restart-loops, /.well-known/agent-card.json never returns 200, and the workspace is invisible to the bench/canvas — operators see "stuck booting forever" with no clear error to act on. The agent-card is a static capability advertisement (name, version, skills, supported protocols). It doesn't need a working LLM. Coupling its mount to setup() conflates availability ("am I up?") with configuration ("can I actually answer?"). They're different concerns. This change: - Builds AgentCard from `config.skills` (static names from config.yaml) BEFORE adapter.setup(), so the route mounts independent of setup state. - Wraps setup() + create_executor in try/except. On success, mounts the real DefaultRequestHandler with rich loaded_skills metadata swapped into the card in-place. On failure, mounts a JSON-RPC handler that returns -32603 "agent not configured" with the setup() exception in error.data. - Heartbeat keeps running on misconfigured boots so the platform marks the workspace as reachable-but-misconfigured rather than crash-looping. Operators redeploy with corrected env without chasing a restart loop. - initial_prompt and idle_loop are skipped on misconfigured boots — they self-fire to /, which would land in -32603 anyway, and the marker would consume on the first useless attempt. Bench impact (RFC #388 strict <120s): codex/openclaw bench-time-outs were the agent-card-never-returns-200 symptom. With this fix those runtimes serve the card immediately on EC2 boot, so the bench measures infrastructure cold-start (claude-code class: ~50–80s) instead of credential-coupled boot. Adds workspace/not_configured_handler.py (factory + module-level so behavior is unit-testable; main.py is `# pragma: no cover`) and workspace/tests/test_not_configured_handler.py (6 tests covering status code, JSON-RPC envelope shape, id-echo, malformed-body fallback, reason surfacing, batch-body safety). All 1665 existing workspace tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 10:22:31 -07:00
Hongming Wang	46731729d4	Memory v2 fixup Critical: wire plugin from main.go (was fully dormant) Caught during continued review: the entire v2 plugin system shipped in PRs #2729-#2742 + #2744-#2751 was never actually invoked because main.go and router.go don't construct the plugin client/resolver or attach the WithMemoryV2 / WithNamespaceCleanup hooks. Operators setting MEMORY_PLUGIN_URL=... saw zero behavior change because nothing read it. Every fixup we shipped (idempotency, verify mode, expires_at validation, audit JSON, namespace cleanup, O(N) export, boot E2E) was also dormant for the same reason. Root cause: when a multi-handler feature lands across many PRs, none of them are individually responsible for wiring main.go — and the master-task-tracking issue didn't gate-check that the wiring landed. Add main.go integration to every multi-handler RFC checklist. What ships: * internal/memory/wiring/wiring.go: new package that constructs the plugin client + resolver from MEMORY_PLUGIN_URL once. Returns nil when unset (preserves zero-config legacy behavior). Probes /v1/health at boot but doesn't fail-closed — the MCP layer's circuit breaker handles ongoing unavailability. * internal/memory/wiring/wiring_test.go: 6 tests covering the nil/non-nil bundle paths + the namespace-cleanup closure contract (nil-safe, format-stable, failure-tolerant). * cmd/server/main.go: imports memwiring, calls Build(db.DB) once after WorkspaceHandler creation, attaches WithNamespaceCleanup, threads the bundle through router.Setup. * internal/router/router.go: Setup signature gains *memwiring.Bundle param. Inside, attaches WithMemoryV2 to AdminMemoriesHandler and MCPHandler when the bundle is non-nil. After this, the v2 plugin is reachable end-to-end: Operator sets MEMORY_PLUGIN_URL → main.Build instantiates client + resolver → WorkspaceHandler gets cleanup hook → router wires AdminMemoriesHandler + MCPHandler with WithMemoryV2 → MCP tool calls (commit_memory_v2, search_memory, etc.) actually do something → admin export/import respects MEMORY_V2_CUTOVER. Prerequisite for #292 (staging verification) — without this, the operator runbook's step 2 (set MEMORY_PLUGIN_URL, observe behavior) silently no-ops. Verified: all 9 affected test packages still green (memory/{client,contract,e2e,namespace,pgplugin,wiring}, handlers, router, plus the build).	2026-05-04 10:22:30 -07:00
Hongming Wang	6dc2d907a2	Merge pull request #2754 from Molecule-AI/auto-sync/main-849bc973 chore: sync main → staging (auto, ff to `849bc973`)	2026-05-04 17:19:03 +00:00
molecule-ai[bot]	849bc97349	Merge pull request #2753 from Molecule-AI/staging staging → main: auto-promote `e13dcab`	2026-05-04 17:08:11 +00:00
Hongming Wang	e13dcab5e0	Merge pull request #2749 from Molecule-AI/fix/memory-v2-i3-export-on Memory v2 fixup I3: admin export O(workspaces) → O(N_roots+1)	2026-05-04 16:49:43 +00:00
Hongming Wang	721010307c	Merge pull request #2752 from Molecule-AI/auto-sync/main-73a949bb chore: sync main → staging (auto, ff to `73a949bb`)	2026-05-04 16:49:23 +00:00
Hongming Wang	9f47ecf86e	Merge branch 'staging' into fix/memory-v2-i3-export-on	2026-05-04 09:44:37 -07:00
Hongming Wang	ebc20794f3	fix(admin-memories): include each member's private namespace in export ReadableNamespaces(rootID) returns {workspace:rootID, team:rootID, org:rootID} — the workspace: namespace it surfaces is the root's only. The I3 batching change resolved namespaces once per root which silently dropped every child workspace's private memories from admin export (workspace:childID never reached the plugin search). Keep the per-root batching win for team:/org:/custom: namespaces; inject each member's workspace:<id> + owner mapping explicitly so coverage matches the legacy per-workspace iteration. Cost stays at 1 SQL + N_roots resolver + 1 plugin search. Test changes: - New TestExport_IncludesEveryMembersPrivateNamespace uses a per-workspace resolver stub (mirrors real behaviour) and asserts every member's workspace:<id> reaches the plugin search AND that children's private memories appear in the response with correct owner attribution. Verified to FAIL on the pre-fix code. - TestExport_BatchesPluginCallsByRoot updated to expect 5 namespaces (3 workspace + team + org) instead of 3 — it had pinned the buggy 3-namespace behaviour. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 09:44:06 -07:00
Hongming Wang	73a949bb5c	Merge pull request #2737 from Molecule-AI/staging staging → main: auto-promote `f74fff6`	2026-05-04 09:37:55 -07:00
Hongming Wang	281cb04163	Merge pull request #2751 from Molecule-AI/fix/memory-v2-opt2-boot-e2e Memory v2 fixup Opt-2: real-subprocess boot E2E	2026-05-04 16:27:56 +00:00
Hongming Wang	fe7ff5440d	Memory v2 fixup Opt-2: add E2E.md operator runbook Companion to boot_e2e_test.go (just merged). Documents: - When the E2E suite runs (build tag + env var) - Local run with docker postgres - CI integration example (label-gated workflow step) - What each test pins - Explicit gap list (migration drift, recovery, TTL)	2026-05-04 09:24:16 -07:00
Hongming Wang	5b0a75ab73	Memory v2 fixup Optional-2: real-subprocess boot E2E Self-review #293. PR-11's E2E test uses sqlmock + httptest — integration, not E2E. This adds the actual real-subprocess test: build the binary with `go build`, start it pointing at real postgres, drive HTTP via the real client. What in-process tests miss that this catches: - Binary build / boot-path panics (env var typos, mixed-key interface bugs that only surface when start() runs) - Wire encoding bugs that sqlmock smooths over (the pq.Array regression from PR-3 development would have been caught here) - HTTP+TCP-socket edge cases - Real upsert behavior under postgres ON CONFLICT (C1 fix) Build-tag gated so default CI doesn't require docker: go test -tags memory_plugin_e2e -v ./cmd/memory-plugin-postgres/ Tests skip silently when MEMORY_PLUGIN_E2E_DB is unset. Three tests: 1. TestE2E_BootAndHealth — capabilities advertised correctly 2. TestE2E_FullCommitSearchForgetRoundTrip — full agent flow 3. TestE2E_IdempotencyKey — C1 upsert against real postgres Plus E2E.md operator runbook with docker quickstart + CI integration example + explicit statement of what's still uncovered (migration drift, recovery scenarios, TTL eviction over real time).	2026-05-04 09:23:46 -07:00
Hongming Wang	a6dadc7ee0	Merge pull request #2750 from Molecule-AI/fix/memory-v2-i5-namespace-cleanup Memory v2 fixup I5: workspace purge cleans up plugin namespace	2026-05-04 16:23:41 +00:00
Hongming Wang	5e52a0fdad	Merge pull request #2748 from Molecule-AI/docs/memory-v2-fixup-docs Memory v2 docs update: idempotency key + verify mode + cutover runbook	2026-05-04 16:21:02 +00:00
Hongming Wang	6b445aae2d	Memory v2 fixup I5: workspace purge cleans up plugin namespace Self-review #291. When a workspace is hard-purged, its `workspace:<id>` namespace stays in the plugin storage. Over time deleted workspaces accumulate as orphan namespaces. Fix: optional namespaceCleanupFn hook on WorkspaceHandler. The purge path (workspace_crud.go ~line 520) iterates each purged id and calls the hook best-effort. main.go wires the hook to plugin.DeleteNamespace when MEMORY_PLUGIN_URL is set; operators who haven't enabled the plugin keep the no-op default. Why a hook (not direct plugin import): * Keeps WorkspaceHandler decoupled from the memory contract package (easier to test, smaller blast radius if the contract bumps) * Tests inject a captureCleanupHook stub without standing up a real plugin client * Production wiring stays a one-liner in main.go What gets cleaned up: * `workspace:<id>` for each purged workspace * NOT `team:<root>` / `org:<root>` — those may still be referenced by other workspaces under the same root, so dropping them on a single workspace's purge would orphan team/org data for the survivors. Operator can purge those manually after confirming the entire root is gone. What stays untouched: * Soft-removed workspaces (status='removed', no ?purge=true). The grace window is by design — the data should still be there if the operator unremoves. Tests: * TestWithNamespaceCleanup_DefaultIsNil pins the safe default * TestWithNamespaceCleanup_NilStaysNil pins the explicit-nil case * TestWithNamespaceCleanup_AttachesFn pins the wiring * TestPurge_CallsCleanupHookPerID exercises the per-id loop body * TestPurge_NilHookIsSkipped pins the nil guard A full end-to-end Delete-handler test requires mocking broadcaster + provisioner + descendant SQL chain, which is out-of-scope for a single fixup. Integration coverage for the wired path lives in PR-11's E2E swap test (#293 follow-up).	2026-05-04 09:20:37 -07:00
Hongming Wang	4f3d51bd61	Merge branch 'staging' into docs/memory-v2-fixup-docs	2026-05-04 09:18:49 -07:00
Hongming Wang	9a64aeaa2c	Memory v2 fixup I3: admin export O(workspaces) → O(N_roots+1) Self-review #289. The previous exportViaPlugin ran one resolver CTE walk + one plugin search PER WORKSPACE. For a 1000-workspace tenant that's 1000× of each, mostly redundant — workspaces sharing a team/org root see identical readable namespaces. New strategy: 1. Single SQL pass returns each workspace + its computed root_id via a recursive CTE (loadWorkspacesWithRoots). 2. Group by root → unique tree count is typically << workspace count. 3. Resolver runs ONCE per root (any member sees the same readable list). 4. Build the union of all root namespaces; single plugin.Search call. 5. Map each memory back to a workspace_name via pickOwnerForNamespace (workspace:<id> → matching member; team:* / org:* / custom:* → canonical first member of root group). Net call cost: 1 SQL + N_roots resolver + 1 plugin call (vs N_workspaces × resolver + N_workspaces × plugin in the old code). Tests: * TestExport_BatchesPluginCallsByRoot pins the new behavior explicitly: 3 workspaces under 1 root → exactly 1 plugin search (was 3 with the old code). * TestPickOwnerForNamespace covers all five attribution cases: workspace:<id> match, workspace:<id> no-match-fallback, team:, org:, custom:* → first-member-of-root-group; plus empty-members fallback. * All 9 existing TestExport_* / TestImport_* / TestPickOwner / TestNamespaceKindFromLegacyScope / TestSkipImport / etc. tests remain green (verified with -run "Export"). The legacy DB path (when MEMORY_V2_CUTOVER unset) is unchanged.	2026-05-04 09:17:30 -07:00
Hongming Wang	2d783b5ca6	Memory v2 docs update: idempotency key + verify mode + cutover runbook Updates plugin-author and operator docs to reflect the four fixup PRs (C1, C2, I1, I4) for self-review findings. Stacked on C1+C2 so the docs reference behavior that lands in the same wave; rebases to staging once those merge. What changes: * docs/memory-plugins/README.md - New "Memory idempotency" section explaining MemoryWrite.id contract: omit → plugin generates UUID; supplied → upsert - "Replacing the built-in plugin" rewritten as a 6-step operator runbook with concrete commands for -dry-run / -apply / -verify / MEMORY_V2_CUTOVER, including the failure path ("if -verify reports mismatches, do not flip the cutover flag") - Added link to new CHANGELOG.md * docs/memory-plugins/testing-your-plugin.md - New TestMyPlugin_IDIsIdempotencyKey example: write same id twice, assert single row + updated content - "What the harness does NOT cover" expanded with two new operational gates: backfill twice → no double; verify-mode reports zero mismatches * docs/memory-plugins/pinecone-example/README.md - Wire-mapping table updated: id (caller-supplied) → Pinecone vector id (upsert); id (omitted) → plugin-generated UUID - Production-hardening checklist gained an idempotency-key item * docs/memory-plugins/CHANGELOG.md (new) - Captures the four fixup PRs in one place with severity-ordered summary, plugin-author action items, and remaining open follow-ups (#289, #291, #293) for transparency No code changes. Docs-only PR.	2026-05-04 09:08:28 -07:00
Hongming Wang	6fc328ef44	Merge pull request #2747 from Molecule-AI/fix/memory-v2-c2-backfill-verify Memory v2 fixup C2: backfill -verify mode (parity check)	2026-05-04 16:08:27 +00:00
Hongming Wang	bb3212ad37	Merge branch 'staging' into fix/memory-v2-c2-backfill-verify	2026-05-04 09:08:21 -07:00
Hongming Wang	1986260603	Merge remote-tracking branch 'origin/fix/memory-v2-c1-backfill-idempotent' into docs/memory-v2-fixup-docs	2026-05-04 09:05:11 -07:00
Hongming Wang	d297e75fc9	Merge pull request #2746 from Molecule-AI/fix/memory-v2-i1-i4-small Memory v2 fixup I1+I4: expires_at validation + audit JSON marshal	2026-05-04 16:05:02 +00:00
Hongming Wang	3ae0513209	Merge pull request #2744 from Molecule-AI/fix/memory-v2-c1-backfill-idempotent Memory v2 fixup C1: backfill idempotency via MemoryWrite.id	2026-05-04 16:04:54 +00:00
Hongming Wang	4b6373861c	Memory v2 fixup C2: backfill -verify mode (parity check) Self-review missed deliverable from PR-7's task spec. Operators had no way to confirm a -apply produced equivalent search results to the legacy agent_memories direct queries; this PR ships that. Usage: memory-backfill -verify # 50-workspace random sample memory-backfill -verify -verify-sample=200 # bigger sample memory-backfill -verify -workspace=<uuid> # one specific workspace Algorithm: 1. Pick N random workspaces (or use -workspace if specified) 2. For each: query agent_memories direct, query plugin search via the workspace's readable namespace list 3. Multiset-compare contents: every legacy row must have a matching plugin row. Plugin having MORE rows is OK (team-shared content may be visible from sibling workspaces). 4. Print mismatches with content excerpt; non-zero mismatches/errors yields a non-zero exit so CI can gate cutover. Sql: - Sampling uses ORDER BY random() LIMIT N (TABLESAMPLE has surprising distribution at small populations). - Filters out status='removed' workspaces. Test coverage: * pickWorkspaceSample: single-ws short-circuit, random sampling, query error, scan error * queryLegacyMemories: happy path, error path * verifyParity: - all match → 1 match, 0 mismatch - missing-from-plugin → 1 mismatch with content excerpt - plugin-extra rows → 1 match (legacy is subset of plugin) - legacy query error → 1 error counter - resolver error → 1 error counter - plugin search error → 1 error counter - no readable namespaces + empty legacy → match - no readable namespaces + non-empty legacy → mismatch - pickSample error → propagated up * CLI: -verify+-apply rejected as mutually exclusive; -verify alone is a valid mode Note: namespaceResolverAdapter bridges *namespace.Resolver to the verify package's verifyResolver interface so verify.go has zero dependency on the namespace package — keeps test stubs minimal.	2026-05-04 09:01:31 -07:00
Hongming Wang	3886e8fb9f	Merge pull request #2745 from Molecule-AI/fix/harness-stub-auth-headers-1arg fix(harness): stub platform_auth with *args lambdas (#2743 fallout)	2026-05-04 15:58:24 +00:00
Hongming Wang	d48693144b	Memory v2 fixup I1+I4: expires_at validation + audit JSON marshal Two small Important findings from self-review, bundled because both are <20 line changes touching the same file. I1: expires_at silent drop - mcp_tools_memory_v2.go:130 had `if t, err := ...; err == nil { ... }` which dropped malformed timestamps without telling the agent. Agent passes `expires_at: "tomorrow"`, gets a 200, and the memory has no TTL. - Now returns a clear error: "invalid expires_at: must be RFC3339" - Test renamed: TestCommitMemoryV2_BadExpiresIsIgnored (which codified the bug) → TestCommitMemoryV2_BadExpiresReturnsError (which pins the fix). I4: audit log JSON via Sprintf-%q - auditOrgWrite was building activity_logs.metadata via fmt.Sprintf with %q. Go-quoted strings happen to coincide with JSON-quoted for ASCII (and today's values are pure ASCII: UUID + hex digest) so the bug was latent. - Replaced with json.Marshal of map[string]string. Same wire shape today, but won't silently produce invalid JSON if metadata grows to include arbitrary content snippets. - New test TestAuditOrgWrite_MetadataIsValidJSON uses a custom sqlmock.Argument matcher (jsonValidMatcher) that fails the test if the metadata column isn't parseable JSON. The test runs auditOrgWrite with a content string containing quotes, backslashes, and a control byte — values where %q would diverge from JSON-quote. Both pre-existing tests (TestCommitMemoryV2_AuditsOrgWrites etc.) remain green.	2026-05-04 08:57:58 -07:00
Hongming Wang	1b207b214d	fix(harness): stub platform_auth with args lambdas (#2743 fallout) PR #2743 (multi-workspace MCP PR-2) made auth_headers accept an optional ``workspace_id`` arg and self_source_headers stayed 1-arg-required. The peer-discovery-404 harness replay stubbed both with 0-arg lambdas, so the helper call inside the replay raised: TypeError: <lambda>() takes 0 positional arguments but 1 was given …and the diagnostic captured by the replay was the TypeError text, not the platform-404 string the assertion grep'd for. Caught by PR-2737 (auto-promote staging→main) — the replay went red right after #2743 merged into staging. Switching both stubs to ``args, **kwargs`` makes them tolerant of both the legacy 0-arg call shape AND the new 1-arg-with-workspace call shape, so neither the harness nor the in-tree unit tests need to know which version of the runtime helpers ran the call. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 08:55:42 -07:00
Hongming Wang	1e97fb9a16	Memory v2 fixup C1: backfill idempotency via MemoryWrite.id Self-review (post-merge) flagged that the backfill claimed to be idempotent on re-run but actually duplicates every row because the plugin's INSERT uses gen_random_uuid() and ignores any id passed in. Fix is contract-level: extend MemoryWrite with an optional `id` idempotency key. When supplied, the plugin MUST treat the write as upsert keyed on this id; when omitted, the plugin generates a fresh UUID (production agent commits keep working unchanged). Changes: * docs/api-protocol/memory-plugin-v1.yaml: add id field with description that flags it as idempotency key * internal/memory/contract/contract.go: add ID to MemoryWrite struct, update memory_write_minimal golden vector * internal/memory/pgplugin/store.go: split CommitMemory into two paths — upsert when body.ID set (INSERT ... ON CONFLICT (id) DO UPDATE), plain INSERT otherwise * cmd/memory-backfill/main.go: pass agent_memories.id to MemoryWrite, fix the false comment about 409 deduplication New tests: * pgplugin: TestCommitMemory_WithIDUpserts pins the upsert SQL is used when id is set; TestCommitMemory_UpsertScanError covers the error branch * backfill: TestBackfill_PassesSourceUUIDAsIdempotencyKey pins the forwarding behavior; TestBackfill_RerunIsIdempotent simulates a retry and asserts both runs pass the same uuid (plugin upsert is what makes this safe) Why this matters: operators retrying a failed backfill (which they will — networks fail, transactions abort) would otherwise create N duplicates per memory. The duplicates aren't visible until search results show obvious dupes — debugging that under prod load is bad. Production agent commits are unaffected: they leave id empty, the plugin generates a fresh UUID via gen_random_uuid(), zero behavior change for the hot path.	2026-05-04 08:54:13 -07:00
Hongming Wang	7cffff844b	Merge pull request #2743 from Molecule-AI/feat/mcp-multi-workspace-pr2 feat(mcp): cross-workspace delegation routing (multi-ws PR-2)	2026-05-04 15:43:20 +00:00
Hongming Wang	4a0d7cd545	Merge branch 'staging' into feat/mcp-multi-workspace-pr2	2026-05-04 08:37:20 -07:00
Hongming Wang	35b3ea598a	test: fix WORKSPACE_ID assert to match module attr (CI portability) CI's pytest harness pre-sets WORKSPACE_ID=test in the env before test collection, so a2a_client's module-level WORKSPACE_ID (captured at import time, line 24) holds "test" — but the local fixture's monkeypatch.setenv("WORKSPACE_ID", ...) only affects the ENV value seen on later os.environ reads, NOT the already-bound module attribute. Assert against a2a_client.WORKSPACE_ID directly so the test is portable across local + CI runs without monkey-patching the module itself (which a future test reload might undo). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 08:35:48 -07:00
Hongming Wang	1161b97faf	feat(mcp): cross-workspace delegation routing (multi-ws PR-2) PR-2 of the multi-workspace external-agent stack. PR-1 (#2739) landed per-workspace auth + heartbeat + inbox. This PR threads ``source_workspace_id`` through the A2A client + tool surface so an agent registered against multiple workspaces can list peers across all of them and delegate from a specific source. Changes ------- * ``a2a_client``: ``discover_peer``, ``send_a2a_message``, ``get_peers_with_diagnostic``, and ``enrich_peer_metadata`` now accept ``source_workspace_id``. Routing uses it for both the X-Workspace-ID header and (transitively, via ``auth_headers(src)``) the bearer token. Defaults to module-level WORKSPACE_ID for back-compat. * ``a2a_client._peer_to_source``: a new lock-free cache mapping each discovered peer back to the source workspace whose registry surfaced it. ``tool_list_peers`` populates the cache on every call; ``tool_delegate_task`` consults it for auto-routing. * ``a2a_tools.tool_list_peers(source_workspace_id=None)``: when multiple workspaces are registered (MOLECULE_WORKSPACES) and no explicit source is passed, aggregates peers across every registered workspace and tags each entry with ``via: <src[:8]>``. Single-workspace mode is unchanged — no ``via:`` annotation, same output shape. * ``a2a_tools.tool_delegate_task`` and ``tool_delegate_task_async`` resolve source via ``source_workspace_id arg → _peer_to_source[target] → WORKSPACE_ID``. Agents almost never need to specify ``source_`` explicitly — call ``list_peers`` first and the cache handles the rest. ``tool_delegate_task_async`` idempotency key now includes the source workspace, so the same task delegated from two registered workspaces produces two distinct delegations (the right behavior — one per tenant audit trail). * ``platform_auth.list_registered_workspaces()``: new helper for the tool layer to enumerate the multi-ws registry. Lock-free reads matched by the existing single-writer-per-workspace contract from PR-1. * ``platform_auth.self_source_headers``: now passes ``workspace_id`` through to ``auth_headers`` — without this, a multi-workspace POST source-tagged with ``X-Workspace-ID=ws_b`` was authenticating with ws_a's token (or no token if MOLECULE_WORKSPACE_TOKEN unset). Latent PR-1 bug exposed by the new tool surface. * ``a2a_mcp_server`` tool dispatch passes ``source_workspace_id`` from the tool call arguments. * ``platform_tools.registry``: add ``source_workspace_id`` to the delegate_task, delegate_task_async, check_task_status, list_peers input schemas with copy explaining when to use it (rarely — the cache handles it). Tests (15 new, all passing) --------------------------- ``test_a2a_multi_workspace.py``: * TestDiscoverPeerSourceRouting (3): src arg drives header+token, fallback to module ws when omitted, invalid target short-circuits before any HTTP attempt. * TestSendA2AMessageSourceRouting (1): X-Workspace-ID source header + Authorization bearer both come from the source arg via the patched self_source_headers chain. * TestGetPeersSourceRouting (1): URL path AND headers use the source workspace id. * TestToolListPeersAggregation (4): aggregates across multiple registered workspaces, tags origin, leaves single-workspace path unchanged, explicit src arg overrides aggregation, diagnostic joining when every workspace returns empty. * TestToolDelegateTaskAutoRouting (3): cache-driven auto-route, explicit override beats cache, single-workspace fallback to module WORKSPACE_ID. * TestListRegisteredWorkspaces (3): registry enumeration helper. Plus ``tests/snapshots/a2a_instructions_mcp.txt`` regenerated to absorb the new ``source_workspace_id`` schema entries. Back-compat ----------- Every change defaults ``source_workspace_id=None``; legacy single-workspace operators (no MOLECULE_WORKSPACES) see identical behavior — same URLs, same headers, same tool output. The 24 PR-1 tests + 125 existing A2A tests all still pass. Out of scope (PR-3) ------------------- Memory namespacing per registered workspace lands after the new memory system v2 PR (#2740) settles in production. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 08:32:24 -07:00
Hongming Wang	059962a0a3	Merge pull request #2742 from Molecule-AI/feat/memory-v2-pr11-e2e-swap Memory v2 PR-11: E2E test — flat-plugin swap proves contract works	2026-05-04 15:29:56 +00:00
Hongming Wang	b07575c710	Merge branch 'staging' into feat/memory-v2-pr11-e2e-swap	2026-05-04 08:24:26 -07:00
Hongming Wang	586fa5f84e	Merge pull request #2741 from Molecule-AI/feat/memory-v2-pr10-docs Memory v2 PR-10: operator docs for writing a custom memory plugin	2026-05-04 15:20:35 +00:00
Hongming Wang	b937415e1e	Memory v2 PR-11: E2E test — flat-plugin swap proves contract works Final implementation PR. Builds on PR-1..10 (all merged or queued). Proves the central design property of the plugin contract: ANY plugin satisfying the v1 OpenAPI spec works as a drop-in replacement for the built-in postgres plugin. If this test fails after a refactor, the contract has drifted in a way that breaks ecosystem plugins. What ships: * internal/memory/e2e/swap_test.go — five E2E tests against a deliberately minimal "flat-memory" stub plugin (~50 LOC, single map, zero capabilities) * MCPHandler.Dispatch — small exported wrapper around dispatch so out-of-package E2E tests can drive tools by name without duplicating the whole MCP RPC stack E2E coverage: * TestE2E_FlatPluginRoundTrip: full lifecycle - list_writable_namespaces returns 3 entries - commit_memory_v2 writes through plugin - search_memory finds it back - commit_summary writes a summary - forget_memory deletes - search after forget excludes the deleted memory * TestE2E_LegacyShimRoutesThroughFlatPlugin: PR-6 shim wired up - Legacy commit_memory(scope=LOCAL) ends up in plugin storage - Legacy recall_memory finds it back through plugin search - Response shapes preserved (scope:LOCAL stays scope:LOCAL) * TestE2E_OrgMemoriesDelimiterWrap: prompt-injection mitigation - Org-namespace memory committed - Audit INSERT into activity_logs verified - Search returns content with [MEMORY id=... scope=ORG ns=...] prefix applied * TestE2E_StubPluginCapabilitiesAreEmpty: capability negotiation - Stub plugin reports zero capabilities - Client.SupportsCapability returns false for FTS, embedding - Confirms graceful degradation when plugin doesn't support a feature * TestE2E_PluginUnreachable_AgentSeesClearError: failure surface - Plugin URL pointing at bogus port - commit_memory_v2 returns informative error - No nil-pointer dereference; error message is actionable The flat plugin is intentionally minimal — it has no namespaces table distinct from memory records, no FTS, no semantic search, no TTL. The test proves operators can drop in a 50-line plugin and the agent behavior is identical (modulo capability-gated features).	2026-05-04 08:20:35 -07:00
Hongming Wang	0f46c7eefe	Merge pull request #2739 from Molecule-AI/feat/mcp-multi-workspace-pr1 mcp: support multi-workspace external-agent registration (PR-1 of stack)	2026-05-04 15:19:03 +00:00

1 2 3 4 5 ...

4144 Commits