Files
molecule-core/docs/memory-plugins/testing-your-plugin.md
hongming 8019231a16
ci-arm64-advisory / fast-checks (push) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 8s
Block internal-flavored paths / Block forbidden paths (push) Successful in 8s
CI / Detect changes (push) Successful in 9s
CI / Python Lint & Test (push) Successful in 5s
E2E API Smoke Test / detect-changes (push) Successful in 9s
E2E Chat / detect-changes (push) Successful in 8s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (push) Successful in 49s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 12s
publish-workspace-server-image / build-and-push (push) Successful in 3m12s
E2E Staging SaaS (full lifecycle) / pr-validate (push) Successful in 39s
Handlers Postgres Integration / detect-changes (push) Successful in 4s
Harness Replays / detect-changes (push) Successful in 5s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 6s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 4s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 3s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (push) Successful in 3s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Successful in 1m6s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 14s
CI / Canvas (Next.js) (push) Successful in 3s
CI / Shellcheck (E2E scripts) (push) Successful in 2s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Successful in 1m25s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push) Successful in 5m19s
E2E Staging External Runtime / E2E Staging External Runtime (push) Successful in 5m30s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 2m23s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push) Successful in 6m5s
E2E Chat / E2E Chat (push) Successful in 4m6s
CI / Platform (Go) (push) Successful in 5m0s
CI / all-required (push) Successful in 9m45s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 2s
publish-workspace-server-image / Production auto-deploy (push) Successful in 8m32s
Harness Replays / Harness Replays (push) Successful in 12s
CI / Canvas Deploy Reminder (push) Successful in 2s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m37s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 8s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 12s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 5m9s
main-red-watchdog / watchdog (push) Successful in 32s
gate-check-v3 / gate-check (push) Successful in 25s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 6m10s
chore(go-module): #1760 rename Go module to git.moleculesai.app/molecule-ai/molecule-core/workspace-server (#1816)
CTO-bypass merge 2026-05-24: #1760 Go module rename to git.moleculesai.app path
2026-05-24 23:37:18 +00:00

5.9 KiB

Testing Your Memory Plugin

Once you have a plugin implementing the v1 contract, you can validate it against the spec without booting workspace-server.

The contract test harness

Workspace-server ships typed Go bindings + round-trip tests in workspace-server/internal/memory/contract/. The simplest way to gain confidence in your plugin's wire compatibility is to point those tests at it.

A minimal contract suite:

package myplugin_test

import (
    "context"
    "testing"

    mclient "git.moleculesai.app/molecule-ai/molecule-core/workspace-server/internal/memory/client"
    "git.moleculesai.app/molecule-ai/molecule-core/workspace-server/internal/memory/contract"
)

func TestMyPlugin_FullRoundTrip(t *testing.T) {
    // Start your plugin somehow (subprocess, in-process, etc.)
    pluginURL := startMyPlugin(t)
    cl := mclient.New(mclient.Config{BaseURL: pluginURL})

    // 1. Health
    hr, err := cl.Boot(context.Background())
    if err != nil {
        t.Fatalf("Boot: %v", err)
    }
    if hr.Status != "ok" {
        t.Errorf("status = %q", hr.Status)
    }

    // 2. Namespace upsert
    if _, err := cl.UpsertNamespace(context.Background(), "workspace:test-1",
        contract.NamespaceUpsert{Kind: contract.NamespaceKindWorkspace}); err != nil {
        t.Fatalf("UpsertNamespace: %v", err)
    }

    // 3. Commit memory
    resp, err := cl.CommitMemory(context.Background(), "workspace:test-1",
        contract.MemoryWrite{
            Content: "hello",
            Kind:    contract.MemoryKindFact,
            Source:  contract.MemorySourceAgent,
        })
    if err != nil {
        t.Fatalf("CommitMemory: %v", err)
    }
    if resp.ID == "" {
        t.Errorf("plugin must return a non-empty memory id")
    }

    // 4. Search
    sresp, err := cl.Search(context.Background(), contract.SearchRequest{
        Namespaces: []string{"workspace:test-1"},
        Query:      "hello",
    })
    if err != nil {
        t.Fatalf("Search: %v", err)
    }
    if len(sresp.Memories) == 0 {
        t.Errorf("plugin returned no memories for the query we just wrote")
    }

    // 5. Forget
    if err := cl.ForgetMemory(context.Background(), resp.ID,
        contract.ForgetRequest{RequestedByNamespace: "workspace:test-1"}); err != nil {
        t.Errorf("ForgetMemory: %v", err)
    }
}

Testing idempotency

The contract requires that MemoryWrite.id, when supplied, behaves as an upsert key. The backfill CLI relies on this — without it, operator retries silently duplicate every memory.

func TestMyPlugin_IDIsIdempotencyKey(t *testing.T) {
    pluginURL := startMyPlugin(t)
    cl := mclient.New(mclient.Config{BaseURL: pluginURL})
    if _, err := cl.UpsertNamespace(context.Background(), "workspace:test-1",
        contract.NamespaceUpsert{Kind: contract.NamespaceKindWorkspace}); err != nil {
        t.Fatal(err)
    }

    fixedID := "11111111-2222-3333-4444-555555555555"

    // First write with a specific id.
    resp1, err := cl.CommitMemory(context.Background(), "workspace:test-1",
        contract.MemoryWrite{
            ID:      fixedID,
            Content: "first version",
            Kind:    contract.MemoryKindFact,
            Source:  contract.MemorySourceAgent,
        })
    if err != nil {
        t.Fatalf("first commit: %v", err)
    }
    if resp1.ID != fixedID {
        t.Errorf("plugin must echo the supplied id, got %q", resp1.ID)
    }

    // Second write with the same id — must update, not insert.
    if _, err := cl.CommitMemory(context.Background(), "workspace:test-1",
        contract.MemoryWrite{
            ID:      fixedID,
            Content: "second version (updated)",
            Kind:    contract.MemoryKindFact,
            Source:  contract.MemorySourceAgent,
        }); err != nil {
        t.Fatalf("second commit: %v", err)
    }

    // Search must return exactly one row, with the updated content.
    sresp, _ := cl.Search(context.Background(), contract.SearchRequest{
        Namespaces: []string{"workspace:test-1"},
    })
    matches := 0
    for _, m := range sresp.Memories {
        if m.ID == fixedID {
            matches++
            if m.Content != "second version (updated)" {
                t.Errorf("upsert didn't update content: got %q", m.Content)
            }
        }
    }
    if matches != 1 {
        t.Errorf("upsert produced %d rows for id=%s, want 1", matches, fixedID)
    }
}

What the harness does NOT cover

  • Capability accuracy: if you list embedding you must actually do semantic search. The harness can't tell you whether ranking is meaningful — only that you don't crash.
  • TTL eviction: write a memory with expires_at 1 second in the future, sleep 2 seconds, search — assert the memory is gone.
  • Concurrency: hit your plugin with 100 parallel writes; assert no IDs collide.
  • Recovery: kill your plugin's storage backend, send a request, assert your plugin returns 503 (not 200 with stale data).
  • Backfill compatibility: run the operator backfill against your plugin twice in a row (memory-backfill -apply); assert the row count doesn't double. The idempotency test above verifies the unit contract; this checks the operational integration.
  • Verify-mode parity: after a backfill, run memory-backfill -verify; assert it reports zero mismatches against agent_memories.

Smoke test against workspace-server

Once unit-level wire tests pass, run a real workspace-server with your plugin URL:

DATABASE_URL=postgres://... \
MEMORY_PLUGIN_URL=http://localhost:9100 \
./workspace-server

Then ask an agent to call commit_memory_v2 and search_memory. If both round-trip cleanly, you're done.

For the full E2E flow (including the namespace resolver, MCP layer, and security perimeter), see PR-11's plugin-swap test.

Reporting bugs

If you find a contract ambiguity or missing edge case, file an issue against Molecule-AI/molecule-core referencing RFC #2728.