Memory v2 PR-10: operator docs for writing a custom memory plugin

Builds on merged PR-1..7 (PR-8 in queue). Pure docs; no code. What ships: * docs/memory-plugins/README.md — contract overview, capability negotiation, deployment models, replacement workflow * docs/memory-plugins/testing-your-plugin.md — using the contract test harness to validate wire compatibility, what the harness DOES NOT cover (capability accuracy, TTL eviction, concurrency) * docs/memory-plugins/pinecone-example/README.md — worked example of a Pinecone-backed plugin: capability mapping (only embedding, no FTS), wire mapping (memory → vector + metadata), production- hardening checklist Documentation strategy: * Lead with what workspace-server takes care of (security perimeter, redaction, ACL, GLOBAL audit, prompt-injection wrap) so plugin authors don't reimplement those layers * Show three deployment models (same machine / separate container / self-managed) so operators see their topology * Capability table makes it explicit what each capability gates so a plugin that supports only one (e.g. semantic search) is still a useful plugin * Pinecone example is honest: shows the skeleton, the wire mapping, and explicitly calls out what's MISSING from the sketch (batch commits, TTL janitor, circuit breaker, metrics)
2026-05-04 08:17:03 -07:00 · 2026-05-04 08:17:03 -07:00 · 8417bce50d
commit 8417bce50d
parent 9929f73e80
3 changed files with 361 additions and 0 deletions
--- a/docs/memory-plugins/README.md
+++ b/docs/memory-plugins/README.md
@ -0,0 +1,135 @@
+# Writing a Memory Plugin
+
+This document is for operators and ecosystem authors who want to
+replace the built-in postgres-backed memory plugin (the default
+implementation that ships with workspace-server) with their own.
+
+The contract was introduced by RFC #2728. The shipped binary is
+`cmd/memory-plugin-postgres/`; reading its source is the fastest way
+to see a complete reference implementation.
+
+## What the contract is
+
+The plugin is an HTTP server that workspace-server talks to via the
+OpenAPI v1 spec at [`docs/api-protocol/memory-plugin-v1.yaml`](../api-protocol/memory-plugin-v1.yaml).
+
+Six endpoints:
+
+| Endpoint | Method | Purpose |
+|---|---|---|
+| `/v1/health` | GET | Liveness probe + capability list |
+| `/v1/namespaces/{name}` | PUT | Idempotent upsert |
+| `/v1/namespaces/{name}` | PATCH | Update TTL or metadata |
+| `/v1/namespaces/{name}` | DELETE | Remove namespace and its memories |
+| `/v1/namespaces/{name}/memories` | POST | Write a memory |
+| `/v1/search` | POST | Multi-namespace search |
+| `/v1/memories/{id}` | DELETE | Forget a memory |
+
+The wire types are defined in
+`workspace-server/internal/memory/contract/contract.go`. Run-time
+validation is built into the Go bindings via `Validate()` methods —
+your plugin SHOULD perform equivalent validation.
+
+## What workspace-server takes care of
+
+You do **not** implement these in the plugin; workspace-server is the
+security perimeter:
+
+- **Secret redaction** (SAFE-T1201). All `content` you receive is
+  already scrubbed. Don't run additional redaction; it's pointless.
+- **Namespace ACL**. workspace-server intersects the caller's
+  readable namespaces against the requested list before sending you
+  the search request. The list you receive is authoritative.
+- **GLOBAL audit**. Org-namespace writes are recorded in
+  `activity_logs` server-side; you don't see them.
+- **Prompt-injection wrap**. Org memories returned to agents get a
+  `[MEMORY id=... scope=ORG ns=...]:` prefix added at the
+  workspace-server layer. Your `content` field is plain text.
+
+## What you implement
+
+- Storage of `memory_namespaces` and `memory_records` (or whatever
+  shape you want — Pinecone vectors, an in-memory map, etc.)
+- The 7 endpoints above with the request/response shapes the spec
+  defines
+- `/v1/health` reporting your supported capabilities (see below)
+- Idempotency on namespace upsert (PUT semantics, not POST)
+
+## Capability negotiation
+
+Your `/v1/health` response declares what features you support:
+
+```json
+{
+  "status": "ok",
+  "version": "1.0.0",
+  "capabilities": ["embedding", "fts", "ttl", "pin", "propagation"]
+}
+```
+
+| Capability | What it gates |
+|---|---|
+| `embedding` | Agents may ask for semantic search; you receive `embedding: [...]` in search bodies |
+| `fts` | Agents may pass a query string; you decide how to match (FTS, ILIKE, regex) |
+| `ttl` | Agents may set `expires_at`; you must not return expired rows |
+| `pin` | Agents may set `pin: true`; you should rank pinned rows first |
+| `propagation` | Agents may set `propagation: {...}`; you must store it as opaque JSON and return it on read |
+
+A capability you DON'T list is fine — workspace-server adapts the MCP
+tool surface to match. E.g., a Pinecone-only plugin that lists only
+`embedding` will silently ignore agents' `query` strings.
+
+## Deployment models
+
+Three common shapes:
+
+1. **Same machine, different process**: workspace-server boots, then
+   `MEMORY_PLUGIN_URL=http://localhost:9100` points at your plugin
+   running on a unix socket or localhost port. This is what the
+   built-in postgres plugin does.
+
+2. **Separate container**: deploy your plugin as its own service on
+   the private network. Set `MEMORY_PLUGIN_URL` to its DNS name.
+
+3. **Self-managed**: customer-owned plugin running on customer-owned
+   infrastructure, accessed over a tunnel. Same env-var wiring.
+
+Auth is **none** — the plugin must be reachable only on a private
+network. workspace-server is the only sanctioned client.
+
+## Replacing the built-in plugin
+
+1. Apply [PR-7's backfill](../../workspace-server/cmd/memory-backfill/) to
+   copy `agent_memories` into your plugin's storage.
+2. Stop workspace-server, point `MEMORY_PLUGIN_URL` at your plugin,
+   restart.
+3. Existing data in the postgres plugin's tables is **not auto-
+   dropped** — that's a deliberate safety property. Operator drops
+   manually after they're confident they don't want to switch back.
+
+If you switch back later, the old postgres tables come back into use
+(no data loss).
+
+## Worked examples
+
+- [`pinecone-example/`](pinecone-example/) — full Pinecone-backed plugin
+- [`testing-your-plugin.md`](testing-your-plugin.md) — running the
+  contract test harness against your implementation
+
+## When to write one vs. fork the default
+
+Fork the default postgres plugin if:
+- You want different SQL (Materialized views? Different vector index?)
+- You want extra auth on top
+- You want server-side metrics emission
+
+Write a fresh plugin if:
+- The storage backend is fundamentally different (vector DB, KV store,
+  in-memory, file-based)
+- You're integrating an existing memory service (Letta, Mem0, etc.)
+
+## See also
+
+- RFC #2728 — design rationale
+- [`cmd/memory-plugin-postgres/`](../../workspace-server/cmd/memory-plugin-postgres/) — reference implementation
+- [`docs/api-protocol/memory-plugin-v1.yaml`](../api-protocol/memory-plugin-v1.yaml) — full OpenAPI spec
--- a/docs/memory-plugins/pinecone-example/README.md
+++ b/docs/memory-plugins/pinecone-example/README.md
@ -0,0 +1,114 @@
+# Pinecone-backed Memory Plugin (worked example)
+
+A working sketch of a memory plugin that delegates storage to
+[Pinecone](https://www.pinecone.io/) instead of postgres.
+
+This is **example code, not a production binary**. It demonstrates
+how to map the v1 contract onto a vector database. Operators who
+want to ship this would harden auth, add retries, batch the
+commit path, etc.
+
+## Why Pinecone is interesting
+
+The default postgres plugin's pgvector index works for ~10M memories
+on a single node. Beyond that, semantic search becomes painful. A
+managed vector database can handle 1B+ memories, but the trade-offs
+are different:
+
+- **Capabilities**: Pinecone is great at `embedding` (its core
+  feature) but has no first-class FTS. So the plugin reports
+  `["embedding"]` and ignores the `query` field.
+- **TTL**: Pinecone supports per-vector metadata with deletion via
+  metadata filter — TTL becomes a periodic janitor task, not a
+  per-row property.
+- **Cost**: per-vector billing, so the plugin should batch writes
+  and dedup before posting.
+
+## Wire mapping
+
+| Contract field | Pinecone shape |
+|---|---|
+| `namespace` | `namespace` (Pinecone's first-class concept) |
+| `id` | `id` |
+| `content` | metadata.text |
+| `embedding` | `values` |
+| `kind` / `source` / `pin` / `expires_at` | `metadata.{kind, source, pin, expires_at}` |
+| `propagation` (opaque JSON) | `metadata.propagation` (also opaque) |
+
+The contract's `expires_at` becomes a metadata field; a separate
+janitor cron periodically queries `expires_at < now` and deletes.
+
+## Skeleton
+
+```go
+package main
+
+import (
+    "context"
+    "encoding/json"
+    "log"
+    "net/http"
+    "os"
+
+    "github.com/pinecone-io/go-pinecone/pinecone"
+)
+
+type pineconePlugin struct {
+    client *pinecone.Client
+    index  string
+}
+
+func main() {
+    apiKey := os.Getenv("PINECONE_API_KEY")
+    if apiKey == "" {
+        log.Fatal("PINECONE_API_KEY required")
+    }
+    client, err := pinecone.NewClient(pinecone.NewClientParams{ApiKey: apiKey})
+    if err != nil {
+        log.Fatal(err)
+    }
+    p := &pineconePlugin{client: client, index: os.Getenv("PINECONE_INDEX")}
+
+    http.HandleFunc("/v1/health", p.health)
+    http.HandleFunc("/v1/search", p.search)
+    // ... rest of the routes ...
+
+    log.Fatal(http.ListenAndServe(":9100", nil))
+}
+
+func (p *pineconePlugin) health(w http.ResponseWriter, r *http.Request) {
+    w.Header().Set("Content-Type", "application/json")
+    json.NewEncoder(w).Encode(map[string]interface{}{
+        "status":       "ok",
+        "version":      "1.0.0",
+        "capabilities": []string{"embedding"}, // no FTS, no TTL out-of-box
+    })
+}
+
+func (p *pineconePlugin) search(w http.ResponseWriter, r *http.Request) {
+    // Parse contract.SearchRequest
+    // Build Pinecone QueryByVectorValuesRequest with body.Embedding
+    // For each Pinecone namespace in body.Namespaces, call Query
+    // Map results to contract.Memory
+    // ...
+}
+```
+
+## What's missing from this sketch
+
+A production-ready Pinecone plugin would add:
+
+- **Batch commits**: bulk upsert N memories in a single Pinecone call
+- **TTL janitor**: periodic deletion of expired vectors
+- **Connection pooling**: keep one Pinecone client alive across requests
+- **Retry + circuit breaker**: Pinecone occasionally returns 5xx
+- **Metrics**: latency histograms per endpoint, write/read counters
+
+But the mapping above is the load-bearing part — the rest is
+operational hardening, not contract-specific.
+
+## See also
+
+- [Pinecone Go SDK docs](https://docs.pinecone.io/reference/go-sdk)
+- [Memory plugin contract spec](../../api-protocol/memory-plugin-v1.yaml)
+- [Default postgres plugin source](../../../workspace-server/cmd/memory-plugin-postgres/) — for comparison
--- a/docs/memory-plugins/testing-your-plugin.md
+++ b/docs/memory-plugins/testing-your-plugin.md
@ -0,0 +1,112 @@
+# Testing Your Memory Plugin
+
+Once you have a plugin implementing the v1 contract, you can validate
+it against the spec without booting workspace-server.
+
+## The contract test harness
+
+Workspace-server ships typed Go bindings + round-trip tests in
+`workspace-server/internal/memory/contract/`. The simplest way to
+gain confidence in your plugin's wire compatibility is to point those
+tests at it.
+
+A minimal contract suite:
+
+```go
+package myplugin_test
+
+import (
+    "context"
+    "testing"
+
+    mclient "github.com/Molecule-AI/molecule-monorepo/platform/internal/memory/client"
+    "github.com/Molecule-AI/molecule-monorepo/platform/internal/memory/contract"
+)
+
+func TestMyPlugin_FullRoundTrip(t *testing.T) {
+    // Start your plugin somehow (subprocess, in-process, etc.)
+    pluginURL := startMyPlugin(t)
+    cl := mclient.New(mclient.Config{BaseURL: pluginURL})
+
+    // 1. Health
+    hr, err := cl.Boot(context.Background())
+    if err != nil {
+        t.Fatalf("Boot: %v", err)
+    }
+    if hr.Status != "ok" {
+        t.Errorf("status = %q", hr.Status)
+    }
+
+    // 2. Namespace upsert
+    if _, err := cl.UpsertNamespace(context.Background(), "workspace:test-1",
+        contract.NamespaceUpsert{Kind: contract.NamespaceKindWorkspace}); err != nil {
+        t.Fatalf("UpsertNamespace: %v", err)
+    }
+
+    // 3. Commit memory
+    resp, err := cl.CommitMemory(context.Background(), "workspace:test-1",
+        contract.MemoryWrite{
+            Content: "hello",
+            Kind:    contract.MemoryKindFact,
+            Source:  contract.MemorySourceAgent,
+        })
+    if err != nil {
+        t.Fatalf("CommitMemory: %v", err)
+    }
+    if resp.ID == "" {
+        t.Errorf("plugin must return a non-empty memory id")
+    }
+
+    // 4. Search
+    sresp, err := cl.Search(context.Background(), contract.SearchRequest{
+        Namespaces: []string{"workspace:test-1"},
+        Query:      "hello",
+    })
+    if err != nil {
+        t.Fatalf("Search: %v", err)
+    }
+    if len(sresp.Memories) == 0 {
+        t.Errorf("plugin returned no memories for the query we just wrote")
+    }
+
+    // 5. Forget
+    if err := cl.ForgetMemory(context.Background(), resp.ID,
+        contract.ForgetRequest{RequestedByNamespace: "workspace:test-1"}); err != nil {
+        t.Errorf("ForgetMemory: %v", err)
+    }
+}
+```
+
+## What the harness does NOT cover
+
+- **Capability accuracy**: if you list `embedding` you must actually
+  do semantic search. The harness can't tell you whether ranking is
+  meaningful — only that you don't crash.
+- **TTL eviction**: write a memory with `expires_at` 1 second in the
+  future, sleep 2 seconds, search — assert the memory is gone.
+- **Concurrency**: hit your plugin with 100 parallel writes; assert
+  no IDs collide.
+- **Recovery**: kill your plugin's storage backend, send a request,
+  assert your plugin returns 503 (not 200 with stale data).
+
+## Smoke test against workspace-server
+
+Once unit-level wire tests pass, run a real workspace-server with your
+plugin URL:
+
+```bash
+DATABASE_URL=postgres://... \
+MEMORY_PLUGIN_URL=http://localhost:9100 \
+./workspace-server
+```
+
+Then ask an agent to call `commit_memory_v2` and `search_memory`. If
+both round-trip cleanly, you're done.
+
+For the full E2E flow (including the namespace resolver, MCP layer,
+and security perimeter), see [PR-11's plugin-swap test](../../workspace-server/test/e2e/memory_plugin_swap_test.go).
+
+## Reporting bugs
+
+If you find a contract ambiguity or missing edge case, file an issue
+against `Molecule-AI/molecule-core` referencing RFC #2728.