Builds on merged PR-1..7 (PR-8 in queue). Pure docs; no code.
What ships:
* docs/memory-plugins/README.md — contract overview, capability
negotiation, deployment models, replacement workflow
* docs/memory-plugins/testing-your-plugin.md — using the contract
test harness to validate wire compatibility, what the harness
DOES NOT cover (capability accuracy, TTL eviction, concurrency)
* docs/memory-plugins/pinecone-example/README.md — worked example
of a Pinecone-backed plugin: capability mapping (only embedding,
no FTS), wire mapping (memory → vector + metadata), production-
hardening checklist
Documentation strategy:
* Lead with what workspace-server takes care of (security perimeter,
redaction, ACL, GLOBAL audit, prompt-injection wrap) so plugin
authors don't reimplement those layers
* Show three deployment models (same machine / separate container /
self-managed) so operators see their topology
* Capability table makes it explicit what each capability gates so
a plugin that supports only one (e.g. semantic search) is still
a useful plugin
* Pinecone example is honest: shows the skeleton, the wire mapping,
and explicitly calls out what's MISSING from the sketch (batch
commits, TTL janitor, circuit breaker, metrics)
3.6 KiB
Pinecone-backed Memory Plugin (worked example)
A working sketch of a memory plugin that delegates storage to Pinecone instead of postgres.
This is example code, not a production binary. It demonstrates how to map the v1 contract onto a vector database. Operators who want to ship this would harden auth, add retries, batch the commit path, etc.
Why Pinecone is interesting
The default postgres plugin's pgvector index works for ~10M memories on a single node. Beyond that, semantic search becomes painful. A managed vector database can handle 1B+ memories, but the trade-offs are different:
- Capabilities: Pinecone is great at
embedding(its core feature) but has no first-class FTS. So the plugin reports["embedding"]and ignores thequeryfield. - TTL: Pinecone supports per-vector metadata with deletion via metadata filter — TTL becomes a periodic janitor task, not a per-row property.
- Cost: per-vector billing, so the plugin should batch writes and dedup before posting.
Wire mapping
| Contract field | Pinecone shape |
|---|---|
namespace |
namespace (Pinecone's first-class concept) |
id |
id |
content |
metadata.text |
embedding |
values |
kind / source / pin / expires_at |
metadata.{kind, source, pin, expires_at} |
propagation (opaque JSON) |
metadata.propagation (also opaque) |
The contract's expires_at becomes a metadata field; a separate
janitor cron periodically queries expires_at < now and deletes.
Skeleton
package main
import (
"context"
"encoding/json"
"log"
"net/http"
"os"
"github.com/pinecone-io/go-pinecone/pinecone"
)
type pineconePlugin struct {
client *pinecone.Client
index string
}
func main() {
apiKey := os.Getenv("PINECONE_API_KEY")
if apiKey == "" {
log.Fatal("PINECONE_API_KEY required")
}
client, err := pinecone.NewClient(pinecone.NewClientParams{ApiKey: apiKey})
if err != nil {
log.Fatal(err)
}
p := &pineconePlugin{client: client, index: os.Getenv("PINECONE_INDEX")}
http.HandleFunc("/v1/health", p.health)
http.HandleFunc("/v1/search", p.search)
// ... rest of the routes ...
log.Fatal(http.ListenAndServe(":9100", nil))
}
func (p *pineconePlugin) health(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(map[string]interface{}{
"status": "ok",
"version": "1.0.0",
"capabilities": []string{"embedding"}, // no FTS, no TTL out-of-box
})
}
func (p *pineconePlugin) search(w http.ResponseWriter, r *http.Request) {
// Parse contract.SearchRequest
// Build Pinecone QueryByVectorValuesRequest with body.Embedding
// For each Pinecone namespace in body.Namespaces, call Query
// Map results to contract.Memory
// ...
}
What's missing from this sketch
A production-ready Pinecone plugin would add:
- Batch commits: bulk upsert N memories in a single Pinecone call
- TTL janitor: periodic deletion of expired vectors
- Connection pooling: keep one Pinecone client alive across requests
- Retry + circuit breaker: Pinecone occasionally returns 5xx
- Metrics: latency histograms per endpoint, write/read counters
But the mapping above is the load-bearing part — the rest is operational hardening, not contract-specific.