From 354bb5a4d493f153b017714d86ae2ab054ff1346 Mon Sep 17 00:00:00 2001 From: hongming Date: Sat, 23 May 2026 14:27:59 -0700 Subject: [PATCH 1/3] =?UTF-8?q?feat(memory-plugin):=20#1733=20A0=20?= =?UTF-8?q?=E2=80=94=20isolate=20v2=20plugin=20tables=20under=20`memory=5F?= =?UTF-8?q?plugin`=20schema?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The Memory v2 plugin sidecar already auto-starts on tenant boot (per entrypoint-tenant.sh, since 2026-05-14) when `MEMORY_V2_CUTOVER=true`, which CP sets on every tenant since 2026-05-05 (controlplane ec2.go:2232). Plugin tables today land in the tenant Postgres's `public` schema, sharing the namespace with platform-tenant tables — a soft SSOT violation flagged in #1733. CTO decision 2026-05-23: same Postgres instance, separate schema. Changes: - New `cmd/memory-plugin-postgres/migrations/000_schema_bootstrap.up.sql` + matching `.down.sql`. Migration filenames sort alphabetically (per `main.go:runMigrationsFromEmbed → sort.Strings`), so 000 runs before the existing 001_memory_v2 which assumes the schema exists in search_path. - `entrypoint-tenant.sh`: when defaulting `MEMORY_PLUGIN_DATABASE_URL` from the tenant's `DATABASE_URL`, append `search_path=memory_plugin` (handles both `?` and `&` URL forms). When the operator explicitly sets `MEMORY_PLUGIN_DATABASE_URL` (separate DB entirely), they keep full control — no rewriting. - `migrations_embed_test.go`: relax the per-file assertion (was: every *.up.sql must contain `CREATE TABLE`; now: every file must contain a recognized DDL keyword AND at least one file declares CREATE TABLE). Required because the new 000 file is schema-only. Stage A — verified locally against `pgvector/pgvector:pg16` (the same image CP launches on tenant EC2): - Plugin binary boots, embedded migrations apply in order `000_schema_bootstrap → 001_memory_v2`. - Tables land in `memory_plugin` schema; `public` stays empty. - End-to-end PUT namespace / POST memory / POST search round-trips through the new schema. - The entrypoint URL-mutation logic exercised on 3 cases: DATABASE_URL with `?`, without `?`, and operator-override (no rewrite). - `go test -short -count=1 ./...` green across all packages (workspace-server + plugin). Operational follow-ups (deliberately out of scope for this PR): - Tenants currently running with rows in `public.memory_*` need a one-time migration to `memory_plugin.memory_*`. Handled in #1733 Phase A2 (per-tenant backfill via `cmd/memory-backfill`); not gating this PR because no production tenant has rows in v2 yet (every tenant_workspace_server has been silently falling back to v1 SQL on `agent_memories` since 2026-05-05). - `CREATE EXTENSION vector` (in 001_memory_v2) implicitly lands in the first writable search_path schema — i.e. `memory_plugin` now. Dropping the schema cascade-drops the extension. If a future operator drops the schema intentionally, they must re-`CREATE EXTENSION`. Acceptable for a defensive operator action; flagged in `000_schema_bootstrap.down.sql`. Refs: #1733 (memory SSOT — Phase A0 of revised plan). Followed by A1 (workspace-server: kill v1 fallback, fail-fast on missing plugin) in a separate core PR. Co-Authored-By: Claude Opus 4.7 (1M context) --- .../migrations/000_schema_bootstrap.down.sql | 9 ++++++++ .../migrations/000_schema_bootstrap.up.sql | 20 +++++++++++++++++ .../migrations_embed_test.go | 22 +++++++++++++++++-- workspace-server/entrypoint-tenant.sh | 17 +++++++++++++- 4 files changed, 65 insertions(+), 3 deletions(-) create mode 100644 workspace-server/cmd/memory-plugin-postgres/migrations/000_schema_bootstrap.down.sql create mode 100644 workspace-server/cmd/memory-plugin-postgres/migrations/000_schema_bootstrap.up.sql diff --git a/workspace-server/cmd/memory-plugin-postgres/migrations/000_schema_bootstrap.down.sql b/workspace-server/cmd/memory-plugin-postgres/migrations/000_schema_bootstrap.down.sql new file mode 100644 index 00000000..e4f3282c --- /dev/null +++ b/workspace-server/cmd/memory-plugin-postgres/migrations/000_schema_bootstrap.down.sql @@ -0,0 +1,9 @@ +-- Reverse of 000_schema_bootstrap.up.sql. +-- +-- Drops the dedicated schema and every plugin object inside it. Operator +-- runs this only when intentionally tearing down the v2 plugin store on +-- a shared tenant Postgres. Down migrations are NOT auto-applied by the +-- plugin (cmd/memory-plugin-postgres/main.go:runMigrations comment) — +-- this file is for manual operator-driven cleanup. + +DROP SCHEMA IF EXISTS memory_plugin CASCADE; diff --git a/workspace-server/cmd/memory-plugin-postgres/migrations/000_schema_bootstrap.up.sql b/workspace-server/cmd/memory-plugin-postgres/migrations/000_schema_bootstrap.up.sql new file mode 100644 index 00000000..974e01b9 --- /dev/null +++ b/workspace-server/cmd/memory-plugin-postgres/migrations/000_schema_bootstrap.up.sql @@ -0,0 +1,20 @@ +-- Create the dedicated schema this plugin owns when it shares a Postgres +-- with the tenant platform (the default deployment shape — see +-- entrypoint-tenant.sh which appends `search_path=memory_plugin` to +-- DATABASE_URL when the operator hasn't set MEMORY_PLUGIN_DATABASE_URL +-- explicitly). +-- +-- The schema name `memory_plugin` matches the search_path injected by +-- the entrypoint; sql.Open's URL controls *where* subsequent CREATE +-- TABLE lands (search_path) but does NOT auto-create the target schema, +-- so the plugin has to do it itself before 001_memory_v2 runs. +-- +-- Operators who point the plugin at a dedicated database (no shared +-- tenant tables) can leave this no-op: CREATE SCHEMA IF NOT EXISTS is +-- idempotent and a fresh DB happily owns a `memory_plugin` schema. +-- +-- Migration files are sorted alphabetically (cmd/memory-plugin-postgres/ +-- main.go:runMigrationsFromEmbed → sort.Strings), so 000_ runs before +-- 001_memory_v2 which assumes the schema already exists in search_path. + +CREATE SCHEMA IF NOT EXISTS memory_plugin; diff --git a/workspace-server/cmd/memory-plugin-postgres/migrations_embed_test.go b/workspace-server/cmd/memory-plugin-postgres/migrations_embed_test.go index f2f0b785..128129dc 100644 --- a/workspace-server/cmd/memory-plugin-postgres/migrations_embed_test.go +++ b/workspace-server/cmd/memory-plugin-postgres/migrations_embed_test.go @@ -24,6 +24,7 @@ func TestMigrationsEmbedded_ContainsCreateTable(t *testing.T) { } var seenUp bool + var seenCreateTable bool for _, e := range entries { if e.IsDir() || !strings.HasSuffix(e.Name(), ".up.sql") { continue @@ -34,13 +35,30 @@ func TestMigrationsEmbedded_ContainsCreateTable(t *testing.T) { t.Errorf("read embedded %q: %v", e.Name(), err) continue } - if !strings.Contains(string(data), "CREATE TABLE") { - t.Errorf("embedded %q has no CREATE TABLE — wrong file embedded?", e.Name()) + // Each file must contain at least one DDL statement we expect to see + // — guards against truncated / empty files that would silently embed. + if !containsAnyDDL(string(data)) { + t.Errorf("embedded %q contains no recognized DDL (CREATE TABLE/SCHEMA/EXTENSION/INDEX) — wrong or truncated file?", e.Name()) + } + if strings.Contains(string(data), "CREATE TABLE") { + seenCreateTable = true } } if !seenUp { t.Fatal("no *.up.sql in embedded migrations — runtime would have no schema to apply") } + if !seenCreateTable { + t.Fatal("no embedded migration declares CREATE TABLE — the v2 store would have no tables at runtime") + } +} + +func containsAnyDDL(sql string) bool { + for _, kw := range []string{"CREATE TABLE", "CREATE SCHEMA", "CREATE EXTENSION", "CREATE INDEX", "CREATE OR REPLACE", "ALTER TABLE"} { + if strings.Contains(sql, kw) { + return true + } + } + return false } // TestRunMigrationsFromEmbed_OrderingIsAlphabetic pins that we apply diff --git a/workspace-server/entrypoint-tenant.sh b/workspace-server/entrypoint-tenant.sh index 0f2d6dde..51352e12 100644 --- a/workspace-server/entrypoint-tenant.sh +++ b/workspace-server/entrypoint-tenant.sh @@ -38,7 +38,22 @@ if [ "$MEMORY_V2_CUTOVER" = "true" ] || [ -n "$MEMORY_PLUGIN_URL" ]; then memory_plugin_wanted=1 fi if [ -z "$MEMORY_PLUGIN_DISABLE" ] && [ -n "$memory_plugin_wanted" ] && [ -n "$DATABASE_URL" ]; then - : "${MEMORY_PLUGIN_DATABASE_URL:=$DATABASE_URL}" + # Schema isolation (issue #1733): when defaulting from the tenant + # DATABASE_URL we co-locate the plugin's tables under a dedicated + # `memory_plugin` schema so they never collide with platform-tenant + # tables in `public`. The plugin's 000_schema_bootstrap migration + # creates the schema; search_path here directs every subsequent CREATE + # TABLE / SELECT to land in it. + # + # Operators who explicitly set MEMORY_PLUGIN_DATABASE_URL (separate DB + # entirely) keep full control — search_path is only injected when we + # default from DATABASE_URL. + if [ -z "$MEMORY_PLUGIN_DATABASE_URL" ]; then + case "$DATABASE_URL" in + *\?*) MEMORY_PLUGIN_DATABASE_URL="${DATABASE_URL}&search_path=memory_plugin" ;; + *) MEMORY_PLUGIN_DATABASE_URL="${DATABASE_URL}?search_path=memory_plugin" ;; + esac + fi : "${MEMORY_PLUGIN_LISTEN_ADDR:=:9100}" export MEMORY_PLUGIN_DATABASE_URL MEMORY_PLUGIN_LISTEN_ADDR echo "memory-plugin: starting sidecar on $MEMORY_PLUGIN_LISTEN_ADDR" >&2 -- 2.52.0 From e03b3fb243c2ab7036c4f0ce5d1f6583e0196a6d Mon Sep 17 00:00:00 2001 From: hongming Date: Sat, 23 May 2026 16:53:43 -0700 Subject: [PATCH 2/3] =?UTF-8?q?review(memory-plugin):=20#1742=20=E2=80=94?= =?UTF-8?q?=20search=5Fpath=20fallback=20to=20public=20+=20per-file=20embe?= =?UTF-8?q?d=20test?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Addresses two findings from the #1742 review (PR-comment context: the extension-resolution finding was flagged as a production blocker for any tenant that previously installed pgvector in `public`): 1. entrypoint-tenant.sh — search_path now `memory_plugin,public` (was `memory_plugin`). Pgvector is schema-bound, not relocated by search_path. On tenants where pgvector was already installed into `public` by a prior boot or operator action, the previous single-schema search_path made `vector(1536)` references in 001_memory_v2.up.sql unresolvable — boot would die inside the 30s health gate. The `,public` fallback resolves `vector` via the tail schema while still landing every new CREATE TABLE in `memory_plugin` (SSOT preserved — search_path picks the FIRST writable schema for new objects). Verified locally with the exact failure case from the review: - Fresh tenant: CREATE EXTENSION vector → extension lands in `memory_plugin` (SSOT intent). - Pre-existing tenant (pgvector pre-installed in `public`): CREATE EXTENSION IF NOT EXISTS is a no-op, extension stays in `public`, vector(1536) resolves via fallback, plugin boots, memory tables still land in `memory_plugin`. 2. migrations_embed_test.go — replaced the global "at least one file has CREATE TABLE somewhere" assertion with explicit per-file pins: - 000_schema_bootstrap.up.sql MUST contain `CREATE SCHEMA`. - 001_memory_v2.up.sql MUST contain `CREATE TABLE` AND mention both `memory_records` and `memory_namespaces`. The global check let a future rewrite of 001 silently regress to schema-only as long as ANY embedded file declared a table. 3. Updated 000_schema_bootstrap.up.sql comment block to document the new search_path shape + the pgvector resolution rule, so a future reader doesn't have to re-derive why the fallback exists. Stage A: - Plugin tests still green (`go test -count=1 ./cmd/memory-plugin-postgres/...`). - Local Postgres-in-Docker round-trip exercising BOTH the pre-existing-public and fresh-tenant paths; logs + schema verification in the commit-discussion thread. Refs: #1742 (this PR), review-finding C1 (extension resolution) + C2 (test invariant weakened). Co-Authored-By: Claude Opus 4.7 (1M context) --- .../migrations/000_schema_bootstrap.up.sql | 15 ++++++++-- .../migrations_embed_test.go | 30 +++++++++++++++---- workspace-server/entrypoint-tenant.sh | 15 ++++++++-- 3 files changed, 50 insertions(+), 10 deletions(-) diff --git a/workspace-server/cmd/memory-plugin-postgres/migrations/000_schema_bootstrap.up.sql b/workspace-server/cmd/memory-plugin-postgres/migrations/000_schema_bootstrap.up.sql index 974e01b9..9c60588c 100644 --- a/workspace-server/cmd/memory-plugin-postgres/migrations/000_schema_bootstrap.up.sql +++ b/workspace-server/cmd/memory-plugin-postgres/migrations/000_schema_bootstrap.up.sql @@ -1,7 +1,7 @@ -- Create the dedicated schema this plugin owns when it shares a Postgres -- with the tenant platform (the default deployment shape — see --- entrypoint-tenant.sh which appends `search_path=memory_plugin` to --- DATABASE_URL when the operator hasn't set MEMORY_PLUGIN_DATABASE_URL +-- entrypoint-tenant.sh which appends `search_path=memory_plugin,public` +-- to DATABASE_URL when the operator hasn't set MEMORY_PLUGIN_DATABASE_URL -- explicitly). -- -- The schema name `memory_plugin` matches the search_path injected by @@ -9,6 +9,17 @@ -- TABLE lands (search_path) but does NOT auto-create the target schema, -- so the plugin has to do it itself before 001_memory_v2 runs. -- +-- About the `,public` fallback in search_path: pgvector ships as an +-- extension that lives in one schema. On fresh tenants 001_memory_v2's +-- `CREATE EXTENSION IF NOT EXISTS vector` installs it into the first +-- writable schema in search_path (memory_plugin — SSOT preserved). On +-- tenants where pgvector was already installed into `public` by a +-- prior boot, the IF NOT EXISTS is a no-op and the extension stays in +-- public; the `vector(1536)` type reference in 001 then resolves via +-- the public fallback. Without that fallback, 001 would die with +-- "type vector does not exist" on every pre-existing tenant (#1742 +-- review finding). +-- -- Operators who point the plugin at a dedicated database (no shared -- tenant tables) can leave this no-op: CREATE SCHEMA IF NOT EXISTS is -- idempotent and a fresh DB happily owns a `memory_plugin` schema. diff --git a/workspace-server/cmd/memory-plugin-postgres/migrations_embed_test.go b/workspace-server/cmd/memory-plugin-postgres/migrations_embed_test.go index 128129dc..341e694e 100644 --- a/workspace-server/cmd/memory-plugin-postgres/migrations_embed_test.go +++ b/workspace-server/cmd/memory-plugin-postgres/migrations_embed_test.go @@ -24,7 +24,6 @@ func TestMigrationsEmbedded_ContainsCreateTable(t *testing.T) { } var seenUp bool - var seenCreateTable bool for _, e := range entries { if e.IsDir() || !strings.HasSuffix(e.Name(), ".up.sql") { continue @@ -40,15 +39,34 @@ func TestMigrationsEmbedded_ContainsCreateTable(t *testing.T) { if !containsAnyDDL(string(data)) { t.Errorf("embedded %q contains no recognized DDL (CREATE TABLE/SCHEMA/EXTENSION/INDEX) — wrong or truncated file?", e.Name()) } - if strings.Contains(string(data), "CREATE TABLE") { - seenCreateTable = true - } } if !seenUp { t.Fatal("no *.up.sql in embedded migrations — runtime would have no schema to apply") } - if !seenCreateTable { - t.Fatal("no embedded migration declares CREATE TABLE — the v2 store would have no tables at runtime") + + // Per-file invariants (issue #1742 review finding). The previous global + // "at least one file has CREATE TABLE somewhere" check let a future + // rewrite of 001_memory_v2.up.sql silently regress to schema-only as + // long as any other file declared a table. Pin the load-bearing DDL + // per filename so a wrong-or-truncated 001 fails this test loudly. + assertFileContains(t, "000_schema_bootstrap.up.sql", "CREATE SCHEMA") + assertFileContains(t, "001_memory_v2.up.sql", "CREATE TABLE") + assertFileContains(t, "001_memory_v2.up.sql", "memory_records") + assertFileContains(t, "001_memory_v2.up.sql", "memory_namespaces") +} + +// assertFileContains fails the test if the embedded migration `name` +// either can't be read or doesn't contain `needle`. Pulled out so each +// per-file pin reads as one line. +func assertFileContains(t *testing.T, name, needle string) { + t.Helper() + data, err := migrationsFS.ReadFile("migrations/" + name) + if err != nil { + t.Errorf("required migration %q not embedded: %v", name, err) + return + } + if !strings.Contains(string(data), needle) { + t.Errorf("migration %q must contain %q — wrong or truncated file?", name, needle) } } diff --git a/workspace-server/entrypoint-tenant.sh b/workspace-server/entrypoint-tenant.sh index 51352e12..5b6087e9 100644 --- a/workspace-server/entrypoint-tenant.sh +++ b/workspace-server/entrypoint-tenant.sh @@ -45,13 +45,24 @@ if [ -z "$MEMORY_PLUGIN_DISABLE" ] && [ -n "$memory_plugin_wanted" ] && [ -n "$D # creates the schema; search_path here directs every subsequent CREATE # TABLE / SELECT to land in it. # + # The search_path includes `public` as a fallback so the `vector` type + # resolves regardless of which schema pgvector was installed into. + # Fresh tenants (no prior `CREATE EXTENSION vector`) install the + # extension into `memory_plugin` (first writable schema in the path), + # keeping the SSOT intent. Tenants where pgvector was already + # installed into `public` by a prior boot or operator action keep the + # extension where it is and resolve `vector(1536)` via the public + # fallback — without this fallback those tenants would crash the + # plugin boot with "type vector does not exist" once the migrations + # try to create memory_records (#1742 review finding). + # # Operators who explicitly set MEMORY_PLUGIN_DATABASE_URL (separate DB # entirely) keep full control — search_path is only injected when we # default from DATABASE_URL. if [ -z "$MEMORY_PLUGIN_DATABASE_URL" ]; then case "$DATABASE_URL" in - *\?*) MEMORY_PLUGIN_DATABASE_URL="${DATABASE_URL}&search_path=memory_plugin" ;; - *) MEMORY_PLUGIN_DATABASE_URL="${DATABASE_URL}?search_path=memory_plugin" ;; + *\?*) MEMORY_PLUGIN_DATABASE_URL="${DATABASE_URL}&search_path=memory_plugin,public" ;; + *) MEMORY_PLUGIN_DATABASE_URL="${DATABASE_URL}?search_path=memory_plugin,public" ;; esac fi : "${MEMORY_PLUGIN_LISTEN_ADDR:=:9100}" -- 2.52.0 From ea01341a9f424018f084f609f1cf9059e9e49761 Mon Sep 17 00:00:00 2001 From: hongming Date: Sat, 23 May 2026 19:42:00 -0700 Subject: [PATCH 3/3] ci: refire stale lint workflows on #1742 (post-persona-ack) -- 2.52.0