Some checks failed
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Successful in 13s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Successful in 11s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Successful in 12s
Check merge_group trigger on required workflows / Required workflows have merge_group trigger (pull_request) Successful in 15s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 27s
CI / Detect changes (pull_request) Successful in 20s
Retarget main PRs to staging / Retarget to staging (pull_request) Has been skipped
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 15s
E2E API Smoke Test / detect-changes (pull_request) Successful in 51s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 51s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 39s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 51s
Harness Replays / detect-changes (pull_request) Successful in 53s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 48s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m7s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 31s
Harness Replays / Harness Replays (pull_request) Failing after 1m18s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2m19s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 3m14s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 6m1s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 6m47s
CI / Python Lint & Test (pull_request) Successful in 8m16s
CI / Canvas (Next.js) (pull_request) Failing after 9m36s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Platform (Go) (pull_request) Successful in 12m18s
Migrates the two Go modules under molecule-core off the dead
github.com/Molecule-AI/molecule-monorepo/... identity onto the vanity
host go.moleculesai.app. Also fixes the historical naming
inconsistency where the Gitea repo is molecule-core but the Go module
path said molecule-monorepo.
Module changes:
- workspace-server/go.mod:
github.com/Molecule-AI/molecule-monorepo/platform
-> go.moleculesai.app/core/platform
- tests/harness/cp-stub/go.mod:
github.com/Molecule-AI/molecule-monorepo/tests/harness/cp-stub
-> go.moleculesai.app/core/tests/harness/cp-stub
Surfaces touched
- 174 *.go files (374 import lines) — every import under
workspace-server/ + tests/harness/cp-stub/
- 2 Dockerfiles (workspace-server/Dockerfile + Dockerfile.tenant) —
-ldflags strings updated in lockstep with the module rename so
buildinfo.GitSHA injection still resolves correctly
- README + docs + scripts + comment URLs to git.moleculesai.app form
- NEW workspace-server/internal/lint/import_path_lint_test.go —
structural lint gate rejecting future github.com/Molecule-AI/ or
Molecule-AI/molecule-monorepo references. Identical template to the
other migration PRs (plugin-gh-identity#3, molecule-cli#2,
molecule-controlplane#32).
Cross-repo dep allowlist (documented in lint gate)
workspace-server requires molecule-ai-plugin-gh-identity, whose own
vanity migration is PR molecule-ai-plugin-gh-identity#3. Until that PR
merges + a tag is cut at go.moleculesai.app/plugin/gh-identity, the
two locations referencing the legacy github.com path
(workspace-server/go.mod require, cmd/server/main.go import) remain
allowlisted. Follow-up PR drops the allowlist + updates both refs in
one shot once gh-identity is fully migrated.
Test plan
- go build ./... clean for both modules
- go test ./... green except two pre-existing failures
(TestStartSweeper_RecordsMetricsOnSuccess flaky-on-suite,
TestLocalResolver_BubblesUpCopyFailure relies on read-only fs perms
but runs as root on operator host) — both reproduce identically on
baseline main pre-migration; NOT regressions of this PR
- Mutation-tested: lint gate fails on canaries in .go + .md;
allowlist correctly suppresses cross-repo dep references in go.mod
while still flagging unrelated additions
Open dependency
- go.moleculesai.app responder must be deployed before fresh-clone
external builds resolve the vanity path. Existing CI / Docker builds
ride pinned go.sum + self-referential module path + responder is
not on critical path for those.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
197 lines
5.9 KiB
Go
197 lines
5.9 KiB
Go
package main
|
|
|
|
// verify.go — post-apply parity check.
|
|
//
|
|
// After a backfill -apply, run with -verify to confirm the migration
|
|
// actually produced equivalent data. Picks `SampleSize` random
|
|
// workspaces, queries agent_memories direct + plugin search via the
|
|
// caller's namespaces, and diffs the result sets by content.
|
|
//
|
|
// The diff is best-effort: pg's recent-first ordering and the plugin's
|
|
// internal ordering may differ, so we compare as sets, not lists.
|
|
// We do require strict 1:1 multiset equality (every legacy row maps
|
|
// to exactly one plugin row, ignoring id since the backfill preserves
|
|
// it via the C1 idempotency key).
|
|
|
|
import (
|
|
"context"
|
|
"database/sql"
|
|
"fmt"
|
|
"math/rand"
|
|
"os"
|
|
|
|
"go.moleculesai.app/core/platform/internal/memory/contract"
|
|
"go.moleculesai.app/core/platform/internal/textutil"
|
|
)
|
|
|
|
// verifyConfig is the typed dependency bundle for verifyParity.
|
|
type verifyConfig struct {
|
|
DB *sql.DB
|
|
Plugin verifyPlugin
|
|
Resolver verifyResolver
|
|
SampleSize int
|
|
WorkspaceID string // optional: limit to one workspace
|
|
Rand *rand.Rand
|
|
}
|
|
|
|
// verifyPlugin is the slice of memory-plugin client we call.
|
|
type verifyPlugin interface {
|
|
Search(ctx context.Context, body contract.SearchRequest) (*contract.SearchResponse, error)
|
|
}
|
|
|
|
// verifyResolver mirrors namespace.Resolver. Same shape as
|
|
// backfillResolver but kept distinct so verify isn't tied to
|
|
// backfill's interface.
|
|
type verifyResolver interface {
|
|
ReadableNamespaces(ctx context.Context, workspaceID string) ([]ResolvedNamespace, error)
|
|
}
|
|
|
|
// ResolvedNamespace is the minimum we need from the resolver — kept
|
|
// separate so the verify code doesn't depend on the namespace package
|
|
// (the live tests inject stubs, the binary uses an adapter).
|
|
type ResolvedNamespace struct {
|
|
Name string
|
|
}
|
|
|
|
// verifyReport accumulates the per-workspace results.
|
|
type verifyReport struct {
|
|
WorkspacesSampled int
|
|
Matches int
|
|
Mismatches int
|
|
Errors int
|
|
}
|
|
|
|
// verifyParity is the workhorse. Returns a report; the CLI converts
|
|
// any non-zero mismatches/errors into a non-zero exit so CI can gate
|
|
// the cutover.
|
|
func verifyParity(ctx context.Context, cfg verifyConfig, stdout *os.File) (*verifyReport, error) {
|
|
report := &verifyReport{}
|
|
rng := cfg.Rand
|
|
if rng == nil {
|
|
rng = rand.New(rand.NewSource(42)) //nolint:gosec // determinism > unpredictability for ops
|
|
}
|
|
|
|
wsIDs, err := pickWorkspaceSample(ctx, cfg.DB, cfg.WorkspaceID, cfg.SampleSize, rng)
|
|
if err != nil {
|
|
return report, fmt.Errorf("pick sample: %w", err)
|
|
}
|
|
|
|
for _, wsID := range wsIDs {
|
|
report.WorkspacesSampled++
|
|
legacy, err := queryLegacyMemories(ctx, cfg.DB, wsID)
|
|
if err != nil {
|
|
fmt.Fprintf(stdout, "[err] workspace=%s legacy query: %v\n", wsID, err)
|
|
report.Errors++
|
|
continue
|
|
}
|
|
readable, err := cfg.Resolver.ReadableNamespaces(ctx, wsID)
|
|
if err != nil {
|
|
fmt.Fprintf(stdout, "[err] workspace=%s resolve: %v\n", wsID, err)
|
|
report.Errors++
|
|
continue
|
|
}
|
|
nsList := make([]string, len(readable))
|
|
for i, ns := range readable {
|
|
nsList[i] = ns.Name
|
|
}
|
|
if len(nsList) == 0 {
|
|
// No readable namespaces — empty plugin result expected.
|
|
if len(legacy) == 0 {
|
|
report.Matches++
|
|
} else {
|
|
fmt.Fprintf(stdout, "[mismatch] workspace=%s legacy=%d plugin=0 (no readable namespaces)\n", wsID, len(legacy))
|
|
report.Mismatches++
|
|
}
|
|
continue
|
|
}
|
|
resp, err := cfg.Plugin.Search(ctx, contract.SearchRequest{Namespaces: nsList, Limit: 100})
|
|
if err != nil {
|
|
fmt.Fprintf(stdout, "[err] workspace=%s plugin search: %v\n", wsID, err)
|
|
report.Errors++
|
|
continue
|
|
}
|
|
pluginContents := make(map[string]int, len(resp.Memories))
|
|
for _, m := range resp.Memories {
|
|
pluginContents[m.Content]++
|
|
}
|
|
// Compare as multisets: each legacy content appears at least
|
|
// once in plugin output. We deliberately tolerate plugin
|
|
// having MORE rows (the namespace might include team-shared
|
|
// memories from sibling workspaces that aren't in this
|
|
// workspace's agent_memories rows).
|
|
matched := true
|
|
for _, c := range legacy {
|
|
if pluginContents[c] == 0 {
|
|
fmt.Fprintf(stdout, "[mismatch] workspace=%s missing-from-plugin content=%q\n", wsID, textutil.TruncateBytes(c, 80))
|
|
matched = false
|
|
break
|
|
}
|
|
pluginContents[c]--
|
|
}
|
|
if matched {
|
|
report.Matches++
|
|
} else {
|
|
report.Mismatches++
|
|
}
|
|
}
|
|
return report, nil
|
|
}
|
|
|
|
// pickWorkspaceSample returns up to N workspace UUIDs. If
|
|
// WorkspaceID is set, returns only that one. Otherwise selects N
|
|
// random workspaces from the workspaces table (TABLESAMPLE would be
|
|
// nicer but SYSTEM/BERNOULLI sampling has surprising distribution
|
|
// properties for small populations; we just ORDER BY random() LIMIT).
|
|
func pickWorkspaceSample(ctx context.Context, db *sql.DB, workspaceID string, n int, _ *rand.Rand) ([]string, error) {
|
|
if workspaceID != "" {
|
|
return []string{workspaceID}, nil
|
|
}
|
|
rows, err := db.QueryContext(ctx, `
|
|
SELECT id::text
|
|
FROM workspaces
|
|
WHERE status != 'removed'
|
|
ORDER BY random()
|
|
LIMIT $1
|
|
`, n)
|
|
if err != nil {
|
|
return nil, err
|
|
}
|
|
defer rows.Close()
|
|
out := make([]string, 0, n)
|
|
for rows.Next() {
|
|
var id string
|
|
if err := rows.Scan(&id); err != nil {
|
|
return nil, err
|
|
}
|
|
out = append(out, id)
|
|
}
|
|
return out, rows.Err()
|
|
}
|
|
|
|
// queryLegacyMemories pulls all agent_memories rows for a workspace
|
|
// (LOCAL + TEAM scopes — what the plugin search would return through
|
|
// the resolver's readable list, mapped via PR-6 shim semantics).
|
|
func queryLegacyMemories(ctx context.Context, db *sql.DB, workspaceID string) ([]string, error) {
|
|
rows, err := db.QueryContext(ctx, `
|
|
SELECT content
|
|
FROM agent_memories
|
|
WHERE workspace_id = $1
|
|
ORDER BY created_at DESC
|
|
`, workspaceID)
|
|
if err != nil {
|
|
return nil, err
|
|
}
|
|
defer rows.Close()
|
|
out := []string{}
|
|
for rows.Next() {
|
|
var c string
|
|
if err := rows.Scan(&c); err != nil {
|
|
return nil, err
|
|
}
|
|
out = append(out, c)
|
|
}
|
|
return out, rows.Err()
|
|
}
|
|
|
|
// truncation moved to internal/textutil.TruncateBytes (#2962 SSOT).
|