6946cd12c5
623 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
17f1f30b3f |
fix(test): drain coalesceRestart goroutines before t.Cleanup (Class H, #170)
TestPooledWithEICTunnel_PreservesFnErr (and any sqlmock-using neighbour test) was at risk of inheriting stale INSERT calls from a previous test's coalesceRestart goroutine that survived its t.Cleanup boundary. The production callsite shape is `go h.RestartByID(...)` from a2a_proxy.go, a2a_proxy_helpers.go and main.go. When that goroutine's runRestartCycle panics, coalesceRestart's deferred recover swallows it to keep the platform process alive — but in tests, nothing waits for the goroutine to fully exit. If it's still draining LogActivity-shaped work after the test returns, those INSERTs land in the next test's sqlmock connection as kind=DELEGATION_FAILED / kind=WORKSPACE_PROVISION_FAILED, surfacing as "INSERT-not-expected". Fix: introduce drainCoalesceGoroutine(t, wsID, cycle) test helper that spawns coalesceRestart on a goroutine (matching production) and registers a t.Cleanup with sync.WaitGroup.Wait so the test can't declare itself done while a goroutine is still alive. Convert TestCoalesceRestart_PanicInCycleClearsState to use the helper (previously it called coalesceRestart synchronously, which never exercised the production goroutine-survival contract). Add TestCoalesceRestart_DrainHelperWaitsForGoroutineExit as the regression guard: cycle blocks 150ms then panics; the test asserts t.Run elapsed >= 150ms (proving the Wait barrier engaged) AND the deferred close ran (proving the panic-recovery defer chain executed) AND state.running was cleared. Verified the assertion is real by mutation-testing: removing t.Cleanup(wg.Wait) makes this test FAIL deterministically with elapsed <300µs. Per saved memory feedback_assert_exact_not_substring: the regression test asserts an exact-shape contract (elapsed >= blockFor) rather than a substring-in-output, so it discriminates between "drain works" and "drain skipped". Per Phase 3: 10/10 race-detector runs pass for all TestCoalesceRestart_* tests. Full ./internal/handlers/... suite green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
| be5fbb5ad3 |
fix(workspace-server): a2a-proxy preflight container check (closes #36)
Some checks failed
Retarget main PRs to staging / Retarget to staging (pull_request) Has been skipped
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 8s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 7s
Harness Replays / detect-changes (pull_request) Successful in 7s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
CI / Python Lint & Test (pull_request) Successful in 5s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 4s
CI / Canvas (Next.js) (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 6s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Harness Replays / Harness Replays (pull_request) Failing after 56s
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 1m25s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 1m25s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 1m37s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m38s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m46s
CI / Platform (Go) (pull_request) Successful in 2m44s
Same SSOT-divergence shape as #10 / fixed in #12, but on the a2a-proxy code path. The plugin handler was routed through `provisioner.RunningContainerName`; a2a-proxy was forwarding optimistically and only catching missing containers REACTIVELY via `maybeMarkContainerDead` after the network call timed out. Result on tenants whose agent containers had been recycled (e.g. post-EC2 replace from molecule-controlplane#20): canvas waits 2-30s for the network forward to fail before getting a 503, and the workspace-server logs only "ProxyA2A forward error" without the "container is dead" signal. This PR adds a proactive `Provisioner.IsRunning` check in `proxyA2ARequest` between `resolveAgentURL` and `dispatchA2A`, gated on the conditions where we know we're talking to a sibling Docker container we own (`h.provisioner != nil` AND `platformInDocker` AND the URL was rewritten to Docker-DNS form). Three outcomes via the SSOT helper: (true, nil) → forward as today (false, nil) → fast-503 with `error="workspace container not running — restart triggered"`, `restarting=true`, `preflight=true`, plus the same offline-flip + WORKSPACE_OFFLINE broadcast + async restart that `maybeMarkContainerDead` produces (true, err) → fall through to optimistic forward (matches IsRunning's "fail-soft as alive" contract — flaky daemon must not trigger a restart cascade) The `preflight=true` flag in the response distinguishes the proactive short-circuit from the reactive `maybeMarkContainerDead` path so canvas or downstream callers can render distinct messages later. * `internal/handlers/a2a_proxy.go` — preflight call site between resolveAgentURL and dispatchA2A; gated on `h.provisioner != nil && platformInDocker && url == http://<ContainerName(id)>:port`. * `internal/handlers/a2a_proxy_helpers.go` — `preflightContainerHealth` helper. Routes through `h.provisioner.IsRunning` (which itself wraps `RunningContainerName`). Identical offline-flip side-effects as `maybeMarkContainerDead` for the dead-container case. * `internal/handlers/a2a_proxy_preflight_test.go` — 4 tests: running → nil; not-running → structured 503 + sqlmock expectations on the offline-flip + structure_events insert; transient error → nil (fail-soft); AST gate pinning the SSOT routing (mirror of #12's gate). Mutation-tested: removing the `if running { return nil }` guard makes the production code fail to compile (unused var). A subtler mutation (replacing the !running branch with `return nil`) would make TestPreflight_ContainerNotRunning_StructuredFastFail fail at runtime with sqlmock's "expected DB call did not occur." Refs: molecule-core#36. Companion to #12 (issue #10). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
|||
|
|
d64641904f |
feat(workspace-server): mock runtime + mock-bigorg org template
Some checks failed
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 9s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 8s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 8s
Harness Replays / detect-changes (pull_request) Successful in 9s
CI / Python Lint & Test (pull_request) Successful in 6s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 6s
CI / Canvas (Next.js) (pull_request) Successful in 8s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 10s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 12s
Harness Replays / Harness Replays (pull_request) Successful in 8s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 8s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 1m36s
cascade-list-drift-gate / check (pull_request) Successful in 5s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 1m30s
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 1m39s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 2m50s
Retarget main PRs to staging / Retarget to staging (pull_request) Has been skipped
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Platform (Go) (pull_request) Successful in 4m29s
CI / Detect changes (pull_request) Successful in 6s
E2E API Smoke Test / detect-changes (pull_request) Successful in 8s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 7s
Adds a 'mock' runtime: virtual workspaces with no container, no EC2,
no LLM. Every A2A reply is synthesised from a small canned-variant
pool ('On it!', 'Got it, on it now.', etc.) deterministically seeded
by (workspace_id, request_id).
Built for funding-demo "200-workspace mock org" — renders an
enterprise-scale org chart on the canvas (CEO/VPs/Managers/ICs)
without burning real LLM credits or provisioning 200 EC2 instances.
Surfaces:
- workspace-server/internal/handlers/mock_runtime.go: A2A proxy
short-circuit, canned-reply pool, deterministic variant pick.
- workspace-server/internal/handlers/a2a_proxy.go: gate the
short-circuit before resolveAgentURL (mock has no URL).
- workspace-server/internal/handlers/org_import.go: skip Docker
provisioning for mock workspaces, set status='online' directly,
drop the per-sibling 2s pacing for mock children (collapses
a 200-workspace import from ~7min → ~1s).
- workspace-server/internal/handlers/runtime_registry.go: register
'mock' in the runtime allowlist (manifest + fallback set).
- workspace-server/internal/registry/healthsweep.go +
orphan_sweeper.go: skip mock workspaces in container-health and
stale-token sweeps (no container by design).
- workspace-server/internal/handlers/workspace_restart.go: mirror
the 'external' Restart no-op for mock.
- manifest.json: register the new
Molecule-AI/molecule-ai-org-template-mock-bigorg repo.
Tests: 5 new in mock_runtime_test.go covering happy-path, non-mock
regression guard, determinism, IsMockRuntime trim/case, JSON-RPC
id echo. All existing handler + registry tests still pass.
Local-verified: imported the 200-workspace template against a fresh
postgres+redis, confirmed all 200 land in 'online' and stay there
through the 30s health-sweep window, exercised A2A on CEO + VPs +
Managers + ICs and saw the variant pool rotate.
Org template lives at
Molecule-AI/molecule-ai-org-template-mock-bigorg (created today)
and is imported via the existing /org/import flow on the canvas
Template Palette.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
c1de2287fd |
fix(workspace-server): SSOT-route container check + 422 on external runtimes
Some checks failed
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 4m46s
CI / Detect changes (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 6s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
CI / Canvas (Next.js) (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 7s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
E2E API Smoke Test / detect-changes (pull_request) Successful in 4s
Harness Replays / detect-changes (pull_request) Successful in 5s
Retarget main PRs to staging / Retarget to staging (pull_request) Has been skipped
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 6s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
CI / Python Lint & Test (pull_request) Successful in 4s
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 53s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Failing after 44s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 4s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 1m21s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 1m28s
Harness Replays / Harness Replays (pull_request) Failing after 43s
CI / Platform (Go) (pull_request) Successful in 3m19s
Two coupled fixes for molecule-core#10 (plugin install 503 vs status=online split-state): 1. SSOT for "is this workspace's container running" — `findRunningContainer` in plugins.go used to carry its own copy of `cli.ContainerInspect`, which collapsed transient daemon errors into the same `""` return as a genuinely-stopped container. Healthsweep's `Provisioner.IsRunning` handled the same input correctly (defensive). Promote the inspect logic to `provisioner.RunningContainerName`, route both consumers through it. Transient errors get a distinct log line on the plugins side so triage doesn't confuse a flaky daemon with a stopped container. 2. Runtime-aware Install/Uninstall — `runtime='external'` workspaces have no local container; push-install via docker exec is meaningless. They pull plugins via the download endpoint instead (Phase 30.3). Without a guard they fell through to `findRunningContainer` and 503'd with a misleading "container not running." Add an early 422 with a hint pointing at the download endpoint. The two fixes are independent: (1) preserves correctness when the SSOT helper is later modified; (2) eliminates the persistent split-state on the 5 external persona-agent workspaces in this DB (and on tenant deployments hitting the same shape). * `internal/provisioner/provisioner.go` — new `RunningContainerName(ctx, cli, id) (string, error)` with three documented outcomes (running / stopped / transient). `Provisioner.IsRunning` now wraps it; behavior preserved. * `internal/handlers/plugins.go` — `findRunningContainer` shimmed onto `RunningContainerName`; new `isExternalRuntime(id)` predicate. * `internal/handlers/plugins_install.go` — Install + Uninstall reject external runtimes with 422 + hint, before the source-fetch step. * `internal/handlers/plugins_install_external_test.go` — 5 cases: external→422, uninstall-external→422, container-backed-falls-through, no-runtime-lookup-fails-open, lookup-error-fails-open. * `internal/handlers/plugins_findrunning_ssot_test.go` — two AST gates pin the SSOT routing so future PRs can't silently re-introduce the parallel impl. Mutation-tested: reverting either consumer to a direct `ContainerInspect` makes the gate fail. Refs: molecule-core#10 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
00cfe51df7 |
test(org_import): tighten sqlmock regex on lookupExistingChild (#2872 PR-B)
Some checks failed
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
CI / Detect changes (pull_request) Successful in 6s
E2E API Smoke Test / detect-changes (pull_request) Successful in 6s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 5s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 4s
Harness Replays / detect-changes (pull_request) Successful in 5s
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 41s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 3s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 1m23s
CI / Python Lint & Test (pull_request) Successful in 31s
CI / Canvas (Next.js) (pull_request) Successful in 52s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 40s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 4s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Failing after 40s
Harness Replays / Harness Replays (pull_request) Failing after 43s
CI / Platform (Go) (pull_request) Failing after 2m23s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 4m47s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 14m23s
The five `mock.ExpectQuery(\`SELECT id FROM workspaces\`)` sites used a loose substring regex that silent-passed three regression shapes #2872 called out: 1. `WHERE parent_id = $2` (drops `IS NOT DISTINCT FROM` — breaks NULL-parent root matching) 2. `WHERE name = $1` only (drops parent_id check entirely — hijacks siblings of the same name across different parents) 3. Drops `AND status != 'removed'` (blocks re-import after Collapse) Extracts a `lookupChildSQLRE` const that anchors all four load-bearing tokens (the SELECT/FROM, the name predicate, the IS NOT DISTINCT FROM predicate, and the status filter). All five ExpectQuery sites now use the same const so a future schema/predicate change fails one place. Mutation-tested per memory feedback_assert_exact_not_substring.md: - Replacing `IS NOT DISTINCT FROM` with `=` fails TestLookupExistingChild_NilParent_MatchesRoot. - Dropping `AND status != 'removed'` fails TestLookupExistingChild_Found_ReturnsIDAndTrue. Note: #2872 PR-A (AST gate strengthening) is already addressed inline — findWorkspacesInsertSQL + TestCreateWorkspaceTree_InsertUsesOnConflictDoNothing pin the ON CONFLICT DO NOTHING shape, which is a strictly stronger gate than the original lookup-before-insert ordering check. |
||
| 4b074f631b |
feat(provisioner): env-driven RegistryPrefix() for workspace template images (#6)
Some checks failed
pr-guards / disable-auto-merge-on-push (pull_request) Failing after 0s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Failing after 41s
Harness Replays / Harness Replays (pull_request) Failing after 30s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 5m7s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Platform (Go) (pull_request) Failing after 3m8s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 14m4s
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 14m36s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 14m30s
Block internal-flavored paths / Block forbidden paths (pull_request) Has been cancelled
CI / Python Lint & Test (pull_request) Has been cancelled
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Has been cancelled
CI / Canvas (Next.js) (pull_request) Has been cancelled
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Has been cancelled
CI / Detect changes (pull_request) Has been cancelled
Secret scan / Scan diff for credential-shaped strings (pull_request) Has been cancelled
E2E API Smoke Test / detect-changes (pull_request) Has been cancelled
Runtime PR-Built Compatibility / detect-changes (pull_request) Has been cancelled
Harness Replays / detect-changes (pull_request) Has been cancelled
Handlers Postgres Integration / detect-changes (pull_request) Has been cancelled
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Has been cancelled
CI / Shellcheck (E2E scripts) (pull_request) Has been cancelled
Add MOLECULE_IMAGE_REGISTRY env var to override the registry prefix used by all workspace-template image references. Defaults to ghcr.io/molecule-ai (unchanged for OSS users); set to an ECR URI in production tenants when mirroring to AWS. Why this matters: GitHub suspended the Molecule-AI org on 2026-05-06 with no warning. Production tenants kept running because they had images cached locally, but any tenant restart (AWS health event, redeploy, OS reboot) would have failed at `docker pull ghcr.io/molecule-ai/...` because GHCR returned 401. This change introduces the seam needed to point new pulls at a registry we control (AWS ECR) by flipping a single env var on Railway. Design (RFC: molecule-ai/internal#6): - New `RegistryPrefix()` function in `provisioner/registry.go` reads MOLECULE_IMAGE_REGISTRY, falls back to "ghcr.io/molecule-ai". - New `RuntimeImage(runtime)` returns the canonical ref using the prefix. - `RuntimeImages` map computed at init via `computeRuntimeImages()` so existing callers that range over it still work. - `DefaultImage` likewise computed via `RuntimeImage(defaultRuntime)`. - `handlers.TemplateImageRef()` switched from hardcoded format string to `provisioner.RegistryPrefix()`. - `runtime_image_pin.go::resolveRuntimeImage()` automatically inherits the prefix change because it reads from `provisioner.RuntimeImages[]` and only re-formats the tag suffix to a digest pin. Alternatives rejected (see RFC): - Multi-registry fallback chain (try ECR, fall back to GHCR): GHCR is locked from outbound for our org, so the fallback never works for us. Adds code complexity for no benefit. - Hardcoded ECR-only switch: couples production code to a specific deployment environment. OSS users self-hosting Molecule would need the upstream GHCR. - Self-hosted Harbor / registry-on-Hetzner: adds a component to operate. Not justified at 3-tenant scale; AWS ECR is mature and IAM-integrated. Auth — deliberately NOT changed in this commit: - For GHCR, the existing `ghcrAuthHeader()` reads GHCR_USER/GHCR_TOKEN. - For ECR, EC2 user-data installs `amazon-ecr-credential-helper` and adds a `credHelpers` entry in `~/.docker/config.json` so the daemon resolves ECR credentials via the EC2 instance role on every pull. The Go code needs no auth change. This keeps the diff minimal. Backwards compatibility: - Additive: env unset → identical behavior to today (GHCR). - Existing tests reference literal `ghcr.io/molecule-ai/...` strings; they continue to pass under the default prefix. - `RuntimeImages` map preserved for callers that iterate it. - No interface, schema, API, or migration version bump needed. Security review: - No untrusted input: MOLECULE_IMAGE_REGISTRY is set at deploy time (Railway env, EC2 user-data), not by users. - No expanded data collection or logging changes. - No new permissions: ECR pull permission is a future user-data + IAM role change, separate from this code change. - Worst-case: an attacker who already compromises Railway can swap the registry prefix to a malicious URI — same blast radius as compromising Railway today, no expansion. Tests: - 9 new unit tests in `registry_test.go` covering: default fallback, env override, empty env, all 9 known runtimes, unknown runtime, override-applies-to-all, computeRuntimeImages map population, env reflection, alphabetical ordering pin. - All existing provisioner + handlers tests continue to pass. - Mutation-tested mentally: deleting `if v := os.Getenv(...)` makes TestRegistryPrefix_RespectsEnv fail. Deleting `for _, r := range knownRuntimes` makes TestRuntimeImage_AllKnownRuntimes fail. The test suite would catch a regression of the original failure mode. Rollout plan: this PR is safe to merge with no env change. Production cutover happens by setting MOLECULE_IMAGE_REGISTRY on Railway after the AWS ECR mirror is populated (separate ops change, tracked in issue #6 phases 3b–3f). Tracking: - RFC: molecule-ai/internal#6 - Tasks: #97 (ECR setup), #98 (CP fallback) - Tech debt: runbooks/hetzner-rollout-tech-debt-2026-05-06.md item 7 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
|||
|
|
a33c879017 |
feat(messagestore): MessageStore interface + Postgres impl (RFC #2945 PR-D)
Closes #3026. Final piece of RFC #2945. ## What's new New package internal/messagestore/ holds: - MessageStore interface — single read-side contract operators implement to plug in alternative chat-history backends. - ChatMessage / ChatAttachment / ListOptions types — canonical data shapes returned by any impl, mirrors canvas's TS ChatMessage. - PostgresMessageStore — platform-default impl wrapping the activity_logs query + A2A-envelope parser ported in PR-C. Behavior is byte-identical to the pre-PR-D handler. ## What moves The activity_logs query, the parser (activityRowToChatMessages, extractRequestText, extractChatResponseText, extractFilesFromTask, etc.), and the internal-self-message predicate all migrate from internal/handlers/chat_history.go into the new package. handlers/ chat_history.go becomes a thin HTTP-shape adapter: parse query params → store.List(ctx, workspaceID, opts) → emit JSON Compile-time interface assertion in postgres_store.go catches future drift if the interface evolves and the impl falls behind. ## Why this PR OSS operators wanting to: - Tier hot/warm/cold storage (recent in Postgres, archival in S3) - Use a vector store with hybrid search (Pinecone, Weaviate) - Run an in-memory store for ephemeral test environments - Federate history across regions …had no extension point — they'd have to fork the handler. This PR makes that a constructor swap at router.go. ## Tests Parser-level (22 tests, MOVED to internal/messagestore/postgres_ store_test.go): every TS test case in canvas/src/components/tabs/chat/__tests__/historyHydration.test.ts has a Go counterpart. Timestamp preservation, user/agent extraction, internal-self filter, role decision (status=error vs agent-error prefix), v0/v1 file shapes, malformed JSON resilience. Handler-level (9 NEW tests in internal/handlers/chat_history_test.go): thin adapter coverage using a fake MessageStore. UUID validation, before_ts RFC3339 validation, default limit, max-limit clamp, invalid-limit fallback, before_ts passthrough, empty-array (not null) JSON shape, attachment shape preservation, store-error → 502 mapping. Compile-time interface conformance: PostgresMessageStore satisfies MessageStore, fakeStore (test fake) satisfies MessageStore. Mutation-tested. Removed UUID validation in the handler; confirmed TestChatHistoryHandler_RejectsNonUUIDWorkspaceID fires red (status 200 instead of 400, non-UUID reaches the store). Restored, all green. Full handlers + messagestore + router test runs green; full repo go test ./... green. ## SSOT decision ChatMessage / ChatAttachment / parser / DB query all live in internal/messagestore/ ONLY. handlers/chat_history.go imports the package and uses the types via messagestore.ChatMessage etc. — no re-declaration anywhere. ## Three weakest spots (hostile-reviewer self-pass) 1. The internal-self prefix list (Delegation results are ready...) is a package var in messagestore/postgres_store.go. A future impl that wants to override the predicate must reach into the package to use IsInternalSelfMessage or define its own. Acceptable: the predicate is part of the contract; if an impl wants different semantics it owns that decision explicitly. 2. ListOptions has Limit + BeforeTS + HasBefore; future paging needs (after_ts, peer_id filter, role filter) require additive struct field additions, which is a soft API break for any impl that handles ListOptions positionally. Mitigated by Go's struct-literal convention (named fields by default); also flagged in the interface comment for impl authors. 3. The handler does NOT log when a store returns an error — it just maps to 502. An impl that wants to surface its error class up the stack can't, today. If/when an impl needs that, the interface can add a typed-error contract in a follow-up. Today's coverage is sufficient: most ops issues land in the store impl's own logs. ## Security review - Untrusted input? Same as PR-C — agent-emitted JSON parsed defensively. New fakeStore in tests can't reach production. - Trust boundary? Same. Interface lives BEHIND wsAuth; impls only see workspace IDs already authenticated. - Auth/authz? Inherited from handler; the interface doesn't authenticate. - PII / secrets in logs? Documented in the interface contract: impls MUST NOT log full message bodies / attachment URIs. The Postgres impl logs nothing on the happy path. - Output sanitization? Same plain-text + opaque-URI surface as PR-C. Canvas validates attachment-URI schemes. No security-relevant changes beyond what /chat-history already exposes via PR-C. Considered, not skipped. ## Versioning / backwards compat - New internal package. Zero public API change. - Single caller site in router.go updated (one-line constructor change). NewChatHistoryHandler() → NewChatHistoryHandler(store). - No schema change, no migration. - Existing /chat-history endpoint unchanged on the wire — clients don't notice the refactor. ## Phasing This is the final RFC #2945 piece. Follow-ups parked: - PR-C-2 (canvas migration): swap canvas loadMessagesFromDB to call /chat-history instead of /activity. Independent of this PR; blocked only by canvas team's calendar. - Sample alternative impls (S3, in-memory) for OSS docs: separate PR when the first OSS consumer materializes; demonstration code untested against a real workload is anti-pattern. 🤖 Generated with [Claude Code](https://claude.com/claude-code) |
||
|
|
089be695a9 | Merge staging into rfc-2945-pr-c-chat-history | ||
|
|
dcc870a6b7 |
feat(workspace-server): server-side chat-history endpoint (RFC #2945 PR-C)
Closes the SSOT gap for chat-history hydration: today every consumer
(canvas TS) re-implements an A2A-envelope walk to map activity_logs
rows into rendered ChatMessage objects. This PR moves that walk into
the server.
## What's added
GET /workspaces/:id/chat-history?limit=N&before_ts=T
Returns:
{
"messages": [
{"id": "<uuid>", "role": "user"|"agent"|"system",
"content": "...", "attachments": [...], "timestamp": "<RFC3339>"}
],
"reached_end": false
}
Auth chain: same wsAuth as /workspaces/:id/activity (tenant ADMIN_TOKEN
+ X-Molecule-Org-Id). No new trust boundary.
Filter: a2a_receive rows with source_id IS NULL — same canvas-source
filter the canvas applies via /activity?type=a2a_receive&source=canvas,
centralized so future API consumers don't need to know it.
## What's mirrored from canvas TS
Direct port of canvas/src/components/tabs/chat/historyHydration.ts
+ message-parser.ts:
- extractRequestText / extractFilesFromUserMessage — user-side parts
walk through request_body.params.message.parts[]
- extractChatResponseText — agent-side response_body collector across
the four shapes (string, A2A JSON-RPC parts, older nested
parts.root.text, task artifacts) joined with "\n" (matches canvas
multi-source collector — claude-code emits multiple text parts;
hermes emits summary+artifacts)
- extractFilesFromResponse / extractFilesFromTask — file walk across
parts[] + artifacts[].parts[] + status.message.parts[] +
message.parts[]
- v0 hot path ({kind:"file", file:{...}}) AND v1 protobuf flat shape
({url, filename, mediaType}) both supported
- Role decision: status='error' OR text starts with "agent error"
(case-insensitive) → "system", else "agent"
- isInternalSelfMessage prefix filter (Delegation results are
ready...)
- Timestamp pinned to row.created_at (regression cover for
2026-04-25 bubble-collapse bug)
## Tests
22 unit tests in chat_history_test.go, every TS test case in
historyHydration.test.ts has a Go counterpart:
Timestamp preservation (3): user/agent pin to created_at, two-rows
produce two distinct timestamps.
User-message extraction (5): text-only, internal-self skip,
null body, attachments hydrated, attachments-only-when-text-empty,
internal-self suppresses even with attachments.
Agent-message extraction (4): result-string, status=error→system,
agent-error-prefix→system, response_body.parts attachments,
null body, no-text-no-files-no-bubble.
End-to-end (1): paired user+agent same timestamp.
Go-specific (5): malformed JSON returns empty (no panic), v1
protobuf flat shape extraction, task-artifacts extraction, older
nested root.text shape, basename helper edge cases.
isInternalSelfMessage predicate (1): prefix match, non-prefix non-
match, empty-text non-match.
Mutation-tested. Removed the role-promotion branch (status=error +
agent-error prefix → system); confirmed both
TestChatHistory_RoleSystemWhenStatusError and
TestChatHistory_RoleSystemWhenAgentErrorPrefix fire red. Restored.
Both green.
Full handlers test suite (4.3s) green; full repo `go test ./...` green.
## SSOT decision
Parsing logic lives in workspace-server/internal/handlers/chat_history.go
ONLY. Canvas keeps historyHydration.ts + message-parser.ts during the
transition because:
- PR-C-2 (follow-up): canvas loadMessagesFromDB swaps to new
endpoint. Today's canvas still calls /activity for backward
compatibility.
- The TS parsers are still load-bearing for LIVE message handling
(WebSocket A2A_RESPONSE events) until RFC #2945 PR-B-2 mirrors
the typed event payloads to canvas consumers.
Canvas's TS path will be deleted in a separate PR after a one-week
observation window confirms no live-message consumers depend on it.
## Security review
- Untrusted input? YES — request_body and response_body come from
agents (potentially OSS / third-party). Defensive: any malformed
JSON returns empty content + no attachments, no panic. Tested
via TestChatHistory_MalformedJSONInRequestBodyReturnsEmpty.
- Trust boundary? Same as today: agent → workspace-server.
No new boundary; reuses existing wsAuth middleware.
- Auth/authz? Inherits wsAuth chain. Cross-workspace access blocked
by existing TenantGuard middleware.
- PII / secrets in logs? None. The handler logs nothing on the
happy path; errors log 502 without body content.
- Output sanitization? ChatMessage.content is plain text returned
as-is; canvas already sanitizes via ReactMarkdown. Attachment
URIs are agent-provided (workspace: / platform-pending: /
https:); canvas's existing scheme allow-list still applies.
## Versioning / backwards compatibility
- New endpoint /chat-history. /activity unchanged.
- Canvas historyHydration.ts + message-parser.ts intact during
transition (will be removed in PR-C-2 follow-up).
- No public API consumer of /activity is broken — added route is
additive.
- No semver bump (server is internal versioning).
## Three weakest spots (hostile-reviewer self-pass)
1. extractRequestText returns ONLY parts[0].text. If a user message
contains multiple text parts (uncommon — canvas only ever emits
one), we lose later parts. Matches canvas exactly today, but a
future change that emits multi-text user messages needs both
parsers updated. Documented in code; covered by test if/when
added.
2. activityRowToChatMessages rebuilds ChatMessage IDs every call (no
caching). Each chat reload mints fresh UUIDs. This is fine because
canvas dedupes by (role, content, timestamp window) not id, but a
future API consumer that DID rely on id stability would break.
Documented in the ChatMessage struct comment.
3. The handler scopes to source_id IS NULL only (canvas-source rows).
A future "show all messages, including agent-to-agent" mode would
need a new endpoint or a parameter. Out of scope for PR-C; canvas's
/activity?source=canvas already enforces the same filter.
Closes #3017. Unblocks RFC #2945 PR-D (MessageStore interface) which
returns []ChatMessage typed values.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
656a02fae4 |
fix(textutil): SSOT for rune-safe string truncation, fix 3 audit-gap bugs
Closes #2962. ## Why Six per-package `truncate` helpers had drifted into independent re-implementations of the same idea. Three of them (delegation.go, memory/client/client.go, memory-backfill/verify.go) used `s[:max] + "…"` byte-slice form, which on a multi-byte codepoint at byte `max` produces invalid UTF-8 → Postgres `text`/`jsonb` rejects the INSERT silently → `delegation` / `activity_logs` row never lands → audit gap. Three other helpers (delegation_ledger.go #2962, agent_message_writer.go #2959, scheduler.go #2026) had each been fixed in isolation with three slightly different rune-safe shapes — confirming this is a class of bug, not a single instance. ## What New package `internal/textutil` with three rune-safe functions: - `TruncateBytes(s, maxBytes)` — byte-cap, "…" marker. Used by 5 callers writing into byte-bounded columns / log lines. - `TruncateBytesNoMarker(s, maxBytes)` — byte-cap, no marker. Used by delegation_ledger.go where the storage already conveys "preview" and an extra ellipsis would push the result over the column cap. - `TruncateRunes(s, maxRunes)` — rune-cap, "…" marker. Used by agent_message_writer.go where the cap is in display chars (UI summary), not bytes. All three guarantee `utf8.ValidString(out)` for any `utf8.ValidString(in)`. Inputs already invalid go through `sanitizeUTF8` at the call site boundary (scheduler.go preserved this defense-in-depth). ## Migration map | Old | New | Behavior change | |---|---|---| | `delegation_ledger.truncatePreview` | `textutil.TruncateBytesNoMarker(s, 4096)` | none | | `agent_message_writer.truncatePreviewRunes` | `textutil.TruncateRunes(s, n)` | none | | `scheduler.truncate` | `textutil.TruncateBytes(s, n)` | "..." → "…" (3 bytes either way; single-glyph display) | | `delegation.truncate` | `textutil.TruncateBytes(s, n)` | bug fix + ellipsis swap | | `memory/client.truncate` | `textutil.TruncateBytes(s, n)` | bug fix | | `memory-backfill.truncate` | `textutil.TruncateBytes(s, n)` | bug fix | Five separate `truncate*` helpers + their per-package tests removed. Net: 12 files / +427 / -255. ## Tests - `internal/textutil/truncate_test.go` — 27 table-test cases + 145 fuzz-invariant cases asserting `utf8.ValidString` and byte-cap invariants on every output. - `delegation_ledger_test.go TestLedgerInsert_TruncatesOversizedPreview` strengthened with `capValidUTF8Matcher` so the SQL-write argument is asserted to be valid UTF-8 + within cap (not just `AnyArg()`). Mutation-tested: replacing the SSOT call with byte-slice form makes this test fail loud. ## Compatibility - All callers internal; no external API surface change. - Ellipsis swap "..." → "…": same byte budget (3 bytes), single-glyph display. No alerting/grep on either marker in this codebase (verified). Canvas renders both correctly. - DB column widths unchanged (4096 / 80 / 200 / 256 / 300 — all preserved in the migrations). ## Security Fixes a silent INSERT-failure mode that hid `activity_logs` / `delegations` rows containing peer-controlled text. The class of input that triggered it (CJK, emoji, accented Latin) is normal user content, not malicious — but the symptom (audit gap) makes incident reconstruction harder. Helper is pure-function over `string`; no secrets / PII / auth handling involved. Untrusted input is handled identically to before, just rune-aligned now. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
c53155ec5f
|
Merge pull request #3014 from Molecule-AI/test/cross-table-atomicity-integ-149-followup
test(chat-uploads): integration test for cross-table atomicity (#149 follow-up) |
||
|
|
7a39a08837 |
test(chat-uploads): integration test for cross-table atomicity (#149 follow-up)
Adds two real-Postgres tests under //go:build integration: - TestIntegration_PollUpload_AtomicRollback_AcrossBothTables exercises the helpers in the same Tx shape uploadPollMode does (PutBatchTx + LogActivityTx + Rollback) and asserts COUNT(*)=0 on BOTH pending_uploads AND activity_logs after the rollback. Failure injection: NUL byte in `summary` triggers lib/pq protocol rejection on the second activity insert — same trick the existing PutBatch AtomicRollback test uses. - TestIntegration_PollUpload_HappyPath_AcrossBothTables is the positive counterpart — Commit lands N rows in both tables. Coverage rationale (post-PR-3010 review): - sqlmock unit test (TestPollUpload_AtomicRollbackOnActivityInsertFailure) proved the handler calls Begin/Exec/Exec-fail/Rollback in order. - Existing PutBatch integration test proved Postgres honors rollback for pending_uploads alone. - New tests close the cross-table gap: prove LogActivityTx + PutBatchTx + real Postgres MVCC compose correctly under rollback. A regression that made LogActivityTx silently route through db.DB instead of the passed tx would still pass the sqlmock test (the Begin/Commit/Rollback shape would look right) but would fail this integration test (the activity_logs row would survive the rollback). Verified locally: postgres:15-alpine + all migrations applied, both tests pass in 0.1s. Skips cleanly without INTEGRATION_DB_URL — CI already runs this file via the Handlers Postgres Integration job. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
ff21bbb876 | Merge staging into rfc-2872-workspaces-uniq-toctou to clear BEHIND | ||
|
|
da3cb4c098 |
fix(workspace-server): close TOCTOU race on workspaces(parent_id, name) (#2872 Critical 1)
## Bug
`/org/import` had no per-tenant mutex, advisory lock, or DB-level
uniqueness on (parent_id, name). The pattern was lookup-then-insert:
existingID, existing, err := h.lookupExistingChild(...) // SELECT
if existing { return /* skip */ }
db.DB.ExecContext(ctx, `INSERT INTO workspaces ...`) // INSERT
Two concurrent admin POSTs (rapid double-click in canvas, retry-after-
timeout, two operators on the same template) both saw "not found" in
the SELECT and both INSERT'd the same (parent_id, name).
Captured impact: tenant-hongming accumulated 72 stale child workspaces
in 4 days from repeated org-template spawns of the same template
(see #2857 phase 4 sweeper for the cleanup; #2872 for the prevention RFC).
## Fix
Two-layer fix — DB-level backstop AND application-level happy path:
1. **Migration** `20260506000000_workspaces_unique_parent_name.up.sql`
```sql
CREATE UNIQUE INDEX CONCURRENTLY IF NOT EXISTS workspaces_parent_name_uniq
ON workspaces (
COALESCE(parent_id, '00000000-0000-0000-0000-000000000000'::uuid),
name
)
WHERE status != 'removed';
```
* COALESCE(parent_id, sentinel) collapses NULLs so root workspaces
also collide pairwise.
* `WHERE status != 'removed'` lets a tombstoned row be replaced
by a same-named re-import (preserves existing org-import semantics).
* CONCURRENTLY avoids ACCESS EXCLUSIVE on production tenants under
live traffic; IF NOT EXISTS makes the migration resumable.
* Down migration drops CONCURRENTLY symmetrically.
2. **`org_import.go` swap**
Replace lookup-then-insert with `INSERT ... ON CONFLICT DO NOTHING
RETURNING id`. On the skip path (RETURNING returns 0 rows →
sql.ErrNoRows), re-select the existing id to recurse children:
INSERT INTO workspaces (...) VALUES (...)
ON CONFLICT (COALESCE(parent_id, ...), name)
WHERE status != 'removed'
DO NOTHING
RETURNING id;
The ON CONFLICT target predicate matches the partial-index predicate
exactly — required for Postgres to consider the index applicable.
Existing `lookupExistingChild` helper kept (still used on the skip
path); semantics unchanged.
## Test coverage
* AST gate refreshed to assert the workspaces INSERT contains the
ON CONFLICT pattern (`onConflictDoNothingRE`) instead of the now-obsolete
"lookup-before-insert" ordering. Per behavior-based gating
(memory: feedback_behavior_based_ast_gates.md), the new gate pins
the actual TOCTOU-resolution behavior.
* Companion `TestGate_FailsWhenInsertOmitsOnConflict` proves the gate
catches the bug shape on synthetic source.
* All existing `lookupExistingChild` unit tests (no-rows, found,
nil-parent, DB error, wrapped no-rows) still pass — helper is
unchanged and still load-bearing on the skip path.
* Live Postgres E2E coverage runs via the existing
"Handlers Postgres Integration" CI job, which applies migrations
to a real PG and exercises the INSERT path.
## Why ship the migration + swap together (not stacked)
The migration alone provides a DB-level backstop, but without the
handler swap a UNIQUE-violation surfaces as a 500 to the user. The
handler swap alone has no enforceable target until the migration
applies. Shipped together they give graceful skip + atomic backstop.
Migration is CONCURRENTLY + IF NOT EXISTS, safe to apply even on
tenants where the sweeper (#2860) hasn't run yet — the index just
declines to build until conflicting rows are reconciled.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
b759548822 |
fix(chat-uploads): activity rows commit atomically with PutBatch
Closes #149. uploadPollMode for poll-mode chat uploads previously committed N pending_uploads rows in one Tx (PutBatch), then wrote N activity_logs rows individually outside any Tx. A per-row failure on activity row K left rows 1..K-1 committed and pending_uploads orphaned until the 24h TTL — not data-loss because the platform's fetcher handled the half-state cleanly, but the user never saw file K in the canvas and the inconsistency surfaced as an "uploaded but invisible" complaint class. Thread one Tx through PutBatchTx + N × LogActivityTx + Commit so all or none commit. Broadcasts are deferred until after Commit — emitting an ACTIVITY_LOGGED event for a row that ends up rolled back would paint a ghost message into the canvas's optimistic UI. A new LogActivityTx returns a commitHook the caller invokes post-Commit; the existing fire-and-forget LogActivity is unchanged for the 4 other production callers (a2a_proxy_helpers + activity.go report path). Storage interface gains PutBatchTx; PostgresStorage.PutBatch is refactored to share the validation + insert path. inMemStorage and fakeSweepStorage delegate or no-op for PutBatchTx (the in-mem fake can't model Tx state — DB-level atomicity is verified by the existing real-Postgres integration test for PutBatch + the new unit test asserting the Go handler calls Rollback on activity-insert failure). Tests: - TestPollUpload_AtomicRollbackOnActivityInsertFailure pins the new contract via sqlmock — second activity insert errors → Rollback expected, Commit must NOT be called. - TestLogActivityTx_DefersBroadcastUntilCommitHook + _InsertError_NoHook_NoBroadcast + _NilTx_Errors cover the new API. - TestPutBatchTx_HappyPath / _EmptyItems / _ValidationFails / _PerRowErrorPropagates cover Tx-aware storage layer. - 7 existing TestPollUpload_* tests updated to mock Begin + Commit (or Begin + Rollback for failure paths) since the handler now opens a Tx around PutBatch + activity inserts. All workspace-server tests pass; integration tag also clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
19df43e3da
|
Merge pull request #2993 from Molecule-AI/rfc-2945-pr-b-1-migrate-bare-event-strings
refactor(events): migrate 18 producers to typed EventType constants (RFC #2945 PR-B-1) |
||
|
|
f39b595a9c |
fix(workspace files API): EIC parity for ListFiles + DeleteFile (closes #2999 PR-A)
## User-visible bug Canvas Files tab returns "0 files / No config files yet" for every SaaS workspace, every root (/configs, /home, /workspace, /plugins). Reported by user (canvas screenshot, hongming.moleculesai.app, Hongming Personal Brand Agent — claude-code, T4, online). ## Root cause `ListFiles` (templates.go) was missing the SSH-via-EIC branch that ReadFile (PR #2785) and WriteFile (PR #1702) already have. On SaaS, dockerCli is nil → findContainer returns "" → falls through to host-side resolveTemplateDir which only matches baked-in template names. For a user-named workspace it matches nothing, so the handler silently returns []fileEntry{}. DeleteFile had the same gap — right-click delete (introduced in PR-C of this issue) would silently no-op once #1 was fixed. ## Fix 1. Extracted shared EIC plumbing into `withEICTunnel` (closure-based, single SSOT for keypair → key push → tunnel → port-wait → cleanup). Refactored writeFileViaEIC + readFileViaEIC to use it. Added listFilesViaEIC + deleteFileViaEIC on the same scaffold. The `LogLevel=ERROR` shim from PR #2822 now lives in one `eicSSHSession.sshArgs()` helper instead of being duplicated per helper — the next time we need to tweak ssh options, one place. 2. Factored remote shell strings into pure functions (buildInstallShell / buildCatShell / buildRmShell / buildFindShell + parseFindOutput) so the wire shape can be pinned without booting a real EIC tunnel. 3. Refactored `resolveWorkspaceFilePath(runtime, root, relPath)` to honor `?root=`. New rule: `/configs` (or empty / unrecognized) → runtime managed-config dir via workspaceFilePathPrefix (preserves the v1 ReadFile/WriteFile behaviour where canvas's Config tab GETs/PUTs config.yaml without specifying a root and lands in the right per-runtime dir); `/home`, `/workspace`, `/plugins` → literal absolute path on the EC2 host. List/Read/Write/Delete now agree on what file a tree row points to — pre-fix List would say "/home contents" but Read/Write would route to /configs. 4. ListFiles + DeleteFile dispatch on instance_id != "" → EIC helper. Errors from the EIC path produce 500 (not silent fall-through to local-Docker, which would mask the failure as "0 files" — the exact user-visible symptom). 5. Added ?root= validation gate to WriteFile + DeleteFile so an out-of-allowlist root is rejected before the resolver runs. ## Test coverage - TestResolveWorkspaceFilePath_RuntimeIndirection — pins the /configs → runtime prefix translation per-runtime (hermes, claude-code, langgraph, external, unknown). Catches the regression where a future edit accidentally drops the runtime indirection. - TestResolveWorkspaceFilePath_LiteralRoots — pins /home, /workspace, /plugins as literal pass-through regardless of runtime. Catches the symmetric regression where the literal roots start getting rewritten to the runtime prefix (which would mean the FilesTab "/home" selector silently routes to /configs on hermes). - TestResolveWorkspaceRootPath — directory-only translation used by listFilesViaEIC, same indirection rules. - TestSSHArgs_HardenedFlags — pins the centralised ssh option set (LogLevel=ERROR + hardening). Catches drift in the one-place-where-ssh-flags-live. - TestEicSSHSessionSingleSourceForSSHFlags — behaviour-based AST gate (per memory). Counts s.sshArgs() callers (must be ≥4 — list/read/write/delete) and asserts LogLevel=ERROR appears exactly once in the source. Fires if anyone copy-pastes a raw ssh args slice instead of going through the helper. - TestBuildInstallShell / TestBuildCatShell / TestBuildRmShell / TestBuildFindShell — pure-function tests pinning the remote command shape. Catches regression like "rm -f silently becomes rm -rf" or "find loses node_modules pruning" without needing a real EC2. - TestBuildFindShell_DepthForwarding — catches a regression where the helper hard-codes a depth instead of using the caller's value. - TestParseFindOutput / TestParseFindOutput_EmptyInput — pin the TYPE|SIZE|REL parser. Empty-input case explicitly returns [] not nil so the JSON wire shape stays a list. - TestListFiles_EICDispatch_Success / Error — sqlmock-driven handler test. Verifies instance_id != "" routes to listFilesViaEIC and surfaces errors as 500 (does NOT silently fall through to local-Docker, which is the exact regression-mode of the original bug). - TestListFiles_EICBranch_NotTakenForSelfHosted — back-compat guard: instance_id == "" must NOT enter the EIC branch (would break self-hosted operators). - TestDeleteFile_EICDispatch_Success / Error — same shape for DeleteFile. - TestListFiles_RootValidation / TestDeleteFile_RootValidation — ?root=/etc must 400 before any DB query or EIC call. ## Verification - `go build ./...` clean - `go test ./...` clean (full workspace-server suite) - Will be live-verified against staging on hongming.moleculesai.app after merge: open Files tab → expect populated /home + /configs + /workspace listings (not "0 files"); right-click delete on /configs/old.yaml → expect file removed on the EC2 host. ## Three weakest spots (hostile self-review) 1. The LogLevel=ERROR drift gate counts source occurrences. A future refactor that intentionally moves the literal somewhere else (e.g. into a constant) would trigger a false positive. The gate's failure message points to the load-bearing constraint (must appear in sshArgs); operator can adjust. 2. `eicFileWriteTimeout` constant kept as an alias for back-compat with prior tests. Documented as intentional + safe to remove on the next pass. 3. The resolver tests pin the runtime → prefix map values (`/home/ubuntu/.hermes`, `/configs`, etc.). A future runtime addition that ships a new prefix needs the test updated. This is intentional — silent prefix changes orphan saved files, so a test failure on map edit IS the right signal. ## Follow-up (RFC #2312 subtask 2) Long-term the right fix is to drop EIC entirely and HTTP-forward to the workspace's own URL (RFC #2312). That's a substantially larger refactor across 5 surfaces (chat upload, files, templates, plugins, terminal) and out of scope for this bug-fix PR. Tracked separately under that RFC. Refs #2999. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
64e58fb390 |
test(memory-v2-e2e): update expectChainQueryRoot for new name column
PR #2990 root cause: the resolver SQL added `name` to the SELECT for DisplayName plumbing, but the e2e test's sqlmock fixture (expectChainQueryRoot at swap_test.go:216) still scripts the 3-column shape. Three e2e tests fail with: sql: expected 3 destination arguments in Scan, not 4 Fix: bump the fixture to 4 columns (id, name, parent_id, depth) and pass an empty name. The e2e tests don't assert on label rendering — they pin the namespace string flow ("workspace:root-1" etc), which is unchanged. Empty name is fine: ReadableNamespaces still emits the correct namespace strings; only DisplayName is empty. Caught by CI's Platform (Go) check on PR #2990 — would have been a silent missed-coverage case in the resolver_test.go run because that package doesn't import the e2e package. 🤖 Generated with [Claude Code](https://claude.com/claude-code) |
||
|
|
9ceda9d81f |
refactor(events): migrate 18 files to typed EventType constants (RFC #2945 PR-B-1)
Mechanical migration of bare event-name strings in BroadcastOnly / RecordAndBroadcast call sites to the typed constants from internal/events/types.go (RFC #2945 PR-B). Wire format unchanged (both shapes serialize to identical WSMessage.Event literals); pinned by TestAllEventTypes_IsSnapshot in #2965. Migrated (18 files, scope: handlers/, scheduler/, registry/, bundle/, channels/): - handlers/{approvals,a2a_proxy_helpers,a2a_queue,activity,agent, delegation,external_rotate,org_import,registry,workspace, workspace_bootstrap,workspace_crud,workspace_provision_shared, workspace_restart}.go - channels/manager.go (caught by hostile-reviewer pass — initial scope missed channels/, found via grep on the post-migration tree) - scheduler/scheduler.go - registry/provisiontimeout.go - bundle/importer.go Hostile self-review (3 weakest spots, addressed) ------------------------------------------------ 1. Missed call sites — initial scope omitted channels/. Post-migration `grep -rEn 'BroadcastOnly\([^,]+,[^,]*"[A-Z_]+"|RecordAndBroadcast\([^,]+,[^,]*"[A-Z_]+"' internal/` found 2 stragglers in channels/manager.go. Migrated. Final grep on the same pattern returns only the docstring example in types.go (intentional). 2. gofmt drift — auto-import injection produced non-canonical import ordering. `gofmt -w` applied ONLY to the 18 modified files (NOT the whole tree, to avoid sweeping unrelated pre-existing drift into this PR's diff). Three pre-existing un-gofmt'd files in handlers/ (a2a_proxy.go, a2a_proxy_test.go, a2a_queue_test.go) left as-is — they're unchanged by this PR and their drift predates it. 3. Wire format — paranoia check: do the constants serialize to the exact strings consumers (canvas TS, hermes plugin, anything parsing WSMessage.Event) expect? Yes. Pinned by the snapshot test. The migration is name-only; not a single character of wire output changes. Verified - go build ./... clean - go vet ./internal/... clean - gofmt -l on the 5 migrated package dirs: only pre-existing files - Full tests: handlers/, channels/, scheduler/, registry/, events/, bundle/ all green (5 ok, 0 fail) PR-B-2 (canvas TS mirror + cross-language parity gate) remains as the final piece of RFC #2945 PR-B. Tracked separately so this PR stays mechanical + reviewable. Refs RFC #2945, PR #2965 (PR-B types). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
b6310d7ebf |
fix(memory-v2): namespace dropdown labels use display names not UUID prefixes (#2988)
User feedback on the v2 Memory tab redesign: on a root workspace, the
namespace dropdown showed three indistinguishable entries:
Workspace (30ba7f0b)
Team (30ba7f0b) (team)
Org (30ba7f0b-b303-4a20-aefe-3a4a675b8aa4) (org)
For a root workspace, the resolver collapses workspace==team==org IDs
(resolver.go:113-122 derive() degenerate case). The previous
shortID(8)-truncated UUID label scheme made all three look identical
even though the three concepts (private / team-shared / org-wide)
remain semantically distinct.
## Backend — Resolver returns DisplayName
- SQL chain query now SELECTs workspaces.name (COALESCE → "" on NULL)
- chainNode carries .name through walk
- deriveNames() computes the display name for each namespace,
mirroring derive():
workspace: self.name
team: parent.name (or self.name if root — degenerate)
org: chain[end].name (root of tree)
- Namespace struct gets a new DisplayName field, omitempty wire-shape
## Backend — Handler renders label from DisplayName when present
- memories_v2.go:namespaceLabelWithName(name, kind, displayName) is
the new SSOT label generator. Falls back to the UUID-prefix shape
when displayName is empty so callers without name plumbing keep
working unchanged.
- namespacesToViews now plumbs Namespace.DisplayName into the label.
- Old namespaceLabel(name, kind) is preserved as a thin wrapper
around namespaceLabelWithName(_, _, "") for back-compat.
- Custom namespaces ignore displayName by design — operator-defined
suffixes ARE the chosen label; a name override would surprise.
## Frontend — drop redundant `(kind)` suffix
Pre-fix: "Team (mac laptop) (team)" — kind shown twice.
Post-fix: "Team (mac laptop)" — the prefix already conveys the kind.
## Test coverage
Resolver (3 new tests):
- DisplayName_Root: workspace name propagates to all 3 namespaces
- DisplayName_Child: workspace=self.name, team=parent.name, org=root.name
- DisplayName_EmptyOnNULL: COALESCE → "" → empty fallback
Handler (3 new tests):
- NamespaceLabelWithName_PrefersDisplayName: workspace/team/org/custom paths
- NamespaceLabelWithName_FallsBackToUUIDPrefix: empty displayName → legacy shape
- NamespacesToViews_PassesDisplayNameThrough: full integration on root case
Canvas: existing 30 tests still pass; suffix drop is rendering-only.
memories_v2.go function coverage: **14/14 = 100%**
- namespaceLabelWithName: 100%
- namespacesToViews: 100%
- (all 11 pre-existing functions stay at 100%)
## SSOT
The "what is this namespace called" question now has one source of
truth: namespace.Resolver.ReadableNamespaces sets DisplayName from the
canonical workspace.name column. The handler is a renderer; the
canvas is a consumer. No name-lookup logic duplicated across the
three layers.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
|
||
|
|
f1dc721eeb
|
Merge pull request #2964 from Molecule-AI/fix/delegation-ledger-utf8-truncate-2962
fix(delegation_ledger): rune-safe preview truncation (#2962) |
||
|
|
a5903af459 |
fix(delegation_ledger): rune-safe preview truncation (#2962)
The previous byte-slice form `s[:previewCap]` could split a multi-byte codepoint at byte 4096, producing invalid UTF-8. Postgres JSONB rejects the row → ledger insert silently fails → audit gap on dashboards while activity_logs continues to record the event. Walk the string by rune index and stop at the last boundary that fits inside the cap. ASCII-only strings still hit the cap exactly; CJK/emoji strings stop slightly under, never over. Mirrors the truncatePreviewRunes fix shipped for agent_message_writer in #2959. Followup: deduplicate into a shared helper once both have landed. Tests: 2 regression tests using utf8.ValidString — one with an all-3-byte rune string just over the cap, one with a single multi-byte rune sitting exactly on the boundary. Verified on the previous byte-slice impl: both new tests would fail (invalid UTF-8 + truncation past cap by 1 byte). |
||
|
|
5b78bea10d |
feat(events): typed EventType registry — single source of truth for WS event names (RFC #2945 PR-B)
Pre-RFC-#2945, every BroadcastOnly / RecordAndBroadcast call site
passed a bare string literal:
h.broadcaster.BroadcastOnly(workspaceID, "AGENT_MESSAGE", payload)
29 producers (Go, ~30 call sites in handlers/, scheduler/, registry/,
bundle/) and ~30 canvas consumers (TS store + listeners) duplicated
the same string with no shared definition. A producer renaming an
event silently broke every consumer — same drift class that produced
the reno-stars data-loss regression on the persistence side. PR-A
fixed the persistence-side SSOT (AgentMessageWriter); PR-B fixes the
event-name SSOT.
What this PR ships
internal/events/types.go
- EventType typed string + 29 named constants covering the full
taxonomy (chat / lifecycle / agent assignment / delegation /
task / approval / auth).
- Grouped semantically; new constants must be added here AND
mirrored in canvas/src/lib/ws-events.ts (parity gate landing
in PR-B-2 follow-up).
- AllEventTypes slice — authoritative list for the snapshot
test + the cross-language parity gate.
internal/events/types_test.go (3 tests)
- TestAllEventTypes_IsSnapshot: pins the canonical list. Adding
a new constant without updating AllEventTypes (or vice versa)
fails with a one-line diff.
- TestEventType_NoEmptyConstants: catches accidentally-empty
values (typo in types.go: const X EventType = ...).
- TestEventType_AllUppercaseSnakeCase: pins the wire format that
canvas TS switch statements assume (no kebab-case, no mixed
case, no leading/trailing/double underscores).
agent_message_writer.go (single migration)
- Demonstrates the constant-usage shape:
events.EventAgentMessage → "AGENT_MESSAGE"
- Other ~30 call sites stay on bare strings for now (this PR
narrow); the migration happens in PR-B-1 follow-up. Both
shapes (constant + bare string) co-exist on the wire — the
typed version is just the recommended path for new code.
Why ship this in stages
1. PR-B (this): types + tests + first migration → MERGEABLE NOW,
low risk.
2. PR-B-1 (follow-up): migrate the remaining ~30 call sites to
constants. Mechanical, low-risk.
3. PR-B-2 (follow-up): canvas/src/lib/ws-events.ts mirror + cross-
language parity gate. Touches both repos.
Per memory feedback_oss_design_philosophy.md (every refactor toward
OSS plugin shape) — this surface is now plugin-safe: external
implementations can import the events package and get the same
named taxonomy without copying strings.
Verified
- go vet ./internal/events/ clean
- go build ./... clean
- TestAllEventTypes_IsSnapshot + TestEventType_* all pass
- TestAgentMessageWriter_* (the only call site touched) still green
Refs RFC #2945, PR #2949 (PR-A SSOT), PR #2944 (reno-stars).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
07d09f3696
|
Merge pull request #2959 from Molecule-AI/rfc-2945-pr-a-followup-utf8-and-db-errors
fix(handlers): UTF-8-safe preview truncation + distinguish DB errors from not-found (PR-A followup) |
||
|
|
feef80423b
|
Merge pull request #2958 from Molecule-AI/fix/external-connect-templates-mcp-command
fix(external-connect): use molecule-mcp wrapper in Codex/OpenClaw templates (#2957) |
||
|
|
1e01083e55 |
fix(handlers): UTF-8-safe preview truncation + distinguish DB errors from not-found (RFC #2945 PR-A followup)
Self-review of PR #2949 surfaced two pre-existing defects that the SSOT consolidation inherited from the original /notify handler. Both are addressable in a small follow-up; shipping them as a separate PR keeps the consolidation and the bug-fix individually reviewable. Critical: byte-slice preview truncation produces invalid UTF-8 ------------------------------------------------------------- Pre-fix: if len(preview) > 80 { preview = preview[:80] + "…" } `len()` returns BYTES; `preview[:80]` slices on a byte boundary. For agent-authored chat in CJK / emoji / accented characters, byte 80 lands mid-codepoint → invalid UTF-8 → Postgres JSONB rejects → INSERT fails → activity_log row never written → message vanishes from chat history on the next reload. The persistence-failure log fires but operators have to grep to find it, and the user-visible regression mode is identical to reno-stars. Fix: extract `truncatePreviewRunes(s, maxRunes)` that walks the rune boundary using `for i := range s` (Go's range over string yields rune start indices). Cap at 80 RUNES not bytes — UI-friendly count, not storage count. Important: workspace-lookup error path swallows real DB errors -------------------------------------------------------------- Pre-fix: if err := w.db.QueryRowContext(...).Scan(&wsName); err != nil { return ErrWorkspaceNotFound } Conflates `sql.ErrNoRows` (legit not-found → caller 404) with real DB errors (connection drop, query timeout, pool exhaustion → caller should 503). During a Postgres outage every notify call surfaced as "workspace not found" — masking the actual incident in alerting and making the symptom indistinguishable from "you typed a bad workspace ID". Fix: distinguish via `errors.Is(err, sql.ErrNoRows)` and wrap non-not-found errors with `fmt.Errorf("agent_message: workspace lookup: %w", err)`. Callers' existing fallback path (return 500 / return error wrapped) handles the new shape correctly without any changes — verified by running existing TestNotify_* and TestMCPHandler_SendMessage_* tests. Tests added (3 new, 11 total writer tests) ------------------------------------------ - TestTruncatePreviewRunes_RuneBoundary: 8-case table — ASCII, CJK, exactly-at-max, emoji prefix. Asserts both correct visible output AND `utf8.ValidString` on every result so the bug shape (invalid UTF-8) can't recur. - TestAgentMessageWriter_Send_NonASCIIMessagePersists: end-to-end with a 200-rune CJK message (exceeds the 80-rune cap, would have hit the byte-slice bug). Pins the INSERT summary contains valid UTF-8 with exactly 80-rune body + ellipsis. - TestAgentMessageWriter_Send_DBErrorOnLookupReturnsWrapped: pins the DB-outage path returns a wrapped non-ErrWorkspaceNotFound error so alerting can distinguish 404 from 503. Verified via mock ExpectQuery returning a transient error. Verified -------- - `go vet ./internal/handlers/` clean - `go build ./...` clean - All 14 writer + caller tests pass (8 original + 3 new + AST gate + TestNotify_* + TestMCPHandler_SendMessage_* sibling tests) Per memory feedback_assert_exact_not_substring.md: every new test asserts boundary behavior directly (UTF-8 validity, exact rune count, errors.Is comparison) rather than substring-match in stringified output. Refs RFC #2945, PR #2949, PR #2944. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
eab36e217e |
fix(external-connect): use molecule-mcp wrapper in Codex/OpenClaw templates (#2957)
The External Connect modal's Codex and OpenClaw tabs were rendering this MCP server config: command = "python3" args = ["-m", "molecule_runtime.a2a_mcp_server"] That spawns the bare MCP dispatcher with no presence wiring. The ``molecule-mcp`` console-script wrapper (mcp_cli.main) is what calls ``POST /registry/register`` at startup and runs the 20s heartbeat thread alongside the MCP stdio loop. Without the wrapper, the canvas flips the workspace back to ``awaiting_agent`` (OFFLINE) within 60-90s — even while tools work — because nothing is heartbeating. Operator-side this looks like: the workspace is registered and tools work fine when invoked, but the canvas shows "offline" / "Restart" CTA, peer agents see the workspace as awaiting_agent in list_peers output, and inbound A2A delivery silently fails the readiness check. A new external-Codex operator (#2957) hit this and spent debugging time on what should have been a copy-paste install. Fix: switch both Codex and OpenClaw templates to ``command = "molecule-mcp"`` / ``args = []``, matching the universal MCP template that already handles this correctly. Inline comment in each template explains the wrapper-vs-bare-module tradeoff so a future template author doesn't regress to the shorter form. Hermes-channel intentionally still spawns the bare module — the hermes plugin owns the platform plugin path and runs its own register_platform/heartbeat code in-process; double-heartbeating would race. Universal/Codex/OpenClaw all need the wrapper. Regression gate: TestExternalMcpTemplates_UseMoleculeMcpWrapper asserts the three templates that must use the wrapper actually do, and explicitly fails on the old ``-m molecule_runtime.a2a_mcp_server`` shape. Verified the test FAILS on pre-fix source by stashing only external_connection.go and re-running. Source: molecule-core#2957 issue 1 (item 4 of the report — the ``(codex returned empty output)`` / opaque-canvas-error / stale- session items live in codex-channel-molecule and are tracked separately). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
decec9b9a1
|
Merge pull request #2956 from Molecule-AI/feat/memory-tab-v2-redesign
feat(memory): redesign Memory tab for v2 plugin |
||
|
|
f0f4d0e761 |
feat(memory): redesign Memory tab for v2 plugin
Replaces the v1 LOCAL/TEAM/GLOBAL tab trio (mapped to the deprecated
shared_context model) with a v2 plugin-driven UI. Without this,
canvas Memory tab was reading the frozen agent_memories table while
all post-cutover agent writes went to the plugin's memory_records —
the tab silently displayed stale data.
## Backend (workspace-server)
New routes under wsAuth, all behind the existing per-tenant token:
GET /workspaces/:id/v2/namespaces → readable + writable lists
GET /workspaces/:id/v2/memories → plugin search proxy
DELETE /workspaces/:id/v2/memories/:mid → plugin forget proxy
memories_v2.go — slim handler:
- Server-side ACL: every search request is intersected with the
resolver's readable-namespaces set (canvas-supplied namespace
that the workspace can't read returns [] not 403, matches v1
existence-non-inferring shape).
- Returns 503 with "set MEMORY_PLUGIN_URL" hint when plugin
isn't wired (canvas surfaces a banner).
- Maps plugin not_found → 404, other plugin errors → 502.
- View shaping: NamespaceView.label rendered server-side
("Workspace (abc-1234)", "Team (t-99)", "Org (acme)", custom)
so canvas doesn't parse namespace names. MemoryView surfaces
pin/expires_at/score/source_workspace_id from Propagation.
memories_v2_test.go — 100% line + 100% function coverage:
- 503 path on every endpoint when unwired
- Namespaces success + readable/writable error paths
- Search: empty intersection, full-path query/kind/limit
propagation, namespace=/no-namespace branches, propagation
map missing/wrong-type, intersect error, plugin error
- Forget: success, plugin not_found→404, other plugin
errors→502, missing memoryId→400
- Helpers: namespaceLabel for all 4 kinds + truncation,
parseLimit edge cases (default/0/negative/over-cap/non-num),
memoryToView field round-trip, indexOfColon, shortID
## Frontend (canvas)
MemoryInspectorPanel rewritten for v2:
- Drop LOCAL/TEAM/GLOBAL trio. Namespace dropdown driven by
GET /v2/namespaces.readable, "All namespaces" default.
- New per-row badges: kind (F/S/C), source (agent/runtime/user),
pin (📌), TTL countdown (⌛12h / "expired"), score% on
semantic search, source-workspace ⇡ws-pee for propagated.
- Drop Edit button — v2 plugin contract has no PATCH; the
model is forget + recommit. Forget stays.
- Plugin-unavailable banner with operator hint when /v2/*
returns 503.
- Bug fix surfaced by test: rollback-on-failed-delete order
of operations (loadEntries() called setError(null) AFTER
we set the failure message, wiping it). Reload first, then
set the error.
MemoryEditorDialog deleted — Add was POST /memories which v2
doesn't support from canvas (writes go via MCP). The legacy
Edit-flow tests go with it.
## Test results
Backend: `go test ./internal/handlers/` — all pass
Backend coverage on memories_v2.go: 100% lines, 100% functions
Canvas: `vitest run` — 91 files, 1273 tests pass (26 new)
Canvas coverage on MemoryInspectorPanel.tsx: 100% lines,
100% functions, 96.7% statements, 84.7% branches
(uncovered branches are defensive `?? fallback` for
contract-impossible kind/source values)
## Migration note
The legacy v1 GET/POST/PATCH/DELETE on /workspaces/:id/memories
remains in place for the back-compat MCP shim (mcp_tools_memory_v2's
legacy routing) and admin export/import. PR-9 (#283) drops
agent_memories along with the v1 endpoints once the cutover
verification window closes.
|
||
|
|
d99b3f2aec |
refactor(handlers): consolidate Notify + MCP send_message_to_user through AgentMessageWriter (RFC #2945 PR-A)
Pre-RFC-#2945 the broadcast + activity_log INSERT for "agent → user chat" was duplicated across two handlers — activity.go's Notify (HTTP /notify) and mcp_tools.go's toolSendMessageToUser (MCP tools/call). The duplication is exactly what produced the reno-stars production data-loss regression (PR #2944): the persistence-half fix landed for one handler and silently lagged for the other for months, dropping every long-form external-agent message on reload. PR #2944 added the missing INSERT to mcp_tools.go and a forward- looking AST gate. This PR removes the duplication at the source. What changes ------------ NEW: workspace-server/internal/handlers/agent_message_writer.go - AgentMessageWriter struct + NewAgentMessageWriter ctor. - Send(ctx, workspaceID, message, attachments) error: workspace lookup → broadcast WS AGENT_MESSAGE → INSERT activity_logs. - ErrWorkspaceNotFound for the lookup-miss path so callers can return 404 / JSON-RPC error cleanly. - Best-effort persistence: INSERT failure logs only, returns nil so the broadcast success isn't undone (matches previous behavior in both call sites — pinned by test). - Takes events.EventEmitter (interface) so tests can substitute a capturing fake without nil-panicking inside hub.Broadcast. UPDATED: activity.go:Notify - Replaced ~75 lines of inline broadcast+INSERT with a 12-line call to AgentMessageWriter.Send. - Attachment shape conversion (NotifyAttachment → AgentMessageAttachment) is local to the HTTP handler; the writer's API doesn't import the HTTP-binding-tagged type. UPDATED: mcp_tools.go:toolSendMessageToUser - Replaced ~40 lines (the post-#2944 broadcast+INSERT pair) with a 6-line call to the writer. - Attachments is nil today because the MCP tool args don't expose attachments yet. When the schema adds it, build the slice and pass through; the writer half is ready. Tests ----- agent_message_writer_test.go (8 tests, comprehensive): - TestAgentMessageWriter_Send_Success_NoAttachments — happy path, pins JSON `{"result":"hi"}`. - TestAgentMessageWriter_Send_Success_WithAttachments — pins file parts shape (kind=file, file.{uri,name,mimeType,size}). Uses a jsonMatcher that decodes + asserts via predicate (tolerant of map key ordering, exact on shape). - TestAgentMessageWriter_Send_WorkspaceNotFound — pins ErrWorkspaceNotFound + asserts NO broadcast NO INSERT. - TestAgentMessageWriter_Send_DBInsertFailureStillReturnsNil — pins best-effort persistence contract. - TestAgentMessageWriter_Send_PreviewTruncation — pins ≤80-char preview + ellipsis (Ryan's onboarding-friction report would have bloated activity_logs.summary by 2KB without this). - TestAgentMessageWriter_Send_BroadcastsAgentMessageEvent — pins WS event name + payload shape via capturingEmitter. - TestAgentMessageWriter_Send_OmitsAttachmentsKeyWhenEmpty — pins the "no key when nil" wire contract. The existing AST gate from #2944 (TestAgentMessageBroadcastsArePersisted) still holds: any future function emitting AGENT_MESSAGE without an INSERT fails the test. With the writer in place that's now redundant — both producers go through it — but the gate is cheap to keep as defense-in-depth. Verified: go vet clean; all writer + caller tests pass; existing TestNotify_* + TestMCPHandler_SendMessage_* + the AST gate all green. Refs RFC #2945, PR #2944. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
899c53550d |
test(mcp): comprehensive coverage for send_message_to_user persistence + AST gate (reno-stars followup)
Per user request: audit all similar tools + write comprehensive tests
including E2E for the persistence-of-AGENT_MESSAGE-broadcasts contract.
Audit (all BroadcastOnly call sites in workspace-server/internal/):
| Site | Event | Persisted? | Notes |
|---|---|:---:|---|
| a2a_proxy_helpers.go:275 | A2A_RESPONSE | ✓ | LogActivity above |
| activity.go:486 (Notify) | AGENT_MESSAGE | ✓ | INSERT line 535 |
| activity.go:701 (LogActivity) | ACTIVITY_LOGGED | ✓ | self-emits inside DB write |
| mcp_tools.go:341 (toolSendMessageToUser) | AGENT_MESSAGE | ✓ NEW (this PR) |
| registry.go:575 | TASK_UPDATED | N/A | transient progress, not chat |
| registry.go:596 | WORKSPACE_HEARTBEAT | N/A | infra ping, not chat |
Only one chat-bearing broadcast was missing persistence (the just-
fixed mcp bridge path). No other regressions found.
Tests added (4 new, total 5 send_message_to_user tests):
1. TestAgentMessageBroadcastsArePersisted — AST gate that walks every
non-test .go in the package, finds funcs that BroadcastOnly with
"AGENT_MESSAGE", asserts each ALSO contains an
"INSERT INTO activity_logs". Forward-looking regression block:
any future chat tool that broadcasts without persisting fails the
test with a clear file:func diagnostic. Mutation-tested locally:
removing the INSERT block from toolSendMessageToUser reliably
produces the expected failure.
2. TestMCPHandler_SendMessageToUser_DBErrorLogsAndStill200s — pins
the "best-effort persistence" contract. DB INSERT failures must
NOT abort the tool response (the WS broadcast already succeeded;
retrying would double-render in the live chat). Matches /notify.
3. TestMCPHandler_SendMessageToUser_ResponseBodyShape — pins the
exact `{"result": "<message>"}` JSON shape stored in
response_body. The canvas hydrater (extractResponseText in
historyHydration.ts) reads body.result; any drift here silently
breaks chat history without failing the INSERT. Per memory
feedback_assert_exact_not_substring.md, asserts the literal JSON
shape, not a substring.
4. TestMCPHandler_SendMessageToUser_PersistsToActivityLog (existing,
from previous commit) — pins INSERT shape with regex on
'a2a_receive' + 'notify' literals.
5. TestMCPHandler_SendMessageToUser_Blocked_WhenEnvNotSet (existing)
— env-gate aborts before DB.
Test fixture cleanup: newMCPHandler now uses newTestBroadcaster (real
ws.Hub) instead of events.NewBroadcaster(nil) — the latter nil-panics
inside hub.Broadcast on the AGENT_MESSAGE path. Same broadcaster
shape every other handler test uses.
E2E note: the AST gate is the strongest forward-looking guarantee.
A real-DB integration test would add value for CI but is largely
duplicative of the sqlmock contract tests above (sqlmock pins SQL
shape with much faster feedback). Left as a future enhancement when
the handlers Postgres-integration suite extends MCP coverage.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
cdfc9f743f |
fix(mcp): persist send_message_to_user pushes to activity_log (reno-stars data loss)
Reported on production tenant reno-stars: an external claude-code agent
(CEO Ryan PC workspace) sent a long-form message via send_message_to_user;
the user saw it live in the chat panel but it vanished after a refresh.
Confirmed via direct production query — the message is NOT in
activity_logs at all (only short test pings around it are persisted).
Root cause: there are TWO server-side handlers for send_message_to_user:
1. HTTP `/workspaces/:id/notify` (activity.go:Notify) — broadcasts WS
AND inserts a row into activity_logs. This is the path the
in-container runtime's tool_send_message_to_user calls.
2. MCP-bridge `tools/call name=send_message_to_user`
(mcp_tools.go:toolSendMessageToUser) — broadcasts WS only,
**never persisted**. This is the path EXTERNAL agents using
molecule-mcp's send_message_to_user tool route through.
The persistence fix landed for path 1 months ago but was never mirrored
on path 2. External agents — exactly the case in reno-stars/CEO Ryan PC
— have been silently losing every long-form notification on reload.
Fix: mirror the activity.go INSERT shape inside toolSendMessageToUser:
INSERT INTO activity_logs
(workspace_id, activity_type, method, summary, response_body, status)
VALUES ($1, 'a2a_receive', 'notify', $2, $3::jsonb, 'ok')
Same wire shape as /notify so the canvas's chat-history hydration
(`type=a2a_receive&source=canvas`) treats both writers identically.
Errors are log-only — broadcast already succeeded, persistence failure
shouldn't block the tool response (matches /notify behavior; downside
is the same data-loss-on-DB-error risk, surfaced via log.Printf).
Tests
-----
- `TestMCPHandler_SendMessageToUser_PersistsToActivityLog` — pins both
the workspace-name lookup AND the INSERT shape. Regex-matches
`'a2a_receive'` + `'notify'` literals so a future refactor that
changes activity_type or method breaks the test loud, not silently
re-introducing the data-loss bug.
- Updated newMCPHandler to use newTestBroadcaster() (real ws.Hub) —
events.NewBroadcaster(nil) crashes inside hub.Broadcast in the
send_message_to_user path. Same shape every other handler test uses.
Verified `go test ./internal/handlers/ -run TestMCPHandler_SendMessage`
green; full vet clean.
Refs reno-stars production incident 2026-05-05.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
f3782662bd |
refactor(external-connect): embed help in agent paste, fix wrong docs hostname
Two related fixes to the Connect-External-Agent flow that the user flagged: the "Need help?" disclosure block in the modal is for the operator's eyes only — but the agent reading the pasted snippet has no access to that context. And the docs URL was pointing at a hostname that doesn't resolve. User-visible problems: 1. The agent doesn't see the install link, docs link, or the common- error/check pairs that the human pasted. When the agent fails to register or hits ConnectionRefused, it can't self-diagnose because the troubleshooting context lives in a separate UI block. 2. https://docs.molecule.ai → DNS NXDOMAIN. Every "Documentation" link in the modal was a dead link. ## Fixes ### Move help INTO the snippet (not a separate human-only UI block) Each of the 7 server-rendered templates in `workspace-server/internal/handlers/external_connection.go` now appends a `# Need help?` section with: install link, correct docs link, and the top common errors as `# • symptom — check` pairs. Templates updated: curl / channel (Claude Code) / mcp (Universal MCP) / python / hermes / codex / openclaw. Agents reading the paste now have the same diagnostic context the human did. ### Drop the duplicated UI block in the canvas modal `canvas/src/components/ExternalConnectModal.tsx`: - Removed the `TAB_HELP` per-tab metadata constant (152 lines). - Removed the `HelpBlock` component (62 lines). - Removed the `<HelpBlock help={TAB_HELP[tab]} />` render call. The snippet is now the single source of truth for tab-level help. ### Fix the wrong docs hostname The actual docs site is `doc.moleculesai.app` (singular `doc`, `.app` not `.ai`), confirmed by: - `package.json` description in `Molecule-AI/docs` repo → "Molecule AI documentation site — doc.moleculesai.app" - HTTP HEAD on the new URL → 200 for both `/docs/guides/mcp-server-setup` and `/docs/guides/external-agent-registration` - HTTP HEAD on old `docs.molecule.ai` → 000 (NXDOMAIN) All template docs URLs now point at `doc.moleculesai.app`. ## Verification - `go build ./...` clean - `go test ./internal/handlers/... -count=1` green - `pnpm test` → 1291/1291 pass (unchanged) - `tsc --noEmit` clean - 219 LOC removed (canvas duplicate UI), 69 LOC added (snippet help) - Net `-150 LOC` while gaining the agent-readable help ## Out of scope (deferred, captured in followups) - One blog post still has `canonical: "https://docs.molecule.ai/blog/..."` in `src/app/blog/2026-04-20-chrome-devtools-mcp/page.mdx` — separate blog-content fix. - Comment in `theme-provider.tsx` references `docs.moleculesai.app` (with `s`) — comment-only, not a runtime URL. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
cb70d3d437 |
docs: callout Python>=3.11 requirement on Universal MCP install snippet
User-reported friction: pip install molecule-ai-workspace-runtime on a 3.10 interpreter fails with "Could not find a version that satisfies the requirement (from versions: none)" — pip's requires_python filter silently drops the only available artifact before attempting install, so the error doesn't mention Python at all. Operators see "package missing", file a bug, and chase a phantom CDN/visibility issue. Two changes mirror the requirement at the two operator-touch surfaces: 1. workspace-server/internal/handlers/external_connection.go: the externalUniversalMcpTemplate snippet (rendered into the canvas Connect-External-Agent modal) now leads with a brief "Requires Python >= 3.11" block + diagnostic + upgrade paths. 2. docs/workspace-runtime-package.md: same callout at the top of the doc, before the Overview, so anyone landing here from search gets the answer immediately. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
423d58d42c |
fix(org-import): polish — wrap-safe ErrNoRows, bounded lookup, godoc
Three small hardening passes from #2872's optional/important findings, batched into one polish PR: 1. errors.Is(err, sql.ErrNoRows) instead of err == sql.ErrNoRows. The bare equality breaks if any future caller wraps the error via fmt.Errorf("…: %w", err) — the no-rows happy path would fall through to the "real DB error" branch and abort the import. errors.Is unwraps. New test TestLookupExistingChild_WrappedNoRows_TreatedAsNotFound pins the fix; verified the test fails on the old `==` shape (build break on unused-import + assertion failure once import dropped). 2. Bounded 5s timeout on lookupExistingChild instead of context.Background(). The createWorkspaceTree call site runs in goroutines spawned from the /org/import handler, so plumbing the request context here would cascade-cancel into provisionWorkspaceAuto and abort in-flight EC2 provisioning if the client disconnected mid-import — that's the wrong tradeoff. A short bounded timeout protects the per-row SELECT against a wedged DB without taking the drop-everything-on-disconnect behaviour. The lookup is a single ~10ms query; 5s leaves 500x headroom for transient slow paths. 3. Godoc clarifications on the skip-path block. - /org/import is ADDITIVE-ONLY, never destructive. Children present in the existing tree but absent from the new template are preserved (no DELETE on diff). - Skip-path does NOT propagate updates to existing nodes — a re-import that adds an initial_memory or schedule to an existing workspace is silently dropped. Document the limitation so future operators know to delete-and-re-import or reach for a future /org/sync route. Verification: - go build ./... → clean - go test ./internal/handlers/... → all passing (TestLookup* + TestCreateWorkspaceTree* + TestClass1* + TestGate*) - 4 lookup tests + 1 new wrap-safety test → 5/5 PASS - Full handlers suite → green Refs molecule-core#2872 (Optional findings — wrap-safety + ctx, godoc clarifications for additive-only + skip-path-update-limitation) Out of scope (deferred): - PR-D partial unique index migration + ON CONFLICT — sequenced after Phase 4 cleanup verified clean per #2872 plan - PR-E full createWorkspaceTree integration test for partial-match — needs heavier sqlmock scaffolding for downstream workspaces_audit/canvas_layouts/secrets/channels INSERTs; follow-up Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
60afcd43c9 |
test(handlers): generic Class 1 leak AST gate (#2867 PR-A)
Adds class1_ast_gate_test.go — a per-package AST walk that fails the
build if any handler function INSERTs INTO workspaces inside a range
loop body without one of three escape hatches:
1. A call to a registered preflight helper (lookupExistingChild today;
extend preflightCallNames as new helpers are introduced).
2. An ON CONFLICT clause in the same SQL literal (idempotent UPSERT,
like registry.go).
3. An explicit `// class1-gate: idempotent-by-design` comment in the
function body (deliberately awkward — forces a code-review beat).
Why this is broader than the existing
TestCreateWorkspaceTree_CallsLookupBeforeInsert gate in
org_import_idempotency_test.go: that one is hard-coded to one function
in one file. This one walks every non-test .go file in the handlers
package and applies a structural rule independent of file/function
names. A future handler written from scratch in a new file would not
have been covered before — now it is.
Detection mechanism (per AST):
- Collect spans (Lbrace..Rbrace) of every RangeStmt body in each
function. Position-based instead of stack-based — ast.Inspect's
nil-callback ordering doesn't give per-node pop semantics, so a
naive push/pop stack silently miscounts. Position spans are
deterministic.
- Walk every BasicLit, regex-match `^\s*INSERT INTO workspaces\(`
(tightened from bytes.Index "INSERT INTO workspaces" so
workspaces_audit literals don't false-positive — same regex used
by the existing createWorkspaceTree gate).
- For each match: record insertLine, hasONCONFLICT, and the
innermost enclosing RangeStmt line (or 0 if not inside any range).
- Fail the function if INSERT is inside a range AND no preflight
AND no ON CONFLICT AND no allowlist annotation.
Self-tests (per `feedback_assert_exact_not_substring.md` —
verify gate fails on the bug shape before merging):
- TestClass1_GateFiresOnSyntheticBuggySource: synthetic source
where INSERT is inside `for _, child := range children` body
must trigger the gate's three guards (enclosingRangeLine!=0,
hasONCONFLICT=false, no preflight call).
- TestClass1_GateAllowsONCONFLICT: synthetic INSERT...ON CONFLICT
must NOT trigger the gate (idempotent UPSERT case).
- TestClass1_GateAllowsAllowlistAnnotation: function with
`// class1-gate: idempotent-by-design` must be skipped.
- TestClass1_NoUnpreflightedInsertInsideRange: production sweep
over every handler .go file. Currently passes because
org_import.go preflights, registry.go ON-CONFLICTs, and
workspace.go's Create has no INSERT inside a range body.
Verification:
- go test ./internal/handlers/... -run TestClass1_ -count=1
→ 4/4 PASS
- go test ./internal/handlers/... -count=1 → suite green
(no pre-existing test broken by the new file)
Refs molecule-core#2867 (PR-A Class 1 generic AST gate)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
83454e5efd |
feat(workspace-server): structured logging at provisioning boundaries
Adds internal/provlog with a single Event(name, fields) helper that emits JSON-tagged single-line records to the standard logger. Five boundary sites instrumented for #2867: provision.start — workspace_dispatchers.go (sync + async) provision.skip_existing — org_import.go idempotency hit provision.ec2_started — cp_provisioner.go after RunInstances provision.ec2_stopped — cp_provisioner.go after TerminateInstances ack restart.pre_stop — workspace_restart.go before Stop dispatch These pair with the existing human-prose log.Printf lines (kept). The new records are grep+jq friendly so a future log-aggregation pipeline can reconstruct per-workspace provision timelines without parsing the operator messages — this is the "and debug loggers so it dont happen again" half of the leak-prevention work. Tests: - provlog: emits evt-prefixed JSON, nil-tolerant, marshal-error fallback preserves event boundary, single-line output pinned. - handlers: provlog_emit_test.go pins three call-site contracts: provisionWorkspaceAutoSync emits provision.start with sync=true, stopForRestart emits restart.pre_stop with backend=cp on SaaS, and backend=none when both backends are nil. Field taxonomy is convenience for ops, not contract — payload can grow additively without breaking callers. Behavior gate is the event name + boundary location, per feedback_behavior_based_ast_gates.md. Refs #2867 (PR-D structured logging at provisioning boundaries) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
eec4ea2e7d |
chore: delete TeamHandler.Collapse + docs cleanup (closes #2864)
Multi-model retrospective review of #2856 (Phase 1 Expand removal) flagged that TeamHandler.Collapse is unreachable from the canvas UI: the "Collapse Team" button calls PATCH /workspaces/:id { collapsed } (visual flag toggle on canvas_layouts), NOT POST /workspaces/:id/collapse. The destructive POST route — which stops EC2s, marks children removed, and deletes layouts — has zero UI callers (verified via grep across canvas/, scripts/, and the MCP tool registry; only docs referenced it). Two semantically different operations had been sharing the word "Collapse": - Visual collapse (canvas) → PATCH { collapsed: true }. Hides children visually. Reversible. UI-only. - Destructive collapse (POST /collapse) → Stops + marks removed. Irreversible. No caller. Deleting the destructive one + its supporting machinery: - workspace-server/internal/handlers/team.go (entirely) - workspace-server/internal/handlers/team_test.go (entirely) - POST /collapse route + teamh init in router.go - findTemplateDirByName helper (zero non-test callers after Expand was deleted in #2856; package-private so no out-of-package consumers) - NewTeamHandler constructor (no callers after route removed) Plus stale doc references (the most dangerous was the MCP wrapper mapping in mcp-server-setup.md — anyone generating MCP tool wrappers from that table was wiring a 404): - docs/agent-runtime/team-expansion.md (deleted entirely — whole guide taught the deleted flow) - docs/api-reference.md (dropped two team.go rows) - docs/api-protocol/platform-api.md (dropped /expand + /collapse rows) - docs/architecture/molecule-technical-doc.md (dropped /expand + /collapse rows) - docs/guides/mcp-server-setup.md (dropped expand_team + collapse_team MCP wrapper mappings) - docs/glossary.md (dropped "(org template expand_team)" parenthetical) - docs/frontend/canvas.md (dropped broken link to deleted team-expansion.md) Kept: docs/architecture/backends.md mention of "TeamHandler.Expand (#2367) bypassed routing on Start" — correct historical context for the AST gate's existence, no live route reference. Visual-collapse path unaffected: canvas/src/components/ContextMenu.tsx:227 → api.patch — unchanged canvas/src/components/WorkspaceNode.tsx:128 → api.patch — unchanged go vet ./... clean. go test ./internal/handlers/ -count 1 — all green (4.3s, no regression). Net: -388/+10 = ~378 lines removed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
9f551319d2 |
feat(saas): close 4th default-tier site + lift org_import asymmetry + tests (#2910)
Multi-model retrospective review of #2901 found three Critical gaps: 1. (#2910 PR-B) template_import.go:79 wrote `tier: 3` hardcoded into generated config.yaml. On SaaS this defeated the T4 default at the create-handler layer — a config-less template import landed at T3 regardless of POST /workspaces' computed default. The 4th default-tier site #2901 missed. 2. (#2910 PR-A) #2901 claimed `go test ... all green` but added zero new tests. Existing structural-pin tests caught dispatch-layer drift but said nothing about tier-default drift. A future refactor that flips DefaultTier() to always return 3 would ship green. 3. (#2910 PR-E) org_import.go fallback returned T2 on self-hosted while workspace.go returned T3. Internally consistent ("bulk vs interactive defaults") but undocumented same-name-different-value drift. Fix: - TemplatesHandler.NewTemplatesHandler now takes `wh *WorkspaceHandler` (nil-tolerant for read-only callers). Import + ReplaceFiles compute tier via h.wh.DefaultTier() and pass it to generateDefaultConfig. generateDefaultConfig gets a `tier int` parameter (bounds-checked, invalid input falls back to T3). - org_import.go fallback lifts to h.workspace.DefaultTier() — single source of truth shared with Create + Templates so a future tier-default change sweeps every entry point at once. - New saas_default_tier_test.go pinning: TestIsSaaS_TrueWhenCPProvWired TestIsSaaS_FalseWhenOnlyDocker TestDefaultTier_SaaS_IsT4 TestDefaultTier_SelfHosted_IsT3 TestGenerateDefaultConfig_RespectsTierParam TestGenerateDefaultConfig_SelfHostedTierT3 TestGenerateDefaultConfig_OutOfRangeFallsBackToT3 - Existing template_import_test.go tests + chat_files_test.go + security_regression_test.go updated to thread the new tier param / wh constructor arg through their NewTemplatesHandler calls. Their pre-#2910 assertion of `tier: 3` is preserved (now passes because the test caller passes `3` explicitly), so no regression. go vet ./... clean. go test ./internal/handlers/ -count 1 — all green (4.2s). Deferred to separate follow-ups (per #2910 plan): - PR-C: MOLECULE_DEPLOYMENT_MODE explicit deployment-mode signal (closes the IsSaaS()=cpProv!=nil structural fragility) - PR-D: Host iptables IMDS block + IMDSv2 hop-limit (paired with molecule-controlplane EC2-IAM-scope audit) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
5334d60de4
|
Merge pull request #2898 from Molecule-AI/2867-workspaces-insert-allowlist
test(handlers): allowlist INSERT INTO workspaces sites (#2867 class 1) |
||
|
|
27db090d3d
|
Merge pull request #2907 from Molecule-AI/feat/poll-mode-chat-upload-phase5a
feat(poll-upload): phase 5a — atomic batch insert + acked-index + mime hardening |
||
|
|
0f25f6de97 |
test(handlers): allowlist INSERT INTO workspaces sites — close bulk-create regression class (#2867 class 1)
Adds TestINSERTworkspacesAllowlist: walks every non-test .go in this package, finds funcs containing an `INSERT INTO workspaces (` SQL literal, and pins the result against an explicit allowlist with the safety mechanism named per entry. New entries fail the build until a reviewer adds them — forcing the question "what makes this INSERT idempotent?" at PR-review time, not after the next bulk-create leak (the shape that produced 72 stale child workspaces in tenant-hongming over 4 days). Pairs with TestCreateWorkspaceTree_CallsLookupBeforeInsert (the behavior pin for the one bulk path today). Together: - this test catches "did a new function start inserting?" - that test catches "did the existing bulk path drop its idempotency check?" Both fire immediately when drift happens. Current allowlist (3 entries): - org_import.go:createWorkspaceTree → lookup-then-insert via lookupExistingChild (#2868 phase 3, also pinned by the sibling AST gate from #2895) - registry.go:Register → ON CONFLICT (id) DO UPDATE (idempotent by primary key — external workspace upsert) - workspace.go:Create → single-workspace POST /workspaces, server- generated UUID, no iteration Verified via mutation: dropping a synthetic tempBulkLeakTest with an unsafe loop+INSERT into the package fails the gate with a clear diagnostic pointing at the file + function. Restoring the tree returns the gate to green. Memory: feedback_assert_exact_not_substring.md (verify tightened test FAILS on bug shape) — mutation proof done locally. RFC #2867 class 1. Class 2 (Prometheus gauge for ec2_instance duplicates) + class 3 (structured logging on workspace create) are follow-up PRs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
9991057ad1 |
feat(poll-upload): phase 5a — atomic batch insert + acked-index + mime hardening
Resolves four of six findings from the retrospective code review of Phases 1–4 (poll-mode chat upload). Bundled because every change is in the platform's pending_uploads layer or the multi-file handler that reads it. Findings resolved: 1. Important — Sweep query lacked an index for the acked-retention OR-arm. The Phase 1 partial indexes are both `WHERE acked_at IS NULL`, so the `(acked_at IS NOT NULL AND acked_at < retention)` half of the WHERE clause seq-scanned the table on every cycle. Add a complementary partial index on `acked_at WHERE acked_at IS NOT NULL` so both arms of the disjunction are index-covered. Disjoint from the existing two indexes (no row matches both predicates), so write amplification is bounded to ~one index entry per terminal-state row. 2. Important — uploadPollMode partial-failure left orphans. The previous per-file Put loop committed rows 1..K-1 and then errored on row K with no compensation, so a client retry would double-insert the survivors. Refactor the handler into three explicit phases (pre-validate + read-into-memory, single atomic PutBatch, per-file activity row) and add Storage.PutBatch with all-or-nothing transaction semantics. 3. FYI — pendinguploads.StartSweeperWithInterval was exported only for tests. Move it to lower-case startSweeperWithInterval and expose the test seam through pendinguploads/export_test.go (Go convention; the shim file is stripped from the production binary at build time). 4. Nit — multipart Content-Type was passed verbatim into pending_uploads rows and re-served on /content. Add safeMimetype which strips parameters, rejects CR/LF/control bytes, and coerces malformed shapes to application/octet-stream. The eventual GET /content response can no longer be header-split via a crafted Content-Type on the multipart. Comprehensive tests: - 10 PutBatch unit tests (sqlmock): happy path, empty input, all four pre-validation rejection paths, BeginTx error, per-row error + Rollback (no Commit), first-row error, Commit error. - 4 new PutBatch integration tests (real Postgres): all-rows-commit happy path with COUNT(*) verification, atomic-rollback no-leak via a NUL-byte filename that lib/pq rejects mid-batch, oversize short-circuit no-Tx, idx_pending_uploads_acked existence + partial predicate via pg_indexes (planner-shape-independent). - 3 new chat_files_poll tests: atomic rollback on second-file oversize, atomic rollback on PutBatch error, mimetype CRLF/NUL/parameter sanitization (8 sub-cases). The two remaining review findings (inbox_uploads.fetch_and_stage blocks the poll loop synchronously; two httpx Clients per row) are Python-side and ship in Phase 5b once this lands on staging. Test-only export pattern via export_test.go, atomic pre-validation discipline (validate before Tx), and behavior-based (not name-based) test assertions follow the standing project conventions. |
||
|
|
c79ba05ed5 |
test(pendinguploads): close cycleDone-vs-metric-record race in sweeper tests
TestStartSweeper_RecordsMetricsOnError flaked on every CI rerun under race detection: `error counter delta = 0, want 1`. Root cause is a race between two goroutines, not a bug in the production sweeper. The fake `fakeSweepStorage.Sweep` signals `cycleDone` from inside its deferred return — that happens BEFORE Sweep's return value is received by `sweepOnce`, which is what triggers the metric increment. On slow CI hosts the test goroutine wins the read after `waitForCycle` unblocks and BEFORE StartSweeper's goroutine has called `metrics.PendingUploadsSweepError`, so the asserted delta is 0 even though the metric WILL be 1 a few ms later. Adds a polling assert helper, `waitForMetricDelta`, that closes the race deterministically without timing-based sleeps: - TestStartSweeper_RecordsMetricsOnError uses waitForMetricDelta to wait for the error counter to settle at 1. - TestStartSweeper_RecordsMetricsOnSuccess uses it on the success counters (acked, expired) so the error-stayed-zero assertion reads after StartSweeper has fully processed the cycle. - waitForCycle keeps its current shape but documents the caveat in its comment so future tests don't repeat the assumption. Verified: `go test ./internal/pendinguploads/ -race -count 5` passes all 9 tests across 5 iterations cleanly. Per memory feedback_question_test_when_unexpected.md: the "delta=0, want=1" failure looked like a real production bug at first glance, but instrumented inspection showed the metric DOES increment, just AFTER the test's read. The fix is the test's wait shape, not the sweeper. Unblocks every PR currently broken by this flake (#2898 hit it on two consecutive CI runs; staging-merged PRs from earlier today (#2877/#2881/#2885/#2886) introduced the test). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
7644e82f2f |
feat(saas): default new workspaces to T4 on SaaS, T3 self-hosted
User reported every SaaS workspace defaults to T2 (Standard). Three sites quietly disagreed on the default: - canvas CreateWorkspaceDialog (line 126): isSaaS ? 4 : 3 ← only correct one - canvas EmptyState "Create blank": tier: 2 ← hardcoded - workspace.go POST /workspaces: tier = 3 ← not SaaS-aware - org_import.go createWorkspaceTree: tier = 2 (fallback)← not SaaS-aware So a user clicking "+ New Workspace" via the dialog got T4 on SaaS, but a user clicking "Create blank" on the empty canvas got T2, and an agent POSTing /workspaces directly got T3. Same tenant, three different tiers depending on entry point. Fix: 1. WorkspaceHandler.IsSaaS() and DefaultTier() helpers (workspace_dispatchers.go). IsSaaS() := h.cpProv != nil — single source of truth for "are we SaaS" across the file. DefaultTier() returns 4 on SaaS, 3 on self-hosted. SaaS rationale: each workspace runs on its own sibling EC2 so the per-workspace tier boundary is a Docker resource limit on the only container present — no neighbour to protect from. T4 matches the boundary. 2. workspace.go now defaults tier via h.DefaultTier() instead of hardcoded T3. 3. org_import.go fallback (when neither ws.tier nor defaults.tier set) becomes SaaS-aware: T4 on SaaS, T2 on self-hosted (preserve the existing safe-shared-Docker-daemon default for self-hosted org imports). 4. canvas EmptyState "Create blank" stops sending tier:2 in the body and lets the backend pick — single source of truth in the backend. Eliminates the third disagreement. Test plan: - go vet ./... clean - go test ./internal/handlers/ -count 1 — all green (4.3s) - npx tsc --noEmit on canvas — clean - Staging E2E (after deploy): create a fresh workspace via canvas empty-state on hongming.moleculesai.app, confirm tier=4 on the workspace details panel. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
48d1945269 |
test(org-import): tighten AST gate to discriminate workspaces vs lookalikes (#2872 Imp-1)
The previous TestCreateWorkspaceTree_CallsLookupBeforeInsert used
bytes.Index("INSERT INTO workspaces"), which prefix-matches
INSERT INTO workspaces_audit, INSERT INTO workspace_secrets, and
INSERT INTO workspace_channels. RFC #2872 cited this as a silent
false-pass mode: a future refactor that adds an audit-table INSERT
literal earlier in source than the real workspaces INSERT would
make the gate point at the wrong target.
Replaces the byte-search with a go/ast walk + a regex that requires
`\s*\(` after `workspaces` — distinguishes the real target from
prefix lookalikes.
Adds three discriminating tests:
- TestWorkspacesInsertRE_RejectsLookalikes — pins the regex against
9 sql shapes (real, raw-string-literal, audit-shadow, workspace_*
prefixes, canvas_layouts, UPDATE/SELECT, comments).
- TestGate_FailsWhenLookupAfterInsert — synthesizes Go source where
the lookup is positioned AFTER the workspaces INSERT, asserts the
helper returns lookupPos > insertPos (which the production gate
flags via t.Errorf). Proves the gate isn't vestigial.
- TestGate_IgnoresAuditTableShadow — synthesizes source with an
audit-table INSERT BEFORE the lookup + real INSERT, asserts the
tightened regex correctly walks past the shadow and finds the
real INSERT.
Also extracts findLookupAndWorkspacesInsertPos as a helper so the
gate logic can be exercised against synthetic source, not only
against the real org_import.go.
Memory: feedback_assert_exact_not_substring.md (verify tightened
test FAILS on old code) — TestGate_FailsWhenLookupAfterInsert is
the failing-on-bug-shape proof.
Closes the silent-false-pass mode of #2872 Important-1.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
e50799bc29 |
test(rfc): poll-mode chat upload — phase 4 real-Postgres integration
Phase 4 closes out the rollout — strict-sqlmock unit tests pin which
SQL fires, but they cannot detect bugs that depend on the actual row
state after the SQL runs. Real-Postgres integration tests catch:
- the Sweep CTE depends on Postgres' make_interval function and
the table's CHECK constraints; sqlmock would happily accept a
hand-written SQL literal that Postgres rejects at runtime.
- the partial idx_pending_uploads_unacked index only catches a
wrong WHERE predicate at real-query-plan time.
- subtle predicate drift (e.g. a WHERE clause that filters by
acked_at IS NOT NULL but uses BETWEEN incorrectly).
Test cases:
- PutGetAckRoundTrip: the full happy path — Put, Get, MarkFetched,
Ack, idempotent re-Ack, Get-after-Ack returns ErrNotFound.
- Sweep_DeletesAckedAfterRetention: row not eligible at retention=1h
immediately after Ack; deleted at retention=0.
- Sweep_DeletesExpiredUnacked: backdated expires_at exercises the
unacked-and-expired branch of the WHERE clause.
- Sweep_DeletesBothCategoriesInOneCycle: three rows (acked, expired,
fresh); a single Sweep deletes the first two and leaves the third.
- PutEnforcesSizeCap: ErrTooLarge above MaxFileBytes.
- GetIgnoresExpiredAndAcked: Get filters predicate matches expected
row state in the table.
Run path:
- locally via the file-header docker incantation.
- CI runs on every PR/push that touches handlers/** OR migrations/**
(.github/workflows/handlers-postgres-integration.yml).
|
||
|
|
a327d207da |
feat(rfc): poll-mode chat upload — phase 3 GC sweep + observability
Phase 3 of the poll-mode chat upload rollout. Stack atop Phase 2.
The platform's pending_uploads table grows once-per-uploaded-file with
no built-in cleanup. Phase 1's hard TTL (expires_at default 24h) makes
expired rows un-fetchable but doesn't actually delete them; Phase 1's
ack stamps acked_at but leaves the row indefinitely. Without a sweep
the table grows unbounded across normal traffic.
This PR adds:
- `Storage.Sweep(ctx, ackRetention)` — a single round-trip CTE that
deletes acked rows past their retention window plus unacked rows
past expires_at. Returns `(acked, expired)` deletion counts so
Phase 3 dashboards can spot the stuck-fetch pattern (high expired,
low acked) vs healthy churn.
- `pendinguploads.StartSweeper(ctx, storage, ackRetention)` —
background goroutine that calls Sweep every 5 minutes (default).
Runs once immediately on startup so a platform restart cleans up
any rows that became eligible while we were down.
- Prometheus counters `molecule_pending_uploads_swept_total` with
`outcome={acked,expired,error}` labels. Wired into the existing
`/metrics` endpoint.
- Wired from cmd/server/main.go via supervised.RunWithRecover —
one transient panic doesn't take the platform down with it.
Defaults:
- SweepInterval = 5m (matches the dashboard refresh cadence)
- DefaultAckRetention = 1h (gives the workspace at-least-once retry
headroom in case it processed but failed to write the file before
crashing)
Test coverage: 100% on storage_test.go (extended with sweepSQL pin +
six Sweep test cases including negative-retention clamp + zero-retention
immediate-delete + DB error wrapping) and sweeper_test.go (ticker-driven
+ ctx-cancel + nil-storage + transient-error-doesn't-crash + metric
counter assertions).
Closes the third of four phases tracked on the parent RFC; phase 4 is
the staging E2E test.
|
||
|
|
c778b62202 |
feat(metrics): add molecule_phantom_busy_resets_total counter (#2865)
Closes #2865 (split-B of the #2669 root-cause stack). The phantom-busy sweep in workspace-server/internal/scheduler/scheduler.go already logs each row reset, but no aggregate metric surfaces "how often is this firing." A regression that causes high reset rates (e.g. controlplane#481's missing env vars, or future drift in the workspace runtime's task-lifecycle accounting) only surfaces when users complain. Fix: counter exposed at /metrics as molecule_phantom_busy_resets_total, incremented from sweepPhantomBusy after each row whose active_tasks was reset. Same shape as existing molecule_websocket_connections_active. Operator-side dashboard: alert when daily phantom-busy reset count > 0.5% of active workspaces. Today's steady-state is near-zero; any increase is a regression signal. Tests: - TestTrackPhantomBusyReset_IncrementsCounter - TestTrackPhantomBusyReset_RaceFreeUnderConcurrentWrites (50×200 concurrent writes; tests atomic invariant) - TestHandler_ExposesPhantomBusyResetsCounter (asserts HELP + TYPE + value lines in Prometheus text format) - TestHandler_PhantomBusyResetsZeroByDefault (fresh-process 0 contract — prevents a future refactor from accidentally dropping the metric from /metrics) Race-detector clean. Vet clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
243f9bc2b1
|
Merge pull request #2877 from Molecule-AI/feat/poll-mode-chat-upload-phase1
feat(rfc): poll-mode chat upload — phase 1 platform staging layer |