Compare commits

..

166 Commits

Author SHA1 Message Date
Molecule AI Dev Engineer A (Kimi) 87dbee381c github_token: add timeout and status check to env-based fallback
sop-checklist / review-refire (pull_request) Waiting to run
sop-checklist / na-declarations (pull_request) N/A: (none)
Block internal-flavored paths / Block forbidden paths (pull_request) Waiting to run
branch-protection drift check / Branch protection drift (pull_request) Waiting to run
cascade-list-drift-gate / check (pull_request) Waiting to run
Check merge_group trigger on required workflows / Required workflows have merge_group trigger (pull_request) Waiting to run
Check migration collisions / Migration version collision check (pull_request) Waiting to run
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Waiting to run
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Waiting to run
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Waiting to run
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Waiting to run
Runtime Pin Compatibility / PyPI-latest install + import smoke (pull_request) Waiting to run
Secret scan / Scan diff for credential-shaped strings (pull_request) Waiting to run
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Waiting to run
gate-check-v3 / gate-check (pull_request) Waiting to run
qa-review / approved (pull_request) Waiting to run
security-review / approved (pull_request) Waiting to run
sop-checklist / all-items-acked (pull_request) Waiting to run
sop-tier-check / tier-check (pull_request) Waiting to run
audit-force-merge / audit (pull_request) Waiting to run
audit-force-merge / audit (pull_request_target) Has been skipped
E2E API Smoke Test / E2E API Smoke Test (pull_request) Has been cancelled
CI / Platform (Go) (pull_request) Has been cancelled
CI / Canvas (Next.js) (pull_request) Has been cancelled
CI / Shellcheck (E2E scripts) (pull_request) Has been cancelled
CI / Canvas Deploy Reminder (pull_request) Has been cancelled
CI / Python Lint & Test (pull_request) Has been cancelled
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Has been cancelled
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Has been cancelled
Harness Replays / Harness Replays (pull_request) Has been cancelled
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Has been cancelled
E2E API Smoke Test / detect-changes (pull_request) Has been cancelled
CI / Detect changes (pull_request) Has been cancelled
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Has been cancelled
Handlers Postgres Integration / detect-changes (pull_request) Has been cancelled
Harness Replays / detect-changes (pull_request) Has been cancelled
Runtime PR-Built Compatibility / detect-changes (pull_request) Has been cancelled
The fallback generateAppInstallationToken used http.DefaultClient which
has no timeout. If GitHub API hangs, the handler hangs indefinitely,
blocking the workspace credential helper. Fix: use a 15s timeout client
and check HTTP status before JSON decode for a cleaner error on 401/403.

Related to #1101.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-31 19:58:21 +00:00
hongming-personal 50c3bdfd6c Merge pull request #3028 from Molecule-AI/rfc-2945-pr-d-message-store
CI / Detect changes (push) Successful in 11s
Handlers Postgres Integration / detect-changes (push) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 16s
E2E API Smoke Test / detect-changes (push) Successful in 17s
Block internal-flavored paths / Block forbidden paths (push) Successful in 25s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 13s
Secret scan / Scan diff for credential-shaped strings (push) Failing after 53s
CI / Shellcheck (E2E scripts) (push) Failing after 6s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 29s
CI / Python Lint & Test (push) Failing after 39s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 4s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 24s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 53s
CI / Canvas (Next.js) (push) Failing after 3m21s
CI / Canvas Deploy Reminder (push) Has been skipped
CI / Platform (Go) (push) Failing after 3m47s
CodeQL / Analyze (${{ matrix.language }}) (go) (push) Failing after 14m15s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (push) Failing after 14m33s
CodeQL / Analyze (${{ matrix.language }}) (python) (push) Failing after 14m34s
feat(messagestore): MessageStore interface + Postgres impl (RFC #2945 PR-D)
2026-05-06 06:42:13 +00:00
hongming-personal a33c879017 feat(messagestore): MessageStore interface + Postgres impl (RFC #2945 PR-D)
Closes #3026. Final piece of RFC #2945.

## What's new

New package internal/messagestore/ holds:

  - MessageStore interface — single read-side contract operators
    implement to plug in alternative chat-history backends.
  - ChatMessage / ChatAttachment / ListOptions types — canonical data
    shapes returned by any impl, mirrors canvas's TS ChatMessage.
  - PostgresMessageStore — platform-default impl wrapping the
    activity_logs query + A2A-envelope parser ported in PR-C.
    Behavior is byte-identical to the pre-PR-D handler.

## What moves

The activity_logs query, the parser (activityRowToChatMessages,
extractRequestText, extractChatResponseText, extractFilesFromTask,
etc.), and the internal-self-message predicate all migrate from
internal/handlers/chat_history.go into the new package. handlers/
chat_history.go becomes a thin HTTP-shape adapter:

  parse query params → store.List(ctx, workspaceID, opts) → emit JSON

Compile-time interface assertion in postgres_store.go catches future
drift if the interface evolves and the impl falls behind.

## Why this PR

OSS operators wanting to:

  - Tier hot/warm/cold storage (recent in Postgres, archival in S3)
  - Use a vector store with hybrid search (Pinecone, Weaviate)
  - Run an in-memory store for ephemeral test environments
  - Federate history across regions

…had no extension point — they'd have to fork the handler. This PR
makes that a constructor swap at router.go.

## Tests

  Parser-level (22 tests, MOVED to internal/messagestore/postgres_
  store_test.go): every TS test case in
  canvas/src/components/tabs/chat/__tests__/historyHydration.test.ts
  has a Go counterpart. Timestamp preservation, user/agent extraction,
  internal-self filter, role decision (status=error vs agent-error
  prefix), v0/v1 file shapes, malformed JSON resilience.

  Handler-level (9 NEW tests in internal/handlers/chat_history_test.go):
  thin adapter coverage using a fake MessageStore. UUID validation,
  before_ts RFC3339 validation, default limit, max-limit clamp,
  invalid-limit fallback, before_ts passthrough, empty-array (not
  null) JSON shape, attachment shape preservation, store-error → 502
  mapping.

  Compile-time interface conformance: PostgresMessageStore satisfies
  MessageStore, fakeStore (test fake) satisfies MessageStore.

  Mutation-tested. Removed UUID validation in the handler; confirmed
  TestChatHistoryHandler_RejectsNonUUIDWorkspaceID fires red (status
  200 instead of 400, non-UUID reaches the store). Restored, all
  green.

  Full handlers + messagestore + router test runs green; full repo
  go test ./... green.

## SSOT decision

ChatMessage / ChatAttachment / parser / DB query all live in
internal/messagestore/ ONLY. handlers/chat_history.go imports the
package and uses the types via messagestore.ChatMessage etc. — no
re-declaration anywhere.

## Three weakest spots (hostile-reviewer self-pass)

1. The internal-self prefix list (Delegation results are ready...) is
   a package var in messagestore/postgres_store.go. A future impl
   that wants to override the predicate must reach into the package
   to use IsInternalSelfMessage or define its own. Acceptable: the
   predicate is part of the contract; if an impl wants different
   semantics it owns that decision explicitly.

2. ListOptions has Limit + BeforeTS + HasBefore; future paging needs
   (after_ts, peer_id filter, role filter) require additive struct
   field additions, which is a soft API break for any impl that
   handles ListOptions positionally. Mitigated by Go's struct-literal
   convention (named fields by default); also flagged in the
   interface comment for impl authors.

3. The handler does NOT log when a store returns an error — it just
   maps to 502. An impl that wants to surface its error class up the
   stack can't, today. If/when an impl needs that, the interface can
   add a typed-error contract in a follow-up. Today's coverage is
   sufficient: most ops issues land in the store impl's own logs.

## Security review

  - Untrusted input? Same as PR-C — agent-emitted JSON parsed
    defensively. New fakeStore in tests can't reach production.
  - Trust boundary? Same. Interface lives BEHIND wsAuth; impls only
    see workspace IDs already authenticated.
  - Auth/authz? Inherited from handler; the interface doesn't
    authenticate.
  - PII / secrets in logs? Documented in the interface contract:
    impls MUST NOT log full message bodies / attachment URIs. The
    Postgres impl logs nothing on the happy path.
  - Output sanitization? Same plain-text + opaque-URI surface as
    PR-C. Canvas validates attachment-URI schemes.

No security-relevant changes beyond what /chat-history already
exposes via PR-C. Considered, not skipped.

## Versioning / backwards compat

  - New internal package. Zero public API change.
  - Single caller site in router.go updated (one-line constructor
    change). NewChatHistoryHandler() → NewChatHistoryHandler(store).
  - No schema change, no migration.
  - Existing /chat-history endpoint unchanged on the wire — clients
    don't notice the refactor.

## Phasing

This is the final RFC #2945 piece. Follow-ups parked:

  - PR-C-2 (canvas migration): swap canvas loadMessagesFromDB to call
    /chat-history instead of /activity. Independent of this PR;
    blocked only by canvas team's calendar.
  - Sample alternative impls (S3, in-memory) for OSS docs: separate
    PR when the first OSS consumer materializes; demonstration code
    untested against a real workload is anti-pattern.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
2026-05-05 23:38:14 -07:00
hongming-personal e91186c4bf Merge pull request #3020 from Molecule-AI/rfc-2945-pr-c-chat-history
feat(workspace-server): server-side chat-history endpoint (RFC #2945 PR-C)
2026-05-06 06:23:12 +00:00
hongming-personal 089be695a9 Merge staging into rfc-2945-pr-c-chat-history 2026-05-05 23:18:52 -07:00
hongming-personal dcc870a6b7 feat(workspace-server): server-side chat-history endpoint (RFC #2945 PR-C)
Closes the SSOT gap for chat-history hydration: today every consumer
(canvas TS) re-implements an A2A-envelope walk to map activity_logs
rows into rendered ChatMessage objects. This PR moves that walk into
the server.

## What's added

GET /workspaces/:id/chat-history?limit=N&before_ts=T

Returns:

  {
    "messages": [
      {"id": "<uuid>", "role": "user"|"agent"|"system",
       "content": "...", "attachments": [...], "timestamp": "<RFC3339>"}
    ],
    "reached_end": false
  }

Auth chain: same wsAuth as /workspaces/:id/activity (tenant ADMIN_TOKEN
+ X-Molecule-Org-Id). No new trust boundary.

Filter: a2a_receive rows with source_id IS NULL — same canvas-source
filter the canvas applies via /activity?type=a2a_receive&source=canvas,
centralized so future API consumers don't need to know it.

## What's mirrored from canvas TS

Direct port of canvas/src/components/tabs/chat/historyHydration.ts
+ message-parser.ts:

  - extractRequestText / extractFilesFromUserMessage — user-side parts
    walk through request_body.params.message.parts[]
  - extractChatResponseText — agent-side response_body collector across
    the four shapes (string, A2A JSON-RPC parts, older nested
    parts.root.text, task artifacts) joined with "\n" (matches canvas
    multi-source collector — claude-code emits multiple text parts;
    hermes emits summary+artifacts)
  - extractFilesFromResponse / extractFilesFromTask — file walk across
    parts[] + artifacts[].parts[] + status.message.parts[] +
    message.parts[]
  - v0 hot path ({kind:"file", file:{...}}) AND v1 protobuf flat shape
    ({url, filename, mediaType}) both supported
  - Role decision: status='error' OR text starts with "agent error"
    (case-insensitive) → "system", else "agent"
  - isInternalSelfMessage prefix filter (Delegation results are
    ready...)
  - Timestamp pinned to row.created_at (regression cover for
    2026-04-25 bubble-collapse bug)

## Tests

22 unit tests in chat_history_test.go, every TS test case in
historyHydration.test.ts has a Go counterpart:

  Timestamp preservation (3): user/agent pin to created_at, two-rows
  produce two distinct timestamps.

  User-message extraction (5): text-only, internal-self skip,
  null body, attachments hydrated, attachments-only-when-text-empty,
  internal-self suppresses even with attachments.

  Agent-message extraction (4): result-string, status=error→system,
  agent-error-prefix→system, response_body.parts attachments,
  null body, no-text-no-files-no-bubble.

  End-to-end (1): paired user+agent same timestamp.

  Go-specific (5): malformed JSON returns empty (no panic), v1
  protobuf flat shape extraction, task-artifacts extraction, older
  nested root.text shape, basename helper edge cases.

  isInternalSelfMessage predicate (1): prefix match, non-prefix non-
  match, empty-text non-match.

Mutation-tested. Removed the role-promotion branch (status=error +
agent-error prefix → system); confirmed both
TestChatHistory_RoleSystemWhenStatusError and
TestChatHistory_RoleSystemWhenAgentErrorPrefix fire red. Restored.
Both green.

Full handlers test suite (4.3s) green; full repo `go test ./...` green.

## SSOT decision

Parsing logic lives in workspace-server/internal/handlers/chat_history.go
ONLY. Canvas keeps historyHydration.ts + message-parser.ts during the
transition because:

  - PR-C-2 (follow-up): canvas loadMessagesFromDB swaps to new
    endpoint. Today's canvas still calls /activity for backward
    compatibility.
  - The TS parsers are still load-bearing for LIVE message handling
    (WebSocket A2A_RESPONSE events) until RFC #2945 PR-B-2 mirrors
    the typed event payloads to canvas consumers.

Canvas's TS path will be deleted in a separate PR after a one-week
observation window confirms no live-message consumers depend on it.

## Security review

  - Untrusted input? YES — request_body and response_body come from
    agents (potentially OSS / third-party). Defensive: any malformed
    JSON returns empty content + no attachments, no panic. Tested
    via TestChatHistory_MalformedJSONInRequestBodyReturnsEmpty.
  - Trust boundary? Same as today: agent → workspace-server.
    No new boundary; reuses existing wsAuth middleware.
  - Auth/authz? Inherits wsAuth chain. Cross-workspace access blocked
    by existing TenantGuard middleware.
  - PII / secrets in logs? None. The handler logs nothing on the
    happy path; errors log 502 without body content.
  - Output sanitization? ChatMessage.content is plain text returned
    as-is; canvas already sanitizes via ReactMarkdown. Attachment
    URIs are agent-provided (workspace: / platform-pending: /
    https:); canvas's existing scheme allow-list still applies.

## Versioning / backwards compatibility

  - New endpoint /chat-history. /activity unchanged.
  - Canvas historyHydration.ts + message-parser.ts intact during
    transition (will be removed in PR-C-2 follow-up).
  - No public API consumer of /activity is broken — added route is
    additive.
  - No semver bump (server is internal versioning).

## Three weakest spots (hostile-reviewer self-pass)

1. extractRequestText returns ONLY parts[0].text. If a user message
   contains multiple text parts (uncommon — canvas only ever emits
   one), we lose later parts. Matches canvas exactly today, but a
   future change that emits multi-text user messages needs both
   parsers updated. Documented in code; covered by test if/when
   added.

2. activityRowToChatMessages rebuilds ChatMessage IDs every call (no
   caching). Each chat reload mints fresh UUIDs. This is fine because
   canvas dedupes by (role, content, timestamp window) not id, but a
   future API consumer that DID rely on id stability would break.
   Documented in the ChatMessage struct comment.

3. The handler scopes to source_id IS NULL only (canvas-source rows).
   A future "show all messages, including agent-to-agent" mode would
   need a new endpoint or a parameter. Out of scope for PR-C; canvas's
   /activity?source=canvas already enforces the same filter.

Closes #3017. Unblocks RFC #2945 PR-D (MessageStore interface) which
returns []ChatMessage typed values.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 23:17:26 -07:00
hongming-personal d144dcc700 Merge pull request #3016 from Molecule-AI/fix/textutil-ssot-truncate-2962
fix(textutil): SSOT for rune-safe string truncation, fix 3 audit-gap bugs (#2962)
2026-05-06 06:05:00 +00:00
Hongming Wang 656a02fae4 fix(textutil): SSOT for rune-safe string truncation, fix 3 audit-gap bugs
Closes #2962.

## Why

Six per-package `truncate` helpers had drifted into independent
re-implementations of the same idea. Three of them (delegation.go,
memory/client/client.go, memory-backfill/verify.go) used
`s[:max] + "…"` byte-slice form, which on a multi-byte codepoint at
byte `max` produces invalid UTF-8 → Postgres `text`/`jsonb` rejects
the INSERT silently → `delegation` / `activity_logs` row never lands
→ audit gap.

Three other helpers (delegation_ledger.go #2962, agent_message_writer.go
#2959, scheduler.go #2026) had each been fixed in isolation with three
slightly different rune-safe shapes — confirming this is a class of
bug, not a single instance.

## What

New package `internal/textutil` with three rune-safe functions:

- `TruncateBytes(s, maxBytes)` — byte-cap, "…" marker. Used by 5
  callers writing into byte-bounded columns / log lines.
- `TruncateBytesNoMarker(s, maxBytes)` — byte-cap, no marker. Used by
  delegation_ledger.go where the storage already conveys "preview"
  and an extra ellipsis would push the result over the column cap.
- `TruncateRunes(s, maxRunes)` — rune-cap, "…" marker. Used by
  agent_message_writer.go where the cap is in display chars (UI
  summary), not bytes.

All three guarantee `utf8.ValidString(out)` for any `utf8.ValidString(in)`.
Inputs already invalid go through `sanitizeUTF8` at the call site
boundary (scheduler.go preserved this defense-in-depth).

## Migration map

| Old | New | Behavior change |
|---|---|---|
| `delegation_ledger.truncatePreview` | `textutil.TruncateBytesNoMarker(s, 4096)` | none |
| `agent_message_writer.truncatePreviewRunes` | `textutil.TruncateRunes(s, n)` | none |
| `scheduler.truncate` | `textutil.TruncateBytes(s, n)` | "..." → "…" (3 bytes either way; single-glyph display) |
| `delegation.truncate` | `textutil.TruncateBytes(s, n)` | bug fix + ellipsis swap |
| `memory/client.truncate` | `textutil.TruncateBytes(s, n)` | bug fix |
| `memory-backfill.truncate` | `textutil.TruncateBytes(s, n)` | bug fix |

Five separate `truncate*` helpers + their per-package tests removed.
Net: 12 files / +427 / -255.

## Tests

- `internal/textutil/truncate_test.go` — 27 table-test cases + 145
  fuzz-invariant cases asserting `utf8.ValidString` and byte-cap
  invariants on every output.
- `delegation_ledger_test.go TestLedgerInsert_TruncatesOversizedPreview`
  strengthened with `capValidUTF8Matcher` so the SQL-write argument
  is asserted to be valid UTF-8 + within cap (not just `AnyArg()`).
  Mutation-tested: replacing the SSOT call with byte-slice form makes
  this test fail loud.

## Compatibility

- All callers internal; no external API surface change.
- Ellipsis swap "..." → "…": same byte budget (3 bytes), single-glyph
  display. No alerting/grep on either marker in this codebase
  (verified). Canvas renders both correctly.
- DB column widths unchanged (4096 / 80 / 200 / 256 / 300 — all
  preserved in the migrations).

## Security

Fixes a silent INSERT-failure mode that hid `activity_logs` /
`delegations` rows containing peer-controlled text. The class of input
that triggered it (CJK, emoji, accented Latin) is normal user content,
not malicious — but the symptom (audit gap) makes incident
reconstruction harder. Helper is pure-function over `string`; no
secrets / PII / auth handling involved. Untrusted input is handled
identically to before, just rune-aligned now.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 23:01:21 -07:00
hongming-personal c53155ec5f Merge pull request #3014 from Molecule-AI/test/cross-table-atomicity-integ-149-followup
test(chat-uploads): integration test for cross-table atomicity (#149 follow-up)
2026-05-06 05:05:49 +00:00
Hongming Wang debe29c889 ci(handlers-postgres-integration): apply legacy *.sql migrations too
The migration-replay step globbed only *.up.sql, silently skipping
the older flat-naming migrations (001_workspaces.sql,
009_activity_logs.sql, etc.). Fine while no integration test
depended on those tables; broke when the #149 cross-table
atomicity test came in needing both workspaces (FK target for
activity_logs) and activity_logs themselves.

Switch to globbing *.sql + sorted lex-order, excluding *.down.sql
so up/down pairs don't undo themselves mid-run. Add a sanity check
for workspaces + activity_logs + pending_uploads alongside the
existing delegations gate so a future migration drift fails loud
instead of silently skipping the regressed test.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 22:02:24 -07:00
Hongming Wang 7a39a08837 test(chat-uploads): integration test for cross-table atomicity (#149 follow-up)
Adds two real-Postgres tests under //go:build integration:

- TestIntegration_PollUpload_AtomicRollback_AcrossBothTables exercises
  the helpers in the same Tx shape uploadPollMode does (PutBatchTx +
  LogActivityTx + Rollback) and asserts COUNT(*)=0 on BOTH
  pending_uploads AND activity_logs after the rollback. Failure
  injection: NUL byte in `summary` triggers lib/pq protocol rejection
  on the second activity insert — same trick the existing PutBatch
  AtomicRollback test uses.

- TestIntegration_PollUpload_HappyPath_AcrossBothTables is the positive
  counterpart — Commit lands N rows in both tables.

Coverage rationale (post-PR-3010 review):
- sqlmock unit test (TestPollUpload_AtomicRollbackOnActivityInsertFailure)
  proved the handler calls Begin/Exec/Exec-fail/Rollback in order.
- Existing PutBatch integration test proved Postgres honors rollback
  for pending_uploads alone.
- New tests close the cross-table gap: prove LogActivityTx + PutBatchTx
  + real Postgres MVCC compose correctly under rollback.

A regression that made LogActivityTx silently route through db.DB
instead of the passed tx would still pass the sqlmock test (the
Begin/Commit/Rollback shape would look right) but would fail this
integration test (the activity_logs row would survive the rollback).

Verified locally: postgres:15-alpine + all migrations applied, both
tests pass in 0.1s. Skips cleanly without INTEGRATION_DB_URL — CI
already runs this file via the Handlers Postgres Integration job.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 21:57:56 -07:00
hongming-personal bb9bf85dbd Merge pull request #3011 from Molecule-AI/rfc-2872-workspaces-uniq-toctou
fix(workspace-server): close TOCTOU race on workspaces(parent_id, name) (#2872 Critical 1)
2026-05-06 04:51:01 +00:00
hongming-personal ff21bbb876 Merge staging into rfc-2872-workspaces-uniq-toctou to clear BEHIND 2026-05-05 21:46:33 -07:00
hongming-personal da3cb4c098 fix(workspace-server): close TOCTOU race on workspaces(parent_id, name) (#2872 Critical 1)
## Bug

`/org/import` had no per-tenant mutex, advisory lock, or DB-level
uniqueness on (parent_id, name). The pattern was lookup-then-insert:

    existingID, existing, err := h.lookupExistingChild(...)  // SELECT
    if existing { return /* skip */ }
    db.DB.ExecContext(ctx, `INSERT INTO workspaces ...`)     // INSERT

Two concurrent admin POSTs (rapid double-click in canvas, retry-after-
timeout, two operators on the same template) both saw "not found" in
the SELECT and both INSERT'd the same (parent_id, name).

Captured impact: tenant-hongming accumulated 72 stale child workspaces
in 4 days from repeated org-template spawns of the same template
(see #2857 phase 4 sweeper for the cleanup; #2872 for the prevention RFC).

## Fix

Two-layer fix — DB-level backstop AND application-level happy path:

1. **Migration** `20260506000000_workspaces_unique_parent_name.up.sql`

   ```sql
   CREATE UNIQUE INDEX CONCURRENTLY IF NOT EXISTS workspaces_parent_name_uniq
     ON workspaces (
       COALESCE(parent_id, '00000000-0000-0000-0000-000000000000'::uuid),
       name
     )
     WHERE status != 'removed';
   ```

   * COALESCE(parent_id, sentinel) collapses NULLs so root workspaces
     also collide pairwise.
   * `WHERE status != 'removed'` lets a tombstoned row be replaced
     by a same-named re-import (preserves existing org-import semantics).
   * CONCURRENTLY avoids ACCESS EXCLUSIVE on production tenants under
     live traffic; IF NOT EXISTS makes the migration resumable.
   * Down migration drops CONCURRENTLY symmetrically.

2. **`org_import.go` swap**

   Replace lookup-then-insert with `INSERT ... ON CONFLICT DO NOTHING
   RETURNING id`. On the skip path (RETURNING returns 0 rows →
   sql.ErrNoRows), re-select the existing id to recurse children:

       INSERT INTO workspaces (...) VALUES (...)
       ON CONFLICT (COALESCE(parent_id, ...), name)
       WHERE status != 'removed'
       DO NOTHING
       RETURNING id;

   The ON CONFLICT target predicate matches the partial-index predicate
   exactly — required for Postgres to consider the index applicable.

   Existing `lookupExistingChild` helper kept (still used on the skip
   path); semantics unchanged.

## Test coverage

* AST gate refreshed to assert the workspaces INSERT contains the
  ON CONFLICT pattern (`onConflictDoNothingRE`) instead of the now-obsolete
  "lookup-before-insert" ordering. Per behavior-based gating
  (memory: feedback_behavior_based_ast_gates.md), the new gate pins
  the actual TOCTOU-resolution behavior.
* Companion `TestGate_FailsWhenInsertOmitsOnConflict` proves the gate
  catches the bug shape on synthetic source.
* All existing `lookupExistingChild` unit tests (no-rows, found,
  nil-parent, DB error, wrapped no-rows) still pass — helper is
  unchanged and still load-bearing on the skip path.
* Live Postgres E2E coverage runs via the existing
  "Handlers Postgres Integration" CI job, which applies migrations
  to a real PG and exercises the INSERT path.

## Why ship the migration + swap together (not stacked)

The migration alone provides a DB-level backstop, but without the
handler swap a UNIQUE-violation surfaces as a 500 to the user. The
handler swap alone has no enforceable target until the migration
applies. Shipped together they give graceful skip + atomic backstop.

Migration is CONCURRENTLY + IF NOT EXISTS, safe to apply even on
tenants where the sweeper (#2860) hasn't run yet — the index just
declines to build until conflicting rows are reconciled.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 21:43:49 -07:00
hongming-personal ef9bd1e0e2 Merge pull request #3010 from Molecule-AI/fix/activity-row-tx-atomicity-149
fix(chat-uploads): activity rows commit atomically with PutBatch (#149)
2026-05-06 04:37:55 +00:00
Hongming Wang b759548822 fix(chat-uploads): activity rows commit atomically with PutBatch
Closes #149.

uploadPollMode for poll-mode chat uploads previously committed N
pending_uploads rows in one Tx (PutBatch), then wrote N activity_logs
rows individually outside any Tx. A per-row failure on activity row K
left rows 1..K-1 committed and pending_uploads orphaned until the 24h
TTL — not data-loss because the platform's fetcher handled the
half-state cleanly, but the user never saw file K in the canvas and
the inconsistency surfaced as an "uploaded but invisible" complaint
class.

Thread one Tx through PutBatchTx + N × LogActivityTx + Commit so all
or none commit. Broadcasts are deferred until after Commit — emitting
an ACTIVITY_LOGGED event for a row that ends up rolled back would
paint a ghost message into the canvas's optimistic UI. A new
LogActivityTx returns a commitHook the caller invokes post-Commit;
the existing fire-and-forget LogActivity is unchanged for the 4 other
production callers (a2a_proxy_helpers + activity.go report path).

Storage interface gains PutBatchTx; PostgresStorage.PutBatch is
refactored to share the validation + insert path. inMemStorage and
fakeSweepStorage delegate or no-op for PutBatchTx (the in-mem fake
can't model Tx state — DB-level atomicity is verified by the existing
real-Postgres integration test for PutBatch + the new unit test
asserting the Go handler calls Rollback on activity-insert failure).

Tests:
- TestPollUpload_AtomicRollbackOnActivityInsertFailure pins the new
  contract via sqlmock — second activity insert errors → Rollback
  expected, Commit must NOT be called.
- TestLogActivityTx_DefersBroadcastUntilCommitHook +
  _InsertError_NoHook_NoBroadcast + _NilTx_Errors cover the new API.
- TestPutBatchTx_HappyPath / _EmptyItems / _ValidationFails /
  _PerRowErrorPropagates cover Tx-aware storage layer.
- 7 existing TestPollUpload_* tests updated to mock Begin + Commit
  (or Begin + Rollback for failure paths) since the handler now
  opens a Tx around PutBatch + activity inserts.

All workspace-server tests pass; integration tag also clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 21:34:28 -07:00
hongming-personal cce2050b6a Merge pull request #2997 from Molecule-AI/rfc-2991-pr-1-image-preview-lightbox
feat(canvas/chat): inline image preview + fullscreen lightbox (RFC #2991 PR-1)
2026-05-06 04:28:03 +00:00
hongming-personal e87df906bd Merge staging into rfc-2991-pr-1 to clear BEHIND (post PR-2993 + PR-3005) 2026-05-05 21:24:20 -07:00
hongming-personal c60e2b5fa2 chore(canvas/chat): drop unused downloadChatFile import in AttachmentImage
github-code-quality bot flagged this as the last unresolved review thread
blocking the merge queue. The function is referenced in comments but
never called from this file (download is dispatched via the lightbox /
AttachmentChip path). Removing the import resolves the bot thread and
clears the staging branch-protection 'all conversations resolved' gate.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 21:23:46 -07:00
hongming-personal 143fbb91ff Merge pull request #3005 from Molecule-AI/ux/files-tab-drag-drop-upload
ux(canvas/files): drag-drop upload to target folder (#2999 PR-D)
2026-05-06 03:52:10 +00:00
hongming-personal 1b29b24e83 Merge staging into rfc-2991-pr-1 to clear BEHIND state 2026-05-05 20:50:55 -07:00
hongming-personal 6033179f48 Merge pull request #3006 from Molecule-AI/rfc-2991-pr-3-pdf-text-preview
feat(canvas/chat): inline PDF + text/code preview (RFC #2991 PR-3)
2026-05-05 20:49:53 -07:00
Hongming Wang ab1acff2d2 ux(canvas/files): drag-drop upload to target folder (#2999 PR-D)
User asked for VSCode-style drag-drop upload (#2999): "drag local to
upload to target folder just like vscode does". Today the only upload
path is the toolbar's Upload button (folder picker). Drag-drop lets
users grab files from Finder/Explorer and drop them directly on a
specific subdirectory in the tree.

1. New `uploadDataTransferItems(items, targetDir)` in `useFilesApi`
   — walks the HTML5 DataTransferItemList via `webkitGetAsEntry()`,
   recursing folders to a flat (relativePath, file) list, then PUTs
   each via the existing /files/<path> endpoint. The walker (also
   exported via `__testables`) calls `readEntries()` in a loop until
   empty so multi-batch folders (browsers cap each call at ~100
   entries) aren't silently truncated.

2. `uploadFiles` (folder-picker path) gained an optional `targetDir`
   parameter. Same prefixing semantics so future surfaces (e.g. an
   "upload here" toolbar button on a row) can reuse it.

3. `FileTree` directory rows gained `onDragOver` / `onDragEnter` /
   `onDragLeave` / `onDrop` handlers + a hover-target highlight
   (accent-tinted background + outline). dragLeave uses
   `currentTarget.contains(relatedTarget)` to suppress the flicker
   that fires when the cursor crosses any child of the row (icon,
   label, ✕ button) — without this the highlight strobes on every
   sub-element transition.

4. `FilesTab` wraps the tree column in an outer drop zone for
   "drop on root" — drops outside any specific subdir row land at
   root. The empty-state placeholder copy now includes a
   "drag files here to upload" hint when the active root is
   /configs (the only writable root today).

5. Both the row drop and the root drop are gated on
   `root === "/configs"` (the same gate that already blocks the
   toolbar's New / Upload / Clear). Other roots ignore the drag
   entirely (no highlight, no drop), so the user doesn't get a
   misleading drag affordance followed by a "switch root" toast.

`dragDropUpload.test.tsx` (9 tests, two layers):

Walker tests (pure function, no DOM):
- `walkEntry` collects a single dropped file with correct relpath.
- `walkEntry` walks a folder + preserves folder name in the path.
- **Multi-batch loop**: a fake reader that emits two batches of 2
  + an empty terminator must yield 4 files. A walker that called
  readEntries once would see only 2 — this is the load-bearing
  assertion against silent folder truncation.
- Nested directories: outer/inner/file.md → "outer/inner/file.md".

FileTree drag-drop wiring (DOM):
- `dragover` on a directory row preventDefault's (load-bearing —
  without it the drop event never fires).
- `drop` on a directory row fires `onDropToTarget(path, items)`.
- `drop` on a FILE row does NOT fire (only directories are valid
  drop targets).
- `drop` with no DataTransferItems does NOT fire (defensive guard
  against text-only drags).
- `dragenter` adds the highlight class to the directory row.

1. The 1MB per-file size cap is inherited from the existing
   `uploadFiles`. A user dropping a 5MB skill bundle silently
   skips the file (the loop's `continue` on `file.size >
   1_000_000`). Same behavior as the toolbar Upload, so consistent
   if not great. Surfacing skipped-files would be a UX improvement
   tracked separately — not load-bearing for this PR.

2. Drop-zone highlight on the column wrapper uses an outline that
   sits inside the column's overflow-y-auto scroll container. If
   the user drags onto a row that's mid-scroll, the highlight may
   clip slightly at the scroll boundary. Cosmetic only; the drop
   still works.

3. The `?root=` query is NOT passed on the underlying writeFile
   call (matches the existing uploadFiles behavior). On a backend
   without #2999 PR-A, this means uploads always land in /configs
   regardless of selected root — but we already gated drop on
   `root === "/configs"` so the practical effect is nil today.
   Once PR-A merges and the canvas threads ?root= through writes
   (separate follow-up), drops on /home etc. would be enableable
   by lifting the canDelete-style gate.

- `npx tsc --noEmit` clean
- 177/177 canvas tab tests pass
- Manual on local dev: drag a file from Finder onto /configs/skills
  row → file appears under /configs/skills/<name>. Drag a folder of
  3 files onto root area → 3 files uploaded with folder structure
  preserved. Drag onto /home tree → no highlight, no drop.

Refs #2999. Pairs with PR-A (backend EIC) — without PR-A the tree
is empty on SaaS and there's nothing to drop ONTO; PR-D still works
on self-hosted today.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
2026-05-05 20:47:47 -07:00
hongming-personal 19df43e3da Merge pull request #2993 from Molecule-AI/rfc-2945-pr-b-1-migrate-bare-event-strings
refactor(events): migrate 18 producers to typed EventType constants (RFC #2945 PR-B-1)
2026-05-06 03:45:47 +00:00
hongming-personal dcece2762b feat(canvas/chat): inline PDF + text/code preview (RFC #2991 PR-3)
Adds two new arms to the AttachmentPreview kind dispatcher:

* PDF — chip in the bubble, click opens the shared AttachmentLightbox
  with a browser-native <embed type="application/pdf"> at 95vw/90vh.
  Fetch+Blob+ObjectURL auth path matches AttachmentImage / Video. PDF.js
  not pulled in; browser viewer is good enough for the desktop chat MVP
  (Slack/Linear/Notion all gate full-page PDF behind a click for the
  same reason). Falls back to AttachmentChip on fetch error.

* Text/code/JSON/YAML — first 10 lines in monospace <pre><code> right
  in the bubble, "Show all N lines" expands to full content, with a
  filename + ⬇ download header. Streams up to 256 KB then marks
  truncated and offers a download chip; large logs don't crash the
  bubble. No syntax highlighting in v1 — shiki adds 200-500 KB and is
  pure polish.

Coverage: 5 new dispatch tests (PDF success → embed in lightbox,
PDF fetch fail → chip fallback, text inline render, text long content
→ Show-all-N-lines expand button, text fetch fail → chip fallback).
All 19 AttachmentPreview tests pass; tsc --noEmit clean.

Stacked on rfc-2991-pr-1-image-preview-lightbox (PR-2 already merged
into PR-1's branch). PR-1 ships first; this rebases onto staging
once it lands.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 20:43:46 -07:00
hongming-personal 57bfa40990 Merge pull request #3004 from Molecule-AI/ux/files-tab-context-menu
ux(canvas/files): right-click context menu — Open / Download / Delete (#2999 PR-C)
2026-05-06 03:37:16 +00:00
Hongming Wang d88fbb90fb ux(canvas/files): right-click context menu — Open / Download / Delete (#2999 PR-C)
## Why

User asked for a VSCode-style right-click menu on file rows (#2999):
"right click to have a menu to download". Today the only download
affordance is the toolbar's Export-all (bulk JSON dump), and the
inline ✕ button is the only delete UX (small click target, easy to
miss).

## Fix

1. New `FileTreeContextMenu` component — fixed-position popover with
   Open / Download / Delete items composed per-row (files get all
   three; directories get Delete only since "open a directory in the
   editor" doesn't apply). Esc + outside-click + Tab + scroll
   dismiss. ↓/↑ arrow keys rove focus between menu items. role=menu
   + role=menuitem + autofocus on first item for a11y.

2. Menu state lifted to the top-level `FileTree` (not per-row) so
   opening a second row's menu auto-closes the first — only one
   menu open at a time, matching VSCode/Theia. Pinned by the
   `replaces the first` test.

3. New `downloadFileByPath(path)` in `useFilesApi` — fetches via the
   existing GET /workspaces/<id>/files/<path>?root= endpoint and
   triggers a browser download. Distinct from the existing
   `handleDownloadFile` which downloads the in-editor buffer
   (round-trips unsaved edits to disk); the context-menu download
   targets arbitrary tree rows the user hasn't opened.

4. `canDelete` prop threaded from FilesTab → FileTree → menu →
   item. Same gate as the toolbar (Clear/New/Upload all gated to
   /configs); context menu's Delete renders as disabled with a
   muted background on other roots, matching the "feature exists
   but isn't applicable here" pattern.

## Test coverage

`FileTreeContextMenu.test.tsx` (8 tests):

- File row → menu opens with Open + Download + Delete.
- Directory row → menu opens with Delete only.
- Click Download → onDownload(path) fires + menu closes.
- Click Delete (canDelete=true) → onDelete(path) fires.
- Click Delete (canDelete=false) → onDelete NOT called + menu stays
  open (disabled-state UX).
- Esc dismisses.
- Outside-click (mousedown on document.body) dismisses.
- Opening second context menu replaces the first (only-one-open
  invariant).

Each test uses fireEvent + screen.getByRole, so they fail on a
deleted-code regression — none would pass on the pre-PR shape.

## Three weakest spots (hostile self-review)

1. The menu is positioned at `clientX/clientY` without viewport
   clamping. If the user right-clicks at the very bottom-right of
   the panel, part of the menu may overflow off-screen. VSCode
   handles this by flipping the anchor; we don't yet. Acceptable
   v1 because the FilesTab is fixed-width (≤ side-panel width)
   and the menu is small (140×~80px); the overflow would be a few
   pixels of one item. Filed as a follow-up.

2. Auto-focus on the first item shifts keyboard focus away from
   the row that opened the menu. Closing with Esc returns focus
   to the body, not the row. Same behavior as TerminalTab's
   placeholder + the canvas's other context menus; consistent
   isn't ideal but at least uniform. Documented inline.

3. The download request reuses the API client's 15s default
   timeout — large config files (multi-MB skill bundles) on a
   slow connection could time out. Same risk applies to the
   existing toolbar Export. If we see real download failures we
   can add a `timeoutMs` override at the call site without
   touching the menu.

## Verification

- `npx tsc --noEmit` clean
- 176/176 canvas tab tests pass
- Manual on local dev: right-click a config.yaml row → menu opens
  → click Download → file lands in Downloads. Right-click on
  /home root → Delete renders disabled.

Refs #2999. Pairs with PR-A (backend EIC) — without PR-A the tree
is empty and there's nothing to right-click on a SaaS workspace.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
2026-05-05 20:26:04 -07:00
hongming-personal 2e6bed71b9 Merge pull request #3003 from Molecule-AI/ux/files-tab-external-not-available
ux(canvas/files): "Files not available" banner for external runtimes (#2999 PR-B)
2026-05-06 03:24:45 +00:00
hongming-personal 030377bb84 Merge pull request #3002 from Molecule-AI/fix/files-eic-list-delete-symmetry
fix(workspace files API): EIC parity for ListFiles + DeleteFile (#2999 PR-A)
2026-05-06 03:22:45 +00:00
Hongming Wang f93957e982 ux(canvas/files): "Files not available" banner for external runtimes (#2999 PR-B)
## Why

Reported by user (issue #2999): external workspaces (mac laptop, mac
mini, hermes-on-home-server — runtime="external") render the FilesTab
identically to the SaaS empty-listing bug, showing "0 files / No
config files yet" even though the platform doesn't actually own the
filesystem of these workspaces. Visually indistinguishable from the
broken state, reads as a bug.

## Fix

Mirror the affordance TerminalTab adopted in PR #2830 for runtimes
without a TTY:

1. New `NotAvailablePanel` in `canvas/src/components/tabs/FilesTab/`
   — folder-with-slash icon + "Files not available" headline + body
   text that names the runtime and points the user at Chat.

2. `FilesTab` now takes optional `data?: WorkspaceNodeData`. When
   `data.runtime` is in `RUNTIMES_WITHOUT_FILES` (currently just
   "external"), early-return the placeholder before mounting the
   useFilesApi hook. Mirrors TerminalTab's prop shape exactly so the
   review pattern is uniform across tabs.

3. SidePanel passes `node.data` to FilesTab (matches existing pattern
   for ChatTab / TerminalTab).

## Test coverage

`FilesTab.notAvailable.test.tsx` (4 tests):

- external runtime → banner renders with runtime name + Chat-tab
  guidance copy.
- external runtime → NO `/files` API request fires (asserted by
  inspecting the mocked api.get call log).
- claude-code runtime → no banner, normal mount proceeds (toolbar's
  root selector is the discriminator).
- data prop omitted → falls through to normal mount (back-compat
  with any caller that doesn't thread data through, e.g. legacy
  tests).

Each branch is independent and discriminating — none would pass on
a code-deleted version of the early-return.

## Three weakest spots (hostile self-review)

1. `RUNTIMES_WITHOUT_FILES` is a hardcoded set in this file. If a
   future runtime joins (e.g. a "byok-claude" that runs on user
   hardware), someone has to remember to add it here. Reviewed
   alternatives: pull from a runtime-capabilities registry — same
   shape as `RUNTIMES_WITHOUT_TERMINAL` already in TerminalTab. We
   chose the parallel pattern over a new abstraction; consolidating
   into a shared registry can land if/when a third tab grows the
   same gate (rule of three). Documented inline.

2. The placeholder is a static panel — no retry, no "report bug"
   link. Same as TerminalTab's. Acceptable because the absence is
   intentional, not transient.

3. Chat-tab guidance is hardcoded English. No i18n in canvas yet;
   matches the rest of the codebase. Will move with the i18n
   migration when that lands.

## Verification

- `npx tsc --noEmit` clean
- 54/54 canvas tab + SidePanel tests pass
- Will be live-verified on staging post-merge: open Files tab on an
  external workspace (mac laptop) → expect placeholder; open on a
  platform-owned workspace (Hongming Personal Brand Agent) → expect
  normal tree (assuming PR-A also lands).

Refs #2999. Pairs with PR-A (backend EIC fix) — without PR-A the
platform-owned path still shows "0 files" because the backend never
returns rows.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
2026-05-05 20:21:45 -07:00
hongming-personal b530c147de Merge pull request #3000 from Molecule-AI/rfc-2991-pr-2-video-audio-preview
feat(canvas/chat): inline video + audio HTML5 players (RFC #2991 PR-2)
2026-05-05 20:18:36 -07:00
Hongming Wang f39b595a9c fix(workspace files API): EIC parity for ListFiles + DeleteFile (closes #2999 PR-A)
## User-visible bug

Canvas Files tab returns "0 files / No config files yet" for every
SaaS workspace, every root (/configs, /home, /workspace, /plugins).
Reported by user (canvas screenshot, hongming.moleculesai.app,
Hongming Personal Brand Agent — claude-code, T4, online).

## Root cause

`ListFiles` (templates.go) was missing the SSH-via-EIC branch that
ReadFile (PR #2785) and WriteFile (PR #1702) already have. On SaaS,
dockerCli is nil → findContainer returns "" → falls through to
host-side resolveTemplateDir which only matches baked-in template
names. For a user-named workspace it matches nothing, so the handler
silently returns []fileEntry{}.

DeleteFile had the same gap — right-click delete (introduced in PR-C
of this issue) would silently no-op once #1 was fixed.

## Fix

1. Extracted shared EIC plumbing into `withEICTunnel` (closure-based,
   single SSOT for keypair → key push → tunnel → port-wait → cleanup).
   Refactored writeFileViaEIC + readFileViaEIC to use it. Added
   listFilesViaEIC + deleteFileViaEIC on the same scaffold. The
   `LogLevel=ERROR` shim from PR #2822 now lives in one
   `eicSSHSession.sshArgs()` helper instead of being duplicated per
   helper — the next time we need to tweak ssh options, one place.

2. Factored remote shell strings into pure functions
   (buildInstallShell / buildCatShell / buildRmShell / buildFindShell
   + parseFindOutput) so the wire shape can be pinned without booting
   a real EIC tunnel.

3. Refactored `resolveWorkspaceFilePath(runtime, root, relPath)` to
   honor `?root=`. New rule: `/configs` (or empty / unrecognized) →
   runtime managed-config dir via workspaceFilePathPrefix (preserves
   the v1 ReadFile/WriteFile behaviour where canvas's Config tab
   GETs/PUTs config.yaml without specifying a root and lands in the
   right per-runtime dir); `/home`, `/workspace`, `/plugins` →
   literal absolute path on the EC2 host. List/Read/Write/Delete now
   agree on what file a tree row points to — pre-fix List would say
   "/home contents" but Read/Write would route to /configs.

4. ListFiles + DeleteFile dispatch on instance_id != "" → EIC helper.
   Errors from the EIC path produce 500 (not silent fall-through to
   local-Docker, which would mask the failure as "0 files" — the
   exact user-visible symptom).

5. Added ?root= validation gate to WriteFile + DeleteFile so an
   out-of-allowlist root is rejected before the resolver runs.

## Test coverage

- TestResolveWorkspaceFilePath_RuntimeIndirection — pins the
  /configs → runtime prefix translation per-runtime (hermes,
  claude-code, langgraph, external, unknown). Catches the regression
  where a future edit accidentally drops the runtime indirection.

- TestResolveWorkspaceFilePath_LiteralRoots — pins /home,
  /workspace, /plugins as literal pass-through regardless of
  runtime. Catches the symmetric regression where the literal roots
  start getting rewritten to the runtime prefix (which would mean
  the FilesTab "/home" selector silently routes to /configs on
  hermes).

- TestResolveWorkspaceRootPath — directory-only translation used
  by listFilesViaEIC, same indirection rules.

- TestSSHArgs_HardenedFlags — pins the centralised ssh option set
  (LogLevel=ERROR + hardening). Catches drift in the
  one-place-where-ssh-flags-live.

- TestEicSSHSessionSingleSourceForSSHFlags — behaviour-based AST
  gate (per memory). Counts s.sshArgs() callers (must be ≥4 —
  list/read/write/delete) and asserts LogLevel=ERROR appears
  exactly once in the source. Fires if anyone copy-pastes a raw
  ssh args slice instead of going through the helper.

- TestBuildInstallShell / TestBuildCatShell / TestBuildRmShell /
  TestBuildFindShell — pure-function tests pinning the remote
  command shape. Catches regression like "rm -f silently becomes
  rm -rf" or "find loses node_modules pruning" without needing a
  real EC2.

- TestBuildFindShell_DepthForwarding — catches a regression where
  the helper hard-codes a depth instead of using the caller's value.

- TestParseFindOutput / TestParseFindOutput_EmptyInput — pin the
  TYPE|SIZE|REL parser. Empty-input case explicitly returns []
  not nil so the JSON wire shape stays a list.

- TestListFiles_EICDispatch_Success / Error — sqlmock-driven
  handler test. Verifies instance_id != "" routes to listFilesViaEIC
  and surfaces errors as 500 (does NOT silently fall through to
  local-Docker, which is the exact regression-mode of the original
  bug).

- TestListFiles_EICBranch_NotTakenForSelfHosted — back-compat
  guard: instance_id == "" must NOT enter the EIC branch (would
  break self-hosted operators).

- TestDeleteFile_EICDispatch_Success / Error — same shape for
  DeleteFile.

- TestListFiles_RootValidation / TestDeleteFile_RootValidation —
  ?root=/etc must 400 before any DB query or EIC call.

## Verification

- `go build ./...` clean
- `go test ./...` clean (full workspace-server suite)
- Will be live-verified against staging on hongming.moleculesai.app
  after merge: open Files tab → expect populated /home + /configs +
  /workspace listings (not "0 files"); right-click delete on
  /configs/old.yaml → expect file removed on the EC2 host.

## Three weakest spots (hostile self-review)

1. The LogLevel=ERROR drift gate counts source occurrences. A
   future refactor that intentionally moves the literal somewhere
   else (e.g. into a constant) would trigger a false positive. The
   gate's failure message points to the load-bearing constraint
   (must appear in sshArgs); operator can adjust.

2. `eicFileWriteTimeout` constant kept as an alias for back-compat
   with prior tests. Documented as intentional + safe to remove on
   the next pass.

3. The resolver tests pin the runtime → prefix map values
   (`/home/ubuntu/.hermes`, `/configs`, etc.). A future runtime
   addition that ships a new prefix needs the test updated. This
   is intentional — silent prefix changes orphan saved files, so a
   test failure on map edit IS the right signal.

## Follow-up (RFC #2312 subtask 2)

Long-term the right fix is to drop EIC entirely and HTTP-forward to
the workspace's own URL (RFC #2312). That's a substantially larger
refactor across 5 surfaces (chat upload, files, templates, plugins,
terminal) and out of scope for this bug-fix PR. Tracked separately
under that RFC.

Refs #2999.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 20:18:05 -07:00
hongming-personal 95fdf86187 feat(canvas/chat): inline video + audio HTML5 native players (RFC #2991 PR-2)
Second specialized renderer pair landing under RFC #2991. Stacks on
PR-1 (#2997) — extends the AttachmentPreview dispatcher with video/
audio cases.

Why HTML5-native (not custom JS player)
---------------------------------------

- Browser vendors ship hardware-accelerated decoders, captions,
  pinch + scrub UX, and fullscreen UI. We get all of it for free.
- Native fullscreen via the <video> control bar — no
  AttachmentLightbox needed for video (the browser's built-in
  fullscreen handles it).
- Mobile-friendly without us writing the touch handlers.

Auth model
----------

Identical to AttachmentImage (PR-1): platform-auth URIs need our
cookie/token, so we fetch the bytes, wrap in a Blob, hand the
browser an ObjectURL via <video src=> / <audio src=>. External
http(s) URIs skip the fetch.

Memory caveat: a Blob holds the entire media in JS memory until the
bubble unmounts. The server's 25MB single-file cap (chat_files.go)
bounds this; v2 can switch to MediaSource + streaming if larger
files become a real shape.

Failure modes
-------------

- Fetch failure (404, 403, network) → AttachmentChip fallback.
- Bytes that aren't valid media (corrupt, wrong Content-Type) →
  <video onError> / <audio onError> swap to chip.

Tests
-----

5 new component tests in AttachmentPreview.test.tsx (now 14 total):
  - kind=video → <video controls> with blob URL src
  - kind=video fetch fails → falls back to chip
  - kind=video extension fallback (no mime) → routes to video path
  - kind=audio → <audio controls> + filename label visible
  - kind=audio fetch fails → falls back to chip

The preview-kind unit tests from PR-1 (49 cases) already cover the
MIME → video / audio dispatch logic; this PR's component tests pin
the rendered DOM shape (controls attribute, blob URL src, fallback
behavior).

Hostile self-review
-------------------

1. Memory bound: 25MB cap protects us today; documented future
   migration path (MediaSource).
2. iOS Safari autoplay: playsInline pinned on <video> so mobile
   doesn't auto-fullscreen on play.
3. Captions accessibility: <track kind="captions" /> placeholder so
   the element is tagged correctly even though we don't have caption
   files yet (forward-compatible).

Verified
- tsc --noEmit clean
- 173 chat tests green (49 unit + 14 component + 110 pre-existing)

Stacks on PR-1 (#2997). PR-3 (PDF + text/code) is the final piece.

Refs RFC #2991, PR #2997 (PR-1).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 20:10:19 -07:00
hongming-personal 04f7a07add feat(canvas/chat): inline image preview + fullscreen lightbox (RFC #2991 PR-1)
First specialized renderer landing under RFC #2991 — chat attachment
preview. Adds the dispatch infrastructure that PR-2 (video/audio) and
PR-3 (PDF/text) will extend.

Architecture (RFC #2991 Phase 2 design)
---------------------------------------

- preview-kind.ts: pure helper that maps mimeType (+ extension fallback
  for missing/generic MIME) to one of: image | video | audio | pdf |
  text | file. Single source of truth; the dispatch axis for every
  attachment renderer.

- AttachmentPreview.tsx: SSOT dispatch component. ChatTab no longer
  imports kind-specific components — it imports AttachmentPreview,
  which switches on the kind and renders the right child.

- AttachmentImage.tsx: inline thumbnail (max 240×180) + click →
  lightbox. Auth-aware: for platform URIs (workspace: /
  platform-pending: / etc) the bytes are fetched via JS-injected
  headers, wrapped in a Blob, served as ObjectURL — bare <img src>
  would not include the cookie/token.

- AttachmentLightbox.tsx: shared fullscreen modal (image now; PDF will
  use it in PR-3). Esc / backdrop click / X button to close, focus
  trap on close button, focus restoration on close.

- AttachmentChip retained as the kind=file fallback. No breaking
  change for existing renderable shapes.

External-workspace coverage
---------------------------

The wire shape (ChatAttachment.mimeType + uri) is identical for
internal + external workspaces — both go through AgentMessageWriter
(PR #2949). External claude-code agents that attach images via
send_message_to_user automatically get the new preview surface; no
runtime-side change needed.

Failure modes
-------------

- Fetch failure (404, 403, network) → AttachmentChip fallback so the
  user still gets a working download. Pinned by tests.
- Decoded as non-image (corrupt bytes, wrong Content-Type) → onError
  on the <img> swaps to AttachmentChip. Pinned by tests.
- Non-platform URIs (http/https external image hosts) → skip the
  auth-fetch flow, use the raw URL via resolveAttachmentHref. Pinned
  by extension-fallback tests.

Tests
-----

preview-kind.test.ts (49 cases):
  - Strict MIME match across image/video/audio/pdf/text/unknown
  - Extension fallback when MIME is missing or application/octet-stream
  - URL with query string + fragment → strip before parsing
  - MIME wins over extension (regression: don't render image-named zip)
  - SVG is image (not text) despite being XML
  - Non-canonical MIME like application/javascript → text

AttachmentPreview.test.tsx (9 component tests):
  - Dispatch: kind=file → chip, kind=image → image path
  - Loading state shows placeholder, NOT chip (proves dispatch routed)
  - Extension fallback (no mimeType) routes to image path
  - Fetch fail (404) and network error → fall back to chip
  - Image success: <img> renders ObjectURL, click opens lightbox
  - Lightbox: Esc closes, backdrop click closes, content click doesn't
  - Universal fallback: unknown MIME → chip even when extension hints
    at a renderable kind

Hostile self-review (3 weakest spots, addressed)
------------------------------------------------

1. <img> auth: bare <img src="/chat/download?..."> would NOT include
   our auth headers. Resolved via fetch+Blob+ObjectURL pattern.
   Pinned by the image-success test (asserts src === "blob:test-url").

2. Server-side allowed-roots mismatch: pre-fix tests used /tmp/ paths
   which the server doesn't allow. Caught when the dispatch test
   fell into the non-platform path. Updated tests to use /workspace/
   subpaths matching templates.go's allowedRoots.

3. Bundle size creep: each kind component adds bytes. Lightbox is
   currently always-bundled. Lazy-loading is plausible but defer
   until measured-needed.

Verified
- tsc --noEmit clean
- 168 chat tests green (49 unit + 9 component + 110 pre-existing)

PR-2 (video + audio) and PR-3 (PDF + text) extend the dispatch in
AttachmentPreview.tsx with their own kind-specific components.

Refs RFC #2991.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 19:39:37 -07:00
hongming 3dfeb180ab Merge pull request #2995 from Molecule-AI/fix/sweep-add-orphan-tunnel-cleanup-2987
chore(sweep): add orphan-tunnel cleanup step (#2987 / #340)
2026-05-06 02:38:39 +00:00
Hongming Wang 88ff0d770b chore(sweep): add orphan-tunnel cleanup step (#2987 / #340)
The 15-min sweeper has been deleting stale e2e orgs but not the
orphan tunnels left behind when the org-delete cascade half-fails
(CP transient 5xx after the org row is gone but before the CF
tunnel delete completes). Result: tunnels accumulate in CF until
manual operator cleanup.

Add a final step that POSTs `/cp/admin/orphan-tunnels/cleanup`
every tick. Best-effort — failure doesn't fail the workflow; next
tick re-attempts. Output reports deleted_count + failed count for
ops visibility.

This is the catch-all for the orphan-tunnel class. The proper
upstream fix (transactional org delete) lives in CP and tracks as
issue #2989. Until that lands, the sweeper bounded-time-to-cleanup
keeps the leak from escalating.

Note: PR #492 (cf-tunnel silent-success fix) makes this step
actually effective — pre-fix DeleteTunnel silent-succeeded on
1022, so the cleanup endpoint reported success without deleting.
Post-fix the cleanup chains CleanupTunnelConnections + retry on
1022, which actually clears stuck-connector orphans.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
2026-05-05 19:36:20 -07:00
hongming-personal 86b8d8d744 Merge pull request #2982 from Molecule-AI/fix-config-skip-yaml-for-external-runtime
fix(canvas/config): skip config.yaml fetch for external/hermes runtimes
2026-05-06 02:22:14 +00:00
hongming-personal 9b9419ad5e Merge pull request #2992 from Molecule-AI/chore/ssot-pointer-sweep-workflow
chore(sweep): note SSOT for ephemeral prefixes lives in CP
2026-05-06 02:20:35 +00:00
Hongming Wang a19ee90556 chore(sweep): note SSOT for ephemeral prefixes lives in CP
Mirrors molecule-controlplane#494: the canonical EPHEMERAL_PREFIXES
list now lives in molecule-controlplane/internal/slugs/ephemeral.go,
where redeploy-fleet reads it to skip in-flight test tenants. The
sweep workflow keeps a Python copy because GHA Python can't import
Go, but a comment now points engineers updating the list to update
both files.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 19:18:13 -07:00
hongming bd0580f4af Merge pull request #2990 from Molecule-AI/fix/memory-v2-namespace-labels-2988
fix(memory-v2): namespace labels use display names not UUID prefixes (#2988)
2026-05-06 02:13:30 +00:00
Hongming Wang 64e58fb390 test(memory-v2-e2e): update expectChainQueryRoot for new name column
PR #2990 root cause: the resolver SQL added `name` to the SELECT for
DisplayName plumbing, but the e2e test's sqlmock fixture
(expectChainQueryRoot at swap_test.go:216) still scripts the
3-column shape. Three e2e tests fail with:

    sql: expected 3 destination arguments in Scan, not 4

Fix: bump the fixture to 4 columns (id, name, parent_id, depth) and
pass an empty name. The e2e tests don't assert on label rendering —
they pin the namespace string flow ("workspace:root-1" etc), which
is unchanged. Empty name is fine: ReadableNamespaces still emits the
correct namespace strings; only DisplayName is empty.

Caught by CI's Platform (Go) check on PR #2990 — would have been a
silent missed-coverage case in the resolver_test.go run because that
package doesn't import the e2e package.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
2026-05-05 19:10:18 -07:00
hongming-personal 9ceda9d81f refactor(events): migrate 18 files to typed EventType constants (RFC #2945 PR-B-1)
Mechanical migration of bare event-name strings in BroadcastOnly /
RecordAndBroadcast call sites to the typed constants from
internal/events/types.go (RFC #2945 PR-B). Wire format unchanged
(both shapes serialize to identical WSMessage.Event literals); pinned
by TestAllEventTypes_IsSnapshot in #2965.

Migrated (18 files, scope: handlers/, scheduler/, registry/, bundle/,
channels/):
- handlers/{approvals,a2a_proxy_helpers,a2a_queue,activity,agent,
  delegation,external_rotate,org_import,registry,workspace,
  workspace_bootstrap,workspace_crud,workspace_provision_shared,
  workspace_restart}.go
- channels/manager.go (caught by hostile-reviewer pass — initial
  scope missed channels/, found via grep on the post-migration tree)
- scheduler/scheduler.go
- registry/provisiontimeout.go
- bundle/importer.go

Hostile self-review (3 weakest spots, addressed)
------------------------------------------------

1. Missed call sites — initial scope omitted channels/. Post-migration
   `grep -rEn 'BroadcastOnly\([^,]+,[^,]*"[A-Z_]+"|RecordAndBroadcast\([^,]+,[^,]*"[A-Z_]+"' internal/`
   found 2 stragglers in channels/manager.go. Migrated. Final grep
   on the same pattern returns only the docstring example in
   types.go (intentional).

2. gofmt drift — auto-import injection produced non-canonical import
   ordering. `gofmt -w` applied ONLY to the 18 modified files (NOT
   the whole tree, to avoid sweeping unrelated pre-existing drift
   into this PR's diff). Three pre-existing un-gofmt'd files in
   handlers/ (a2a_proxy.go, a2a_proxy_test.go, a2a_queue_test.go)
   left as-is — they're unchanged by this PR and their drift
   predates it.

3. Wire format — paranoia check: do the constants serialize to the
   exact strings consumers (canvas TS, hermes plugin, anything
   parsing WSMessage.Event) expect? Yes. Pinned by the snapshot
   test. The migration is name-only; not a single character of
   wire output changes.

Verified
- go build ./... clean
- go vet ./internal/... clean
- gofmt -l on the 5 migrated package dirs: only pre-existing files
- Full tests: handlers/, channels/, scheduler/, registry/, events/,
  bundle/ all green (5 ok, 0 fail)

PR-B-2 (canvas TS mirror + cross-language parity gate) remains as
the final piece of RFC #2945 PR-B. Tracked separately so this PR
stays mechanical + reviewable.

Refs RFC #2945, PR #2965 (PR-B types).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 19:05:03 -07:00
Hongming Wang b6310d7ebf fix(memory-v2): namespace dropdown labels use display names not UUID prefixes (#2988)
User feedback on the v2 Memory tab redesign: on a root workspace, the
namespace dropdown showed three indistinguishable entries:
  Workspace (30ba7f0b)
  Team (30ba7f0b) (team)
  Org (30ba7f0b-b303-4a20-aefe-3a4a675b8aa4) (org)

For a root workspace, the resolver collapses workspace==team==org IDs
(resolver.go:113-122 derive() degenerate case). The previous
shortID(8)-truncated UUID label scheme made all three look identical
even though the three concepts (private / team-shared / org-wide)
remain semantically distinct.

## Backend — Resolver returns DisplayName

  - SQL chain query now SELECTs workspaces.name (COALESCE → "" on NULL)
  - chainNode carries .name through walk
  - deriveNames() computes the display name for each namespace,
    mirroring derive():
      workspace: self.name
      team:      parent.name (or self.name if root — degenerate)
      org:       chain[end].name (root of tree)
  - Namespace struct gets a new DisplayName field, omitempty wire-shape

## Backend — Handler renders label from DisplayName when present

  - memories_v2.go:namespaceLabelWithName(name, kind, displayName) is
    the new SSOT label generator. Falls back to the UUID-prefix shape
    when displayName is empty so callers without name plumbing keep
    working unchanged.
  - namespacesToViews now plumbs Namespace.DisplayName into the label.
  - Old namespaceLabel(name, kind) is preserved as a thin wrapper
    around namespaceLabelWithName(_, _, "") for back-compat.
  - Custom namespaces ignore displayName by design — operator-defined
    suffixes ARE the chosen label; a name override would surprise.

## Frontend — drop redundant `(kind)` suffix

  Pre-fix: "Team (mac laptop) (team)" — kind shown twice.
  Post-fix: "Team (mac laptop)" — the prefix already conveys the kind.

## Test coverage

Resolver (3 new tests):
  - DisplayName_Root: workspace name propagates to all 3 namespaces
  - DisplayName_Child: workspace=self.name, team=parent.name, org=root.name
  - DisplayName_EmptyOnNULL: COALESCE → "" → empty fallback

Handler (3 new tests):
  - NamespaceLabelWithName_PrefersDisplayName: workspace/team/org/custom paths
  - NamespaceLabelWithName_FallsBackToUUIDPrefix: empty displayName → legacy shape
  - NamespacesToViews_PassesDisplayNameThrough: full integration on root case

Canvas: existing 30 tests still pass; suffix drop is rendering-only.

memories_v2.go function coverage: **14/14 = 100%**
- namespaceLabelWithName: 100%
- namespacesToViews: 100%
- (all 11 pre-existing functions stay at 100%)

## SSOT

The "what is this namespace called" question now has one source of
truth: namespace.Resolver.ReadableNamespaces sets DisplayName from the
canonical workspace.name column. The handler is a renderer; the
canvas is a consumer. No name-lookup logic duplicated across the
three layers.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
2026-05-05 18:46:50 -07:00
molecule-ai[bot] d75b73e713 Merge pull request #2981 from Molecule-AI/staging
staging → main: auto-promote 9dd2988
2026-05-05 18:13:50 -07:00
hongming-personal 0886dbc923 Merge pull request #2978 from Molecule-AI/fix-plugins-compact-empty-state
feat(canvas/skills): compact-empty layout for Plugins section (#2971)
2026-05-06 01:12:09 +00:00
hongming-personal 7420631c32 Merge pull request #2983 from Molecule-AI/feat/auto-promote-stale-alarm-2975
feat(ops): hourly alarm for auto-promote PR stuck on REVIEW_REQUIRED (#2975)
2026-05-06 00:58:49 +00:00
Hongming Wang caf19e8980 feat(ops): hourly alarm for auto-promote PR stuck on REVIEW_REQUIRED (#2975)
Closes the silent-block failure mode that left 25 commits — including
the Memory v2 redesign and the reno-stars data-loss fix — wedged on
staging for 12+ hours behind a single missing review. The auto-promote
workflow opened the PR + armed auto-merge, but main's branch protection
required a human review and nobody noticed until a user reported
"still seeing old memory tab".

## Detection logic — `scripts/check-stale-promote-pr.sh`

Reads open PRs `base=main head=staging` and alarms on:
  - `mergeStateStatus == BLOCKED`
  - `reviewDecision == REVIEW_REQUIRED`
  - createdAt older than `STALE_HOURS` (default 4h)

Other BLOCKED reasons (DIRTY, BEHIND, failed checks) are NOT alarmed —
those are the author's signal-to-fix. This script targets the specific
"no human reviewed yet" wedge.

Output:
  - `::warning` per stale PR (visible in workflow summary + Actions UI)
  - PR comment (idempotent via marker-string detection; one alarm
    per PR, never re-spammed)
  - Exit code = count of stale PRs (capped at 125)

Logic in a script (not inline workflow YAML) so it's:
  - **Unit-testable** — tests/test-check-stale-promote-pr.sh exercises
    every branch with stubbed fixture JSON + frozen clock. 23 tests
    covering: empty list, single stale, just-under-threshold, wrong
    reviewDecision, wrong mergeStateStatus, mixed list (only matching
    PRs alarm), custom threshold via --stale-hours, exit-code-counts-
    matching-PRs, --help, unknown arg → 64, missing repo → 2.
  - **Operator-runnable ad-hoc** — `scripts/check-stale-promote-pr.sh`
    works from any shell with `gh` + `jq`.
  - **SSOT** — one detector, the workflow YAML is just schedule +
    invocation surface. Future sibling workflows that need the same
    check call the same script.

## Workflow — `.github/workflows/auto-promote-stale-alarm.yml`

Triggers:
  - cron `27 * * * *` (hourly, off-the-hour to dodge cron herd)
  - workflow_dispatch with `stale_hours` + `post_comment` overrides

Concurrency: `auto-promote-stale-alarm` group, cancel-in-progress=false
(idempotent script; no benefit to cancelling a running scan).

Permissions: `contents: read` + `pull-requests: write` (post comments).

Sparse checkout — only fetches `scripts/check-stale-promote-pr.sh`.
No node_modules, no go modules, no slow setup steps. Workflow runs
in <30s on a clean repo.

## Why "alarm + comment" not "auto-approve"

Considered options in issue #2975:
  1. Slack/email alert — picked.
  2. Bot-account auto-approve via molecule-ops — circumvents the
     human-review gate that branch protection encodes.
  3. Trusted-promote bypass via CODEOWNERS — needs Org Admin config
     change; out of scope for a workflow PR.

The comment-on-PR pattern picks (1) without external dependencies
(no Slack token, no email config). Subscribers get notified via
GitHub's existing PR notification delivery; the warning shows up in
the Actions feed.

## Why this won't false-positive on legitimate slow reviews

Threshold is 4h. Most legitimate gates clear in <1h, so 4× headroom
is plenty for slow CI. The comment is idempotent (one alarm per PR,
never re-posted) — adding noise stops at 1 comment regardless of
how long the PR sits.

## Test plan

- [x] `bash scripts/test-check-stale-promote-pr.sh` — 23/23 pass
- [x] `python3 -c 'yaml.safe_load(...)'` clean
- [x] `bash -n` clean on both scripts
- [ ] Live verification: dispatch the workflow once main has caught up,
      confirm it correctly reports zero stale PRs
2026-05-05 17:55:27 -07:00
hongming-personal 38bc27df0d fix(canvas/config): skip config.yaml fetch for external / hermes runtimes — eliminate 404 console noise
Reported on production reno-stars 2026-05-05 (browser console):

  /workspaces/d76977b1-…/files/config.yaml:1
    Failed to load resource: the server responded with a status of 404

The workspace was an external-runtime mac-mini-style agent that
doesn't use the platform's config.yaml template — every Config tab
open issued a GET that 404d cleanly, and the existing catch block
fell into the runtime-manages-own-config branch + populated the
form from workspace metadata. Functionally correct, but the request
fired anyway, surfaced as a 404 in DevTools, and burned an RTT.

Fix: branch on RUNTIMES_WITH_OWN_CONFIG BEFORE the fetch — when the
workspace's runtime is one of those (external, hermes), skip the
GET, populate the form from workspace metadata directly, set
loading=false, return. Same code path as the existing 404-catch
fallback, just skipping the wasted request.

Behavior preserved for runtimes that DO use the template
(claude-code, etc.): unchanged GET → parse → setConfig flow.

Tests: 24/24 existing ConfigTab tests pass; no behavioral change for
the documented runtimes. tsc clean.

Refs reno-stars production 2026-05-05.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 17:55:24 -07:00
hongming-personal 6748035720 Merge pull request #2980 from Molecule-AI/test/canvas-resolve-attachment-href-2973
test(canvas/chat): cover platform-pending: branch + isPlatformAttachment (#2973)
2026-05-06 00:54:17 +00:00
hongming-personal c74d0ecc94 test(canvas/chat): cover platform-pending: branch + isPlatformAttachment (#2973)
Closes #2973 — the followup test gap I flagged on PR #2968's review.

Pre-merge #2968 added the platform-pending: URI scheme branch to
resolveAttachmentHref + introduced the isPlatformAttachment SSOT
helper, but the existing uploads.test.ts only covered the older
workspace: / file:/// / absolute-path branches. The new branch shipped
on prod-impact (live console error on reno-stars) with manual post-
deploy verification; the regression gate was filed as a followup
(#2973) so a future canvas refactor can't silently re-break the
poll-mode chat-attachment download path.

Adds 15 new test cases across two existing describe blocks:

resolveAttachmentHref — platform-pending: scheme (poll-mode uploads):
- well-formed platform-pending:<wsid>/<fileid> resolves to the
  /pending-uploads/<file>/content endpoint
- uses the URI's wsid, NOT the chat workspace_id (cross-workspace
  forwarding case — pinning the explicit decision from #2968's
  commit message so a regression that flipped this would mis-route
  the download to the wrong workspace's pending-uploads store)
- defensive fallback to raw URI on missing slash, empty fileID,
  empty wsid (so a future "helpful" change can't synthesize a
  broken /pending-uploads// path)
- regression test against the EXACT production repro from #2968's
  body (reno-stars, 2026-05-05 console error)

isPlatformAttachment:
- positive cases for platform-pending: (well-formed and malformed),
  workspace:<allowed-root>, file:///<allowed-root>, absolute paths
  under allowed roots
- NEGATIVE cases for HTTPS/HTTP URLs to other origins (auth-leak
  class regression — a helper that always returned true would
  attach workspace tokens to third-party requests), non-allowlisted
  roots like /etc/passwd or /var/log/x, empty string, and
  unrecognised schemes (s3://, ftp://)

All 21 tests pass. The 6 pre-existing tests are unchanged. The 15
new tests are the regression gate that #2973 asked for.

Verification:
- pnpm exec vitest run src/components/tabs/chat/__tests__/uploads.test.ts
  → 21 passed
2026-05-05 17:51:28 -07:00
hongming-personal 9dd29882e2 Merge pull request #2979 from Molecule-AI/fix/a2a-poll-mode-response-shape-2967
feat(a2a): SSOT typed-variant response parser + auto-fallback for poll-mode peers (#2967)
2026-05-06 00:41:43 +00:00
Hongming Wang e342d0c5a7 fix(build): register a2a_response in TOP_LEVEL_MODULES
The drift gate caught the new SSOT parser module — without registration
the wheel ships it un-rewritten and runtime imports fail. Same pattern
as inbox_uploads, a2a_tools_delegation, a2a_tools_rbac registrations.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 17:34:05 -07:00
Hongming Wang 166ad20cd7 test(e2e): Phase 3.5 — wheel parser classifies real server response (#2967)
Previously Phase 3 only checked the workspace-server's poll-mode short-circuit
emit shape ({"status":"queued","delivery_mode":"poll","method":"..."}); the
matching client-side classification was tested in isolation against fixture
dicts in test_a2a_response.py.

This phase closes the loop by piping the actual on-the-wire response from a
real workspace-server back through the wheel's a2a_response.parse() and
asserting it classifies as the Queued variant with the right method +
delivery_mode. A regression in EITHER the server emit shape OR the client
parser will now fail this E2E, eliminating the gap that allowed the original
"unexpected response shape" production bug to ship despite green unit tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 17:31:45 -07:00
hongming-personal 4a2dda7cac feat(canvas/skills): compact-empty layout for Plugins section (#2971)
Reported on production 2026-05-05:

  agent plugin tab Plugins
  0 installed
  + Install Plugin
  this part should be default compact

Pre-fix: SkillsTab always rendered the Plugins section as a full
rounded-xl panel with vertical chrome — even when zero plugins were
installed and the registry browser was closed. The empty state
gave a lot of vertical real estate for content that's just "0
installed + Install button".

Fix: when installed.length === 0 AND registry closed AND initial
load completed, collapse the section into a single inline pill
("Plugins · 0 installed · + Install Plugin"). The full panel
re-mounts when:
  - installed.length > 0 (a plugin landed → expand to surface the list)
  - showRegistry === true (user clicked + Install Plugin → registry opens)
  - !installedLoaded (avoid flash; the loading shell shows instead
    until the first /plugins fetch resolves)

Accessibility:
  - Compact pill: aria-label="Plugins (none installed)" + button
    aria-expanded="false" + aria-controls="plugins-section"
  - Full panel: button aria-expanded={showRegistry} + same aria-controls
  - Section gets id="plugins-section" so the aria-controls reference
    resolves once the section mounts

External workspaces: this is a pure canvas-frontend layout change —
applies to ALL workspace runtimes (external, claude-code, hermes,
langchain, codex, third-party MCP). No server-side change needed.

Tests
-----

SkillsTab.compactEmpty.test.tsx (4 tests):
  - Compact pill renders when installed=0, registry closed, loaded
  - Full panel renders when installed > 0
  - Click + Install Plugin from compact → expands to full panel
    (verified via aria-controls target id appearing in the DOM)
  - During initial load (installedLoaded=false), compact pill does
    NOT render — avoids a compact→full flash as the load completes

Per memory feedback_oss_design_philosophy.md: the SkillsTab is the
only tab that needs compact-empty today, but the pattern is
extractable into a shared EmptyStateCompactWrapper if Schedules /
Memories / Approvals adopt the same affordance later. Don't generalise
until the third use case (per the same memory, "every refactor toward
OSS plugin shape" without premature abstraction).

Verified
- tsc --noEmit clean
- All 4 tests pass

Refs #2971.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 17:26:32 -07:00
Hongming Wang 8b9f809966 fix(a2a): SSOT response parser — handle poll-mode queued envelope (#2967)
Introduce ``workspace/a2a_response.py`` as the single source of truth for
the wire shapes the workspace-server proxy can return at
``/workspaces/<id>/a2a``:

  * ``Result``    — JSON-RPC success
  * ``Error``     — JSON-RPC error or platform-level error (with
                    restart-in-progress metadata when present)
  * ``Queued``    — poll-mode short-circuit envelope: the platform
                    queued the message into the target's inbox, the
                    target will fetch via /activity poll
  * ``Malformed`` — anything the parser can't classify (logged at
                    WARNING so a future server change is loud)

``send_a2a_message`` (in ``a2a_client.py``) now dispatches via
``a2a_response.parse(data)`` instead of inline ``"result" in data`` /
``"error" in data`` sniffing. The Queued variant returns a new
``_A2A_QUEUED_PREFIX`` sentinel so callers can distinguish "delivered
async, no synchronous reply" from both success-with-text and failure.

reno-stars production data caught two intermittent failures that
both reduced to the same root cause:

  1. **File transfer announce silently failed** — when CEO Ryan PC
     (poll-mode external molecule-mcp) sent the harmi.zip
     announcement to Reno Stars Business Intelligent (also poll-mode
     external), ``send_a2a_message`` saw the platform's poll-queued
     envelope ``{"status":"queued","delivery_mode":"poll","method":"..."}``,
     didn't recognize it as the synthetic delivery-acknowledgement
     it is, and returned ``[A2A_ERROR] unexpected response shape``.
     The agent fell back to a chunk-shipping path; receiver did get
     the file but operator-facing logs showed a failure that didn't
     actually fail.

  2. **Duplicated agent comm** — same bug, inverted direction. d76
     delegated to 67d, send_a2a_message returned the unexpected-shape
     error, delegate_task wrapped it as DELEGATION FAILED, the calling
     agent retried with sharper wording, the recipient saw the same
     request twice and self-reported "二次请求 — 我先不执行".

External molecule-mcp standalone runtimes are inherently poll-mode
(they have no public URL), so every external↔external A2A pair was
hitting this on every send. The pre-fix client only handled JSON-RPC
``result``/``error`` keys and treated the queued envelope (which has
neither) as malformed. RFC #2339 PR 2 added the queued envelope on
the server side; the client never caught up.

When ``send_a2a_message`` returns the ``_A2A_QUEUED_PREFIX`` sentinel,
``tool_delegate_task`` now transparently falls back to
``_delegate_sync_via_polling`` (RFC #2829 PR-5's durable
``/delegate`` + ``/delegations`` polling path, which DOES work for
poll-mode peers because the platform's executeDelegation goroutine
writes to the inbox queue and the result row arrives when the target
picks it up + replies). The agent gets a real synchronous reply
instead of the empty queued sentinel.

  * ``test_a2a_response.py`` — 62 tests, **100% line coverage** on
    the parser (verified via ``coverage run --source=a2a_response``).
    Includes adversarial-input fuzzing across ~25 pathological
    payloads — parser must never raise.
  * ``test_a2a_client.py::TestSendA2AMessagePollMode`` — 4 tests for
    the new Queued/Error wiring in ``send_a2a_message``.
  * ``test_delegation_sync_via_polling.py::TestPollModeAutoFallback``
    — 3 tests for the auto-fallback in ``tool_delegate_task``,
    including negative cases (push-mode reply must NOT trigger
    fallback; genuine error must NOT silently retry).
  * **Verified all new tests FAIL on pre-fix source** by stashing
    a2a_client.py + a2a_tools_delegation.py and re-running — 5
    failures including ImportError for the missing
    ``_A2A_QUEUED_PREFIX``.

Per the operator-debuggability directive:

  * INFO at every Queued classification (expected variant; operator
    sees normal poll-mode-peer queueing in log stream).
  * INFO at the auto-fallback decision in ``tool_delegate_task``
    so a future operator can correlate "send returned queued →
    falling back to polling path" without reading the source.
  * WARNING at every Malformed classification (server contract
    drift; operator MUST see this immediately).
  * Existing transient-retry WARNING preserved.

  * Mirror Go-side typed model in workspace-server. The wire shape
    is documented in ``a2a_response.py``'s module docstring with
    file:line pointers to the canonical emitters; a future PR can
    introduce ``models/a2a_response.go`` without changing wire
    behavior. The fixture corpus in ``test_a2a_response.py`` is
    designed so a one-sided edit breaks CI.
  * ``send_message_to_user`` and ``chat_upload_receive`` use a
    different endpoint (``/notify``) and aren't affected by this
    bug; their parsing stays unchanged.

  * 135 tests pass across ``test_a2a_response.py`` +
    ``test_a2a_client.py`` + ``test_delegation_sync_via_polling.py``
    + ``test_a2a_tools_impl.py``.
  * ``coverage run --source=a2a_response -m pytest`` reports 100%
    line coverage with 0 missing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 17:21:28 -07:00
molecule-ai[bot] a869bc1536 Merge pull request #2963 from Molecule-AI/staging
staging → main: auto-promote 7ee696e
2026-05-05 17:21:02 -07:00
hongming-personal d3e115cb06 Merge pull request #2972 from Molecule-AI/fix/a2a-poll-queued-envelope-2967
fix(a2a-client): recognize poll-mode 'queued' envelope (#2967)
2026-05-06 00:05:27 +00:00
hongming-personal b372c265ab Merge pull request #2968 from Molecule-AI/fix-chat-platform-pending-scheme
fix(canvas/chat): handle platform-pending: scheme for poll-mode upload downloads (PR #2966 followup)
2026-05-06 00:04:49 +00:00
hongming-personal 146c0e7c60 fix(a2a-client): recognize poll-mode 'queued' envelope (#2967)
workspace-server's a2a_proxy poll-mode short-circuit returns

    {status: "queued", delivery_mode: "poll", method: <a2a_method>}

when the peer has no URL to dispatch to (poll-mode peers, including
every external molecule-mcp standalone runtime). The bare
send_a2a_message parser only knew about JSON-RPC {result, error}
keys, so this envelope fell through to the "unexpected response shape"
error path. Two production symptoms on the reno-stars tenant traced
to it:

1. File transfer logged as failed when it actually succeeded —
   operator-facing logs showed an A2A_ERROR but the receiving
   workspace did get the chunked file via the agent's fallback path.
2. delegate_task retried after the false failure → peer received
   duplicate delegations → conversation got confused, the second
   peer self-diagnosed in a notify ("⚠️ Peer 二次请求 — 我先不执行").

Add a third branch to the parser, BETWEEN the existing JSON-RPC
{result, error} cases and the catch-all "unexpected" fallback. The
queued envelope is delivery-acknowledged-but-pending-consumption —
not an error — so it returns a clean success string the agent can
render as a normal outcome. The success string includes "queued"
and "poll" so an operator scanning logs sees the routing path
without parsing JSON.

Defensive: the new branch only fires when BOTH status="queued" AND
delivery_mode="poll" are present. A partial envelope (one key
missing) still falls through to the catch-all, so a future server
bug that emits a malformed shape gets surfaced instead of silently
swallowed.

Tests:
- test_poll_queued_envelope_returns_success_string — pins the canonical
  envelope returns a non-error string. Discriminating: verified to FAIL
  on old code (returned [A2A_ERROR] string), PASS on new.
- test_poll_queued_envelope_with_other_method — pins the parser doesn't
  hardcode message/send. Discriminating: also FAILS on old code.
- test_status_queued_without_poll_mode_still_falls_through — pins both
  keys are required (defensive against future server bugs).

12 existing tests in TestSendA2AMessage still pass — no regression.

Scope: hotfix for the bare send_a2a_message path. The full SSOT
typed-A2AResponse refactor (#158-#163, parents under #2967) covers the
broader vocabulary alignment between Go server and Python client. This
PR ends the production symptoms now without preempting that work.
2026-05-05 16:58:48 -07:00
hongming-personal 5d8b5e96e3 fix(canvas/chat): handle platform-pending: scheme for poll-mode upload downloads
Followup to PR #2966. The user reported the about:blank symptom on
reno-stars and the browser console showed:

  Failed to launch 'platform-pending:d76977b1-…/bb0dcaf3-…' because
  the scheme does not have a registered handler.

So the agent's "download link" was a `platform-pending:<wsid>/<file_id>`
URI — the canonical reference for poll-mode chat uploads (see
workspace-server/internal/handlers/chat_files.go:690 +
workspace/inbox_uploads.py). PR #2966 only handled `workspace:`,
`file:///`, and absolute container paths; the platform-pending
scheme fell through to the raw URI which the browser couldn't
navigate to.

Fix
---

- `resolveAttachmentHref`: added a `platform-pending:` branch that
  resolves to `${PLATFORM_URL}/workspaces/<wsid>/pending-uploads/
  <file_id>/content`. Uses the wsid from the URI, NOT the chat's
  workspace_id — these can differ when a file is forwarded across
  workspaces (cross-workspace delegation, agent forwarding).
- New `isPlatformAttachment(uri)` helper — single source of truth
  for "this URI requires our auth headers, route through
  downloadChatFile". Used by both `downloadChatFile` (chip click)
  and ChatTab's markdown-link override.
- ChatTab.tsx markdown-link override now imports
  `isPlatformAttachment` instead of duplicating the scheme list.
  Pre-fix this list was duplicated and missed `platform-pending:`.

Tests
-----

The 4 IME tests still pass; tsc clean. The platform-pending resolution
is exercised via the `isPlatformAttachment` SSOT helper (any URI
reaching `downloadChatFile` or the markdown override goes through
it). A dedicated test for the URL shape would need a more elaborate
fixture; manual verification on staging post-deploy is the practical
gate.

Reported on production reno-stars 2026-05-05.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 16:55:43 -07:00
hongming-personal dc6e1ac2bf Merge pull request #2966 from Molecule-AI/fix-chat-ime-and-download-link
fix(canvas/chat): IME-safe Enter + markdown link target/scheme handling
2026-05-05 23:52:54 +00:00
hongming-personal c2e12f3fb6 fix(canvas/chat): IME-safe Enter + markdown link target/scheme handling
Two production-reported regressions in the same chat surface, fixed
in one focused PR.

Issue 1 — IME composition + Enter sends half-typed message
----------------------------------------------------------

ChatTab's textarea onKeyDown was:

  if (e.key === "Enter" && !e.shiftKey) {
    e.preventDefault();
    sendMessage();
  }

For agents typing CJK / Japanese / Korean via the system IME, Enter
commits the candidate selection — not a newline, not a send. With
the old check, every IME-commit Enter accidentally sent the
half-typed message ("你好" + half-typed-pinyin + Enter to commit
the next candidate → message goes out before the user finishes).

Fix: guard on `event.nativeEvent.isComposing` AND `e.keyCode !== 229`.
The latter covers older Safari / WebKit-based mobile browsers that
delay setting isComposing on the composition-end Enter.

Issue 2 — markdown links land at about:blank
---------------------------------------------

ReactMarkdown's default `<a>` rendering passes the agent-supplied
href directly to the DOM with no target / scheme handling:

  - http(s) → navigates the canvas tab away (canvas state lost)
  - workspace://path / file:///workspace/... / /workspace/... →
    browser hits unhandled-protocol click → about:blank, no
    download (the reported bug)

Fix: ReactMarkdown `components.a` override:

  - In-container paths (workspace:, file:///{workspace,configs,home,
    plugins}, bare /{workspace,configs,...}) → preventDefault, route
    through downloadChatFile (same auth path the AttachmentChip
    uses). Filename is derived from the path's last segment.
  - External (http/https/mailto/unknown scheme) → target="_blank"
    rel="noopener noreferrer" so canvas state survives.

Tests
-----

ChatTab.imeAndLinks.test.tsx (4 tests):
  - Enter with isComposing=true → does NOT send, input preserved
  - Enter with keyCode=229 (older-Safari IME) → does NOT send
  - Enter with no IME signal → DOES send (happy path intact)
  - Shift+Enter → does NOT send (newline path intact)

The link-component override is exercised through the full ChatTab
render — the IME tests are jsdom-only and don't load chat history
with markdown messages, so the link test would need a more elaborate
fixture. Manual verification on staging post-deploy is the practical
gate; if the link test grows critical the AttachmentViews-style chip
test can extend.

Verified:
- tsc --noEmit clean
- 4/4 IME tests pass

Reported on production 2026-05-05.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 16:47:04 -07:00
hongming-personal dd5df70e59 Merge pull request #2965 from Molecule-AI/rfc-2945-pr-b-typed-events
feat(events): typed EventType registry — SSOT for WS event names (RFC #2945 PR-B)
2026-05-05 16:35:47 -07:00
hongming-personal f1dc721eeb Merge pull request #2964 from Molecule-AI/fix/delegation-ledger-utf8-truncate-2962
fix(delegation_ledger): rune-safe preview truncation (#2962)
2026-05-05 23:34:57 +00:00
hongming-personal 5b78bea10d feat(events): typed EventType registry — single source of truth for WS event names (RFC #2945 PR-B)
Pre-RFC-#2945, every BroadcastOnly / RecordAndBroadcast call site
passed a bare string literal:

  h.broadcaster.BroadcastOnly(workspaceID, "AGENT_MESSAGE", payload)

29 producers (Go, ~30 call sites in handlers/, scheduler/, registry/,
bundle/) and ~30 canvas consumers (TS store + listeners) duplicated
the same string with no shared definition. A producer renaming an
event silently broke every consumer — same drift class that produced
the reno-stars data-loss regression on the persistence side. PR-A
fixed the persistence-side SSOT (AgentMessageWriter); PR-B fixes the
event-name SSOT.

What this PR ships

  internal/events/types.go
    - EventType typed string + 29 named constants covering the full
      taxonomy (chat / lifecycle / agent assignment / delegation /
      task / approval / auth).
    - Grouped semantically; new constants must be added here AND
      mirrored in canvas/src/lib/ws-events.ts (parity gate landing
      in PR-B-2 follow-up).
    - AllEventTypes slice — authoritative list for the snapshot
      test + the cross-language parity gate.

  internal/events/types_test.go (3 tests)
    - TestAllEventTypes_IsSnapshot: pins the canonical list. Adding
      a new constant without updating AllEventTypes (or vice versa)
      fails with a one-line diff.
    - TestEventType_NoEmptyConstants: catches accidentally-empty
      values (typo in types.go: const X EventType = ...).
    - TestEventType_AllUppercaseSnakeCase: pins the wire format that
      canvas TS switch statements assume (no kebab-case, no mixed
      case, no leading/trailing/double underscores).

  agent_message_writer.go (single migration)
    - Demonstrates the constant-usage shape:
        events.EventAgentMessage  →  "AGENT_MESSAGE"
    - Other ~30 call sites stay on bare strings for now (this PR
      narrow); the migration happens in PR-B-1 follow-up. Both
      shapes (constant + bare string) co-exist on the wire — the
      typed version is just the recommended path for new code.

Why ship this in stages

  1. PR-B (this): types + tests + first migration → MERGEABLE NOW,
     low risk.
  2. PR-B-1 (follow-up): migrate the remaining ~30 call sites to
     constants. Mechanical, low-risk.
  3. PR-B-2 (follow-up): canvas/src/lib/ws-events.ts mirror + cross-
     language parity gate. Touches both repos.

Per memory feedback_oss_design_philosophy.md (every refactor toward
OSS plugin shape) — this surface is now plugin-safe: external
implementations can import the events package and get the same
named taxonomy without copying strings.

Verified
- go vet ./internal/events/ clean
- go build ./... clean
- TestAllEventTypes_IsSnapshot + TestEventType_* all pass
- TestAgentMessageWriter_* (the only call site touched) still green

Refs RFC #2945, PR #2949 (PR-A SSOT), PR #2944 (reno-stars).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 16:25:38 -07:00
hongming-personal a5903af459 fix(delegation_ledger): rune-safe preview truncation (#2962)
The previous byte-slice form `s[:previewCap]` could split a multi-byte
codepoint at byte 4096, producing invalid UTF-8. Postgres JSONB rejects
the row → ledger insert silently fails → audit gap on dashboards while
activity_logs continues to record the event.

Walk the string by rune index and stop at the last boundary that fits
inside the cap. ASCII-only strings still hit the cap exactly; CJK/emoji
strings stop slightly under, never over.

Mirrors the truncatePreviewRunes fix shipped for agent_message_writer
in #2959. Followup: deduplicate into a shared helper once both have
landed.

Tests: 2 regression tests using utf8.ValidString — one with an all-3-byte
rune string just over the cap, one with a single multi-byte rune sitting
exactly on the boundary. Verified on the previous byte-slice impl: both
new tests would fail (invalid UTF-8 + truncation past cap by 1 byte).
2026-05-05 16:19:51 -07:00
hongming-personal 07d09f3696 Merge pull request #2959 from Molecule-AI/rfc-2945-pr-a-followup-utf8-and-db-errors
fix(handlers): UTF-8-safe preview truncation + distinguish DB errors from not-found (PR-A followup)
2026-05-05 16:19:29 -07:00
hongming-personal f7c270bf24 Merge pull request #2955 from Molecule-AI/auto-sync/main-e0df90c2
chore: sync main → staging (auto, ff to e0df90c2)
2026-05-05 16:19:03 -07:00
hongming-personal 0301f90183 Merge pull request #2961 from Molecule-AI/fix/doctor-register-side-effect
fix(mcp-doctor): heartbeat (idempotent) instead of register (UPSERT) — self-review of #2954
2026-05-05 23:18:54 +00:00
hongming-personal feef80423b Merge pull request #2958 from Molecule-AI/fix/external-connect-templates-mcp-command
fix(external-connect): use molecule-mcp wrapper in Codex/OpenClaw templates (#2957)
2026-05-05 16:18:23 -07:00
hongming-personal 469b24ff8f Merge pull request #2960 from Molecule-AI/fix/memory-tab-v2-self-review
fix(memory): self-review on PR #2956 — drop speculative field, tighten 503 match
2026-05-05 23:15:50 +00:00
Hongming Wang c4d3c9a451 fix(memory): self-review on PR #2956 — drop speculative field, tighten 503 match
Two issues caught in five-axis self-review of #2956:

## 1. Drop speculative source_workspace_id rendering

The panel rendered a "from peer" badge based on
`propagation.source_workspace_id`, claiming it surfaced cross-
workspace propagation. But the OpenAPI spec at
docs/api-protocol/memory-plugin-v1.yaml documents `propagation` as
"Opaque metadata the plugin stores and returns. Reserved for future
cross-namespace propagation semantics" — and a grep across
workspace-server/internal/memory/ confirms NO writer in the codebase
populates that key. The badge would never render against real data.

Violates "don't design for hypothetical future requirements" from
the project conventions. Drop the field from MemoryV2, the row badge,
the test fixtures, and the JSDoc. When propagation gains a concrete
shape, re-add backed by an actual writer.

## 2. Tighten 503 detection — match the literal contract string

Pre-fix detection: `msg.includes('503') || msg.toLowerCase().includes('plugin is not configured')`
False-positives on any unrelated 503 + on any error mentioning
"plugin" + "configured" in any order.

Post-fix: `msg.includes('MEMORY_PLUGIN_URL')` — the env var name is a
hard-coded literal in workspace-server/internal/handlers/memories_v2.go's
available() error, so this is a pinned cross-layer contract. Drift
between the Go error message and the canvas detection now fails
loud (TestMemoriesV2_PluginUnwired_All503 asserts the env var name
in the response body; the canvas test asserts the same).

Extracted as a named export `isPluginUnavailableError` so the
detection is unit-testable and reusable. Added 4 direct tests:
contract-string match, generic-503 false-negative, 401 false-
negative, non-Error inputs.

## Test results

- 30 component tests pass (was 26; +4 for isPluginUnavailableError)
- Coverage on MemoryInspectorPanel.tsx: 100% lines, 100% functions
  (branch coverage up to 85.9% from 84.7% — speculative-field
  branches no longer count)
- Full canvas suite: 1277/1277 pass across 91 files
2026-05-05 16:11:13 -07:00
hongming-personal 2652ea8342 fix(mcp-doctor): heartbeat (idempotent) instead of register (UPSERT)
Self-review caught after #2954 landed: check_register() POSTed to
/registry/register with agent_card.name="doctor-probe". The endpoint
is an UPSERT, so the doctor probe overwrites the workspace's actual
agent_card metadata until the real agent's next register call. An
operator running `molecule-mcp doctor` against a live workspace
would see their canvas briefly display "doctor-probe" as the agent
name — invisible production-disruption.

Switches to POST /registry/heartbeat. heartbeat only updates
last_heartbeat_at (and clears awaiting_agent if needed) — the same
work a normal molecule-mcp boot does every 20s in steady state, so
the doctor's extra heartbeat is indistinguishable from background
traffic.

Function renamed check_register → check_token_auth to match what
it actually does. check_register kept as back-compat alias so any
external test/import still resolves.

Also unified the duplicated token-resolution paths into a single
_resolve_token() returning (value, source_label). Pre-fix:
check_register and _resolve_token_summary read env in parallel
ladders — a future env-var addition would have to touch both.

New tests:
  - test_check_token_auth_uses_heartbeat_endpoint: mocks urlopen,
    asserts the URL ends in /registry/heartbeat AND does NOT
    contain /registry/register. Pins the load-bearing invariant
    so a future refactor can't silently re-route through register.
  - test_resolve_token_returns_value_and_label_for_env: pins the
    consolidated resolver returns both pieces of info from the
    same source-decision.
  - test_resolve_token_returns_none_when_missing: missing-env
    happy path.

Verification:
  - 13/13 tests pass (10 existing + 3 new)
  - Manual stripped-env run still renders 4 FAIL + 2 WARN with
    actionable hints, exit 1.

Refs molecule-core#2934 item 6 (doctor side-effect fix-up).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 16:11:08 -07:00
hongming-personal 1e01083e55 fix(handlers): UTF-8-safe preview truncation + distinguish DB errors from not-found (RFC #2945 PR-A followup)
Self-review of PR #2949 surfaced two pre-existing defects that the
SSOT consolidation inherited from the original /notify handler. Both
are addressable in a small follow-up; shipping them as a separate PR
keeps the consolidation and the bug-fix individually reviewable.

Critical: byte-slice preview truncation produces invalid UTF-8
-------------------------------------------------------------

Pre-fix:

    if len(preview) > 80 {
        preview = preview[:80] + "…"
    }

`len()` returns BYTES; `preview[:80]` slices on a byte boundary. For
agent-authored chat in CJK / emoji / accented characters, byte 80
lands mid-codepoint → invalid UTF-8 → Postgres JSONB rejects → INSERT
fails → activity_log row never written → message vanishes from chat
history on the next reload. The persistence-failure log fires but
operators have to grep to find it, and the user-visible regression
mode is identical to reno-stars.

Fix: extract `truncatePreviewRunes(s, maxRunes)` that walks the rune
boundary using `for i := range s` (Go's range over string yields rune
start indices). Cap at 80 RUNES not bytes — UI-friendly count, not
storage count.

Important: workspace-lookup error path swallows real DB errors
--------------------------------------------------------------

Pre-fix:

    if err := w.db.QueryRowContext(...).Scan(&wsName); err != nil {
        return ErrWorkspaceNotFound
    }

Conflates `sql.ErrNoRows` (legit not-found → caller 404) with real
DB errors (connection drop, query timeout, pool exhaustion → caller
should 503). During a Postgres outage every notify call surfaced as
"workspace not found" — masking the actual incident in alerting and
making the symptom indistinguishable from "you typed a bad workspace
ID".

Fix: distinguish via `errors.Is(err, sql.ErrNoRows)` and wrap
non-not-found errors with `fmt.Errorf("agent_message: workspace
lookup: %w", err)`. Callers' existing fallback path (return 500 /
return error wrapped) handles the new shape correctly without any
changes — verified by running existing TestNotify_* and
TestMCPHandler_SendMessage_* tests.

Tests added (3 new, 11 total writer tests)
------------------------------------------

- TestTruncatePreviewRunes_RuneBoundary: 8-case table — ASCII, CJK,
  exactly-at-max, emoji prefix. Asserts both correct visible output
  AND `utf8.ValidString` on every result so the bug shape (invalid
  UTF-8) can't recur.

- TestAgentMessageWriter_Send_NonASCIIMessagePersists: end-to-end
  with a 200-rune CJK message (exceeds the 80-rune cap, would have
  hit the byte-slice bug). Pins the INSERT summary contains valid
  UTF-8 with exactly 80-rune body + ellipsis.

- TestAgentMessageWriter_Send_DBErrorOnLookupReturnsWrapped: pins the
  DB-outage path returns a wrapped non-ErrWorkspaceNotFound error so
  alerting can distinguish 404 from 503. Verified via mock
  ExpectQuery returning a transient error.

Verified
--------

- `go vet ./internal/handlers/` clean
- `go build ./...` clean
- All 14 writer + caller tests pass (8 original + 3 new + AST gate +
  TestNotify_* + TestMCPHandler_SendMessage_* sibling tests)

Per memory feedback_assert_exact_not_substring.md: every new test
asserts boundary behavior directly (UTF-8 validity, exact rune count,
errors.Is comparison) rather than substring-match in stringified
output.

Refs RFC #2945, PR #2949, PR #2944.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 16:10:58 -07:00
Hongming Wang eab36e217e fix(external-connect): use molecule-mcp wrapper in Codex/OpenClaw templates (#2957)
The External Connect modal's Codex and OpenClaw tabs were rendering
this MCP server config:

  command = "python3"
  args = ["-m", "molecule_runtime.a2a_mcp_server"]

That spawns the bare MCP dispatcher with no presence wiring. The
``molecule-mcp`` console-script wrapper (mcp_cli.main) is what calls
``POST /registry/register`` at startup and runs the 20s heartbeat
thread alongside the MCP stdio loop. Without the wrapper, the canvas
flips the workspace back to ``awaiting_agent`` (OFFLINE) within
60-90s — even while tools work — because nothing is heartbeating.

Operator-side this looks like: the workspace is registered and tools
work fine when invoked, but the canvas shows "offline" / "Restart"
CTA, peer agents see the workspace as awaiting_agent in list_peers
output, and inbound A2A delivery silently fails the readiness check.
A new external-Codex operator (#2957) hit this and spent debugging
time on what should have been a copy-paste install.

Fix: switch both Codex and OpenClaw templates to
``command = "molecule-mcp"`` / ``args = []``, matching the universal
MCP template that already handles this correctly. Inline comment in
each template explains the wrapper-vs-bare-module tradeoff so a
future template author doesn't regress to the shorter form.

Hermes-channel intentionally still spawns the bare module — the
hermes plugin owns the platform plugin path and runs its own
register_platform/heartbeat code in-process; double-heartbeating
would race. Universal/Codex/OpenClaw all need the wrapper.

Regression gate: TestExternalMcpTemplates_UseMoleculeMcpWrapper
asserts the three templates that must use the wrapper actually do,
and explicitly fails on the old ``-m molecule_runtime.a2a_mcp_server``
shape. Verified the test FAILS on pre-fix source by stashing only
external_connection.go and re-running.

Source: molecule-core#2957 issue 1 (item 4 of the report — the
``(codex returned empty output)`` / opaque-canvas-error / stale-
session items live in codex-channel-molecule and are tracked
separately).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 16:06:02 -07:00
hongming-personal 7ee696ec9a Merge pull request #2954 from Molecule-AI/feat/molecule-mcp-doctor
feat(mcp): add molecule-mcp doctor onboarding diagnostic (#2934 item 6)
2026-05-05 22:57:28 +00:00
hongming-personal decec9b9a1 Merge pull request #2956 from Molecule-AI/feat/memory-tab-v2-redesign
feat(memory): redesign Memory tab for v2 plugin
2026-05-05 22:56:55 +00:00
hongming-personal ada27fdb5d Merge pull request #2949 from Molecule-AI/rfc-2945-pr-a-agent-message-writer
refactor(handlers): AgentMessageWriter SSOT — consolidate Notify + MCP send_message_to_user (RFC #2945 PR-A)
2026-05-05 22:56:28 +00:00
Hongming Wang f0f4d0e761 feat(memory): redesign Memory tab for v2 plugin
Replaces the v1 LOCAL/TEAM/GLOBAL tab trio (mapped to the deprecated
shared_context model) with a v2 plugin-driven UI. Without this,
canvas Memory tab was reading the frozen agent_memories table while
all post-cutover agent writes went to the plugin's memory_records —
the tab silently displayed stale data.

## Backend (workspace-server)

New routes under wsAuth, all behind the existing per-tenant token:

  GET    /workspaces/:id/v2/namespaces      → readable + writable lists
  GET    /workspaces/:id/v2/memories        → plugin search proxy
  DELETE /workspaces/:id/v2/memories/:mid   → plugin forget proxy

memories_v2.go — slim handler:
  - Server-side ACL: every search request is intersected with the
    resolver's readable-namespaces set (canvas-supplied namespace
    that the workspace can't read returns [] not 403, matches v1
    existence-non-inferring shape).
  - Returns 503 with "set MEMORY_PLUGIN_URL" hint when plugin
    isn't wired (canvas surfaces a banner).
  - Maps plugin not_found → 404, other plugin errors → 502.
  - View shaping: NamespaceView.label rendered server-side
    ("Workspace (abc-1234)", "Team (t-99)", "Org (acme)", custom)
    so canvas doesn't parse namespace names. MemoryView surfaces
    pin/expires_at/score/source_workspace_id from Propagation.

memories_v2_test.go — 100% line + 100% function coverage:
  - 503 path on every endpoint when unwired
  - Namespaces success + readable/writable error paths
  - Search: empty intersection, full-path query/kind/limit
    propagation, namespace=/no-namespace branches, propagation
    map missing/wrong-type, intersect error, plugin error
  - Forget: success, plugin not_found→404, other plugin
    errors→502, missing memoryId→400
  - Helpers: namespaceLabel for all 4 kinds + truncation,
    parseLimit edge cases (default/0/negative/over-cap/non-num),
    memoryToView field round-trip, indexOfColon, shortID

## Frontend (canvas)

MemoryInspectorPanel rewritten for v2:
  - Drop LOCAL/TEAM/GLOBAL trio. Namespace dropdown driven by
    GET /v2/namespaces.readable, "All namespaces" default.
  - New per-row badges: kind (F/S/C), source (agent/runtime/user),
    pin (📌), TTL countdown (12h / "expired"), score% on
    semantic search, source-workspace ⇡ws-pee for propagated.
  - Drop Edit button — v2 plugin contract has no PATCH; the
    model is forget + recommit. Forget stays.
  - Plugin-unavailable banner with operator hint when /v2/*
    returns 503.
  - Bug fix surfaced by test: rollback-on-failed-delete order
    of operations (loadEntries() called setError(null) AFTER
    we set the failure message, wiping it). Reload first, then
    set the error.

MemoryEditorDialog deleted — Add was POST /memories which v2
doesn't support from canvas (writes go via MCP). The legacy
Edit-flow tests go with it.

## Test results

Backend: `go test ./internal/handlers/` — all pass
Backend coverage on memories_v2.go: 100% lines, 100% functions
Canvas: `vitest run` — 91 files, 1273 tests pass (26 new)
Canvas coverage on MemoryInspectorPanel.tsx: 100% lines,
  100% functions, 96.7% statements, 84.7% branches
  (uncovered branches are defensive `?? fallback` for
   contract-impossible kind/source values)

## Migration note

The legacy v1 GET/POST/PATCH/DELETE on /workspaces/:id/memories
remains in place for the back-compat MCP shim (mcp_tools_memory_v2's
legacy routing) and admin export/import. PR-9 (#283) drops
agent_memories along with the v1 endpoints once the cutover
verification window closes.
2026-05-05 15:53:28 -07:00
molecule-ai[bot] e0df90c294 Merge pull request #2951 from Molecule-AI/staging
staging → main: auto-promote 1edee11
2026-05-05 15:48:32 -07:00
hongming-personal f01f374072 feat(mcp): add molecule-mcp doctor onboarding diagnostic
Closes #2934 item 6 — the deferred follow-up from Ryan's onboarding-
friction report. Quote: "this single command would have saved me
30 of the 45 minutes."

When push delivery fails or the install half-works, the operator
today has no signal — they hand-grep the Claude Code binary or
chase the `from versions: none` red herring. Doctor renders six
checks in one screen with concrete next-step suggestions:

  1. Python version    >=3.11? (wheel's pin)
  2. Wheel install     molecule-ai-workspace-runtime importable +
                        version surfaced
  3. PATH for binary   `molecule-mcp` resolves on PATH; if not,
                        prints the resolved user-site bin dir to
                        add (or recommends pipx)
  4. Env vars          PLATFORM_URL + WORKSPACE_ID + token (env or
                        *_FILE or .auth_token)
  5. Platform reach    GET ${PLATFORM_URL}/healthz returns 2xx
  6. Registry register POST /registry/register with the resolved
                        token returns 2xx — end-to-end auth check

Each line: `[OK|WARN|FAIL] <label>: <status>` plus a `next:` hint
when not OK. ANSI colors auto-disable on non-TTY / NO_COLOR.

Exit code: 0 on all-OK or only-WARN, 1 on any FAIL — scriptable
from CI install-checks.

## Files

`workspace/mcp_doctor.py`  (new) — six check functions + `run()`
                                   entry point. Uses urllib (stdlib)
                                   so doctor works even on a partial
                                   install where `requests` is missing.

`workspace/mcp_cli.py`             Subcommand dispatch:
                                     molecule-mcp doctor   → mcp_doctor.run()
                                     molecule-mcp --help   → usage banner
                                     molecule-mcp          → server (unchanged)

`workspace/tests/test_mcp_doctor.py`  (new) — 10 tests covering each
                                       check's pass/fail/skip path
                                       plus the end-to-end exit-code
                                       contract on a stripped env.

`scripts/build_runtime_package.py`    Adds `mcp_doctor` to
                                       TOP_LEVEL_MODULES so the
                                       wheel ships the new module.

## Out of scope (deferred follow-ups)
- Claude Code-specific checks (parse ~/.claude.json, verify each
  MCP entry is plugin-sourced + dev-channels flag set). That's a
  separate Claude-Code-shaped doctor; lives in the channel plugin.
- Automated remediation. Doctor is diagnostic — tells the operator
  what's wrong + how to fix it, doesn't apply changes.

## Verification
  - python -m pytest tests/test_mcp_doctor.py -v   → 10/10 PASS
  - python -m pytest tests/test_mcp_cli*.py        → 67/67 PASS
    (existing CLI suite still green; subcommand dispatch added
    before env-validation, doesn't disturb the server-boot path)
  - manual: `molecule-mcp doctor` on a stripped env renders 4 FAIL
    + 2 WARN + exit code 1, with each `next:` hint actionable

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 15:44:36 -07:00
hongming-personal 1edee1131b Merge pull request #2948 from Molecule-AI/auto-sync/main-f5ea812e
chore: sync main → staging (auto, ff to f5ea812e)
2026-05-05 15:29:46 -07:00
hongming-personal d99b3f2aec refactor(handlers): consolidate Notify + MCP send_message_to_user through AgentMessageWriter (RFC #2945 PR-A)
Pre-RFC-#2945 the broadcast + activity_log INSERT for "agent → user
chat" was duplicated across two handlers — activity.go's Notify (HTTP
/notify) and mcp_tools.go's toolSendMessageToUser (MCP tools/call).
The duplication is exactly what produced the reno-stars production
data-loss regression (PR #2944): the persistence-half fix landed for
one handler and silently lagged for the other for months, dropping
every long-form external-agent message on reload.

PR #2944 added the missing INSERT to mcp_tools.go and a forward-
looking AST gate. This PR removes the duplication at the source.

What changes
------------

NEW: workspace-server/internal/handlers/agent_message_writer.go
- AgentMessageWriter struct + NewAgentMessageWriter ctor.
- Send(ctx, workspaceID, message, attachments) error: workspace
  lookup → broadcast WS AGENT_MESSAGE → INSERT activity_logs.
- ErrWorkspaceNotFound for the lookup-miss path so callers can
  return 404 / JSON-RPC error cleanly.
- Best-effort persistence: INSERT failure logs only, returns nil so
  the broadcast success isn't undone (matches previous behavior in
  both call sites — pinned by test).
- Takes events.EventEmitter (interface) so tests can substitute a
  capturing fake without nil-panicking inside hub.Broadcast.

UPDATED: activity.go:Notify
- Replaced ~75 lines of inline broadcast+INSERT with a 12-line
  call to AgentMessageWriter.Send.
- Attachment shape conversion (NotifyAttachment → AgentMessageAttachment)
  is local to the HTTP handler; the writer's API doesn't import the
  HTTP-binding-tagged type.

UPDATED: mcp_tools.go:toolSendMessageToUser
- Replaced ~40 lines (the post-#2944 broadcast+INSERT pair) with a
  6-line call to the writer.
- Attachments is nil today because the MCP tool args don't expose
  attachments yet. When the schema adds it, build the slice and
  pass through; the writer half is ready.

Tests
-----

agent_message_writer_test.go (8 tests, comprehensive):
- TestAgentMessageWriter_Send_Success_NoAttachments — happy path,
  pins JSON `{"result":"hi"}`.
- TestAgentMessageWriter_Send_Success_WithAttachments — pins file
  parts shape (kind=file, file.{uri,name,mimeType,size}). Uses a
  jsonMatcher that decodes + asserts via predicate (tolerant of
  map key ordering, exact on shape).
- TestAgentMessageWriter_Send_WorkspaceNotFound — pins
  ErrWorkspaceNotFound + asserts NO broadcast NO INSERT.
- TestAgentMessageWriter_Send_DBInsertFailureStillReturnsNil — pins
  best-effort persistence contract.
- TestAgentMessageWriter_Send_PreviewTruncation — pins ≤80-char
  preview + ellipsis (Ryan's onboarding-friction report would have
  bloated activity_logs.summary by 2KB without this).
- TestAgentMessageWriter_Send_BroadcastsAgentMessageEvent — pins WS
  event name + payload shape via capturingEmitter.
- TestAgentMessageWriter_Send_OmitsAttachmentsKeyWhenEmpty — pins
  the "no key when nil" wire contract.

The existing AST gate from #2944
(TestAgentMessageBroadcastsArePersisted) still holds: any future
function emitting AGENT_MESSAGE without an INSERT fails the test.
With the writer in place that's now redundant — both producers go
through it — but the gate is cheap to keep as defense-in-depth.

Verified: go vet clean; all writer + caller tests pass; existing
TestNotify_* + TestMCPHandler_SendMessage_* + the AST gate all green.

Refs RFC #2945, PR #2944.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 15:29:42 -07:00
molecule-ai[bot] f5ea812e9d Merge pull request #2947 from Molecule-AI/staging
staging → main: auto-promote c4807a9
2026-05-05 22:22:58 +00:00
hongming-personal 3b7ed9cf53 Merge pull request #2946 from Molecule-AI/fix/onboarding-followup-2934
mcp: surface specific TOKEN_FILE errors + link follow-ups (#2934)
2026-05-05 22:19:21 +00:00
Hongming Wang da9061c131 mcp: surface specific TOKEN_FILE errors + link follow-ups (#2934)
Self-review of #2935 turned up two real defects:

1. Stale README issue references — the build_runtime_package.py
   README template said "(issue #2934 follow-up)" twice, but the
   marketplace-plugin and `doctor` items now have dedicated tracking
   issues. Updated to point at #2936 and #2937 respectively.

2. Silent fallthrough on broken MOLECULE_WORKSPACE_TOKEN_FILE — when
   an operator EXPLICITLY pointed TOKEN_FILE at a path that didn't
   exist / wasn't readable / was blank / contained internal whitespace,
   the resolver silently returned the generic "set one of these three
   vars" error. That's exactly the silent failure mode #2934 flagged
   ("a new user has no chance"). Refactor `_read_token_from_file_env`
   to return `(token, error)`; surface the SPECIFIC failure when the
   operator's intent was clearly the file path. Skip the CONFIGS_DIR
   fallback in that case so the operator's config bug isn't masked
   by a different source happening to work.

Adds 2 renames + 2 new tests in test_mcp_cli_split.py:
  - test_missing_file_returns_specific_error (asserts "does not exist")
  - test_empty_file_returns_specific_error (asserts "is empty")
  - test_multi_line_file_rejected (asserts "internal whitespace")
  - test_token_file_error_skips_configs_dir_fallback (asserts a valid
    CONFIGS_DIR/.auth_token does NOT silently rescue a broken
    TOKEN_FILE)

All 81 mcp_cli + mcp_cli_multi_workspace + mcp_cli_split tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 15:07:15 -07:00
hongming-personal c4807a930d Merge pull request #2940 from Molecule-AI/refactor/a2a-tools-inbox-extract-rfc2873-iter4e
refactor(workspace): extract inbox tools from a2a_tools.py (RFC #2873 iter 4e)
2026-05-05 21:58:32 +00:00
hongming-personal d22fbb29b8 Merge pull request #2944 from Molecule-AI/fix-mcp-send-message-to-user-persist
fix(mcp): persist send_message_to_user pushes to activity_log (reno-stars data loss)
2026-05-05 21:57:37 +00:00
hongming-personal 899c53550d test(mcp): comprehensive coverage for send_message_to_user persistence + AST gate (reno-stars followup)
Per user request: audit all similar tools + write comprehensive tests
including E2E for the persistence-of-AGENT_MESSAGE-broadcasts contract.

Audit (all BroadcastOnly call sites in workspace-server/internal/):

  | Site | Event | Persisted? | Notes |
  |---|---|:---:|---|
  | a2a_proxy_helpers.go:275 | A2A_RESPONSE | ✓ | LogActivity above |
  | activity.go:486 (Notify) | AGENT_MESSAGE | ✓ | INSERT line 535 |
  | activity.go:701 (LogActivity) | ACTIVITY_LOGGED | ✓ | self-emits inside DB write |
  | mcp_tools.go:341 (toolSendMessageToUser) | AGENT_MESSAGE | ✓ NEW (this PR) |
  | registry.go:575 | TASK_UPDATED | N/A | transient progress, not chat |
  | registry.go:596 | WORKSPACE_HEARTBEAT | N/A | infra ping, not chat |

Only one chat-bearing broadcast was missing persistence (the just-
fixed mcp bridge path). No other regressions found.

Tests added (4 new, total 5 send_message_to_user tests):

1. TestAgentMessageBroadcastsArePersisted — AST gate that walks every
   non-test .go in the package, finds funcs that BroadcastOnly with
   "AGENT_MESSAGE", asserts each ALSO contains an
   "INSERT INTO activity_logs". Forward-looking regression block:
   any future chat tool that broadcasts without persisting fails the
   test with a clear file:func diagnostic. Mutation-tested locally:
   removing the INSERT block from toolSendMessageToUser reliably
   produces the expected failure.

2. TestMCPHandler_SendMessageToUser_DBErrorLogsAndStill200s — pins
   the "best-effort persistence" contract. DB INSERT failures must
   NOT abort the tool response (the WS broadcast already succeeded;
   retrying would double-render in the live chat). Matches /notify.

3. TestMCPHandler_SendMessageToUser_ResponseBodyShape — pins the
   exact `{"result": "<message>"}` JSON shape stored in
   response_body. The canvas hydrater (extractResponseText in
   historyHydration.ts) reads body.result; any drift here silently
   breaks chat history without failing the INSERT. Per memory
   feedback_assert_exact_not_substring.md, asserts the literal JSON
   shape, not a substring.

4. TestMCPHandler_SendMessageToUser_PersistsToActivityLog (existing,
   from previous commit) — pins INSERT shape with regex on
   'a2a_receive' + 'notify' literals.

5. TestMCPHandler_SendMessageToUser_Blocked_WhenEnvNotSet (existing)
   — env-gate aborts before DB.

Test fixture cleanup: newMCPHandler now uses newTestBroadcaster (real
ws.Hub) instead of events.NewBroadcaster(nil) — the latter nil-panics
inside hub.Broadcast on the AGENT_MESSAGE path. Same broadcaster
shape every other handler test uses.

E2E note: the AST gate is the strongest forward-looking guarantee.
A real-DB integration test would add value for CI but is largely
duplicative of the sqlmock contract tests above (sqlmock pins SQL
shape with much faster feedback). Left as a future enhancement when
the handlers Postgres-integration suite extends MCP coverage.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 14:52:32 -07:00
hongming-personal cdfc9f743f fix(mcp): persist send_message_to_user pushes to activity_log (reno-stars data loss)
Reported on production tenant reno-stars: an external claude-code agent
(CEO Ryan PC workspace) sent a long-form message via send_message_to_user;
the user saw it live in the chat panel but it vanished after a refresh.
Confirmed via direct production query — the message is NOT in
activity_logs at all (only short test pings around it are persisted).

Root cause: there are TWO server-side handlers for send_message_to_user:

  1. HTTP `/workspaces/:id/notify` (activity.go:Notify) — broadcasts WS
     AND inserts a row into activity_logs. This is the path the
     in-container runtime's tool_send_message_to_user calls.

  2. MCP-bridge `tools/call name=send_message_to_user`
     (mcp_tools.go:toolSendMessageToUser) — broadcasts WS only,
     **never persisted**. This is the path EXTERNAL agents using
     molecule-mcp's send_message_to_user tool route through.

The persistence fix landed for path 1 months ago but was never mirrored
on path 2. External agents — exactly the case in reno-stars/CEO Ryan PC
— have been silently losing every long-form notification on reload.

Fix: mirror the activity.go INSERT shape inside toolSendMessageToUser:

  INSERT INTO activity_logs
    (workspace_id, activity_type, method, summary, response_body, status)
  VALUES ($1, 'a2a_receive', 'notify', $2, $3::jsonb, 'ok')

Same wire shape as /notify so the canvas's chat-history hydration
(`type=a2a_receive&source=canvas`) treats both writers identically.
Errors are log-only — broadcast already succeeded, persistence failure
shouldn't block the tool response (matches /notify behavior; downside
is the same data-loss-on-DB-error risk, surfaced via log.Printf).

Tests
-----

- `TestMCPHandler_SendMessageToUser_PersistsToActivityLog` — pins both
  the workspace-name lookup AND the INSERT shape. Regex-matches
  `'a2a_receive'` + `'notify'` literals so a future refactor that
  changes activity_type or method breaks the test loud, not silently
  re-introducing the data-loss bug.
- Updated newMCPHandler to use newTestBroadcaster() (real ws.Hub) —
  events.NewBroadcaster(nil) crashes inside hub.Broadcast in the
  send_message_to_user path. Same shape every other handler test uses.

Verified `go test ./internal/handlers/ -run TestMCPHandler_SendMessage`
green; full vet clean.

Refs reno-stars production incident 2026-05-05.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 14:47:48 -07:00
molecule-ai[bot] 7a2664523c Merge pull request #2942 from Molecule-AI/staging
staging → main: auto-promote 1ad107c
2026-05-05 14:47:21 -07:00
molecule-ai[bot] 632e906640 Merge pull request #2938 from Molecule-AI/staging
staging → main: auto-promote b906e1d
2026-05-05 21:29:56 +00:00
Hongming Wang 475da5b64c refactor(workspace): extract inbox tools from a2a_tools.py (RFC #2873 iter 4e)
Continues the OSS-shape refactor. After iters 4a-4d (rbac, delegation,
memory, messaging) the only behavior left in ``a2a_tools.py`` was
``report_activity`` plus three thin inbox-tool wrappers and the
``_enrich_inbound_for_agent`` helper. This iter extracts the inbox
slice to ``a2a_tools_inbox.py`` so the kitchen-sink module shrinks
from 280 LOC to ~165 LOC of imports + report_activity + back-compat
re-export blocks.

Extracted symbols:
  - ``_INBOX_NOT_ENABLED_MSG`` (sentinel)
  - ``_enrich_inbound_for_agent`` (poll-path peer enrichment helper)
  - ``tool_inbox_peek``
  - ``tool_inbox_pop``
  - ``tool_wait_for_message``

Re-exports (`from a2a_tools_inbox import …`) preserve the public
``a2a_tools.tool_inbox_*`` surface so existing tests + call sites
continue to resolve unchanged.

New tests in test_a2a_tools_inbox_split.py:
  1. **Drift gate (5)** — every previously-public symbol on a2a_tools
     is the EXACT same object as a2a_tools_inbox.foo (`is`, not `==`),
     catches a future "wrap with logging" refactor that silently loses
     existing test coverage.
  2. **Import contract (1)** — a2a_tools_inbox does NOT eagerly import
     a2a_tools at module load. Pins the layered architecture: the
     extracted slice depends on ``inbox`` + a lazy ``a2a_client``
     import, never on the kitchen-sink that re-exports it.
  3. **_enrich_inbound_for_agent branches (5)** — peer_id-empty
     (canvas_user) returns dict unchanged; missing peer_id key same;
     a2a_client unavailable (test harness, partial install) degrades
     gracefully with a bare envelope; registry hit populates
     peer_name + peer_role + agent_card_url; registry miss still
     surfaces agent_card_url (constructable from peer_id alone).

The full timeout-clamp / validation / JSON-shape behavior matrix for
the three wrappers stays in test_a2a_tools_inbox_wrappers.py — those
tests pass identically against both the alias and the underlying impl.

Wiring updates:
  - ``scripts/build_runtime_package.py``: add ``a2a_tools_inbox`` to
    ``TOP_LEVEL_MODULES`` so it ships in the runtime wheel and the
    drift gate doesn't fail the next publish.
  - ``.github/workflows/ci.yml``: add ``a2a_tools_inbox.py`` to
    ``CRITICAL_FILES`` so the 75% MCP/inbox/auth per-file floor
    applies — this is now where the inbox-delivery code actually
    lives.
2026-05-05 14:28:58 -07:00
hongming-personal 1ad107cc15 Merge pull request #2935 from Molecule-AI/fix/onboarding-friction-2934
fix(onboarding): address Claude Code MCP onboarding friction (#2934)
2026-05-05 21:25:57 +00:00
hongming-personal e4bd1e4293 Merge pull request #2933 from Molecule-AI/auto-sync/main-226e57a9
chore: sync main → staging (auto, ff to 226e57a9)
2026-05-05 14:22:57 -07:00
Hongming Wang 01deeb36cf fix(onboarding): address Claude Code MCP onboarding friction (#2934)
Ryan's bug report (#2934) walked through ~45 min of debugging a stock
external-runtime install. This PR fixes the four items he flagged that
have a small surface, and stubs out the larger ones for follow-up.

Fixed in this PR
================

#1 — Python floor disclosure (README in publish bundle)
  Add an explicit "Requires Python ≥3.11" section that calls out the
  cryptic "Could not find a version that satisfies the requirement"
  failure mode; recommend `pipx install` over `pip install` so the
  binary lands on PATH automatically; show the explicit `pip install
  --user` alternative with the PATH caveat.

#3 — MOLECULE_WORKSPACE_TOKEN_FILE support (mcp_workspace_resolver.py)
  Add a third resolution step between the inline env var and the
  in-container CONFIGS_DIR fallback. Operators can write the bearer to
  a 0600 file (e.g. ~/.config/molecule/token) and point
  MOLECULE_WORKSPACE_TOKEN_FILE at it, keeping the secret out of
  ~/.zsh_history and out of plaintext in MCP-host configs like
  ~/.claude.json. Inline TOKEN still wins on conflict so rotation flows
  are predictable. README documents the safer option as the
  recommended path. 6 new tests pin every leg (file resolves, inline
  wins, missing/empty file falls through, blank env unset-equivalent,
  help text advertises it).

#4 — Push delivery 3-condition gating (README in publish bundle)
  Document that real-time push on Claude Code requires (a) the server
  to declare experimental.claude/channel (we do), (b) the server to be
  marketplace-plugin-sourced (operators must scaffold their own until
  the official marketplace lands — see #2934 follow-up), and (c) the
  --dangerously-load-development-channels flag on the claude
  invocation. Until any of the three is in place, delivery silently
  falls back to poll mode with no diagnostic. The README now says all
  of this explicitly so a new operator doesn't grep the binary for
  channel_enable to figure it out.

#8 — serverInfo.name mismatch (a2a_mcp_server.py)
  The server reported `serverInfo.name = "a2a-delegation"` while
  operators register it as `molecule` (the name in `claude mcp add
  molecule …`). Harmless on tool routing today but matters for any
  future Claude Code allowlist that gates push by hardcoded server
  name. Renamed to "molecule" with an inline comment explaining the
  invariant.

Deferred (separate issues to track)
===================================

#2 — covered transitively by #1's pipx recommendation; no separate fix.
#5 — `moleculesai/claude-code-plugin` marketplace repo (substantial new
     repo work; the README references it as a documented follow-up).
#6 — `molecule-mcp doctor` subcommand (substantial new CLI surface;
     mentioned in the README's push-vs-poll section as the planned
     diagnostic for silent push fallback).
#7 — `--dangerously-load-development-channels` rename — not in our
     control; that's Claude Code's flag.

Tests
=====
164/164 mcp_cli + a2a_mcp_server tests pass locally
(WORKSPACE_ID=00000000-0000-0000-0000-000000000001 pytest …) including
6 new TestTokenFileEnv cases. Wheel builds successfully via
scripts/build_runtime_package.py with the new README markers verified
in the output.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 14:19:09 -07:00
hongming-personal b906e1da61 Merge pull request #2892 from Molecule-AI/refactor/a2a-tools-messaging-extract-rfc2873-iter4d
refactor(workspace): extract messaging tools from a2a_tools.py (RFC #2873 iter 4d)
2026-05-05 21:07:44 +00:00
molecule-ai[bot] 226e57a942 Merge pull request #2931 from Molecule-AI/staging
staging → main: auto-promote a1d2027
2026-05-05 21:03:11 +00:00
Hongming Wang abc3affcb6 test(a2a_tools): cover inbox tool wrappers to restore 75% per-file floor
After RFC #2873 iter 4d extracted messaging tools to
``a2a_tools_messaging.py``, the only behavior left in ``a2a_tools.py``
is ``report_activity`` (covered by test_a2a_tools_impl) plus three
thin wrappers around inbox state — ``tool_inbox_peek``,
``tool_inbox_pop``, ``tool_wait_for_message`` — which were never
directly exercised at the module level.

Per-file critical-path coverage dropped to 54.4% on the iter 4d
branch, breaking the 75% MCP/inbox/auth floor in ci.yml.

Adds ``test_a2a_tools_inbox_wrappers.py`` — 14 focused tests on the
three wrappers covering: inbox-disabled fallback (via the
_INBOX_NOT_ENABLED_MSG sentinel), input validation
(empty/non-str activity_id, non-int peek limit), the timeout clamp
contract on wait_for_message (300s ceiling, 0s floor, non-numeric
fallback to 60s), JSON-shape pinning, and the limit/activity_id
forwarding contract.

Result: a2a_tools.py back to 100% covered with the existing impl-tests
suite, gate green.
2026-05-05 13:59:58 -07:00
Hongming Wang 3322524b0f Merge remote-tracking branch 'origin/staging' into refactor/a2a-tools-messaging-extract-rfc2873-iter4d
# Conflicts:
#	workspace/a2a_tools.py
2026-05-05 13:57:44 -07:00
hongming-personal de01ff51b0 Merge pull request #2932 from Molecule-AI/refactor/embed-help-fix-docs-hostname
refactor(external-connect): embed help in agent paste + fix wrong docs hostname
2026-05-05 20:57:15 +00:00
hongming-personal f3782662bd refactor(external-connect): embed help in agent paste, fix wrong docs hostname
Two related fixes to the Connect-External-Agent flow that the user
flagged: the "Need help?" disclosure block in the modal is for the
operator's eyes only — but the agent reading the pasted snippet has
no access to that context. And the docs URL was pointing at a
hostname that doesn't resolve.

User-visible problems:
1. The agent doesn't see the install link, docs link, or the common-
   error/check pairs that the human pasted. When the agent fails to
   register or hits ConnectionRefused, it can't self-diagnose because
   the troubleshooting context lives in a separate UI block.
2. https://docs.molecule.ai → DNS NXDOMAIN. Every "Documentation"
   link in the modal was a dead link.

## Fixes

### Move help INTO the snippet (not a separate human-only UI block)
Each of the 7 server-rendered templates in
`workspace-server/internal/handlers/external_connection.go` now
appends a `# Need help?` section with: install link, correct docs
link, and the top common errors as `# • symptom — check` pairs.

Templates updated: curl / channel (Claude Code) / mcp (Universal MCP) /
python / hermes / codex / openclaw. Agents reading the paste now have
the same diagnostic context the human did.

### Drop the duplicated UI block in the canvas modal
`canvas/src/components/ExternalConnectModal.tsx`:
- Removed the `TAB_HELP` per-tab metadata constant (152 lines).
- Removed the `HelpBlock` component (62 lines).
- Removed the `<HelpBlock help={TAB_HELP[tab]} />` render call.

The snippet is now the single source of truth for tab-level help.

### Fix the wrong docs hostname
The actual docs site is `doc.moleculesai.app` (singular `doc`,
`.app` not `.ai`), confirmed by:
- `package.json` description in `Molecule-AI/docs` repo →
  "Molecule AI documentation site — doc.moleculesai.app"
- HTTP HEAD on the new URL → 200 for both
  `/docs/guides/mcp-server-setup` and
  `/docs/guides/external-agent-registration`
- HTTP HEAD on old `docs.molecule.ai` → 000 (NXDOMAIN)

All template docs URLs now point at `doc.moleculesai.app`.

## Verification
- `go build ./...` clean
- `go test ./internal/handlers/... -count=1` green
- `pnpm test` → 1291/1291 pass (unchanged)
- `tsc --noEmit` clean
- 219 LOC removed (canvas duplicate UI), 69 LOC added (snippet help)
- Net `-150 LOC` while gaining the agent-readable help

## Out of scope (deferred, captured in followups)
- One blog post still has `canonical: "https://docs.molecule.ai/blog/..."`
  in `src/app/blog/2026-04-20-chrome-devtools-mcp/page.mdx` — separate
  blog-content fix.
- Comment in `theme-provider.tsx` references `docs.moleculesai.app`
  (with `s`) — comment-only, not a runtime URL.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 13:51:35 -07:00
hongming-personal e9eb3868d5 Merge pull request #2930 from Molecule-AI/docs/python-version-callout-mcp-snippet
docs: callout Python>=3.11 on Universal MCP snippet + runtime doc
2026-05-05 20:48:31 +00:00
hongming-personal cb70d3d437 docs: callout Python>=3.11 requirement on Universal MCP install snippet
User-reported friction: pip install molecule-ai-workspace-runtime on a
3.10 interpreter fails with "Could not find a version that satisfies the
requirement (from versions: none)" — pip's requires_python filter
silently drops the only available artifact before attempting install,
so the error doesn't mention Python at all. Operators see
"package missing", file a bug, and chase a phantom CDN/visibility
issue.

Two changes mirror the requirement at the two operator-touch surfaces:

1. workspace-server/internal/handlers/external_connection.go:
   the externalUniversalMcpTemplate snippet (rendered into the
   canvas Connect-External-Agent modal) now leads with a brief
   "Requires Python >= 3.11" block + diagnostic + upgrade paths.

2. docs/workspace-runtime-package.md: same callout at the top of
   the doc, before the Overview, so anyone landing here from search
   gets the answer immediately.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 13:44:25 -07:00
hongming-personal a1d202723d Merge pull request #2929 from Molecule-AI/auto-sync/main-ef67dc51
chore: sync main → staging (auto, ff to ef67dc51)
2026-05-05 13:42:55 -07:00
hongming-personal 0d0840d9d9 Merge branch 'staging' into refactor/a2a-tools-messaging-extract-rfc2873-iter4d 2026-05-05 13:41:55 -07:00
hongming-personal fc30b5c9de Merge pull request #2905 from Molecule-AI/fix/poll-path-message-enrichment
fix(workspace): enrich poll-path inbox messages with peer_name/role/card_url
2026-05-05 20:36:41 +00:00
molecule-ai[bot] ef67dc513e Merge pull request #2928 from Molecule-AI/staging
staging → main: auto-promote 2bf6a70
2026-05-05 20:33:52 +00:00
hongming-personal 23d3f057d3 Merge pull request #2890 from Molecule-AI/refactor/a2a-tools-memory-extract-rfc2873-iter4c
refactor(workspace): extract memory tools from a2a_tools.py (RFC #2873 iter 4c)
2026-05-05 20:31:45 +00:00
Hongming Wang 8ca027ddf3 fix(tests): drop unused json + pytest imports
Bot lint flagged the two imports as unused (correct — neither is
referenced after the file shrank during review). Resolves the two
unresolved review threads silently blocking merge per the staging
"all conversations resolved" gate.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 13:26:49 -07:00
Hongming Wang 46a4ef83bb fix(tests): patch a2a_tools_memory.httpx, not a2a_tools.httpx
Iter 4c (#2890) moved tool_commit_memory + tool_recall_memory into
a2a_tools_memory.py, which has its own top-level `import httpx`.
test_mcp_memory.py + the secret-redact memory tests still patched
`a2a_tools.httpx.AsyncClient`, which after the move is the WRONG
module's reference — the real call inside the moved tool resolves to
`a2a_tools_memory.httpx.AsyncClient` and reaches the network. CI
catches this as 7 failures: JSONDecodeError on empty bodies and
"All connection attempts failed" on the recall side.

Update 7 patch sites to `a2a_tools_memory.httpx.AsyncClient`. The
existing tests in `test_a2a_tools_impl.py` were already updated by
the iter-4c PR; only these two files were missed.

Verified: pytest workspace/tests/test_mcp_memory.py +
test_secret_redact.py — 43/43 pass after the fix (both files were
red on the iter-4c branch CI).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 13:25:06 -07:00
hongming-personal a6afc18de5 Merge pull request #2927 from Molecule-AI/fix/org-import-polish-2872
fix(org-import): polish — wrap-safe ErrNoRows, bounded lookup, godoc (#2872)
2026-05-05 20:25:01 +00:00
hongming-personal 423d58d42c fix(org-import): polish — wrap-safe ErrNoRows, bounded lookup, godoc
Three small hardening passes from #2872's optional/important findings,
batched into one polish PR:

1. errors.Is(err, sql.ErrNoRows) instead of err == sql.ErrNoRows.
   The bare equality breaks if any future caller wraps the error via
   fmt.Errorf("…: %w", err) — the no-rows happy path would fall
   through to the "real DB error" branch and abort the import.
   errors.Is unwraps. New test
   TestLookupExistingChild_WrappedNoRows_TreatedAsNotFound pins the
   fix; verified the test fails on the old `==` shape (build break
   on unused-import + assertion failure once import dropped).

2. Bounded 5s timeout on lookupExistingChild instead of
   context.Background().
   The createWorkspaceTree call site runs in goroutines spawned from
   the /org/import handler, so plumbing the request context here
   would cascade-cancel into provisionWorkspaceAuto and abort
   in-flight EC2 provisioning if the client disconnected mid-import
   — that's the wrong tradeoff. A short bounded timeout protects the
   per-row SELECT against a wedged DB without taking the
   drop-everything-on-disconnect behaviour. The lookup is a single
   ~10ms query; 5s leaves 500x headroom for transient slow paths.

3. Godoc clarifications on the skip-path block.
   - /org/import is ADDITIVE-ONLY, never destructive. Children
     present in the existing tree but absent from the new template
     are preserved (no DELETE on diff).
   - Skip-path does NOT propagate updates to existing nodes — a
     re-import that adds an initial_memory or schedule to an
     existing workspace is silently dropped. Document the limitation
     so future operators know to delete-and-re-import or reach for
     a future /org/sync route.

Verification:
  - go build ./... → clean
  - go test ./internal/handlers/... → all passing (TestLookup* +
    TestCreateWorkspaceTree* + TestClass1* + TestGate*)
  - 4 lookup tests + 1 new wrap-safety test → 5/5 PASS
  - Full handlers suite → green

Refs molecule-core#2872 (Optional findings — wrap-safety + ctx, godoc
clarifications for additive-only + skip-path-update-limitation)

Out of scope (deferred):
  - PR-D partial unique index migration + ON CONFLICT — sequenced
    after Phase 4 cleanup verified clean per #2872 plan
  - PR-E full createWorkspaceTree integration test for partial-match
    — needs heavier sqlmock scaffolding for downstream
    workspaces_audit/canvas_layouts/secrets/channels INSERTs;
    follow-up

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 13:20:54 -07:00
hongming-personal 9386f1d399 Merge pull request #2926 from Molecule-AI/fix/agent-comms-display-parity
fix(canvas): AgentCommsPanel display + initial-state parity with my-chat
2026-05-05 20:15:29 +00:00
hongming-personal a766e5ce48 Merge pull request #2925 from Molecule-AI/e2e/poll-mode-chat-upload-tests
test(e2e): poll-mode chat upload E2E in standard suite
2026-05-05 20:13:27 +00:00
hongming-personal 5ad2669f88 fix(canvas): AgentCommsPanel display + initial-state parity with my-chat
User-visible problem: agent-comms panel opens mid-conversation on long
histories (the same chat-opens-in-middle bug PR #2903 fixed for
my-chat) and silently renders empty state when the history fetch fails
(no retry button, no diagnostic).

Three changes mirror the my-chat patterns from ChatTab:

1. Initial-mount instant scroll.
   Adds hasInitialScrollRef + switches the scroll hook from useEffect
   to useLayoutEffect. First arrival of messages → scrollIntoView
   `instant`; subsequent appends → `smooth` as before. useLayoutEffect
   runs before paint so the user never sees the panel jump for one
   frame on every append.

2. Error UI with Retry button.
   Adds `loadError` state. The history-load .catch now sets the
   error message; a new branch in the render renders a red alert
   with the failure text and a Retry button that re-invokes
   `loadInitial`. Same shape as ChatTab MyChatPanel's `loadError`
   handling — both surfaces should fail loud, not silent.

3. Extracted `loadInitial` callback.
   The history-load body becomes a useCallback so the retry button
   has a stable reference to call. Mirrors ChatTab's loadInitial.

Tests (4 new in AgentCommsPanel.render.test.tsx):
- Loading state renders the loading copy.
- Error state with Retry button renders on rejection; clicking
  Retry fires a second api.get.
- Empty state renders when load succeeds with zero rows.
- scrollIntoView is called with behavior=instant on first message
  arrival (pins the chat-opens-in-middle prevention).

Verification:
- pnpm test → 1284/1284 pass (1280 prior + 4 new)
- tsc --noEmit → clean
- 92 → 93 test files, no existing test broken

Closes the parity gap raised in chat. The two surfaces now share:
loading copy / error UI / empty-state placeholder / scroll behaviour /
useLayoutEffect timing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 13:09:36 -07:00
Hongming Wang 0ca4e431c1 test(e2e): add poll-mode chat upload E2E and wire into e2e-api.yml
Covers the user-visible flow that Phase 1-5b shipped (RFC #2891):
register a poll-mode workspace, POST a multi-file /chat/uploads, verify
the activity feed shows one chat_upload_receive row per file, fetch the
bytes via /pending-uploads/:fid/content, ack each row, and confirm a
post-ack fetch returns 404. Also pins cross-workspace bleed protection
(workspace B's bearer on A's URL → 401, B's URL with A's file_id →
404) and the file_id-UUID-parse 400 path.

23 assertions, all green against a local platform (Postgres+Redis+
platform-server stack matches the e2e-api.yml CI recipe verbatim).

Why a new script instead of extending test_poll_mode_e2e.sh: that
script tests A2A short-circuit + since_id cursor semantics; this one
tests the chat-upload path. They share zero handler code on the
platform side and would dilute each other's failure messages if
combined.

Why not the bearerless-401 strict-mode assertion: the platform's
wsauth fail-opens for bearerless requests when MOLECULE_ENV=development
(see middleware/devmode.go). The CI workflow doesn't set that var, but
some local-dev .env files do — the assertion would flap by environment
without testing the poll-mode upload contract. The middleware's own
unit tests cover strict-mode 401.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 13:08:55 -07:00
molecule-ai[bot] 184ce7ae4e Merge pull request #2919 from Molecule-AI/staging
staging → main: auto-promote ae22a55
2026-05-05 13:08:45 -07:00
hongming-personal 2bf6a7005f Merge pull request #2923 from Molecule-AI/feat/class1-ast-gate-2867
test(handlers): generic Class 1 leak AST gate (#2867 PR-A)
2026-05-05 20:04:59 +00:00
hongming-personal 16ead69641 Merge pull request #2913 from Molecule-AI/fix-saas-logout-ui-missing
fix(canvas): wire SaaS Sign-out button — /cp/auth/signout was unreachable from UI
2026-05-05 20:03:53 +00:00
hongming-personal 60afcd43c9 test(handlers): generic Class 1 leak AST gate (#2867 PR-A)
Adds class1_ast_gate_test.go — a per-package AST walk that fails the
build if any handler function INSERTs INTO workspaces inside a range
loop body without one of three escape hatches:

  1. A call to a registered preflight helper (lookupExistingChild today;
     extend preflightCallNames as new helpers are introduced).
  2. An ON CONFLICT clause in the same SQL literal (idempotent UPSERT,
     like registry.go).
  3. An explicit `// class1-gate: idempotent-by-design` comment in the
     function body (deliberately awkward — forces a code-review beat).

Why this is broader than the existing
TestCreateWorkspaceTree_CallsLookupBeforeInsert gate in
org_import_idempotency_test.go: that one is hard-coded to one function
in one file. This one walks every non-test .go file in the handlers
package and applies a structural rule independent of file/function
names. A future handler written from scratch in a new file would not
have been covered before — now it is.

Detection mechanism (per AST):
  - Collect spans (Lbrace..Rbrace) of every RangeStmt body in each
    function. Position-based instead of stack-based — ast.Inspect's
    nil-callback ordering doesn't give per-node pop semantics, so a
    naive push/pop stack silently miscounts. Position spans are
    deterministic.
  - Walk every BasicLit, regex-match `^\s*INSERT INTO workspaces\(`
    (tightened from bytes.Index "INSERT INTO workspaces" so
    workspaces_audit literals don't false-positive — same regex used
    by the existing createWorkspaceTree gate).
  - For each match: record insertLine, hasONCONFLICT, and the
    innermost enclosing RangeStmt line (or 0 if not inside any range).
  - Fail the function if INSERT is inside a range AND no preflight
    AND no ON CONFLICT AND no allowlist annotation.

Self-tests (per `feedback_assert_exact_not_substring.md` —
verify gate fails on the bug shape before merging):
  - TestClass1_GateFiresOnSyntheticBuggySource: synthetic source
    where INSERT is inside `for _, child := range children` body
    must trigger the gate's three guards (enclosingRangeLine!=0,
    hasONCONFLICT=false, no preflight call).
  - TestClass1_GateAllowsONCONFLICT: synthetic INSERT...ON CONFLICT
    must NOT trigger the gate (idempotent UPSERT case).
  - TestClass1_GateAllowsAllowlistAnnotation: function with
    `// class1-gate: idempotent-by-design` must be skipped.
  - TestClass1_NoUnpreflightedInsertInsideRange: production sweep
    over every handler .go file. Currently passes because
    org_import.go preflights, registry.go ON-CONFLICTs, and
    workspace.go's Create has no INSERT inside a range body.

Verification:
  - go test ./internal/handlers/... -run TestClass1_ -count=1
    → 4/4 PASS
  - go test ./internal/handlers/... -count=1 → suite green
    (no pre-existing test broken by the new file)

Refs molecule-core#2867 (PR-A Class 1 generic AST gate)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 13:01:34 -07:00
hongming-personal ff75aeb43e Merge pull request #2922 from Molecule-AI/fix/memory-plugin-gate-sidecar-on-cutover
fix(memory-plugin): gate sidecar spawn on cutover-active
2026-05-05 19:44:01 +00:00
hongming-personal 81cf0cbf98 Merge pull request #2921 from Molecule-AI/fix/batch-fetcher-cancel-on-timeout
fix(inbox-uploads): cancel BatchFetcher futures on wait_all timeout
2026-05-05 12:41:48 -07:00
Hongming Wang 412dec0d87 fix(memory-plugin): gate sidecar spawn on cutover-active
PR #2906 spawned the sidecar unconditionally on every tenant boot. The
plugin's first migration runs \`CREATE EXTENSION vector\` which fails
on tenant Postgres without pgvector preinstalled — every staging
tenant redeploy aborted at the 30s health gate. CP fail-fast kept
running tenants on the prior image (no outage), but the new image
was DOA.

Caught on staging redeploy 2026-05-05 19:23 with
\`pq: extension "vector" is not available\`.

Fix: only spawn the sidecar when the operator has flipped the cutover
flag — \`MEMORY_V2_CUTOVER=true\` OR \`MEMORY_PLUGIN_URL\` is set.

  * Aligns the entrypoint to the same opt-in posture wiring.go already
    uses (it skips building the client when MEMORY_PLUGIN_URL is empty).
  * Until cutover, the sidecar isn't even running — no migration, no
    health gate, no boot-time pgvector dependency.
  * Operators activating cutover already redeploy with the new env
    vars set; that's when the sidecar starts. By definition they've
    verified pgvector is available before flipping.
  * MEMORY_PLUGIN_DISABLE=1 escape hatch preserved; harness fix #2915
    becomes belt-and-suspenders (still respected).

Both Dockerfile and entrypoint-tenant.sh updated. Behavior change for
existing deployments: zero (cutover env vars still unset → sidecar
still inert, but now also not running).

Refs RFC #2728. Hotfix for #2906; supersedes the migration-path
fragility class (the sidecar isn't doing migrations on tenants that
won't use it).
2026-05-05 12:39:03 -07:00
hongming-personal 9a53529047 ci: retrigger after stuck Canvas tabs E2E (was running 17min vs typical <1min on staging) 2026-05-05 12:38:09 -07:00
Hongming Wang 39931acd9c fix(inbox-uploads): cancel BatchFetcher futures on wait_all timeout
The deadline contract was incomplete: wait_all logged the timeout but
close() then called executor.shutdown(wait=True), which blocked on
the leaked workers — undoing the user-facing timeout. The inbox poll
loop would stall indefinitely on a hung /content fetch instead of
returning to chat-message processing.

Fix: wait_all now flips self._timed_out and cancels queued (not-yet-
started) futures; close() reads that flag and switches to
shutdown(wait=False, cancel_futures=True) on the timeout path.
Currently-running workers can't be interrupted by Python's threading
model, but they're now detached daemons whose blocking httpx call
no longer gates the next poll.

Healthy path (no timeout) keeps the existing drain-and-wait so a
still-queued ack POST isn't dropped mid-write.

Two new tests pin both legs of the contract end-to-end:
- close-after-timeout-doesn't-block: hung worker, wait_all(0.05s)
  fires the timeout, close() returns in <1s instead of waiting ~5s
  for the worker to come back.
- close-without-timeout-still-drains: 2 slow workers, wait_all
  completes cleanly, close() drains both ack POSTs.

Resolves the BatchFetcher timeout-cancellation finding from the
post-merge five-axis review of Phase 5b.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 12:34:41 -07:00
hongming 6f19b88fa7 Merge pull request #2920 from Molecule-AI/feat/structured-provision-logging-2867
feat(workspace-server): structured logging at provisioning boundaries (#2867 PR-D)
2026-05-05 19:34:05 +00:00
hongming-personal 83454e5efd feat(workspace-server): structured logging at provisioning boundaries
Adds internal/provlog with a single Event(name, fields) helper that
emits JSON-tagged single-line records to the standard logger. Five
boundary sites instrumented for #2867:

  provision.start         — workspace_dispatchers.go (sync + async)
  provision.skip_existing — org_import.go idempotency hit
  provision.ec2_started   — cp_provisioner.go after RunInstances
  provision.ec2_stopped   — cp_provisioner.go after TerminateInstances ack
  restart.pre_stop        — workspace_restart.go before Stop dispatch

These pair with the existing human-prose log.Printf lines (kept). The
new records are grep+jq friendly so a future log-aggregation pipeline
can reconstruct per-workspace provision timelines without parsing the
operator messages — this is the "and debug loggers so it dont happen
again" half of the leak-prevention work.

Tests:
  - provlog: emits evt-prefixed JSON, nil-tolerant, marshal-error
    fallback preserves event boundary, single-line output pinned.
  - handlers: provlog_emit_test.go pins three call-site contracts:
    provisionWorkspaceAutoSync emits provision.start with sync=true,
    stopForRestart emits restart.pre_stop with backend=cp on SaaS,
    and backend=none when both backends are nil.

Field taxonomy is convenience for ops, not contract — payload can grow
additively without breaking callers. Behavior gate is the event name +
boundary location, per feedback_behavior_based_ast_gates.md.

Refs #2867 (PR-D structured logging at provisioning boundaries)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 12:30:11 -07:00
hongming-personal 575f893f4e fix(canvas): consume CP logout_url to break the SSO re-auth loop
Follow-up to molecule-controlplane#485. The first half of #2913 wired
a Sign-out button + signOut() helper that POSTed /cp/auth/signout, but
clicking still left the user signed in: WorkOS's browser cookie
preserved the SSO session, /cp/auth/login auto-re-authed via SSO, and
the user landed back on /orgs.

CP PR #485 returns the AuthKit hosted logout URL in the signout
response. This change has signOut() navigate the browser there
instead of /cp/auth/login. AuthKit clears its cookie + redirects to
return_to (configured server-side from APP_URL) → next /cp/auth/login
hits a fresh AuthKit, no SSO session, login form actually shows.

Defensive parsing: malformed JSON, missing logout_url, or wrong-type
logout_url all fall through to the legacy /cp/auth/login fallback,
which works locally (DisabledProvider, dev) where there's no SSO to
escape.

Forward-compat: when CP doesn't have #485 deployed yet, signOut()
sees logout_url="" or missing → fallback fires. Order of merge
between this and #485 doesn't matter, but the bug isn't actually
fixed end-to-end until both ship.

Tests added (3 new, 15 total auth.test.ts):
- Hosted logout: navigates to logout_url when response includes one.
- DisabledProvider path: falls back to /cp/auth/login when "".
- Defensive: malformed JSON body → fallback (no crash).
- Defensive: non-string logout_url → fallback (no open redirect).

Verified:
- npx vitest run src/lib/__tests__/auth.test.ts — 15/15 pass
- tsc --noEmit clean

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 12:21:49 -07:00
hongming-personal 4cac4e7710 fix(canvas): wire SaaS Sign-out button — POST /cp/auth/signout was unreachable from the UI
Reported externally on 2026-05-05: "SaaS app logout does not work."

Root cause: the control plane has had POST /cp/auth/signout (clears the
WorkOS session cookie + revokes at the provider) since auth shipped,
but no canvas code ever called it. grep across canvas/ for
`logout|signOut|signout|sign-out` returned zero results — no helper,
no button, no menu entry. Users had no path to log out short of
clearing cookies in DevTools.

This is a UI gap, not a backend bug. Adding the missing pieces:

1. `signOut()` helper in `canvas/src/lib/auth.ts`:
   - POST /cp/auth/signout with credentials:include (cross-origin
     cookie required for tenant subdomain → app subdomain)
   - Best-effort: a 5xx, 401-stale-cookie, or network failure still
     redirects the browser to /cp/auth/login. Leaving the user on an
     authed-looking page after they clicked Sign out is the worst
     possible UX — that's the precise "logout doesn't work" symptom
     the report described.
   - Lands on /cp/auth/login (not the current URL) so the user
     doesn't loop back into the org they just left via AuthGate's
     return_to.

2. `AccountBar` component on /orgs page Shell — renders the signed-in
   email + Sign-out button at the top. Click → signOut() →
   `Signing out…` → bounces to login. Disabled-while-pending so a
   double-click can't fire two requests.

3. Tests in `auth.test.ts` (4 new, total 12 pass):
   - POSTs to the right endpoint with credentials:include
   - Redirects to /cp/auth/login after success
   - Redirects EVEN ON network failure (the critical UX invariant)
   - Redirects on 401 (stale cookie path)

The auth-origin resolution (`getAuthOrigin`) is reused so a tenant
subdomain (acme.moleculesai.app) correctly POSTs to
app.moleculesai.app/cp/auth/signout — same chain that fetchSession
+ redirectToLogin already use.

Test plan:
- [x] `npx vitest run src/lib/__tests__/auth.test.ts` — 12/12 green
- [x] `tsc --noEmit` — clean
- [ ] Manual: navigate to /orgs, click Sign out, observe redirect +
      that the next /orgs visit bounces to login (cookie cleared)
- [ ] CI green

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 12:20:18 -07:00
hongming-personal 8254bedf30 Merge pull request #2917 from Molecule-AI/chore/delete-team-collapse-2864
chore: delete TeamHandler.Collapse + docs cleanup (closes #2864)
2026-05-05 19:04:30 +00:00
hongming-personal ec72f199e6 Merge pull request #2916 from Molecule-AI/fix/memory-plugin-embed-migrations
fix(memory-plugin): embed migrations into binary via go:embed (hotfix #2906)
2026-05-05 19:04:01 +00:00
hongming-personal ae22a55675 Merge pull request #2909 from Molecule-AI/feat/poll-mode-chat-upload-phase5b
feat(poll-upload): phase 5b — concurrent BatchFetcher + httpx client reuse
2026-05-05 12:05:58 -07:00
molecule-ai[bot] 08648bf4b1 Merge pull request #2914 from Molecule-AI/staging
staging → main: auto-promote 5334d60
2026-05-05 12:03:30 -07:00
hongming-personal eec4ea2e7d chore: delete TeamHandler.Collapse + docs cleanup (closes #2864)
Multi-model retrospective review of #2856 (Phase 1 Expand removal)
flagged that TeamHandler.Collapse is unreachable from the canvas UI:
the "Collapse Team" button calls PATCH /workspaces/:id { collapsed }
(visual flag toggle on canvas_layouts), NOT POST /workspaces/:id/collapse.
The destructive POST route — which stops EC2s, marks children removed,
and deletes layouts — has zero UI callers (verified via grep across
canvas/, scripts/, and the MCP tool registry; only docs referenced it).

Two semantically different operations had been sharing the word
"Collapse":

- Visual collapse (canvas) → PATCH { collapsed: true }. Hides
  children visually. Reversible. UI-only.
- Destructive collapse (POST /collapse) → Stops + marks removed.
  Irreversible. No caller.

Deleting the destructive one + its supporting machinery:

- workspace-server/internal/handlers/team.go (entirely)
- workspace-server/internal/handlers/team_test.go (entirely)
- POST /collapse route + teamh init in router.go
- findTemplateDirByName helper (zero non-test callers after Expand
  was deleted in #2856; package-private so no out-of-package consumers)
- NewTeamHandler constructor (no callers after route removed)

Plus stale doc references (the most dangerous was the MCP wrapper
mapping in mcp-server-setup.md — anyone generating MCP tool wrappers
from that table was wiring a 404):

- docs/agent-runtime/team-expansion.md (deleted entirely — whole
  guide taught the deleted flow)
- docs/api-reference.md (dropped two team.go rows)
- docs/api-protocol/platform-api.md (dropped /expand + /collapse
  rows)
- docs/architecture/molecule-technical-doc.md (dropped /expand +
  /collapse rows)
- docs/guides/mcp-server-setup.md (dropped expand_team +
  collapse_team MCP wrapper mappings)
- docs/glossary.md (dropped "(org template expand_team)"
  parenthetical)
- docs/frontend/canvas.md (dropped broken link to deleted
  team-expansion.md)

Kept: docs/architecture/backends.md mention of "TeamHandler.Expand
(#2367) bypassed routing on Start" — correct historical context for
the AST gate's existence, no live route reference.

Visual-collapse path unaffected:
  canvas/src/components/ContextMenu.tsx:227 → api.patch — unchanged
  canvas/src/components/WorkspaceNode.tsx:128 → api.patch — unchanged

go vet ./... clean. go test ./internal/handlers/ -count 1 — all green
(4.3s, no regression).

Net: -388/+10 = ~378 lines removed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 11:59:43 -07:00
Hongming Wang 6201d12533 fix(memory-plugin): embed migrations into binary via go:embed
PR #2906 shipped the binary at /memory-plugin without the migrations
directory. The plugin's runMigrations() resolved a relative path
\`cmd/memory-plugin-postgres/migrations\` that exists in the build
context but NOT in the runtime image. Every staging tenant boot
failed with:

  memory-plugin-postgres: migrate: read migrations dir
    "cmd/memory-plugin-postgres/migrations": open
    cmd/memory-plugin-postgres/migrations: no such file or directory

  memory-plugin:  /v1/health never returned 200 after 30s
    — aborting boot

Caught on the staging redeploy fleet job after #2906 merged. Tenants
stayed on the old image (CP redeploy correctly fail-fasted) but the
new image was broken.

Fix: \`//go:embed migrations/*.up.sql\` bundles the migrations into
the binary at build time. No filesystem path dependency at runtime.

  * \`embed.FS\` embeds the .up.sql files alongside the binary.
  * runMigrations() reads from migrationsFS by default;
    MEMORY_PLUGIN_MIGRATIONS_DIR override path preserved for operators
    shipping custom migrations.
  * Names sorted alphabetically — pinned by a test so a future
    \`002_*.up.sql\` is guaranteed to run after \`001_*.up.sql\`.

Tests:
  * TestMigrationsEmbedded_ContainsCreateTable — pins that the embed
    pattern matched files AND those files contain CREATE TABLE
    (catches both empty-pattern and wrong-files-embedded).
  * TestRunMigrationsFromEmbed_OrderingIsAlphabetic — pins sorted
    application order.

Verified locally: \`go build\` succeeds, binary 9.3MB,
\`strings\` shows the embedded SQL.

Refs RFC #2728. Hotfix for #2906.
2026-05-05 11:57:37 -07:00
Hongming Wang 81e83c05b7 fix(inbox): drop unused batch_fetcher = None after end-of-batch drain
Lint nit from review bot — _drain_uploads() runs and the function
immediately advances to the cursor save + return, so the local
re-assign is dead code.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 11:56:54 -07:00
Hongming Wang 5b5eacbb29 test(inbox): clean up daemon poller thread to prevent test cross-talk
test_start_poller_thread_is_daemon spawned a daemon thread with no stop
mechanism — the leaked thread polled every 10ms with the test's patched
httpx.Client mock STILL ACTIVE for ~50ms after the test scope. Later
tests that re-patched httpx.Client + asserted call counts on
fetch_and_stage / Client construction got their assertions inflated by
the leaked thread's iterations.

Symptoms: test_poll_once_skips_chat_upload_row_from_queue saw
fetch_and_stage called twice instead of once on Python 3.11 CI;
test_batch_fetcher_owns_client_when_not_supplied saw two Client
constructions instead of one in the full local suite. Both surfaced
only after Phase 5b's BatchFetcher refactor changed the timing window
that allowed the leaked thread to fire mid-test.

Fix: extend start_poller_thread with an optional stop_event kwarg
(backward compatible — production callers pass None and rely on the
daemon flag for process-exit cleanup). The test now signals + joins
on stop_event before exiting scope, so the thread is gone before any
later test patches httpx.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 11:47:14 -07:00
hongming-personal c8fca1467e Merge pull request #2915 from Molecule-AI/fix/harness-disable-memory-plugin
fix(harness): disable memory-plugin sidecar in harness tenants
2026-05-05 18:45:43 +00:00
Hongming Wang 7c8b81c6eb fix(harness): disable memory-plugin sidecar in harness tenants
PR #2906 bundled memory-plugin-postgres as a startup-gated sidecar in
both tenant entrypoints. Plugin migrations include
\`CREATE EXTENSION IF NOT EXISTS vector\` which fails on the harness's
plain postgres:15-alpine (no pgvector preinstalled). The 30s health
gate then aborts container boot and Harness Replays fails.

Detected on auto-promote PR #2914 — Harness Replays job:
  Container harness-tenant-alpha-1  Error
  Container harness-tenant-beta-1   Error
  dependency failed to start: container harness-tenant-alpha-1 exited (1)

The harness doesn't exercise memory features, so the simplest fix is
to use the documented escape hatch the sidecar entrypoint already
ships (MEMORY_PLUGIN_DISABLE=1) — applied to both alpha and beta
tenants in compose.yml. Alternative would be switching the harness
postgres images to pgvector/pgvector:pg15, deferred until the
harness wants to verify memory paths.

Refs PR #2906. Unblocks #2914 (auto-promote staging→main).
2026-05-05 11:42:20 -07:00
hongming-personal fc1c45789e Merge pull request #2912 from Molecule-AI/feat/saas-default-hardening-2910
feat(saas): close 4th default-tier site + lift org_import asymmetry + tests (#2910)
2026-05-05 18:42:19 +00:00
hongming-personal e3a18ed8e8 Merge pull request #2911 from Molecule-AI/fix/memory-plugin-bind-loopback
fix(memory-plugin): bind to 127.0.0.1 by default
2026-05-05 18:38:35 +00:00
hongming-personal 9f551319d2 feat(saas): close 4th default-tier site + lift org_import asymmetry + tests (#2910)
Multi-model retrospective review of #2901 found three Critical gaps:

1. (#2910 PR-B) template_import.go:79 wrote `tier: 3` hardcoded into
   generated config.yaml. On SaaS this defeated the T4 default at the
   create-handler layer — a config-less template import landed at T3
   regardless of POST /workspaces' computed default. The 4th
   default-tier site #2901 missed.

2. (#2910 PR-A) #2901 claimed `go test ... all green` but added zero
   new tests. Existing structural-pin tests caught dispatch-layer
   drift but said nothing about tier-default drift. A future refactor
   that flips DefaultTier() to always return 3 would ship green.

3. (#2910 PR-E) org_import.go fallback returned T2 on self-hosted
   while workspace.go returned T3. Internally consistent ("bulk vs
   interactive defaults") but undocumented same-name-different-value
   drift.

Fix:

- TemplatesHandler.NewTemplatesHandler now takes `wh *WorkspaceHandler`
  (nil-tolerant for read-only callers). Import + ReplaceFiles compute
  tier via h.wh.DefaultTier() and pass it to generateDefaultConfig.
  generateDefaultConfig gets a `tier int` parameter (bounds-checked,
  invalid input falls back to T3).

- org_import.go fallback lifts to h.workspace.DefaultTier() — single
  source of truth shared with Create + Templates so a future
  tier-default change sweeps every entry point at once.

- New saas_default_tier_test.go pinning:
    TestIsSaaS_TrueWhenCPProvWired
    TestIsSaaS_FalseWhenOnlyDocker
    TestDefaultTier_SaaS_IsT4
    TestDefaultTier_SelfHosted_IsT3
    TestGenerateDefaultConfig_RespectsTierParam
    TestGenerateDefaultConfig_SelfHostedTierT3
    TestGenerateDefaultConfig_OutOfRangeFallsBackToT3

- Existing template_import_test.go tests + chat_files_test.go +
  security_regression_test.go updated to thread the new tier param /
  wh constructor arg through their NewTemplatesHandler calls. Their
  pre-#2910 assertion of `tier: 3` is preserved (now passes because
  the test caller passes `3` explicitly), so no regression.

go vet ./... clean. go test ./internal/handlers/ -count 1 — all
green (4.2s).

Deferred to separate follow-ups (per #2910 plan):
- PR-C: MOLECULE_DEPLOYMENT_MODE explicit deployment-mode signal
  (closes the IsSaaS()=cpProv!=nil structural fragility)
- PR-D: Host iptables IMDS block + IMDSv2 hop-limit (paired with
  molecule-controlplane EC2-IAM-scope audit)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 11:38:22 -07:00
Hongming Wang 1052f8bdb0 fix(memory-plugin): bind to 127.0.0.1 by default
Self-review of PR #2906 flagged: defaultListenAddr was ":9100" — binds
on every container interface. Inside today's deployment that's moot
(no host port mapping, platform talks over loopback) but it's not
least-privilege. A future Dockerfile edit that publishes the port,
a misconfigured Fly machine, or a future cross-host plugin topology
would expose an unauth'd memory store.

Loopback is the right baseline. Operators with a multi-host topology
already override via MEMORY_PLUGIN_LISTEN_ADDR — that path is unchanged.

Tests:
  * TestLoadConfig_DefaultListenAddrIsLoopback pins the new default.
  * TestLoadConfig_ListenAddrEnvOverride pins the override path so
    operators relying on it don't break.
  * TestLoadConfig_MissingDatabaseURL covers the existing fail-fast.

No prior unit tests existed for loadConfig — boot_e2e_test.go always
sets MEMORY_PLUGIN_LISTEN_ADDR explicitly, so the default was never
exercised by tests. This PR adds that coverage.

Refs RFC #2728. Hardening follow-up to PR #2906.
2026-05-05 11:35:24 -07:00
Hongming Wang 30fb507165 feat(poll-upload): phase 5b — concurrent BatchFetcher + httpx client reuse
Resolves the two remaining findings from the Phase 1-4 retrospective
review (the Python-side counterparts to phase 5a):

1. Important — inbox_uploads.fetch_and_stage blocked the inbox poll
   loop synchronously per row. A user dragging 4 files into chat at
   once would stall the poller for 4× per-fetch latency before the
   chat message reached the agent. Add BatchFetcher: a thread-pool
   wrapper (default 4 workers) that submits fetches concurrently and
   exposes wait_all() as the barrier the inbox loop calls before
   processing the chat-message row that references the uploads.

   The drain barrier is the correctness invariant: rewrite_request_body
   must observe a populated URI cache when it walks the chat-message
   row's parts. _poll_once now drains the BatchFetcher inline before
   the first non-upload row, AND at end-of-batch (case: batch contains
   only upload rows; the corresponding chat message arrives in a later
   poll, but the future-poll-races-current-fetch race is closed).

2. Nit — fetch_and_stage created two httpx.Client instances per row
   (one for GET /content, one for POST /ack). Refactor so a single
   client serves both calls. When called from BatchFetcher, the
   batch-shared client serves every row's GET + ack — so the second
   fetch reuses the TCP+TLS handshake from the first.

Comprehensive tests:

- 13 new inbox_uploads tests:
  - fetch_and_stage with supplied client: zero httpx.Client
    constructions, GET+POST through the same client, caller's client
    not closed (lifecycle owned by caller).
  - fetch_and_stage without supplied client: exactly one
    httpx.Client constructed (was 2 pre-fix), closed on the way out.
  - BatchFetcher: 3 rows × 120ms = parallel completion < 250ms
    (vs. ~360ms serial), URI cache hot when wait_all returns,
    per-row failure isolation, single-client reuse across all
    submits, idempotent close, submit-after-close raises,
    owned-vs-supplied client lifecycle, no-op wait_all on empty
    batch, graceful httpx-missing degradation.

- 3 new inbox tests:
  - poll_once drains uploads before processing the chat-message row
    (in-place mutation of row['request_body'] proves the URI was
    rewritten BEFORE message_from_activity returned).
  - poll_once with only upload rows still drains at end-of-batch.
  - poll_once with no upload rows never constructs a BatchFetcher
    (zero overhead on the no-upload happy path).

133 total inbox + inbox_uploads tests pass; 0 regressions.

Closes the chat-upload poll-mode-perf gap end-to-end.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 11:26:55 -07:00
molecule-ai[bot] 77e9a965ac Merge pull request #2904 from Molecule-AI/staging
staging → main: auto-promote 6470e5f
2026-05-05 18:24:19 +00:00
hongming-personal 5334d60de4 Merge pull request #2898 from Molecule-AI/2867-workspaces-insert-allowlist
test(handlers): allowlist INSERT INTO workspaces sites (#2867 class 1)
2026-05-05 18:18:19 +00:00
hongming-personal d6c0227e3f Merge pull request #2906 from Molecule-AI/feat/memory-plugin-sidecar-bundle
feat(memory-v2): bundle memory-plugin-postgres as in-image sidecar
2026-05-05 18:16:57 +00:00
hongming-personal 27db090d3d Merge pull request #2907 from Molecule-AI/feat/poll-mode-chat-upload-phase5a
feat(poll-upload): phase 5a — atomic batch insert + acked-index + mime hardening
2026-05-05 11:16:56 -07:00
hongming-personal 0f25f6de97 test(handlers): allowlist INSERT INTO workspaces sites — close bulk-create regression class (#2867 class 1)
Adds TestINSERTworkspacesAllowlist: walks every non-test .go in this
package, finds funcs containing an `INSERT INTO workspaces (` SQL
literal, and pins the result against an explicit allowlist with the
safety mechanism named per entry.

New entries fail the build until a reviewer adds them — forcing the
question "what makes this INSERT idempotent?" at PR-review time, not
after the next bulk-create leak (the shape that produced 72 stale
child workspaces in tenant-hongming over 4 days).

Pairs with TestCreateWorkspaceTree_CallsLookupBeforeInsert (the
behavior pin for the one bulk path today). Together:
- this test catches "did a new function start inserting?"
- that test catches "did the existing bulk path drop its idempotency check?"

Both fire immediately when drift happens.

Current allowlist (3 entries):
- org_import.go:createWorkspaceTree → lookup-then-insert via
  lookupExistingChild (#2868 phase 3, also pinned by the sibling AST
  gate from #2895)
- registry.go:Register → ON CONFLICT (id) DO UPDATE (idempotent by
  primary key — external workspace upsert)
- workspace.go:Create → single-workspace POST /workspaces, server-
  generated UUID, no iteration

Verified via mutation: dropping a synthetic tempBulkLeakTest with an
unsafe loop+INSERT into the package fails the gate with a clear
diagnostic pointing at the file + function. Restoring the tree
returns the gate to green.

Memory: feedback_assert_exact_not_substring.md (verify tightened test
FAILS on bug shape) — mutation proof done locally.

RFC #2867 class 1. Class 2 (Prometheus gauge for ec2_instance
duplicates) + class 3 (structured logging on workspace create) are
follow-up PRs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 11:15:16 -07:00
Hongming Wang 9991057ad1 feat(poll-upload): phase 5a — atomic batch insert + acked-index + mime hardening
Resolves four of six findings from the retrospective code review of Phases
1–4 (poll-mode chat upload). Bundled because every change is in the
platform's pending_uploads layer or the multi-file handler that reads it.

Findings resolved:

1. Important — Sweep query lacked an index for the acked-retention OR-arm.
   The Phase 1 partial indexes are both `WHERE acked_at IS NULL`, so the
   `(acked_at IS NOT NULL AND acked_at < retention)` half of the WHERE
   clause seq-scanned the table on every cycle. Add a complementary
   partial index on `acked_at WHERE acked_at IS NOT NULL` so both arms
   of the disjunction are index-covered. Disjoint from the existing two
   indexes (no row matches both predicates), so write amplification is
   bounded to ~one index entry per terminal-state row.

2. Important — uploadPollMode partial-failure left orphans. The previous
   per-file Put loop committed rows 1..K-1 and then errored on row K with
   no compensation, so a client retry would double-insert the survivors.
   Refactor the handler into three explicit phases (pre-validate +
   read-into-memory, single atomic PutBatch, per-file activity row) and
   add Storage.PutBatch with all-or-nothing transaction semantics.

3. FYI — pendinguploads.StartSweeperWithInterval was exported only for
   tests. Move it to lower-case startSweeperWithInterval and expose the
   test seam through pendinguploads/export_test.go (Go convention; the
   shim file is stripped from the production binary at build time).

4. Nit — multipart Content-Type was passed verbatim into pending_uploads
   rows and re-served on /content. Add safeMimetype which strips
   parameters, rejects CR/LF/control bytes, and coerces malformed shapes
   to application/octet-stream. The eventual GET /content response can no
   longer be header-split via a crafted Content-Type on the multipart.

Comprehensive tests:

- 10 PutBatch unit tests (sqlmock): happy path, empty input, all four
  pre-validation rejection paths, BeginTx error, per-row error +
  Rollback (no Commit), first-row error, Commit error.
- 4 new PutBatch integration tests (real Postgres): all-rows-commit
  happy path with COUNT(*) verification, atomic-rollback no-leak via
  a NUL-byte filename that lib/pq rejects mid-batch, oversize
  short-circuit no-Tx, idx_pending_uploads_acked existence + partial
  predicate via pg_indexes (planner-shape-independent).
- 3 new chat_files_poll tests: atomic rollback on second-file oversize,
  atomic rollback on PutBatch error, mimetype CRLF/NUL/parameter
  sanitization (8 sub-cases).

The two remaining review findings (inbox_uploads.fetch_and_stage blocks
the poll loop synchronously; two httpx Clients per row) are Python-side
and ship in Phase 5b once this lands on staging.

Test-only export pattern via export_test.go, atomic pre-validation
discipline (validate before Tx), and behavior-based (not name-based)
test assertions follow the standing project conventions.
2026-05-05 11:10:13 -07:00
Hongming Wang b89a49ec93 feat(memory-v2): bundle memory-plugin-postgres as in-image sidecar
Closes the gap between the merged Memory v2 code (PR #2757 wired the
client into main.go) and operator activation. Without this PR an
operator wanting to flip MEMORY_V2_CUTOVER=true had to provision a
separate memory-plugin service and point MEMORY_PLUGIN_URL at it —
extra ops surface for what the design intends to be a built-in.

What ships:
  * Both Dockerfile + Dockerfile.tenant build the
    cmd/memory-plugin-postgres binary into /memory-plugin.
  * Entrypoints spawn the plugin in the background on :9100 BEFORE
    starting the main server; wait up to 30s for /v1/health to return
    200; abort boot loud if it doesn't (better to crash-loop than to
    silently route cutover traffic against a dead plugin).
  * Default env: MEMORY_PLUGIN_DATABASE_URL=$DATABASE_URL (share the
    existing tenant Postgres — plugin's `memory_namespaces` /
    `memory_records` tables coexist with platform schema, no
    conflicts), MEMORY_PLUGIN_LISTEN_ADDR=:9100.
  * MEMORY_PLUGIN_DISABLE=1 escape hatch for operators running the
    plugin externally on a separate host.
  * Platform image: plugin runs as the `platform` user (not root) via
    su-exec — matches the privilege boundary the main server already
    drops to. Tenant image already starts as `canvas` so the plugin
    inherits non-root automatically.

What stays operator-controlled:
  * MEMORY_V2_CUTOVER is NOT auto-set. Behavior change for existing
    deployments: zero. The wiring at workspace-server/internal/memory/
    wiring/wiring.go skips building the plugin client until the
    operator opts in, so the running sidecar is a no-op for traffic
    until then.
  * MEMORY_PLUGIN_URL is NOT auto-set either, for the same reason —
    setting it implies cutover-active intent. Operators set both on
    staging first, verify a live commit/recall round-trip (closes
    pending task #292), then promote to production.

Operator activation steps after this PR ships:
  1. Verify pgvector extension is available on the target Postgres
     (the plugin's first migration runs CREATE EXTENSION IF NOT
     EXISTS vector). Railway's managed Postgres ships pgvector
     available; some self-hosted operators may need to enable it.
  2. Redeploy the workspace-server with this image.
  3. Set MEMORY_PLUGIN_URL=http://localhost:9100 + MEMORY_V2_CUTOVER=true
     in the environment (staging first).
  4. Watch boot logs for "memory-plugin:  sidecar healthy" and the
     wiring.go cutover messages; do a live commit_memory + recall_memory
     round-trip via the canvas Memory tab to verify.
  5. Promote to production once staging holds for a sweep window.

Refs RFC #2728. Closes the dormant-plugin gap noted in task #294.
2026-05-05 11:10:11 -07:00
hongming-personal 3d0a7c381b fix(workspace): enrich poll-path inbox messages with peer_name/role/card_url
Reported: agents receiving messages via inbox_peek / wait_for_message
get a plain envelope — text + peer_id + kind only. The push-path
(a2a_mcp_server._build_channel_notification) already enriches the
meta dict with peer_name, peer_role, and agent_card_url from the
registry cache, but the poll-path returns InboxMessage.to_dict()
unchanged. So a Claude Code host with channel-push gets the friendly
identity, but every other MCP client (and Claude Code with push
disabled — the universal default) sees plain text.

This silently breaks the contract documented in
a2a_mcp_server.py:303-345:

> In both paths the same fields apply: kind, peer_id, peer_name,
> peer_role, agent_card_url, activity_id

Fix: a2a_tools._enrich_inbound_for_agent() — same shape as the
push-path's enrichment, called from tool_inbox_peek and
tool_wait_for_message. Cache-first non-blocking (5-min TTL via
enrich_peer_metadata_nonblocking, same helper push uses), so a cache
miss returns immediately with bare envelope and warms the cache for
the next poll. agent_card_url is constructable from peer_id alone
and surfaces even on cache miss, so the receiving agent always has
a single endpoint to hit for capabilities.

Degradation paths:
- canvas_user (peer_id="") → pass through unchanged, no enrichment
- a2a_client unavailable (test harness without registry) → bare
  envelope, agent still gets text + peer_id + kind + activity_id

Tests:
- canvas_user passes through unchanged
- peer_agent cache hit → name + role + agent_card_url all present
- peer_agent cache miss → agent_card_url still constructed
- a2a_client unavailable → bare envelope, no crash

All 4 pass against fixed code. Without the fix, the cache-hit and
cache-miss tests would fail (peer_name/peer_role/agent_card_url keys
absent from to_dict's output).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 11:08:14 -07:00
hongming-personal f5613bf099 Merge pull request #2902 from Molecule-AI/fix-pendinguploads-sweeper-test-race
test(pendinguploads): close cycleDone-vs-metric-record race in sweeper tests
2026-05-05 18:02:21 +00:00
hongming-personal 9bd2a2c45f Merge pull request #2903 from Molecule-AI/fix/chat-tab-initial-scroll-bottom
fix(canvas/chat): instant-scroll to bottom on first mount
2026-05-05 17:50:42 +00:00
hongming-personal a489ee1a7c fix(canvas/chat): instant-scroll to bottom on first mount
Reported: "right now when chat box opens it opens in the middle, but
it should be at the end of conversation."

Root cause: ChatTab.tsx:548 fires `bottomRef.scrollIntoView({ behavior:
"smooth" })` on every messages-update. On initial mount with N
messages already loaded, the smooth-scroll triggers a ~300ms animation
that any concurrent React re-render (agent push landing, theme
toggle, sidepanel resize) interrupts mid-flight, leaving the user
stuck somewhere in the middle of the conversation.

Fix: track first-mount via hasInitialScrollRef. Use behavior:"instant"
for the initial jump (deterministic, no animation interruption), then
smooth for subsequent appends (the new-message-landing visual stays).

Refs flipped on first messages.length > 0 transition, so:
- Initial open of chat tab: instant jump to bottom ✓
- New agent message arrives: smooth scroll into view ✓
- Workspace switch (ChatTab remounts): fresh hasInitialScrollRef, gets
  instant again ✓
- loadOlder prepend: anchor-restore path unchanged, still pins user's
  reading position ✓

Test plan:
- pnpm test --run ChatTab.lazyHistory.test.tsx → 8 pass (existing
  lazy-history tests untouched)
- npx tsc --noEmit clean
- Manual on hongming.moleculesai.app: open a busy chat (mac laptop,
  ~50 messages), confirm view lands at the latest bubble, not mid-
  scroll. Switch to another workspace + back → instant again.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 10:47:32 -07:00
hongming-personal c79ba05ed5 test(pendinguploads): close cycleDone-vs-metric-record race in sweeper tests
TestStartSweeper_RecordsMetricsOnError flaked on every CI rerun under
race detection: `error counter delta = 0, want 1`. Root cause is a
race between two goroutines, not a bug in the production sweeper.

The fake `fakeSweepStorage.Sweep` signals `cycleDone` from inside its
deferred return — that happens BEFORE Sweep's return value is
received by `sweepOnce`, which is what triggers the metric increment.
On slow CI hosts the test goroutine wins the read after `waitForCycle`
unblocks and BEFORE StartSweeper's goroutine has called
`metrics.PendingUploadsSweepError`, so the asserted delta is 0 even
though the metric WILL be 1 a few ms later.

Adds a polling assert helper, `waitForMetricDelta`, that closes the
race deterministically without timing-based sleeps:

- TestStartSweeper_RecordsMetricsOnError uses waitForMetricDelta to
  wait for the error counter to settle at 1.
- TestStartSweeper_RecordsMetricsOnSuccess uses it on the success
  counters (acked, expired) so the error-stayed-zero assertion
  reads after StartSweeper has fully processed the cycle.
- waitForCycle keeps its current shape but documents the caveat in
  its comment so future tests don't repeat the assumption.

Verified: `go test ./internal/pendinguploads/ -race -count 5` passes
all 9 tests across 5 iterations cleanly.

Per memory feedback_question_test_when_unexpected.md: the
"delta=0, want=1" failure looked like a real production bug at first
glance, but instrumented inspection showed the metric DOES increment,
just AFTER the test's read. The fix is the test's wait shape, not
the sweeper.

Unblocks every PR currently broken by this flake (#2898 hit it on
two consecutive CI runs; staging-merged PRs from earlier today
(#2877/#2881/#2885/#2886) introduced the test).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 10:46:17 -07:00
hongming-personal 6470e5f41b Merge pull request #2887 from Molecule-AI/refactor/a2a-tools-delegation-extract-rfc2873-iter4b
refactor(workspace): extract delegation handlers from a2a_tools.py (RFC #2873 iter 4b)
2026-05-05 17:40:40 +00:00
hongming-personal aa560c0314 Merge pull request #2901 from Molecule-AI/feat/saas-default-t4
feat(saas): default new workspaces to T4 on SaaS, T3 self-hosted
2026-05-05 17:34:08 +00:00
hongming-personal 7644e82f2f feat(saas): default new workspaces to T4 on SaaS, T3 self-hosted
User reported every SaaS workspace defaults to T2 (Standard). Three
sites quietly disagreed on the default:

- canvas CreateWorkspaceDialog (line 126): isSaaS ? 4 : 3   ← only correct one
- canvas EmptyState "Create blank":      tier: 2            ← hardcoded
- workspace.go POST /workspaces:         tier = 3           ← not SaaS-aware
- org_import.go createWorkspaceTree:     tier = 2 (fallback)← not SaaS-aware

So a user clicking "+ New Workspace" via the dialog got T4 on SaaS,
but a user clicking "Create blank" on the empty canvas got T2, and an
agent POSTing /workspaces directly got T3. Same tenant, three different
tiers depending on entry point.

Fix:

1. WorkspaceHandler.IsSaaS() and DefaultTier() helpers (workspace_dispatchers.go).
   IsSaaS() := h.cpProv != nil — single source of truth for "are we
   SaaS" across the file. DefaultTier() returns 4 on SaaS, 3 on
   self-hosted. SaaS rationale: each workspace runs on its own sibling
   EC2 so the per-workspace tier boundary is a Docker resource limit
   on the only container present — no neighbour to protect from. T4
   matches the boundary.

2. workspace.go now defaults tier via h.DefaultTier() instead of
   hardcoded T3.

3. org_import.go fallback (when neither ws.tier nor defaults.tier set)
   becomes SaaS-aware: T4 on SaaS, T2 on self-hosted (preserve the
   existing safe-shared-Docker-daemon default for self-hosted org
   imports).

4. canvas EmptyState "Create blank" stops sending tier:2 in the body
   and lets the backend pick — single source of truth in the backend.
   Eliminates the third disagreement.

Test plan:
- go vet ./... clean
- go test ./internal/handlers/ -count 1 — all green (4.3s)
- npx tsc --noEmit on canvas — clean
- Staging E2E (after deploy): create a fresh workspace via canvas
  empty-state on hongming.moleculesai.app, confirm tier=4 on the
  workspace details panel.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 10:30:22 -07:00
Hongming Wang 8e5d193761 fix(tests): retarget get_peers_with_diagnostic patches to a2a_tools_messaging (RFC #2873 iter 4d)
Inherits the iter 4b test retarget commit through rebase. Adds the
remaining 4 patch sites in test_a2a_multi_workspace.py that target
get_peers_with_diagnostic — that call site moved from a2a_tools to
a2a_tools_messaging in this PR.

Refs RFC #2873 iter 4d.
2026-05-05 09:52:15 -07:00
Hongming Wang 3e0d2e650a refactor(workspace): extract messaging tools from a2a_tools.py to a2a_tools_messaging.py (RFC #2873 iter 4d)
Fourth slice of the a2a_tools.py split (stacked on iter 4c). Owns the
four human-and-peer messaging MCP tools + the chat-upload helper:

  * _upload_chat_files — stage local paths to /chat/uploads
  * tool_send_message_to_user — push canvas-chat via /notify
  * tool_list_peers — discover peers across registered workspaces
  * tool_get_workspace_info — JSON-encode workspace info
  * tool_chat_history — fetch prior conversation rows with a peer

a2a_tools.py shrinks from 508 → 213 LOC (−295). The remaining 213
is just report_activity + back-compat re-exports. Inbox tools
(tool_inbox_peek/pop/wait_for_message) deferred to iter 4e.

Layered architecture: messaging depends on a2a_tools_rbac (iter 4a),
a2a_client, platform_auth — NOT on kitchen-sink a2a_tools. An
import-contract test pins this so future refactors that add
`from a2a_tools import …` fail in CI.

Tests:
  * 28 patch sites in TestToolSendMessageToUser + TestToolListPeers +
    TestToolGetWorkspaceInfo + TestChatHistory retargeted from
    `a2a_tools.{httpx, get_peers_*, get_workspace_info,
    _upload_chat_files, _peer_*, list_registered_workspaces}` to
    `a2a_tools_messaging.…` because the call sites moved.
  * test_a2a_tools_messaging.py adds 7 new tests:
    - 5 alias drift gates
    - 2 import-contract tests (no top-level a2a_tools dep + a2a_tools
      surfaces every messaging symbol)

137 tests total in the a2a_tools suite, all green.

Refs RFC #2873.
2026-05-05 09:50:47 -07:00
Hongming Wang 210a26d31a refactor(workspace): extract memory tools from a2a_tools.py to a2a_tools_memory.py (RFC #2873 iter 4c)
Third slice of the a2a_tools.py split (stacked on iter 4b). Owns the
two persistent-memory MCP tools:

  * tool_commit_memory — write to /workspaces/:id/memories with RBAC
    + GLOBAL-scope tier-zero enforcement
  * tool_recall_memory — search /workspaces/:id/memories with RBAC

a2a_tools.py shrinks from 609 → 508 LOC (−101). Both handlers depend
ONLY on a2a_tools_rbac (iter 4a), a2a_client, and the platform's
/memories endpoint — no entanglement with delegation or messaging.

Side-effects of the layered architecture: a2a_tools_memory's import
contract is "depends on a2a_tools_rbac, never on a2a_tools" — the
kitchen-sink module is for back-compat re-exports only. A test pins
this so a future refactor that re-introduces `from a2a_tools import …`
fails in CI.

Tests:
  * 49 patch sites in TestToolCommitMemory + TestToolRecallMemory
    retargeted from `a2a_tools.{_check_memory_*, _is_root_workspace,
    httpx.AsyncClient}` to `a2a_tools_memory.…` because the call sites
    moved.
  * test_a2a_tools_memory.py adds 4 new tests (alias drift gate +
    import-contract + a2a_tools-side re-export).

117 tests total (77 impl + 28 rbac + 8 delegation + 4 memory), all green.

Refs RFC #2873.
2026-05-05 09:50:39 -07:00
Hongming Wang be18b9c8f9 fix(tests): retarget remaining a2a_tools delegation patches to a2a_tools_delegation
CI caught two test files I missed in the original iter 4b retarget:
test_a2a_multi_workspace.py + test_delegation_sync_via_polling.py
patch a2a_tools.{discover_peer, send_a2a_message, _delegate_sync_via_polling,
httpx.AsyncClient} but those call sites moved to a2a_tools_delegation
in this PR. 17 patch sites retargeted; 30 tests now green.

Refs RFC #2873 iter 4b.
2026-05-05 09:50:30 -07:00
Hongming Wang 2227a14b1e fix(build): add a2a_tools_delegation to TOP_LEVEL_MODULES drift gate
Iter 4b's new module needs the rewrite-list entry. Stacked on iter 4a
which already added a2a_tools_rbac.

Refs RFC #2873 iter 4b.
2026-05-05 05:01:04 -07:00
Hongming Wang e72f9ad107 refactor(workspace): extract delegation handlers from a2a_tools.py to a2a_tools_delegation.py (RFC #2873 iter 4b)
Second slice of the a2a_tools.py split (stacked on iter 4a). Owns the
three delegation MCP tools + the RFC #2829 PR-5 sync-via-polling
helper they share:

  * tool_delegate_task — synchronous delegation
  * tool_delegate_task_async — fire-and-forget
  * tool_check_task_status — poll the platform's /delegations log
  * _delegate_sync_via_polling — durable async + poll for terminal status
  * _SYNC_POLL_INTERVAL_S / _SYNC_POLL_BUDGET_S constants

a2a_tools.py shrinks from 915 → 609 LOC (−306). Stacked on iter 4a's
RBAC extraction; uses `from a2a_tools_rbac import auth_headers_for_heartbeat`
as its auth-header source.

The lazy `from a2a_tools import report_activity` inside tool_delegate_task
breaks the circular-import cycle (a2a_tools imports the delegation
re-exports at module-load; delegation handler needs report_activity at
CALL time). A dedicated test pins this contract.

Tests:
  * 77 existing test_a2a_tools_impl.py tests pass after retargeting
    20 patch sites in TestToolDelegateTask + TestToolDelegateTaskAsync +
    TestToolCheckTaskStatus from `a2a_tools.foo` to
    `a2a_tools_delegation.foo` (foo ∈ {discover_peer, send_a2a_message,
    httpx.AsyncClient}). The patches need to target the new module
    because that's where the call sites live now.
  * test_a2a_tools_delegation.py adds 8 new tests:
    - 6 alias drift gates (`a2a_tools.tool_delegate_task is …`)
    - 2 import-contract tests (no top-level circular dep + a2a_tools
      surfaces every delegation symbol)
    - 1 sync-poll budget invariant

113 tests total (77 impl + 28 rbac + 8 delegation), all green.

Refs RFC #2873.
2026-05-05 05:00:52 -07:00
175 changed files with 19851 additions and 3782 deletions
@@ -0,0 +1,83 @@
name: auto-promote-stale-alarm
# Hourly cron + on-demand alarm for the silent-block failure mode that
# motivated issue #2975:
# - The auto-promote-staging.yml workflow opened a PR + armed
# auto-merge, but main's branch protection requires a human review
# (reviewDecision=REVIEW_REQUIRED). The PR sat BLOCKED with no
# surface-up-the-stack for 12+ hours, holding 25 commits hostage
# including the Memory v2 redesign and a reno-stars data-loss fix.
#
# This workflow runs `scripts/check-stale-promote-pr.sh` against the
# repo's open auto-promote PRs (base=main head=staging). When a PR has
# been BLOCKED on REVIEW_REQUIRED for >4h, it:
# 1. Emits a workflow-level warning (visible in run summary + the
# Actions UI feed).
# 2. Posts a comment on the PR (idempotent — one alarm per PR).
#
# The detection logic lives in scripts/check-stale-promote-pr.sh so
# it's unit-testable with stubbed `gh` (see test-check-stale-promote-pr.sh).
# This file is the schedule + invocation surface only — SSOT for the
# detector itself.
on:
schedule:
# Hourly. Cheap (one `gh pr list` + jq), and 1h granularity is
# plenty for a 4h staleness threshold — operators see the alarm
# within at most 1h of crossing the threshold.
- cron: "27 * * * *" # at :27 to dodge the cron herd at :00
workflow_dispatch:
inputs:
stale_hours:
description: "Hours after which a BLOCKED+REVIEW_REQUIRED PR is stale (default 4)"
required: false
default: "4"
post_comment:
description: "Post a comment on stale PRs (default true)"
required: false
default: "true"
permissions:
contents: read
pull-requests: write # post comments on stale PRs
# Serialize so the on-demand and scheduled runs don't double-comment
# the same PR. cancel-in-progress=false because the script is idempotent
# (existing comment marker prevents dupes), but a scheduled run firing
# while a manual one runs would just re-list the same PR set.
concurrency:
group: auto-promote-stale-alarm
cancel-in-progress: false
jobs:
scan:
runs-on: ubuntu-latest
steps:
- name: Checkout (need scripts/ only)
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
sparse-checkout: |
scripts/check-stale-promote-pr.sh
sparse-checkout-cone-mode: false
- name: Run stale-PR detector
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
GITHUB_REPOSITORY: ${{ github.repository }}
STALE_HOURS: ${{ inputs.stale_hours || '4' }}
POST_COMMENT: ${{ inputs.post_comment || 'true' }}
run: |
# The script's exit code reflects the count of stale PRs.
# We don't want a stale finding to fail the workflow run —
# the warning + comment are the signal, the green/red is
# noise. So convert any non-zero exit to a workflow notice
# and exit 0.
set +e
bash scripts/check-stale-promote-pr.sh
rc=$?
set -e
if [ "$rc" -ne 0 ]; then
echo "::notice::Stale PR detector found $rc PR(s) needing attention. See warnings above + comments on the PRs."
fi
# Always succeed — operator-facing surface is the warning,
# not the workflow status.
exit 0
+1
View File
@@ -387,6 +387,7 @@ jobs:
"a2a_mcp_server.py"
"mcp_cli.py"
"a2a_tools.py"
"a2a_tools_inbox.py"
"inbox.py"
"platform_auth.py"
)
+3
View File
@@ -172,6 +172,9 @@ jobs:
- name: Run poll-mode + since_id cursor E2E (#2339)
if: needs.detect-changes.outputs.api == 'true'
run: bash tests/e2e/test_poll_mode_e2e.sh
- name: Run poll-mode chat upload E2E (RFC #2891)
if: needs.detect-changes.outputs.api == 'true'
run: bash tests/e2e/test_poll_mode_chat_upload_e2e.sh
- name: Dump platform log on failure
if: failure() && needs.detect-changes.outputs.api == 'true'
run: cat workspace-server/platform.log || true
@@ -121,8 +121,16 @@ jobs:
# Per-migration result is logged so a failed migration that
# SHOULD have been replayable surfaces in the CI log instead
# of silently failing.
# Apply both *.sql (legacy, lives next to its module) and
# *.up.sql (newer up/down convention) in a single
# lexicographically-sorted pass. Excluding *.down.sql so the
# newest-naming-convention pairs don't undo themselves mid-run.
# Pre-#149-followup this loop only globbed *.up.sql, which
# silently skipped 001_workspaces.sql + 009_activity_logs.sql
# — fine while no integration test depended on those tables,
# not fine once a cross-table atomicity test came in.
set +e
for migration in migrations/*.up.sql; do
for migration in $(ls migrations/*.sql 2>/dev/null | grep -v '\.down\.sql$' | sort); do
if psql -h localhost -U postgres -d molecule -v ON_ERROR_STOP=1 \
-f "$migration" >/dev/null 2>&1; then
echo "✓ $(basename "$migration")"
@@ -132,16 +140,19 @@ jobs:
done
set -e
# Sanity: the delegations table MUST exist for the integration
# tests to be meaningful. Hard-fail if 049 didn't land — that
# would be a real regression we want loud.
if ! psql -h localhost -U postgres -d molecule -tA \
-c "SELECT 1 FROM information_schema.tables WHERE table_name = 'delegations'" \
| grep -q 1; then
echo "::error::delegations table missing after migration replay — handler integration tests would be meaningless"
exit 1
fi
echo "✓ delegations table present"
# Sanity: the delegations + workspaces + activity_logs tables
# MUST exist for the integration tests to be meaningful. Hard-
# fail if any didn't land — that would be a real regression we
# want loud.
for tbl in delegations workspaces activity_logs pending_uploads; do
if ! psql -h localhost -U postgres -d molecule -tA \
-c "SELECT 1 FROM information_schema.tables WHERE table_name = '$tbl'" \
| grep -q 1; then
echo "::error::$tbl table missing after migration replay — handler integration tests would be meaningless"
exit 1
fi
echo "✓ $tbl table present"
done
- if: needs.detect-changes.outputs.handlers == 'true'
name: Run integration tests
+49 -1
View File
@@ -108,6 +108,14 @@ jobs:
python3 > stale_slugs.txt <<'PY'
import json, os
from datetime import datetime, timezone, timedelta
# SSOT for this list lives in the controlplane Go code:
# molecule-controlplane/internal/slugs/ephemeral.go
# (var EphemeralPrefixes). The redeploy-fleet auto-rollout
# also reads from there to SKIP these slugs — without that
# filter, fleet redeploy SSM-failed in-flight E2E tenants
# whose containers were still booting, breaking the test
# that just spun them up (molecule-controlplane#493).
# Update both files together.
EPHEMERAL_PREFIXES = ("e2e-", "rt-e2e-")
with open("orgs.json") as f:
data = json.load(f)
@@ -185,7 +193,47 @@ jobs:
# sweeper is best-effort. Next hourly tick re-attempts. We
# only fail loud at the safety-cap gate above.
- name: Sweep orphan tunnels
# Stale-org cleanup deletes the org (which cascades to tunnel
# delete inside the CP). But when that cascade fails partway —
# CP transient 5xx after the org row is deleted but before the
# CF tunnel delete completes — the tunnel persists with no
# matching org row. The reconciler in internal/sweep flags this
# as `cf_tunnel kind=orphan`, but nothing automatically reaps it.
#
# `/cp/admin/orphan-tunnels/cleanup` is the operator-triggered
# reaper. Calling it here at the end of every sweep tick
# converges the staging CF account to clean even when CP
# cascades half-fail.
#
# PR #492 made the underlying DeleteTunnel actually check
# status — pre-fix it silent-succeeded on CF code 1022
# ("active connections"), so this step would have been a no-op
# against stuck connectors. Post-fix the cleanup invokes
# CleanupTunnelConnections + retry, which actually clears the
# 1022 case. (#2987)
#
# Best-effort. Failure here doesn't fail the workflow — next
# tick re-attempts. Errors flow to step output for ops review.
if: env.DRY_RUN != 'true'
run: |
set +e
curl -sS -o /tmp/cleanup_resp -w "%{http_code}" \
--max-time 60 \
-X POST "$MOLECULE_CP_URL/cp/admin/orphan-tunnels/cleanup" \
-H "Authorization: Bearer $ADMIN_TOKEN" >/tmp/cleanup_code
set -e
http_code=$(cat /tmp/cleanup_code 2>/dev/null || echo "000")
body=$(cat /tmp/cleanup_resp 2>/dev/null | head -c 500)
if [ "$http_code" = "200" ]; then
count=$(echo "$body" | python3 -c "import sys,json; d=json.loads(sys.stdin.read() or '{}'); print(d.get('deleted_count', 0))" 2>/dev/null || echo "0")
failed_n=$(echo "$body" | python3 -c "import sys,json; d=json.loads(sys.stdin.read() or '{}'); print(len(d.get('failed') or {}))" 2>/dev/null || echo "0")
echo "Orphan-tunnel sweep: deleted=$count failed=$failed_n"
else
echo "::warning::orphan-tunnels cleanup returned HTTP $http_code — body: $body"
fi
- name: Dry-run summary
if: env.DRY_RUN == 'true'
run: |
echo "DRY RUN — would have deleted ${{ steps.identify.outputs.count }} org(s). Re-run with dry_run=false to actually delete."
echo "DRY RUN — would have deleted ${{ steps.identify.outputs.count }} org(s) AND triggered orphan-tunnels cleanup. Re-run with dry_run=false to actually delete."
+47 -3
View File
@@ -18,7 +18,7 @@
// quick bounce between signup and either Checkout or the tenant UI.
import { useEffect, useState } from "react";
import { fetchSession, redirectToLogin, type Session } from "@/lib/auth";
import { fetchSession, redirectToLogin, signOut, type Session } from "@/lib/auth";
import { PLATFORM_URL } from "@/lib/api";
import { formatCredits, pillTone, bannerKind } from "@/lib/credits";
import { TermsGate } from "@/components/TermsGate";
@@ -129,7 +129,7 @@ export default function OrgsPage() {
return <EmptyState banner={justCheckedOut ? <CheckoutBanner /> : null} />;
}
return (
<Shell>
<Shell session={session}>
{justCheckedOut && <CheckoutBanner />}
<ul className="space-y-3">
{orgs.map((o) => (
@@ -160,11 +160,21 @@ function CheckoutBanner() {
);
}
function Shell({ children }: { children: React.ReactNode }) {
function Shell({
children,
session,
}: {
children: React.ReactNode;
// Optional: when present, the header renders the signed-in email +
// a Sign-out button. The empty-state Shell call doesn't have a
// session in scope, so accept null and skip the header chrome there.
session?: Session | null;
}) {
return (
<main className="min-h-screen bg-surface text-ink">
<TermsGate>
<div className="mx-auto max-w-2xl px-6 pt-20 pb-12">
{session ? <AccountBar session={session} /> : null}
<h1 className="text-3xl font-bold text-ink">Your organizations</h1>
<p className="mt-2 text-ink-mid">
Each org is an isolated Molecule workspace.
@@ -177,6 +187,40 @@ function Shell({ children }: { children: React.ReactNode }) {
);
}
// AccountBar renders the signed-in email + a Sign-out button at the
// top of the page. Without this the user has no way to log out — the
// /cp/auth/signout endpoint exists on the control plane but no UI ever
// called it. Reported externally on 2026-05-05; this is the fix.
//
// Click → calls signOut() which POSTs /cp/auth/signout (clears the
// WorkOS session cookie + revokes at the provider) then bounces to
// /cp/auth/login. The signOut helper is best-effort — even on a 5xx
// or network failure the redirect fires so the user never gets stuck
// on an authed-looking page after they clicked Sign out.
function AccountBar({ session }: { session: Session }) {
const [signingOut, setSigningOut] = useState(false);
return (
<div className="mb-6 flex items-center justify-between text-sm text-ink-mid">
<span title="Signed-in user">{session.email}</span>
<button
type="button"
disabled={signingOut}
onClick={async () => {
setSigningOut(true);
await signOut();
// Redirect happens inside signOut; this line is for tests +
// edge cases (jsdom, blocked navigation) where it doesn't.
setSigningOut(false);
}}
className="rounded border border-line bg-surface-card px-3 py-1 text-xs text-ink hover:bg-surface-card disabled:opacity-50"
aria-label="Sign out"
>
{signingOut ? "Signing out…" : "Sign out"}
</button>
</div>
);
}
// DataResidencyNotice surfaces where workspace data lives so EU-based
// signups can make an informed choice (GDPR Art. 13 disclosure
// requirement). Plain text, no icon — the goal is clarity, not
+9 -4
View File
@@ -48,16 +48,21 @@ export function EmptyState() {
});
// "Create blank" bypasses templates entirely — no preflight, no
// modal, just POST /workspaces with a default name and tier.
// Deliberately NOT routed through useTemplateDeploy because it
// has no `template.id` to deploy against.
// modal, just POST /workspaces with a default name. Deliberately
// NOT routed through useTemplateDeploy because it has no
// `template.id` to deploy against.
//
// tier is omitted so the backend picks a SaaS-aware default
// (T4 on SaaS, T3 on self-hosted — see WorkspaceHandler.DefaultTier).
// The previous hardcoded `tier: 2` shipped every fresh-tenant agent
// at Standard regardless of host, which surprised SaaS users whose
// CreateWorkspaceDialog already defaults to T4.
const createBlank = async () => {
setBlankCreating(true);
setBlankError(null);
try {
const ws = await api.post<{ id: string }>("/workspaces", {
name: "My First Agent",
tier: 2,
canvas: firstDeployCoords(),
});
handleDeployed(ws.id);
@@ -20,160 +20,6 @@ import * as Dialog from "@radix-ui/react-dialog";
type Tab = "python" | "curl" | "claude" | "mcp" | "hermes" | "codex" | "openclaw" | "fields";
// Per-tab help metadata: docs link, where-to-install link, common errors.
// All URLs verified against repo content (docs/guides/* file paths map to
// docs.molecule.ai/docs/guides/*; canonical hostname confirmed by existing
// blog post canonical metadata) or against the snippet text the operator
// just copied. Never linking to a URL that wasn't already in product —
// dead links here defeat the purpose of "more comprehensive instructions."
const TAB_HELP: Record<
Tab,
{
docsUrl?: string;
docsLabel?: string;
downloadUrl?: string;
downloadLabel?: string;
commonIssues?: { symptom: string; check: string }[];
}
> = {
mcp: {
docsUrl: "https://docs.molecule.ai/docs/guides/mcp-server-setup",
docsLabel: "MCP server setup guide",
downloadUrl: "https://pypi.org/project/molecule-ai-workspace-runtime/",
downloadLabel: "molecule-ai-workspace-runtime on PyPI",
commonIssues: [
{
symptom: "Tools not appearing in your agent",
check:
"Run `claude mcp list` (or your runtime's equivalent) — the molecule entry should be listed. If missing, re-run the `claude mcp add` line.",
},
{
symptom: "ConnectionRefused / DNS error on first call",
check:
"PLATFORM_URL must include the scheme (https://) and have no trailing slash. Verify with `curl $PLATFORM_URL/healthz`.",
},
],
},
python: {
docsUrl:
"https://docs.molecule.ai/docs/guides/external-agent-registration",
docsLabel: "External agent registration guide",
downloadUrl: "https://pypi.org/project/molecule-ai-workspace-runtime/",
downloadLabel: "molecule-ai-workspace-runtime on PyPI",
commonIssues: [
{
symptom: "401 from /heartbeat",
check:
"AUTH_TOKEN expired or wrong workspace_id. Tokens are shown only once at create time — re-create the workspace to get a fresh token.",
},
{
symptom: "AGENT_URL not reachable from platform",
check:
"Public HTTPS URL required for inbound A2A. Use ngrok or Cloudflare Tunnel if your agent is behind NAT.",
},
],
},
claude: {
docsUrl:
"https://docs.molecule.ai/docs/guides/external-agent-registration",
docsLabel: "External agent registration guide",
downloadUrl: "https://claude.com/claude-code",
downloadLabel: "Claude Code (claude.com)",
commonIssues: [
{
symptom: "plugin not installed",
check:
"Run `/plugin marketplace add Molecule-AI/molecule-mcp-claude-channel` then `/plugin install molecule@molecule-mcp-claude-channel` inside Claude Code, then `/reload-plugins`.",
},
{
symptom: "not on the approved channels allowlist",
check:
"Custom channels need `--dangerously-load-development-channels` on the launch command. Team/Enterprise orgs need admin to set `channelsEnabled` + `allowedChannelPlugins` in claude.ai admin settings.",
},
{
symptom: "Inbound messages not arriving",
check:
"Check stderr for `molecule channel: connected — watching N workspace(s)`. Verify ~/.claude/channels/molecule/.env has the right PLATFORM_URL + token.",
},
],
},
hermes: {
docsUrl:
"https://docs.molecule.ai/docs/guides/external-agent-registration",
docsLabel: "External agent registration guide",
downloadUrl: "https://github.com/NousResearch/hermes-agent",
downloadLabel: "hermes-agent (NousResearch)",
commonIssues: [
{
symptom: "Gateway start failure",
check:
"Tail ~/.hermes/gateway.log. YAML duplicate-key in config.yaml is the most common cause — `gateway:` block must appear exactly once.",
},
{
symptom: "Plugin not discovered after install",
check:
"Run `pip show hermes-channel-molecule` to confirm install. Some hermes builds need `hermes plugin reload` before the new platform_plugins entry takes effect.",
},
],
},
codex: {
docsUrl: "https://docs.molecule.ai/docs/guides/mcp-server-setup",
docsLabel: "MCP server setup guide",
downloadUrl: "https://github.com/openai/codex",
downloadLabel: "openai/codex",
commonIssues: [
{
symptom: "[mcp_servers.molecule] not loaded",
check:
"Codex must be ≥ 0.57. Check with `codex --version`; upgrade via `npm install -g @openai/codex@latest`.",
},
{
symptom: "TOML parse error after re-running setup",
check:
"TOML rejects duplicate `[mcp_servers.molecule]` tables. Open ~/.codex/config.toml and remove the old block before pasting the new one.",
},
{
symptom: "Canvas messages don't wake codex",
check:
"Step 3 (codex-channel-molecule bridge daemon) is required for inbound push. Check `pgrep -f codex-channel-molecule` and `tail ~/.codex-channel-molecule/daemon.log`.",
},
],
},
openclaw: {
docsUrl: "https://docs.molecule.ai/docs/guides/mcp-server-setup",
docsLabel: "MCP server setup guide",
commonIssues: [
{
symptom: "Gateway not starting",
check:
"Tail ~/.openclaw/gateway.log. The loopback bind requires :18789 to be free — check with `lsof -iTCP:18789`.",
},
{
symptom: "openclaw mcp set rejected",
check:
"The heredoc generates JSON; verify it parsed by running `jq < ~/.openclaw/mcp/molecule.json`. Re-run `openclaw mcp set` if the file is malformed.",
},
],
},
curl: {
docsUrl:
"https://docs.molecule.ai/docs/guides/external-agent-registration",
docsLabel: "External agent registration guide",
commonIssues: [
{
symptom: "401 / 403 on register",
check:
"WORKSPACE_AUTH_TOKEN must be the value shown at workspace create. Tokens are shown only once.",
},
],
},
fields: {
docsUrl:
"https://docs.molecule.ai/docs/guides/external-agent-registration",
docsLabel: "External agent registration guide",
},
};
export interface ExternalConnectionInfo {
workspace_id: string;
platform_url: string;
@@ -457,7 +303,6 @@ export function ExternalConnectModal({ info, onClose }: Props) {
<Field label="heartbeat_endpoint" value={info.heartbeat_endpoint} onCopy={() => copy(info.heartbeat_endpoint, "hb")} copied={copiedKey === "hb"} />
</div>
)}
<HelpBlock help={TAB_HELP[tab]} />
</div>
<div className="mt-5 flex justify-end gap-2">
@@ -506,70 +351,6 @@ function SnippetBlock({
);
}
// HelpBlock — collapsible "Need help?" section under each tab's snippet.
// Renders only the keys present in the per-tab help metadata (no empty
// sections). Closed by default so the snippet stays the visual focus;
// operators with a working setup never see this. Uses native <details>
// for keyboard accessibility (Tab + Enter) without extra ARIA wiring.
function HelpBlock({
help,
}: {
help: (typeof TAB_HELP)[Tab] | undefined;
}) {
if (!help) return null;
const { docsUrl, docsLabel, downloadUrl, downloadLabel, commonIssues } = help;
if (!docsUrl && !downloadUrl && !commonIssues?.length) return null;
return (
<details className="mt-3 border border-line rounded-lg bg-surface text-xs">
<summary className="cursor-pointer select-none px-3 py-2 text-ink-mid hover:text-ink">
Need help? install link, docs, common errors
</summary>
<div className="px-3 pb-3 pt-1 space-y-2">
{downloadUrl && (
<div>
<span className="text-ink-soft">Where to install: </span>
<a
href={downloadUrl}
target="_blank"
rel="noopener noreferrer"
className="text-accent underline hover:text-accent-strong"
>
{downloadLabel || downloadUrl}
</a>
</div>
)}
{docsUrl && (
<div>
<span className="text-ink-soft">Documentation: </span>
<a
href={docsUrl}
target="_blank"
rel="noopener noreferrer"
className="text-accent underline hover:text-accent-strong"
>
{docsLabel || docsUrl}
</a>
</div>
)}
{commonIssues && commonIssues.length > 0 && (
<div>
<div className="text-ink-soft mb-1">Common errors:</div>
<ul className="space-y-1.5 pl-3">
{commonIssues.map((issue, i) => (
<li key={i}>
<code className="text-warm font-mono">{issue.symptom}</code>
<span className="text-ink-mid"> {issue.check}</span>
</li>
))}
</ul>
</div>
)}
</div>
</details>
);
}
function Field({
label,
value,
@@ -1,261 +0,0 @@
'use client';
import { useEffect, useRef, useState } from "react";
import { createPortal } from "react-dom";
import { api } from "@/lib/api";
import type { MemoryEntry } from "@/components/MemoryInspectorPanel";
type Scope = "LOCAL" | "TEAM" | "GLOBAL";
const SCOPES: Scope[] = ["LOCAL", "TEAM", "GLOBAL"];
interface AddProps {
open: boolean;
mode: "add";
workspaceId: string;
defaultScope: Scope;
defaultNamespace?: string;
entry?: undefined;
onClose: () => void;
onSaved: () => void;
}
interface EditProps {
open: boolean;
mode: "edit";
workspaceId: string;
entry: MemoryEntry;
defaultScope?: undefined;
defaultNamespace?: undefined;
onClose: () => void;
onSaved: () => void;
}
type Props = AddProps | EditProps;
export function MemoryEditorDialog(props: Props) {
const { open, mode, workspaceId, onClose, onSaved } = props;
const dialogRef = useRef<HTMLDivElement>(null);
const [mounted, setMounted] = useState(false);
const [scope, setScope] = useState<Scope>("LOCAL");
const [namespace, setNamespace] = useState("general");
const [content, setContent] = useState("");
const [saving, setSaving] = useState(false);
const [error, setError] = useState<string | null>(null);
useEffect(() => {
setMounted(true);
}, []);
// Reset form whenever the dialog opens.
useEffect(() => {
if (!open) return;
setError(null);
setSaving(false);
if (mode === "edit" && props.entry) {
setScope(props.entry.scope);
setNamespace(props.entry.namespace || "general");
setContent(props.entry.content);
} else if (mode === "add") {
setScope(props.defaultScope);
setNamespace(props.defaultNamespace || "general");
setContent("");
}
// mode/props are stable per-open; intentional shallow deps.
// eslint-disable-next-line react-hooks/exhaustive-deps
}, [open]);
// Move focus into the dialog when it opens (WCAG SC 2.4.3).
useEffect(() => {
if (!open || !mounted) return;
const raf = requestAnimationFrame(() => {
dialogRef.current?.querySelector<HTMLElement>("textarea, input, select")?.focus();
});
return () => cancelAnimationFrame(raf);
}, [open, mounted]);
// Escape closes; Cmd/Ctrl-Enter saves.
const onCloseRef = useRef(onClose);
onCloseRef.current = onClose;
const handleSaveRef = useRef<() => void>(() => {});
useEffect(() => {
if (!open) return;
const handler = (e: KeyboardEvent) => {
if (e.key === "Escape") {
e.preventDefault();
onCloseRef.current();
} else if (e.key === "Enter" && (e.metaKey || e.ctrlKey)) {
e.preventDefault();
handleSaveRef.current();
}
};
window.addEventListener("keydown", handler);
return () => window.removeEventListener("keydown", handler);
}, [open]);
const handleSave = async () => {
if (saving) return;
const trimmed = content.trim();
if (!trimmed) {
setError("Content cannot be empty");
return;
}
setError(null);
setSaving(true);
try {
if (mode === "add") {
await api.post(`/workspaces/${workspaceId}/memories`, {
content: trimmed,
scope,
namespace: namespace.trim() || "general",
});
} else {
// PATCH only sends fields that changed. Content always changeable;
// namespace only sent if it differs from the original (saves a
// no-op write through redactSecrets + re-embed).
const original = props.entry;
const body: Record<string, string> = {};
if (trimmed !== original.content) body.content = trimmed;
const ns = namespace.trim() || "general";
if (ns !== original.namespace) body.namespace = ns;
if (Object.keys(body).length === 0) {
// No-op edit — close without an HTTP round-trip.
onSaved();
onClose();
return;
}
await api.patch(
`/workspaces/${workspaceId}/memories/${encodeURIComponent(original.id)}`,
body,
);
}
onSaved();
onClose();
} catch (e) {
setError(e instanceof Error ? e.message : "Save failed");
} finally {
setSaving(false);
}
};
handleSaveRef.current = handleSave;
if (!open || !mounted) return null;
const titleId = "memory-editor-title";
const isEdit = mode === "edit";
return createPortal(
<div className="fixed inset-0 z-[9999] flex items-center justify-center">
<div className="absolute inset-0 bg-black/60 backdrop-blur-sm" onClick={onClose} />
<div
ref={dialogRef}
role="dialog"
aria-modal="true"
aria-labelledby={titleId}
className="relative bg-surface-sunken border border-line rounded-xl shadow-2xl shadow-black/50 max-w-[480px] w-full mx-4 overflow-hidden"
>
<div className="px-5 py-4 space-y-3">
<h3 id={titleId} className="text-sm font-semibold text-ink">
{isEdit ? "Edit memory" : "Add memory"}
</h3>
{/* Scope */}
<div className="space-y-1">
<label className="text-[10px] text-ink-soft block" htmlFor="memory-editor-scope">
Scope
</label>
{isEdit ? (
<div
id="memory-editor-scope"
className="text-[12px] font-mono text-ink-mid bg-surface rounded px-2 py-1.5 border border-line/50"
title="Scope is fixed on edit. To move a memory across scopes, delete and re-create it."
>
{scope}
</div>
) : (
<div className="flex items-center gap-1" id="memory-editor-scope" role="radiogroup" aria-label="Scope">
{SCOPES.map((s) => (
<button
key={s}
type="button"
role="radio"
aria-checked={scope === s}
onClick={() => setScope(s)}
className={[
"px-3 py-1 text-[11px] rounded transition-colors",
scope === s
? "bg-accent-strong text-white"
: "bg-surface-card text-ink-mid hover:text-ink",
].join(" ")}
>
{s}
</button>
))}
</div>
)}
</div>
{/* Namespace */}
<div className="space-y-1">
<label htmlFor="memory-editor-namespace" className="text-[10px] text-ink-soft block">
Namespace
</label>
<input
id="memory-editor-namespace"
type="text"
value={namespace}
onChange={(e) => setNamespace(e.target.value)}
placeholder="general"
className="w-full bg-surface border border-line/60 focus:border-accent/60 rounded px-2 py-1.5 text-[12px] text-ink placeholder-zinc-600 focus:outline-none transition-colors"
/>
</div>
{/* Content */}
<div className="space-y-1">
<label htmlFor="memory-editor-content" className="text-[10px] text-ink-soft block">
Content
</label>
<textarea
id="memory-editor-content"
value={content}
onChange={(e) => setContent(e.target.value)}
rows={6}
placeholder="What should the agent remember?"
className="w-full bg-surface border border-line/60 focus:border-accent/60 rounded px-2 py-1.5 text-[12px] font-mono text-ink placeholder-zinc-600 focus:outline-none transition-colors resize-y min-h-[100px] max-h-[300px]"
/>
</div>
{error && (
<div
role="alert"
aria-live="assertive"
className="px-2 py-1.5 bg-red-950/30 border border-red-800/40 rounded text-[11px] text-bad"
>
{error}
</div>
)}
</div>
<div className="flex items-center justify-end gap-2 px-5 py-3 border-t border-line bg-surface/50">
<button
type="button"
onClick={onClose}
disabled={saving}
className="px-3.5 py-1.5 text-[13px] text-ink-mid hover:text-ink bg-surface-card hover:bg-surface-elevated border border-line hover:border-line-soft rounded-lg transition-colors focus:outline-none focus-visible:ring-2 focus-visible:ring-accent/40 disabled:opacity-50 disabled:cursor-not-allowed"
>
Cancel
</button>
<button
type="button"
onClick={handleSave}
disabled={saving}
className="px-3.5 py-1.5 text-[13px] rounded-lg transition-colors bg-accent hover:bg-accent-strong text-white focus:outline-none focus-visible:ring-2 focus-visible:ring-offset-2 focus-visible:ring-offset-surface-sunken focus-visible:ring-accent/60 disabled:opacity-50 disabled:cursor-not-allowed"
>
{saving ? "Saving…" : isEdit ? "Save changes" : "Add memory"}
</button>
</div>
</div>
</div>,
document.body,
);
}
+382 -229
View File
@@ -1,30 +1,81 @@
'use client';
import { useState, useEffect, useCallback } from "react";
import { api } from "@/lib/api";
import { ConfirmDialog } from "@/components/ConfirmDialog";
import { MemoryEditorDialog } from "@/components/MemoryEditorDialog";
/**
* MemoryInspectorPanel — Memory v2 redesign.
*
* Reads the canvas Memory tab from the v2 plugin via the
* workspace-server proxy at /v2/{namespaces,memories}, replacing the
* v1 LOCAL/TEAM/GLOBAL trio that mapped to the deprecated
* shared_context model.
*
* Surface differences from v1:
* - Namespace dropdown driven by GET /v2/namespaces (workspace /
* team / org / custom — labels rendered server-side).
* - Per-row badges for kind (fact|summary|checkpoint), source
* (agent|runtime|user), pin (📌), TTL countdown, and propagation
* source-workspace if the memory came from a peer.
* - No Edit affordance — v2's plugin contract has no PATCH; the
* model is forget + recommit. Delete (Forget) stays.
*
* Shipping note: when the plugin isn't wired (MEMORY_PLUGIN_URL
* unset), every endpoint returns 503 with a clear hint. The panel
* surfaces that as a banner so operators know to set the env var,
* rather than rendering a perpetual empty state that looks like
* "no memories yet".
*/
import { useCallback, useEffect, useMemo, useState } from 'react';
import { api } from '@/lib/api';
import { ConfirmDialog } from '@/components/ConfirmDialog';
// ── Types ─────────────────────────────────────────────────────────────────────
/** Memory entry returned by GET /workspaces/:id/memories */
export interface MemoryEntry {
id: string;
workspace_id: string;
content: string;
scope: "LOCAL" | "TEAM" | "GLOBAL";
namespace: string;
created_at: string;
/**
* Semantic similarity score (01). Only present when the API is queried
* with ?q=<query> and the pgvector backend has been deployed.
* Absent on plain list fetches — renders gracefully without a badge.
*/
similarity_score?: number;
export type NamespaceKind = 'workspace' | 'team' | 'org' | 'custom';
export interface NamespaceView {
name: string;
kind: NamespaceKind;
label: string;
}
type Scope = "LOCAL" | "TEAM" | "GLOBAL";
const SCOPES: Scope[] = ["LOCAL", "TEAM", "GLOBAL"];
export interface NamespacesResponse {
readable: NamespaceView[];
writable: NamespaceView[];
}
export type MemoryKind = 'fact' | 'summary' | 'checkpoint';
export type MemorySource = 'agent' | 'runtime' | 'user';
export interface MemoryV2 {
id: string;
namespace: string;
content: string;
kind: MemoryKind;
source: MemorySource;
pin: boolean;
expires_at?: string | null;
created_at: string;
/** 0..1 plugin similarity score; only present when ?q= is set. */
score?: number | null;
// Note: an earlier iteration of this type carried a `source_workspace_id`
// field rendered as a "from peer" badge. The propagation contract that
// would have populated it ("Reserved for future cross-namespace
// propagation semantics" in memory-plugin-v1.yaml) is unimplemented —
// nothing in the codebase writes that key. Removed in self-review.
// Re-add when propagation gains a concrete shape.
}
interface MemoriesResponse {
memories: MemoryV2[];
}
// MemoryEntry kept as a back-compat type alias so any other component
// still importing it doesn't break the build. New consumers should
// prefer MemoryV2 — the v1 shape (LOCAL/TEAM/GLOBAL scope) is gone.
//
// `unknown` is used over `any` so TS still flags accidental field
// access on the legacy shape.
export type MemoryEntry = MemoryV2;
interface Props {
workspaceId: string;
@@ -32,11 +83,26 @@ interface Props {
// ── Helpers ───────────────────────────────────────────────────────────────────
/**
* Sanitise a memory id for use in an HTML id attribute.
*/
function sanitizeId(id: string): string {
return id.replace(/[^a-zA-Z0-9]/g, "-");
return id.replace(/[^a-zA-Z0-9]/g, '-');
}
/**
* Detect a memory-plugin-503 error from the api wrapper's stringified
* Error message. Matches on the literal env-var name rather than the
* status code, because the api shim renders status codes inside a
* larger formatted message and a future status-code reformat would
* silently break the detection.
*
* The substring `MEMORY_PLUGIN_URL` is hard-coded in the handler at
* `workspace-server/internal/handlers/memories_v2.go:available()`,
* so this is a pinned cross-layer contract — drift is caught by both
* the Go test (TestMemoriesV2_PluginUnwired_All503) and the canvas
* test (TestMemoryInspectorPanel — plugin unavailable).
*/
export function isPluginUnavailableError(err: unknown): boolean {
const msg = err instanceof Error ? err.message : '';
return msg.includes('MEMORY_PLUGIN_URL');
}
function formatRelativeTime(iso: string): string {
@@ -47,6 +113,24 @@ function formatRelativeTime(iso: string): string {
return new Date(iso).toLocaleDateString();
}
/**
* Render a TTL countdown like "12h", "3d", or "expired" (when the
* stored expires_at is in the past). Non-fatal if expires_at is null
* or invalid — falls through to empty string so the badge doesn't
* render.
*/
export function formatTTL(expiresAt: string | null | undefined): string {
if (!expiresAt) return '';
const ts = new Date(expiresAt).getTime();
if (Number.isNaN(ts)) return '';
const diff = ts - Date.now();
if (diff <= 0) return 'expired';
if (diff < 60_000) return `${Math.floor(diff / 1000)}s`;
if (diff < 3_600_000) return `${Math.floor(diff / 60_000)}m`;
if (diff < 86_400_000) return `${Math.floor(diff / 3_600_000)}h`;
return `${Math.floor(diff / 86_400_000)}d`;
}
// ── Skeleton rows ──────────────────────────────────────────────────────────────
function MemorySkeletonRows() {
@@ -71,63 +155,92 @@ function MemorySkeletonRows() {
// ── Component ─────────────────────────────────────────────────────────────────
const ALL_NAMESPACES = '__all__';
export function MemoryInspectorPanel({ workspaceId }: Props) {
const [activeScope, setActiveScope] = useState<Scope>("LOCAL");
const [activeNamespace, setActiveNamespace] = useState("");
const [entries, setEntries] = useState<MemoryEntry[]>([]);
const [namespaces, setNamespaces] = useState<NamespacesResponse | null>(null);
const [activeNamespace, setActiveNamespace] = useState<string>(ALL_NAMESPACES);
const [entries, setEntries] = useState<MemoryV2[]>([]);
const [loading, setLoading] = useState(true);
const [error, setError] = useState<string | null>(null);
// ── Search state (debounced) ────────────────────────────────────────────────
const [searchQuery, setSearchQuery] = useState("");
const [debouncedQuery, setDebouncedQuery] = useState("");
// Plugin-disabled banner (503 from server). Stored separately so we
// can keep showing the namespace dropdown empty rather than
// hiding the whole panel.
const [pluginUnavailable, setPluginUnavailable] = useState(false);
// Search state (debounced)
const [searchQuery, setSearchQuery] = useState('');
const [debouncedQuery, setDebouncedQuery] = useState('');
useEffect(() => {
const timer = setTimeout(
() => setDebouncedQuery(searchQuery.trim()),
300
);
const timer = setTimeout(() => setDebouncedQuery(searchQuery.trim()), 300);
return () => clearTimeout(timer);
}, [searchQuery]);
// ── Delete state ─────────────────────────────────────────────────────────────
// Delete state
const [pendingDeleteId, setPendingDeleteId] = useState<string | null>(null);
// ── Editor state (Add + Edit share one modal) ───────────────────────────────
type EditorState =
| { mode: "add" }
| { mode: "edit"; entry: MemoryEntry }
| null;
const [editorState, setEditorState] = useState<EditorState>(null);
// ── Namespace loading ──────────────────────────────────────────────────────
// ── Data loading ────────────────────────────────────────────────────────────
const loadNamespaces = useCallback(async () => {
try {
const data = await api.get<NamespacesResponse>(
`/workspaces/${workspaceId}/v2/namespaces`,
);
setNamespaces(data);
setPluginUnavailable(false);
} catch (e) {
// Plugin-unavailable (503) indicates MEMORY_PLUGIN_URL isn't set.
// Anything else stays as a generic load failure that the
// entries-load path will also flag.
if (isPluginUnavailableError(e)) {
setPluginUnavailable(true);
}
setNamespaces({ readable: [], writable: [] });
}
}, [workspaceId]);
// ── Entries loading ────────────────────────────────────────────────────────
const loadEntries = useCallback(async () => {
setLoading(true);
setError(null);
try {
const params = new URLSearchParams();
params.set("scope", activeScope);
if (debouncedQuery) params.set("q", debouncedQuery);
if (activeNamespace) params.set("namespace", activeNamespace);
if (activeNamespace !== ALL_NAMESPACES) {
params.set('namespace', activeNamespace);
}
if (debouncedQuery) params.set('q', debouncedQuery);
const url = `/workspaces/${workspaceId}/memories?${params.toString()}`;
const data = await api.get<MemoryEntry[]>(url);
const url = `/workspaces/${workspaceId}/v2/memories?${params.toString()}`;
const data = await api.get<MemoriesResponse>(url);
// When a semantic query is active, sort by similarity_score descending.
// When a semantic query is active and the plugin returns
// scores, sort by score descending so the most-relevant hit
// sits at the top. Empty score → push to bottom.
const sorted = debouncedQuery
? [...data].sort(
(a, b) => (b.similarity_score ?? 0) - (a.similarity_score ?? 0)
? [...data.memories].sort(
(a, b) => (b.score ?? 0) - (a.score ?? 0),
)
: data;
: data.memories;
setEntries(sorted);
} catch (e) {
setError(e instanceof Error ? e.message : "Failed to load memories");
if (isPluginUnavailableError(e)) {
setPluginUnavailable(true);
setError(null); // surfaced via banner, not row error
} else {
setError(e instanceof Error ? e.message : 'Failed to load memories');
}
setEntries([]);
} finally {
setLoading(false);
}
}, [workspaceId, activeScope, debouncedQuery, activeNamespace]);
}, [workspaceId, activeNamespace, debouncedQuery]);
useEffect(() => {
loadNamespaces();
}, [loadNamespaces]);
useEffect(() => {
loadEntries();
@@ -144,16 +257,35 @@ export function MemoryInspectorPanel({ workspaceId }: Props) {
setEntries((prev) => prev.filter((e) => e.id !== id));
try {
await api.del(`/workspaces/${workspaceId}/memories/${encodeURIComponent(id)}`);
await api.del(`/workspaces/${workspaceId}/v2/memories/${encodeURIComponent(id)}`);
} catch (e) {
setError(e instanceof Error ? e.message : "Delete failed — reloading...");
// Reload first (which clears any stale error), THEN set the
// delete-failure message — otherwise loadEntries' own
// `setError(null)` wipes our error before the user sees it.
// Caught by the rollback test in MemoryInspectorPanel.test.tsx.
const msg = e instanceof Error ? e.message : 'Delete failed — reloading…';
await loadEntries();
setError(msg);
}
}, [pendingDeleteId, workspaceId, loadEntries]);
// ── Namespace dropdown options ─────────────────────────────────────────────
const dropdownOptions = useMemo(() => {
const opts: Array<{ value: string; label: string; kind?: NamespaceKind }> = [
{ value: ALL_NAMESPACES, label: 'All namespaces' },
];
if (namespaces) {
for (const ns of namespaces.readable) {
opts.push({ value: ns.name, label: ns.label, kind: ns.kind });
}
}
return opts;
}, [namespaces]);
// ── Render ──────────────────────────────────────────────────────────────────
if (loading && entries.length === 0 && !error) {
if (loading && entries.length === 0 && !error && !pluginUnavailable) {
return (
<div className="flex items-center justify-center h-32">
<span className="text-xs text-ink-soft">Loading memories</span>
@@ -163,32 +295,43 @@ export function MemoryInspectorPanel({ workspaceId }: Props) {
return (
<div className="flex flex-col h-full">
{/* Scope tabs */}
<div className="px-4 pt-3 pb-2 border-b border-line/40 shrink-0">
<div className="flex items-center gap-1">
{SCOPES.map((scope) => (
<button
type="button"
key={scope}
onClick={() => setActiveScope(scope)}
aria-pressed={activeScope === scope}
className={[
"px-3 py-1 text-[11px] rounded transition-colors",
activeScope === scope
? "bg-accent-strong text-white"
: "bg-surface-card text-ink-mid hover:bg-surface-card hover:text-ink",
].join(" ")}
>
{scope}
</button>
))}
{/* Plugin-unavailable banner */}
{pluginUnavailable && (
<div
role="alert"
aria-live="polite"
className="mx-4 mt-3 px-3 py-2 bg-amber-950/30 border border-amber-800/40 rounded text-xs text-amber-300 shrink-0"
data-testid="plugin-unavailable-banner"
>
Memory plugin not configured. Set <code>MEMORY_PLUGIN_URL</code> on the
workspace-server to enable v2 memory.
</div>
</div>
)}
{/* Search bar + namespace filter */}
{/* Namespace dropdown */}
<div className="px-4 pt-3 pb-2 border-b border-line/40 shrink-0 space-y-2">
<div className="flex items-center gap-2">
<label htmlFor="namespace-dropdown" className="text-[10px] text-ink-soft shrink-0">
Namespace:
</label>
<select
id="namespace-dropdown"
value={activeNamespace}
onChange={(e) => setActiveNamespace(e.target.value)}
aria-label="Filter by namespace"
disabled={pluginUnavailable}
className="flex-1 bg-surface-sunken border border-line/60 focus:border-accent/60 rounded px-2 py-1 text-[11px] text-ink focus:outline-none transition-colors min-w-0 disabled:opacity-50 disabled:cursor-not-allowed"
>
{dropdownOptions.map((opt) => (
<option key={opt.value} value={opt.value}>
{opt.label}
</option>
))}
</select>
</div>
{/* Search bar */}
<div className="relative flex items-center">
{/* Magnifying glass icon */}
<svg
width="12"
height="12"
@@ -206,14 +349,15 @@ export function MemoryInspectorPanel({ workspaceId }: Props) {
onChange={(e) => setSearchQuery(e.target.value)}
placeholder="Semantic search…"
aria-label="Search memories"
className="w-full bg-surface-sunken border border-line/60 focus:border-accent/60 rounded-lg pl-8 pr-7 py-1.5 text-[11px] text-ink placeholder-zinc-600 focus:outline-none transition-colors"
disabled={pluginUnavailable}
className="w-full bg-surface-sunken border border-line/60 focus:border-accent/60 rounded-lg pl-8 pr-7 py-1.5 text-[11px] text-ink placeholder-zinc-600 focus:outline-none transition-colors disabled:opacity-50 disabled:cursor-not-allowed"
/>
{searchQuery && (
<button
type="button"
onClick={() => {
setSearchQuery("");
setDebouncedQuery("");
setSearchQuery('');
setDebouncedQuery('');
}}
aria-label="Clear search"
className="absolute right-2 text-ink-soft hover:text-ink transition-colors text-sm leading-none"
@@ -222,51 +366,26 @@ export function MemoryInspectorPanel({ workspaceId }: Props) {
</button>
)}
</div>
{/* Namespace filter */}
<div className="flex items-center gap-2">
<label htmlFor="namespace-filter" className="text-[10px] text-ink-soft shrink-0">
Namespace:
</label>
<input
id="namespace-filter"
type="text"
value={activeNamespace}
onChange={(e) => setActiveNamespace(e.target.value)}
placeholder="all namespaces"
aria-label="Filter by namespace"
className="flex-1 bg-surface-sunken border border-line/60 focus:border-accent/60 rounded px-2 py-1 text-[11px] text-ink placeholder-zinc-600 focus:outline-none transition-colors min-w-0"
/>
</div>
</div>
{/* Toolbar */}
<div className="px-4 py-2.5 border-b border-line/40 flex items-center justify-between shrink-0">
<span className="text-[11px] text-ink-soft">
{debouncedQuery
? `${entries.length} result${entries.length !== 1 ? "s" : ""}`
? `${entries.length} result${entries.length !== 1 ? 's' : ''}`
: entries.length === 1
? "1 memory"
: `${entries.length} memories`}
? '1 memory'
: `${entries.length} memories`}
</span>
<div className="flex items-center gap-1.5">
<button
type="button"
onClick={() => setEditorState({ mode: "add" })}
className="px-2 py-1 text-[11px] bg-accent hover:bg-accent-strong text-white rounded transition-colors"
aria-label="Add memory"
>
+ Add
</button>
<button
type="button"
onClick={loadEntries}
className="px-2 py-1 text-[11px] bg-surface-card hover:bg-surface-card text-ink-mid rounded transition-colors"
aria-label="Refresh memories"
>
Refresh
</button>
</div>
<button
type="button"
onClick={loadEntries}
disabled={pluginUnavailable}
className="px-2 py-1 text-[11px] bg-surface-card hover:bg-surface-card text-ink-mid rounded transition-colors disabled:opacity-50 disabled:cursor-not-allowed"
aria-label="Refresh memories"
>
Refresh
</button>
</div>
{/* Error banner */}
@@ -285,47 +404,13 @@ export function MemoryInspectorPanel({ workspaceId }: Props) {
{loading ? (
<MemorySkeletonRows />
) : entries.length === 0 ? (
debouncedQuery ? (
<div className="flex flex-col items-center justify-center py-16 gap-3 text-center">
<span className="text-4xl text-ink-soft" aria-hidden="true"></span>
<p className="text-sm font-medium text-ink-mid">
No memories match your search
</p>
<p className="text-[11px] text-ink-soft max-w-[200px] leading-relaxed">
Try a different query or{" "}
<button
type="button"
onClick={() => {
setSearchQuery("");
setDebouncedQuery("");
}}
className="text-accent hover:text-accent underline transition-colors"
>
clear the search
</button>
.
</p>
</div>
) : (
<div className="flex flex-col items-center justify-center py-16 gap-3 text-center">
<span className="text-4xl text-ink-soft" aria-hidden="true"></span>
<p className="text-sm font-medium text-ink-mid">No {activeScope} memories</p>
<p className="text-[11px] text-ink-soft max-w-[200px] leading-relaxed">
{activeScope === "LOCAL"
? "This workspace has not written any local memories yet."
: activeScope === "TEAM"
? "No team memories shared with this workspace yet."
: "No global memories exist yet."}
</p>
</div>
)
<EmptyState query={debouncedQuery} pluginUnavailable={pluginUnavailable} />
) : (
<div className="space-y-1.5">
{entries.map((entry) => (
<MemoryEntryRow
key={entry.id}
entry={entry}
onEdit={() => setEditorState({ mode: "edit", entry })}
onDelete={() => setPendingDeleteId(entry.id)}
/>
))}
@@ -336,36 +421,64 @@ export function MemoryInspectorPanel({ workspaceId }: Props) {
{/* Delete confirmation dialog */}
<ConfirmDialog
open={pendingDeleteId !== null}
title="Delete memory"
message={`Delete this ${activeScope} memory? This cannot be undone.`}
confirmLabel="Delete"
title="Forget memory"
message="Forget this memory? This cannot be undone."
confirmLabel="Forget"
confirmVariant="danger"
onConfirm={confirmDelete}
onCancel={() => setPendingDeleteId(null)}
/>
</div>
);
}
{/* Add / Edit dialog */}
{editorState?.mode === "add" && (
<MemoryEditorDialog
open={true}
mode="add"
workspaceId={workspaceId}
defaultScope={activeScope}
defaultNamespace={activeNamespace || "general"}
onClose={() => setEditorState(null)}
onSaved={loadEntries}
/>
)}
{editorState?.mode === "edit" && (
<MemoryEditorDialog
open={true}
mode="edit"
workspaceId={workspaceId}
entry={editorState.entry}
onClose={() => setEditorState(null)}
onSaved={loadEntries}
/>
)}
// ── Empty state ─────────────────────────────────────────────────────────────
function EmptyState({
query,
pluginUnavailable,
}: {
query: string;
pluginUnavailable: boolean;
}) {
if (pluginUnavailable) {
// The banner already explains the problem; the empty rows just
// mirror it so the operator sees both signals.
return (
<div className="flex flex-col items-center justify-center py-16 gap-3 text-center">
<span className="text-4xl text-ink-soft" aria-hidden="true">
</span>
<p className="text-sm font-medium text-ink-mid">Memory plugin disabled</p>
<p className="text-[11px] text-ink-soft max-w-[220px] leading-relaxed">
See banner above for the operator-side fix.
</p>
</div>
);
}
if (query) {
return (
<div className="flex flex-col items-center justify-center py-16 gap-3 text-center">
<span className="text-4xl text-ink-soft" aria-hidden="true">
</span>
<p className="text-sm font-medium text-ink-mid">No memories match your search</p>
<p className="text-[11px] text-ink-soft max-w-[200px] leading-relaxed">
Try a different query or clear the search.
</p>
</div>
);
}
return (
<div className="flex flex-col items-center justify-center py-16 gap-3 text-center">
<span className="text-4xl text-ink-soft" aria-hidden="true">
</span>
<p className="text-sm font-medium text-ink-mid">No memories yet</p>
<p className="text-[11px] text-ink-soft max-w-[220px] leading-relaxed">
Agents commit memories via MCP tools (commit_memory, commit_summary). They
appear here once written.
</p>
</div>
);
}
@@ -373,17 +486,32 @@ export function MemoryInspectorPanel({ workspaceId }: Props) {
// ── MemoryEntryRow sub-component ──────────────────────────────────────────────
interface MemoryEntryRowProps {
entry: MemoryEntry;
onEdit: () => void;
entry: MemoryV2;
onDelete: () => void;
}
function MemoryEntryRow({ entry, onEdit, onDelete }: MemoryEntryRowProps) {
const KIND_BADGE_CLASS: Record<MemoryKind, string> = {
fact: 'bg-surface-card text-ink-mid',
summary: 'bg-blue-950 text-accent',
checkpoint: 'bg-violet-950 text-violet-400',
};
const SOURCE_BADGE_CLASS: Record<MemorySource, string> = {
agent: 'bg-surface-card text-ink-mid',
runtime: 'bg-amber-950 text-amber-300',
user: 'bg-emerald-950 text-emerald-400',
};
function MemoryEntryRow({ entry, onDelete }: MemoryEntryRowProps) {
const [expanded, setExpanded] = useState(false);
const bodyId = `mem-body-${sanitizeId(entry.id)}`;
const ttl = formatTTL(entry.expires_at);
return (
<div className="rounded-lg border border-line/60 bg-surface-sunken/50 overflow-hidden">
<div
className="rounded-lg border border-line/60 bg-surface-sunken/50 overflow-hidden"
data-testid={`memory-row-${entry.id}`}
>
{/* Header row */}
<button
type="button"
@@ -392,52 +520,89 @@ function MemoryEntryRow({ entry, onEdit, onDelete }: MemoryEntryRowProps) {
aria-expanded={expanded}
aria-controls={bodyId}
>
{/* Scope badge */}
{/* Kind badge */}
<span
className={[
"text-[9px] shrink-0 font-mono px-1 py-0.5 rounded",
entry.scope === "LOCAL"
? "bg-surface-card text-ink-mid"
: entry.scope === "TEAM"
? "bg-blue-950 text-accent"
: "bg-violet-950 text-violet-400",
].join(" ")}
title={`Scope: ${entry.scope}`}
'text-[9px] shrink-0 font-mono px-1 py-0.5 rounded',
KIND_BADGE_CLASS[entry.kind] ?? 'bg-surface-card text-ink-mid',
].join(' ')}
title={`Kind: ${entry.kind}`}
data-testid="kind-badge"
>
{entry.scope[0]}
{entry.kind[0].toUpperCase()}
</span>
{/* Source badge */}
<span
className={[
'text-[9px] shrink-0 font-mono px-1 py-0.5 rounded',
SOURCE_BADGE_CLASS[entry.source] ?? 'bg-surface-card text-ink-mid',
].join(' ')}
title={`Source: ${entry.source}`}
data-testid="source-badge"
>
{entry.source}
</span>
{/* Pin indicator */}
{entry.pin && (
<span
className="text-[9px] shrink-0"
title="Pinned"
data-testid="pin-badge"
aria-label="Pinned"
>
📌
</span>
)}
{/* Namespace tag */}
<span className="text-[9px] shrink-0 font-mono text-ink-soft truncate max-w-[80px]" title={entry.namespace}>
<span
className="text-[9px] shrink-0 font-mono text-ink-soft truncate max-w-[100px]"
title={entry.namespace}
>
{entry.namespace}
</span>
{/* Content preview */}
<span className="flex-1 min-w-0 text-[10px] font-mono text-ink-mid truncate text-left">
{entry.content.length > 60 ? entry.content.slice(0, 60) + "…" : entry.content}
{entry.content.length > 60 ? entry.content.slice(0, 60) + '…' : entry.content}
</span>
{/* Similarity badge */}
{entry.similarity_score != null && (
{/* Score badge (semantic search only) */}
{entry.score != null && (
<span
className={[
"text-[9px] shrink-0 font-mono tabular-nums",
entry.similarity_score >= 0.8
? "text-accent"
: "text-ink-mid",
].join(" ")}
title={`Similarity: ${(entry.similarity_score * 100).toFixed(1)}%`}
data-testid="similarity-badge"
'text-[9px] shrink-0 font-mono tabular-nums',
entry.score >= 0.8 ? 'text-accent' : 'text-ink-mid',
].join(' ')}
title={`Similarity: ${(entry.score * 100).toFixed(1)}%`}
data-testid="score-badge"
>
{Math.round(entry.similarity_score * 100)}%
{Math.round(entry.score * 100)}%
</span>
)}
{/* TTL countdown */}
{ttl && (
<span
className={[
'text-[9px] shrink-0 font-mono',
ttl === 'expired' ? 'text-bad' : 'text-amber-400',
].join(' ')}
title={`Expires: ${entry.expires_at}`}
data-testid="ttl-badge"
>
{ttl}
</span>
)}
<span className="text-[9px] text-ink-soft shrink-0">
{formatRelativeTime(entry.created_at)}
</span>
<span className="text-[9px] text-ink-soft shrink-0" aria-hidden="true">
{expanded ? "▼" : "▶"}
{expanded ? '▼' : '▶'}
</span>
</button>
@@ -455,31 +620,19 @@ function MemoryEntryRow({ entry, onEdit, onDelete }: MemoryEntryRowProps) {
<div className="flex items-center justify-between gap-2">
<span className="text-[9px] text-ink-soft">
Created: {new Date(entry.created_at).toLocaleString()}
{entry.expires_at && ` · Expires: ${new Date(entry.expires_at).toLocaleString()}`}
</span>
<div className="flex items-center gap-1.5 shrink-0">
<button
type="button"
onClick={(e) => {
e.stopPropagation();
onEdit();
}}
aria-label="Edit memory"
className="text-[10px] px-2 py-0.5 bg-surface-card hover:bg-surface-elevated border border-line/40 rounded text-ink-mid hover:text-ink transition-colors"
>
Edit
</button>
<button
type="button"
onClick={(e) => {
e.stopPropagation();
onDelete();
}}
aria-label="Delete memory"
className="text-[10px] px-2 py-0.5 bg-red-950/40 hover:bg-red-900/50 border border-red-900/30 rounded text-bad transition-colors"
>
Delete
</button>
</div>
<button
type="button"
onClick={(e) => {
e.stopPropagation();
onDelete();
}}
aria-label="Forget memory"
className="text-[10px] px-2 py-0.5 bg-red-950/40 hover:bg-red-900/50 border border-red-900/30 rounded text-bad transition-colors shrink-0"
>
Forget
</button>
</div>
</div>
)}
+1 -1
View File
@@ -287,7 +287,7 @@ export function SidePanel() {
{panelTab === "config" && <ConfigTab key={selectedNodeId} workspaceId={selectedNodeId} />}
{panelTab === "schedule" && <ScheduleTab key={selectedNodeId} workspaceId={selectedNodeId} />}
{panelTab === "channels" && <ChannelsTab key={selectedNodeId} workspaceId={selectedNodeId} />}
{panelTab === "files" && <FilesTab key={selectedNodeId} workspaceId={selectedNodeId} />}
{panelTab === "files" && <FilesTab key={selectedNodeId} workspaceId={selectedNodeId} data={node.data} />}
{panelTab === "memory" && <MemoryInspectorPanel key={selectedNodeId} workspaceId={selectedNodeId} />}
{panelTab === "traces" && <TracesTab key={selectedNodeId} workspaceId={selectedNodeId} />}
{panelTab === "events" && <EventsTab key={selectedNodeId} workspaceId={selectedNodeId} />}
@@ -1,202 +0,0 @@
// @vitest-environment jsdom
/**
* MemoryEditorDialog tests — covers Add (POST /memories) and Edit
* (PATCH /memories/:id) flows. Pins:
* - Add posts {content, scope, namespace} with the trimmed defaults
* - Edit only sends fields that changed (no-op edit short-circuits, no PATCH fires)
* - Empty content blocks save
* - Save error surfaces in the dialog and keeps the modal open
*/
import { describe, it, expect, vi, beforeEach, afterEach } from "vitest";
import { render, screen, fireEvent, waitFor, cleanup } from "@testing-library/react";
vi.mock("@/lib/api", () => ({
api: {
get: vi.fn(),
post: vi.fn(),
patch: vi.fn(),
del: vi.fn(),
},
}));
import { api } from "@/lib/api";
import { MemoryEditorDialog } from "../MemoryEditorDialog";
import type { MemoryEntry } from "../MemoryInspectorPanel";
const mockPost = vi.mocked(api.post);
const mockPatch = vi.mocked(api.patch);
const SAMPLE: MemoryEntry = {
id: "mem-x",
workspace_id: "ws-1",
content: "original content",
scope: "TEAM",
namespace: "procedures",
created_at: "2026-04-17T12:00:00.000Z",
};
beforeEach(() => {
vi.clearAllMocks();
mockPost.mockResolvedValue({} as never);
mockPatch.mockResolvedValue({} as never);
});
afterEach(() => {
cleanup();
});
describe("Add mode", () => {
it("POSTs scope+namespace+trimmed-content and calls onSaved+onClose", async () => {
const onClose = vi.fn();
const onSaved = vi.fn();
render(
<MemoryEditorDialog
open
mode="add"
workspaceId="ws-1"
defaultScope="GLOBAL"
defaultNamespace="facts"
onClose={onClose}
onSaved={onSaved}
/>,
);
const textarea = screen.getByLabelText(/Content/i) as HTMLTextAreaElement;
fireEvent.change(textarea, { target: { value: " new fact " } });
fireEvent.click(screen.getByRole("button", { name: /Add memory$/i }));
await waitFor(() => expect(mockPost).toHaveBeenCalledTimes(1));
expect(mockPost).toHaveBeenCalledWith("/workspaces/ws-1/memories", {
content: "new fact",
scope: "GLOBAL",
namespace: "facts",
});
expect(onSaved).toHaveBeenCalledTimes(1);
expect(onClose).toHaveBeenCalledTimes(1);
});
it("blocks save when content is empty (whitespace-only)", () => {
const onClose = vi.fn();
const onSaved = vi.fn();
render(
<MemoryEditorDialog
open
mode="add"
workspaceId="ws-1"
defaultScope="LOCAL"
onClose={onClose}
onSaved={onSaved}
/>,
);
const textarea = screen.getByLabelText(/Content/i) as HTMLTextAreaElement;
fireEvent.change(textarea, { target: { value: " " } });
fireEvent.click(screen.getByRole("button", { name: /Add memory$/i }));
expect(mockPost).not.toHaveBeenCalled();
expect(screen.getByRole("alert").textContent).toMatch(/empty/i);
expect(onSaved).not.toHaveBeenCalled();
expect(onClose).not.toHaveBeenCalled();
});
});
describe("Edit mode", () => {
it("PATCHes only changed fields", async () => {
const onClose = vi.fn();
const onSaved = vi.fn();
render(
<MemoryEditorDialog
open
mode="edit"
workspaceId="ws-1"
entry={SAMPLE}
onClose={onClose}
onSaved={onSaved}
/>,
);
const textarea = screen.getByLabelText(/Content/i) as HTMLTextAreaElement;
fireEvent.change(textarea, { target: { value: "rewritten content" } });
// namespace untouched
fireEvent.click(screen.getByRole("button", { name: /Save changes/i }));
await waitFor(() => expect(mockPatch).toHaveBeenCalledTimes(1));
expect(mockPatch).toHaveBeenCalledWith(
"/workspaces/ws-1/memories/mem-x",
{ content: "rewritten content" },
);
expect(onSaved).toHaveBeenCalledTimes(1);
expect(onClose).toHaveBeenCalledTimes(1);
});
it("no-op edit short-circuits (no PATCH fires) and still closes", async () => {
const onClose = vi.fn();
const onSaved = vi.fn();
render(
<MemoryEditorDialog
open
mode="edit"
workspaceId="ws-1"
entry={SAMPLE}
onClose={onClose}
onSaved={onSaved}
/>,
);
fireEvent.click(screen.getByRole("button", { name: /Save changes/i }));
await waitFor(() => expect(onClose).toHaveBeenCalled());
expect(mockPatch).not.toHaveBeenCalled();
expect(onSaved).toHaveBeenCalledTimes(1);
});
it("sends namespace too when both content and namespace changed", async () => {
const onClose = vi.fn();
const onSaved = vi.fn();
render(
<MemoryEditorDialog
open
mode="edit"
workspaceId="ws-1"
entry={SAMPLE}
onClose={onClose}
onSaved={onSaved}
/>,
);
fireEvent.change(screen.getByLabelText(/Content/i), {
target: { value: "newer content" },
});
fireEvent.change(screen.getByLabelText(/Namespace/i), {
target: { value: "blockers" },
});
fireEvent.click(screen.getByRole("button", { name: /Save changes/i }));
await waitFor(() => expect(mockPatch).toHaveBeenCalledTimes(1));
expect(mockPatch).toHaveBeenCalledWith(
"/workspaces/ws-1/memories/mem-x",
{ content: "newer content", namespace: "blockers" },
);
});
it("surfaces save error and keeps the modal open", async () => {
const onClose = vi.fn();
const onSaved = vi.fn();
mockPatch.mockRejectedValueOnce(new Error("boom"));
render(
<MemoryEditorDialog
open
mode="edit"
workspaceId="ws-1"
entry={SAMPLE}
onClose={onClose}
onSaved={onSaved}
/>,
);
fireEvent.change(screen.getByLabelText(/Content/i), {
target: { value: "rewritten content" },
});
fireEvent.click(screen.getByRole("button", { name: /Save changes/i }));
await waitFor(() =>
expect(screen.getByRole("alert").textContent).toMatch(/boom/),
);
expect(onClose).not.toHaveBeenCalled();
expect(onSaved).not.toHaveBeenCalled();
});
});
@@ -1,16 +1,29 @@
// @vitest-environment jsdom
/**
* MemoryInspectorPanel tests — issue #909
* MemoryInspectorPanel — v2 redesign tests.
*
* Covers: loading, empty state, scope tabs, namespace filter,
* entry list, expand, delete flow, optimistic updates, Refresh, semantic search.
* Coverage targets every behavior the panel surfaces:
* - Initial load wires GET /v2/namespaces + GET /v2/memories
* - Plugin-unavailable banner (503) renders + disables interactions
* - Generic error renders in the error banner
* - Namespace dropdown populates from /v2/namespaces.readable; "All
* namespaces" is the default
* - Selecting a namespace re-fetches with ?namespace=...
* - Search input debounces + scopes the request to ?q=
* - Search results sort by score descending
* - Empty-state copy differs by query / plugin-state / no-data
* - Per-row badges render (kind / source / pin / TTL / score /
* score) and TTL countdown handles past/future/null
* - Delete (Forget) flow: optimistic removal, confirmation dialog,
* server failure rolls back via reload
* - formatTTL helper covers s/m/h/d/expired/null/invalid branches
*/
import { describe, it, expect, vi, beforeEach, afterEach } from "vitest";
import { render, screen, fireEvent, waitFor, cleanup, act } from "@testing-library/react";
import { describe, it, expect, vi, beforeEach, afterEach } from 'vitest';
import { render, screen, fireEvent, waitFor, cleanup } from '@testing-library/react';
// ── Mocks ─────────────────────────────────────────────────────────────────────
vi.mock("@/lib/api", () => ({
vi.mock('@/lib/api', () => ({
api: {
get: vi.fn(),
post: vi.fn(),
@@ -18,7 +31,7 @@ vi.mock("@/lib/api", () => ({
},
}));
vi.mock("@/components/ConfirmDialog", () => ({
vi.mock('@/components/ConfirmDialog', () => ({
ConfirmDialog: ({
open,
title,
@@ -33,435 +46,473 @@ vi.mock("@/components/ConfirmDialog", () => ({
confirmVariant?: string;
onConfirm: () => void;
onCancel: () => void;
singleButton?: boolean;
}) =>
open ? (
<div data-testid="confirm-dialog">
<p data-testid="dialog-title">{title}</p>
<p data-testid="dialog-message">{message}</p>
<button onClick={onConfirm}>Confirm Delete</button>
<button onClick={onCancel}>Cancel Delete</button>
<button onClick={onConfirm}>Confirm</button>
<button onClick={onCancel}>Cancel</button>
</div>
) : null,
}));
import { api } from "@/lib/api";
import { MemoryInspectorPanel } from "../MemoryInspectorPanel";
// ── Typed mock helpers ────────────────────────────────────────────────────────
import { api } from '@/lib/api';
import {
MemoryInspectorPanel,
formatTTL,
isPluginUnavailableError,
type MemoryV2,
type NamespacesResponse,
} from '../MemoryInspectorPanel';
const mockGet = vi.mocked(api.get);
const mockDel = vi.mocked(api.del);
// ── Sample fixtures ───────────────────────────────────────────────────────────
// ── Fixtures ──────────────────────────────────────────────────────────────────
const NOW = "2026-04-17T12:00:00.000Z";
const MEMORY_A: import("../MemoryInspectorPanel").MemoryEntry = {
id: "mem-a",
workspace_id: "ws-1",
content: "Remember to review PRs before merging",
scope: "LOCAL",
namespace: "general",
created_at: NOW,
const NS_RESPONSE: NamespacesResponse = {
readable: [
{ name: 'workspace:ws-1', kind: 'workspace', label: 'Workspace (ws-1)' },
{ name: 'team:t-1', kind: 'team', label: 'Team (t-1)' },
],
writable: [{ name: 'workspace:ws-1', kind: 'workspace', label: 'Workspace (ws-1)' }],
};
const MEMORY_B: import("../MemoryInspectorPanel").MemoryEntry = {
id: "mem-b",
workspace_id: "ws-1",
content: "Team knowledge: deploy happens on Fridays",
scope: "TEAM",
namespace: "procedures",
created_at: NOW,
const MEM_BASIC: MemoryV2 = {
id: 'mem-a',
namespace: 'workspace:ws-1',
content: 'Remember the standup is at 10am',
kind: 'fact',
source: 'agent',
pin: false,
created_at: '2026-04-17T12:00:00.000Z',
};
const TWO_MEMORIES = [MEMORY_A, MEMORY_B];
const MEM_PINNED: MemoryV2 = {
id: 'mem-pinned',
namespace: 'team:t-1',
content: 'Team retro every Friday',
kind: 'summary',
source: 'user',
pin: true,
expires_at: new Date(Date.now() + 86_400_000).toISOString(),
created_at: '2026-04-17T12:00:00.000Z',
};
const MEM_RUNTIME_CHECKPOINT: MemoryV2 = {
id: 'mem-checkpoint',
namespace: 'team:t-1',
content: 'Runtime checkpoint',
kind: 'checkpoint',
source: 'runtime',
pin: false,
created_at: '2026-04-17T12:00:00.000Z',
};
const MEM_EXPIRED: MemoryV2 = {
id: 'mem-expired',
namespace: 'workspace:ws-1',
content: 'Stale memory',
kind: 'fact',
source: 'agent',
pin: false,
expires_at: new Date(Date.now() - 1000).toISOString(),
created_at: '2026-04-17T12:00:00.000Z',
};
// ── Setup / teardown ──────────────────────────────────────────────────────────
beforeEach(() => {
vi.clearAllMocks();
mockGet.mockReset();
mockDel.mockReset();
});
afterEach(() => {
cleanup();
});
// ── Helper: flush microtasks + React state updates ─────────────────────────────
async function flushUpdates(): Promise<void> {
await act(async () => {});
// Helper: stub a basic two-call flow (namespaces + memories).
function stubFetch(memories: MemoryV2[], namespaces: NamespacesResponse = NS_RESPONSE) {
mockGet.mockImplementation(((url: string) => {
if (url.includes('/v2/namespaces')) {
return Promise.resolve(namespaces);
}
return Promise.resolve({ memories });
}) as typeof api.get);
}
// ── Loading & empty state ─────────────────────────────────────────────────────
// ── isPluginUnavailableError helper ─────────────────────────────────────────
describe("MemoryInspectorPanel — loading and empty state", () => {
it("shows loading indicator before data arrives", () => {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockGet.mockReturnValue(new Promise(() => {}) as any);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
expect(screen.getByText(/loading memories/i)).toBeTruthy();
});
it("renders empty state when API returns []", async () => {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockGet.mockResolvedValue([] as any);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await flushUpdates();
expect(screen.getByText("No LOCAL memories")).toBeTruthy();
});
it("fetches from the correct workspace memories endpoint with scope=LOCAL", async () => {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockGet.mockResolvedValue([] as any);
render(<MemoryInspectorPanel workspaceId="ws-abc-123" />);
await flushUpdates();
expect(mockGet).toHaveBeenCalledWith(
"/workspaces/ws-abc-123/memories?scope=LOCAL"
);
});
it("shows error banner when fetch throws", async () => {
mockGet.mockRejectedValue(new Error("Network error"));
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await flushUpdates();
expect(screen.getByText("Network error")).toBeTruthy();
});
});
// ── Scope tabs ────────────────────────────────────────────────────────────────
describe("MemoryInspectorPanel — scope tabs", () => {
it("renders LOCAL, TEAM, GLOBAL tabs", async () => {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockGet.mockResolvedValue([] as any);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await flushUpdates();
expect(screen.getByRole("button", { name: "LOCAL" })).toBeTruthy();
expect(screen.getByRole("button", { name: "TEAM" })).toBeTruthy();
expect(screen.getByRole("button", { name: "GLOBAL" })).toBeTruthy();
});
it("LOCAL is active by default", async () => {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockGet.mockResolvedValue([] as any);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await flushUpdates();
expect(screen.getByRole("button", { name: "LOCAL" }).getAttribute("aria-pressed")).toBe("true");
});
it("clicking TEAM tab re-fetches with scope=TEAM", async () => {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockGet.mockResolvedValue([] as any);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await flushUpdates();
mockGet.mockClear();
fireEvent.click(screen.getByRole("button", { name: "TEAM" }));
await flushUpdates();
expect(mockGet).toHaveBeenCalledWith(
"/workspaces/ws-1/memories?scope=TEAM"
);
});
it("clicking GLOBAL tab re-fetches with scope=GLOBAL", async () => {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockGet.mockResolvedValue([] as any);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await flushUpdates();
mockGet.mockClear();
fireEvent.click(screen.getByRole("button", { name: "GLOBAL" }));
await flushUpdates();
expect(mockGet).toHaveBeenCalledWith(
"/workspaces/ws-1/memories?scope=GLOBAL"
);
});
it("shows scope-specific empty state when switching tabs", async () => {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockGet.mockResolvedValue([] as any);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await flushUpdates();
fireEvent.click(screen.getByRole("button", { name: "TEAM" }));
await flushUpdates();
expect(screen.getByText("No TEAM memories")).toBeTruthy();
});
});
// ── Namespace filter ──────────────────────────────────────────────────────────
describe("MemoryInspectorPanel — namespace filter", () => {
it("renders namespace filter input", async () => {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockGet.mockResolvedValue([] as any);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await flushUpdates();
expect(screen.getByLabelText("Filter by namespace")).toBeTruthy();
});
it("includes namespace param in API call when set", async () => {
vi.useFakeTimers();
try {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockGet.mockResolvedValue([] as any);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await flushUpdates();
mockGet.mockClear();
fireEvent.change(screen.getByLabelText("Filter by namespace"), {
target: { value: "facts" },
});
// Advance past the 300ms debounce
act(() => { vi.advanceTimersByTime(350); });
await flushUpdates();
expect(mockGet).toHaveBeenCalledWith(
"/workspaces/ws-1/memories?scope=LOCAL&namespace=facts"
);
} finally {
vi.useRealTimers();
}
});
});
// ── Entry list ───────────────────────────────────────────────────────────────
describe("MemoryInspectorPanel — entry list", () => {
beforeEach(() => {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockGet.mockResolvedValue(TWO_MEMORIES as any);
});
it("renders a row for every memory", async () => {
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await flushUpdates();
expect(screen.getByText(/Remember to review PRs before merging/)).toBeTruthy();
expect(screen.getByText(/Team knowledge: deploy happens on Fridays/)).toBeTruthy();
});
it("displays memory count in toolbar", async () => {
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await flushUpdates();
expect(screen.getByText("2 memories")).toBeTruthy();
});
it("displays scope badge for each entry", async () => {
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await flushUpdates();
expect(screen.getByTitle("Scope: LOCAL")).toBeTruthy();
expect(screen.getByTitle("Scope: TEAM")).toBeTruthy();
});
it("entries are collapsed by default (pre region not visible)", async () => {
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await flushUpdates();
// Expanded region (pre tag) should not exist in DOM yet
expect(screen.queryByRole("region")).toBeNull();
});
});
// ── Expand / collapse ─────────────────────────────────────────────────────────
describe("MemoryInspectorPanel — expand/collapse", () => {
beforeEach(() => {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockGet.mockResolvedValue(TWO_MEMORIES as any);
});
it("clicking a row header expands it and shows the full content in a pre tag", async () => {
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await flushUpdates();
fireEvent.click(
screen.getByText(/Remember to review PRs before merging/).closest("button")!
);
await flushUpdates();
// After expand, a region with the full content <pre> should appear
expect(screen.getByRole("region")).toBeTruthy();
});
it("clicking the header again collapses the row (pre region removed)", async () => {
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await flushUpdates();
const headerBtn = screen
.getByText(/Remember to review PRs before merging/)
.closest("button")!;
fireEvent.click(headerBtn); // expand
await flushUpdates();
expect(screen.getByRole("region")).toBeTruthy();
fireEvent.click(headerBtn); // collapse
await flushUpdates();
// After collapse, the region (pre) is removed from the DOM
expect(screen.queryByRole("region")).toBeNull();
});
});
// ── Delete flow ───────────────────────────────────────────────────────────────
describe("MemoryInspectorPanel — delete flow", () => {
beforeEach(() => {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockGet.mockResolvedValue(TWO_MEMORIES as any);
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockDel.mockResolvedValue({ status: "deleted" } as any);
});
/** Helper: expand memory-A and click its Delete button */
async function openDeleteForMemoryA() {
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await flushUpdates();
fireEvent.click(
screen.getByText(/Remember to review PRs before merging/).closest("button")!
);
await flushUpdates();
fireEvent.click(screen.getByRole("button", { name: "Delete memory" }));
await flushUpdates();
}
it("opens ConfirmDialog when Delete is clicked", async () => {
await openDeleteForMemoryA();
expect(screen.getByTestId("confirm-dialog")).toBeTruthy();
expect(screen.getByTestId("dialog-title").textContent).toBe("Delete memory");
});
it("calls api.del with the correct URL-encoded path on confirm", async () => {
await openDeleteForMemoryA();
fireEvent.click(screen.getByText("Confirm Delete"));
await flushUpdates();
expect(mockDel).toHaveBeenCalledWith("/workspaces/ws-1/memories/mem-a");
});
it("removes the entry optimistically after confirm", async () => {
await openDeleteForMemoryA();
fireEvent.click(screen.getByText("Confirm Delete"));
await flushUpdates();
expect(screen.queryByText(/Remember to review PRs before merging/)).toBeNull();
// Sibling entry unaffected
expect(screen.getByText(/Team knowledge: deploy happens on Fridays/)).toBeTruthy();
});
it("closes ConfirmDialog without deleting when Cancel is clicked", async () => {
await openDeleteForMemoryA();
fireEvent.click(screen.getByText("Cancel Delete"));
await flushUpdates();
expect(screen.queryByTestId("confirm-dialog")).toBeNull();
expect(mockDel).not.toHaveBeenCalled();
// Sibling memory entry (MEMORY_B) is still in the list
expect(screen.getByText(/Team knowledge: deploy happens on Fridays/)).toBeTruthy();
});
});
// ── Refresh ───────────────────────────────────────────────────────────────────
describe("MemoryInspectorPanel — Refresh button", () => {
it("re-fetches entries when Refresh is clicked", async () => {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockGet.mockResolvedValue([] as any);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await flushUpdates();
expect(screen.getByText("No LOCAL memories")).toBeTruthy();
expect(mockGet).toHaveBeenCalledTimes(1);
fireEvent.click(screen.getByRole("button", { name: "Refresh memories" }));
await flushUpdates();
expect(mockGet).toHaveBeenCalledTimes(2);
});
});
// ── role=alert a11y ──────────────────────────────────────────────────────────
describe("MemoryInspectorPanel — error elements have role=alert", () => {
it("fetch error banner has role='alert'", async () => {
mockGet.mockRejectedValue(new Error("Network error"));
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await flushUpdates();
const alert = screen.getByRole("alert");
expect(alert).toBeTruthy();
expect(alert.textContent).toContain("Network error");
});
});
// ── Semantic search ──────────────────────────────────────────────────────────
describe("MemoryInspectorPanel — semantic search", () => {
afterEach(() => {
vi.useRealTimers();
});
it("debounces search input by 300ms before calling API", async () => {
vi.useFakeTimers();
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockGet.mockResolvedValue([] as any);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await flushUpdates();
mockGet.mockClear();
fireEvent.change(screen.getByLabelText("Search memories"), {
target: { value: "deploy" },
});
// 200ms — debounce has NOT fired yet
act(() => { vi.advanceTimersByTime(200); });
await flushUpdates();
expect(mockGet).not.toHaveBeenCalled();
// 350ms total — debounce fires
act(() => { vi.advanceTimersByTime(150); });
await flushUpdates();
expect(mockGet).toHaveBeenCalledWith(
"/workspaces/ws-1/memories?scope=LOCAL&q=deploy"
);
});
it("renders similarity-badge when entry has similarity_score", async () => {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockGet.mockResolvedValue([{ ...MEMORY_A, similarity_score: 0.87 }] as any);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await flushUpdates();
const badge = document.querySelector('[data-testid="similarity-badge"]');
expect(badge).toBeTruthy();
expect(badge?.textContent).toBe("87%");
});
it("does not render similarity-badge when entry has no similarity_score", async () => {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockGet.mockResolvedValue([MEMORY_A] as any);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await flushUpdates();
describe('isPluginUnavailableError', () => {
it('matches the literal env var contract from the server handler', () => {
expect(
document.querySelector('[data-testid="similarity-badge"]')
).toBeNull();
isPluginUnavailableError(
new Error('API GET /workspaces/x/v2/memories: 503 {"error":"memory plugin is not configured (set MEMORY_PLUGIN_URL)"}'),
),
).toBe(true);
});
it("clear button resets query immediately and re-fetches without ?q=", async () => {
vi.useFakeTimers();
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockGet.mockResolvedValue([] as any);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await flushUpdates();
it('does not false-match on generic 503 errors that don\'t mention the env var', () => {
expect(isPluginUnavailableError(new Error('API GET /foo: 503 something else'))).toBe(false);
});
fireEvent.change(screen.getByLabelText("Search memories"), {
target: { value: "deploy" },
it('does not false-match on plain 4xx errors', () => {
expect(isPluginUnavailableError(new Error('API GET /foo: 401 unauthorized'))).toBe(false);
});
it('returns false for non-Error inputs', () => {
expect(isPluginUnavailableError(null)).toBe(false);
expect(isPluginUnavailableError(undefined)).toBe(false);
expect(isPluginUnavailableError('a string')).toBe(false);
expect(isPluginUnavailableError({ message: 'MEMORY_PLUGIN_URL' })).toBe(false);
});
});
// ── formatTTL helper ─────────────────────────────────────────────────────────
describe('formatTTL', () => {
it('returns empty string for null/undefined/empty', () => {
expect(formatTTL(null)).toBe('');
expect(formatTTL(undefined)).toBe('');
expect(formatTTL('')).toBe('');
});
it('returns empty for invalid date strings', () => {
expect(formatTTL('not-a-date')).toBe('');
});
it('returns "expired" for past timestamps', () => {
const past = new Date(Date.now() - 5000).toISOString();
expect(formatTTL(past)).toBe('expired');
});
it('formats <60s as seconds', () => {
const future = new Date(Date.now() + 30_000).toISOString();
expect(formatTTL(future)).toMatch(/^\d{1,2}s$/);
});
it('formats <60m as minutes', () => {
const future = new Date(Date.now() + 30 * 60_000).toISOString();
expect(formatTTL(future)).toMatch(/^\d{1,2}m$/);
});
it('formats <24h as hours', () => {
const future = new Date(Date.now() + 5 * 3_600_000).toISOString();
expect(formatTTL(future)).toMatch(/^\d{1,2}h$/);
});
it('formats >24h as days', () => {
const future = new Date(Date.now() + 3 * 86_400_000).toISOString();
expect(formatTTL(future)).toMatch(/^\d{1,2}d$/);
});
});
// ── Initial load + dropdown ─────────────────────────────────────────────────
describe('MemoryInspectorPanel — initial load', () => {
it('fetches namespaces and memories on mount', async () => {
stubFetch([MEM_BASIC]);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await waitFor(() => {
const calls = mockGet.mock.calls.map((c) => c[0]);
expect(calls.some((u) => u.includes('/v2/namespaces'))).toBe(true);
expect(calls.some((u) => u.includes('/v2/memories'))).toBe(true);
});
});
it('renders the row contents from the memories response', async () => {
stubFetch([MEM_BASIC]);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await waitFor(() => {
expect(screen.getByText(/Remember the standup is at 10am/)).toBeTruthy();
});
});
it('populates the namespace dropdown with readable entries + "All namespaces"', async () => {
stubFetch([]);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await waitFor(() => screen.getByLabelText('Filter by namespace'));
const select = screen.getByLabelText('Filter by namespace') as HTMLSelectElement;
const optionLabels = Array.from(select.options).map((o) => o.textContent ?? '');
expect(optionLabels[0]).toContain('All namespaces');
expect(optionLabels.join('|')).toContain('Workspace (ws-1)');
expect(optionLabels.join('|')).toContain('Team (t-1)');
});
it('selecting a namespace re-fetches with ?namespace=', async () => {
stubFetch([MEM_BASIC]);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await waitFor(() => screen.getByLabelText('Filter by namespace'));
const select = screen.getByLabelText('Filter by namespace') as HTMLSelectElement;
fireEvent.change(select, { target: { value: 'team:t-1' } });
await waitFor(() => {
const calls = mockGet.mock.calls.map((c) => c[0] as string);
expect(calls.some((u) => u.includes('namespace=team%3At-1'))).toBe(true);
});
});
});
// ── Plugin unavailable (503) ────────────────────────────────────────────────
describe('MemoryInspectorPanel — plugin unavailable', () => {
it('renders the operator-hint banner and disables search input', async () => {
mockGet.mockRejectedValue(new Error('HTTP 503: memory plugin is not configured (set MEMORY_PLUGIN_URL)'));
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await waitFor(() => screen.getByTestId('plugin-unavailable-banner'));
const searchInput = screen.getByLabelText('Search memories') as HTMLInputElement;
expect(searchInput.disabled).toBe(true);
});
it('shows the empty-state explaining plugin disabled', async () => {
mockGet.mockRejectedValue(new Error('API GET /workspaces/x/v2/memories: 503 {"error":"memory plugin is not configured (set MEMORY_PLUGIN_URL)"}'));
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await waitFor(() => screen.getByText(/Memory plugin disabled/i));
});
});
// ── Generic error (non-503) ─────────────────────────────────────────────────
describe('MemoryInspectorPanel — generic errors', () => {
it('surfaces a non-503 error in the error banner', async () => {
mockGet.mockImplementation(((url: string) => {
if (url.includes('/v2/namespaces')) {
return Promise.resolve(NS_RESPONSE);
}
return Promise.reject(new Error('upstream timeout'));
}) as typeof api.get);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await waitFor(() => {
// Error banner has role=alert
const alerts = screen.getAllByRole('alert');
const found = alerts.some((a) => a.textContent?.includes('upstream timeout'));
expect(found).toBe(true);
});
});
});
// ── Search ──────────────────────────────────────────────────────────────────
describe('MemoryInspectorPanel — search', () => {
it('eventually fires query with ?q= after debounce', async () => {
stubFetch([MEM_BASIC]);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await waitFor(() => screen.getByLabelText('Search memories'));
fireEvent.change(screen.getByLabelText('Search memories'), {
target: { value: 'standup' },
});
act(() => { vi.advanceTimersByTime(350); });
await flushUpdates();
expect(mockGet).toHaveBeenCalledWith(
"/workspaces/ws-1/memories?scope=LOCAL&q=deploy"
await waitFor(
() => {
const calls = mockGet.mock.calls.map((c) => c[0] as string);
expect(calls.some((u) => u.includes('q=standup'))).toBe(true);
},
{ timeout: 1500 },
);
mockGet.mockClear();
});
fireEvent.click(screen.getByRole("button", { name: "Clear search" }));
await flushUpdates();
it('sorts results by score descending when query active', async () => {
const lowScore: MemoryV2 = { ...MEM_BASIC, id: 'low', score: 0.2, content: 'low' };
const highScore: MemoryV2 = { ...MEM_BASIC, id: 'high', score: 0.95, content: 'high' };
// Plugin returns in arbitrary order; component sorts.
mockGet.mockImplementation(((url: string) => {
if (url.includes('/v2/namespaces')) return Promise.resolve(NS_RESPONSE);
return Promise.resolve({ memories: [lowScore, highScore] });
}) as typeof api.get);
expect(mockGet).toHaveBeenCalledWith(
"/workspaces/ws-1/memories?scope=LOCAL"
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await waitFor(() => screen.getByLabelText('Search memories'));
fireEvent.change(screen.getByLabelText('Search memories'), {
target: { value: 'something' },
});
await waitFor(
() => {
const rows = screen.getAllByTestId(/^memory-row-/);
// First row should be the high-score one
expect(rows[0].getAttribute('data-testid')).toBe('memory-row-high');
},
{ timeout: 1500 },
);
});
it('clear-button resets the query', async () => {
stubFetch([MEM_BASIC]);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await waitFor(() => screen.getByLabelText('Search memories'));
fireEvent.change(screen.getByLabelText('Search memories'), {
target: { value: 'foo' },
});
fireEvent.click(screen.getByLabelText('Clear search'));
expect((screen.getByLabelText('Search memories') as HTMLInputElement).value).toBe('');
});
it('renders no-results empty-state when search has no matches', async () => {
stubFetch([]);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await waitFor(() => screen.getByLabelText('Search memories'));
fireEvent.change(screen.getByLabelText('Search memories'), {
target: { value: 'nothing' },
});
await waitFor(
() => {
expect(screen.getByText(/No memories match your search/i)).toBeTruthy();
},
{ timeout: 1500 },
);
});
});
// ── Per-row badges ───────────────────────────────────────────────────────────
describe('MemoryInspectorPanel — row badges', () => {
it('renders kind, source, pin, TTL badges per shape', async () => {
stubFetch([MEM_PINNED, MEM_RUNTIME_CHECKPOINT]);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await waitFor(() => {
// Pinned memory: kind=summary, source=user, pin=true, TTL>0
const pinnedRow = screen.getByTestId('memory-row-mem-pinned');
expect(pinnedRow.querySelector('[data-testid="kind-badge"]')?.textContent).toBe('S');
expect(pinnedRow.querySelector('[data-testid="source-badge"]')?.textContent).toBe('user');
expect(pinnedRow.querySelector('[data-testid="pin-badge"]')).toBeTruthy();
expect(pinnedRow.querySelector('[data-testid="ttl-badge"]')?.textContent).toMatch(/^⌛\d+[hd]$/);
// Checkpoint memory: kind=checkpoint, source=runtime, no pin, no TTL
const propRow = screen.getByTestId('memory-row-mem-checkpoint');
expect(propRow.querySelector('[data-testid="kind-badge"]')?.textContent).toBe('C');
expect(propRow.querySelector('[data-testid="source-badge"]')?.textContent).toBe('runtime');
expect(propRow.querySelector('[data-testid="pin-badge"]')).toBeNull();
expect(propRow.querySelector('[data-testid="ttl-badge"]')).toBeNull();
});
});
it('TTL badge shows "expired" for past expires_at', async () => {
stubFetch([MEM_EXPIRED]);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await waitFor(() => {
const row = screen.getByTestId('memory-row-mem-expired');
expect(row.querySelector('[data-testid="ttl-badge"]')?.textContent).toBe('⌛expired');
});
});
it('expanding a row shows full content + Forget button', async () => {
stubFetch([MEM_BASIC]);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await waitFor(() => screen.getByTestId('memory-row-mem-a'));
const row = screen.getByTestId('memory-row-mem-a');
const headerButton = row.querySelector('button');
expect(headerButton).toBeTruthy();
fireEvent.click(headerButton!);
await waitFor(() => {
expect(screen.getByLabelText('Forget memory')).toBeTruthy();
});
});
});
// ── Delete (Forget) flow ──────────────────────────────────────────────────────
describe('MemoryInspectorPanel — forget flow', () => {
it('opens the confirm dialog on Forget click and removes optimistically on confirm', async () => {
stubFetch([MEM_BASIC]);
mockDel.mockResolvedValue({ status: 'deleted' });
render(<MemoryInspectorPanel workspaceId="ws-1" />);
// Expand row, click Forget
await waitFor(() => screen.getByTestId('memory-row-mem-a'));
const row = screen.getByTestId('memory-row-mem-a');
fireEvent.click(row.querySelector('button')!);
await waitFor(() => screen.getByLabelText('Forget memory'));
fireEvent.click(screen.getByLabelText('Forget memory'));
// Dialog appears with v2-shaped copy (Forget, not Delete)
expect(screen.getByTestId('dialog-title').textContent).toBe('Forget memory');
fireEvent.click(screen.getByText('Confirm'));
// Optimistic removal happens immediately
await waitFor(() => {
expect(screen.queryByTestId('memory-row-mem-a')).toBeNull();
});
// DELETE called with the right path
await waitFor(() => {
const delPaths = mockDel.mock.calls.map((c) => c[0] as string);
expect(delPaths.some((p) => p.includes('/v2/memories/mem-a'))).toBe(true);
});
});
it('cancelling the dialog leaves the row in place', async () => {
stubFetch([MEM_BASIC]);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await waitFor(() => screen.getByTestId('memory-row-mem-a'));
fireEvent.click(screen.getByTestId('memory-row-mem-a').querySelector('button')!);
await waitFor(() => screen.getByLabelText('Forget memory'));
fireEvent.click(screen.getByLabelText('Forget memory'));
fireEvent.click(screen.getByText('Cancel'));
expect(screen.queryByTestId('memory-row-mem-a')).toBeTruthy();
expect(mockDel).not.toHaveBeenCalled();
});
it('rolls back on server failure by reloading entries', async () => {
stubFetch([MEM_BASIC]);
mockDel.mockRejectedValue(new Error('upstream 502'));
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await waitFor(() => screen.getByTestId('memory-row-mem-a'));
fireEvent.click(screen.getByTestId('memory-row-mem-a').querySelector('button')!);
await waitFor(() => screen.getByLabelText('Forget memory'));
fireEvent.click(screen.getByLabelText('Forget memory'));
fireEvent.click(screen.getByText('Confirm'));
// After failure, error banner surfaces + reload re-fetches memories
await waitFor(() => {
const alerts = screen.getAllByRole('alert');
const found = alerts.some((a) => a.textContent?.includes('upstream 502'));
expect(found).toBe(true);
});
});
});
// ── Empty state when no memories at all ────────────────────────────────────
describe('MemoryInspectorPanel — empty state', () => {
it('renders the "no memories yet" empty state when not searching', async () => {
stubFetch([]);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await waitFor(() => {
expect(screen.getByText('No memories yet')).toBeTruthy();
});
});
});
// ── Refresh ─────────────────────────────────────────────────────────────────
describe('MemoryInspectorPanel — refresh', () => {
it('Refresh button refetches memories', async () => {
stubFetch([MEM_BASIC]);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await waitFor(() => screen.getByLabelText('Refresh memories'));
const before = mockGet.mock.calls.filter((c) =>
(c[0] as string).includes('/v2/memories'),
).length;
fireEvent.click(screen.getByLabelText('Refresh memories'));
await waitFor(() => {
const after = mockGet.mock.calls.filter((c) =>
(c[0] as string).includes('/v2/memories'),
).length;
expect(after).toBe(before + 1);
});
});
});
+109 -5
View File
@@ -7,8 +7,9 @@ import { api } from "@/lib/api";
import { useCanvasStore, type WorkspaceNodeData } from "@/store/canvas";
import { useSocketEvent } from "@/hooks/useSocketEvent";
import { type ChatMessage, type ChatAttachment, createMessage, appendMessageDeduped } from "./chat/types";
import { uploadChatFiles, downloadChatFile } from "./chat/uploads";
import { AttachmentChip, PendingAttachmentPill } from "./chat/AttachmentViews";
import { uploadChatFiles, downloadChatFile, isPlatformAttachment } from "./chat/uploads";
import { PendingAttachmentPill } from "./chat/AttachmentViews";
import { AttachmentPreview } from "./chat/AttachmentPreview";
import { extractFilesFromTask } from "./chat/message-parser";
import { AgentCommsPanel } from "./chat/AgentCommsPanel";
import { appendActivityLine } from "./chat/activityLog";
@@ -286,6 +287,14 @@ function MyChatPanel({ workspaceId, data }: Props) {
const [error, setError] = useState<string | null>(null);
const [confirmRestart, setConfirmRestart] = useState(false);
const bottomRef = useRef<HTMLDivElement>(null);
// First-mount scroll-to-bottom needs `behavior: "instant"` — long
// conversations smooth-animate for ~300ms which any concurrent
// re-render can interrupt, leaving the user stuck mid-conversation
// when the chat tab opens. Subsequent appends (new agent messages)
// keep `smooth` for the visual "landing" feel. Flipped the first
// time messages.length goes positive, so a workspace switch (which
// remounts ChatTab) gets a fresh instant jump too.
const hasInitialScrollRef = useRef(false);
// Lazy-load older history on scroll-up.
// - containerRef = the scrollable messages viewport
// - topRef = sentinel above the messages list; IO observes it
@@ -545,6 +554,15 @@ function MyChatPanel({ workspaceId, data }: Props) {
scrollAnchorRef.current = null;
return;
}
// Instant on first arrival of messages — smooth-scroll on a long
// conversation gets interrupted by concurrent renders and leaves
// the user stuck in the middle. After the first jump, subsequent
// appends animate as before.
if (!hasInitialScrollRef.current && messages.length > 0) {
hasInitialScrollRef.current = true;
bottomRef.current?.scrollIntoView({ behavior: "instant" as ScrollBehavior });
return;
}
bottomRef.current?.scrollIntoView({ behavior: "smooth" });
}, [messages]);
@@ -1044,14 +1062,85 @@ function MyChatPanel({ workspaceId, data }: Props) {
: "dark:prose-invert dark:[--tw-prose-invert-body:theme(colors.zinc.100)] dark:[--tw-prose-invert-headings:theme(colors.white)] dark:[--tw-prose-invert-bold:theme(colors.white)] dark:[--tw-prose-invert-code:theme(colors.zinc.100)]"
}`}
>
<ReactMarkdown remarkPlugins={[remarkGfm]}>{msg.content}</ReactMarkdown>
<ReactMarkdown
remarkPlugins={[remarkGfm]}
components={{
// Default ReactMarkdown renders `<a href="...">`
// with no target and no scheme handling, so:
//
// 1. http/https links navigate the canvas tab
// itself away — user loses canvas state.
// 2. workspace://, file://, and bare /workspace/
// paths from agent-authored markdown produce
// an unhandled-protocol click → browser ends
// up at about:blank with no download (the
// reported bug from 2026-05-05).
//
// Override: external URLs open in a new tab with
// rel="noopener noreferrer"; in-container paths
// route through downloadChatFile so the browser
// gets a real Blob with proper auth headers.
a: ({ href, children, ...rest }) => {
const url = String(href ?? "");
// Use the SSOT helper isPlatformAttachment so
// the markdown link override and the chip
// download path agree on which schemes need
// auth-routed download. Pre-fix this list was
// duplicated and missed `platform-pending:`,
// producing about:blank for poll-mode uploads.
if (isPlatformAttachment(url)) {
return (
<a
href={url}
{...rest}
onClick={(e) => {
e.preventDefault();
// Construct a synthetic ChatAttachment
// and route through the same
// authenticated download path the
// download chips use. Filename is the
// last path segment so Save-As prefills
// sensibly.
const name = url.split(/[\\/]/).pop() || "download";
downloadChatFile(workspaceId, {
uri: url,
name,
}).catch((err) => {
setError(
err instanceof Error
? `Download failed: ${err.message}`
: "Download failed",
);
});
}}
>
{children}
</a>
);
}
// External (http(s) / mailto / unknown scheme):
// open in new tab so canvas state survives.
return (
<a
href={url}
target="_blank"
rel="noopener noreferrer"
{...rest}
>
{children}
</a>
);
},
}}
>{msg.content}</ReactMarkdown>
</div>
)}
{msg.attachments && msg.attachments.length > 0 && (
<div className={`flex flex-wrap gap-1 ${msg.content ? "mt-1.5" : ""}`}>
{msg.attachments.map((att, i) => (
<AttachmentChip
<AttachmentPreview
key={`${msg.id}-${i}`}
workspaceId={workspaceId}
attachment={att}
onDownload={downloadAttachment}
tone={msg.role === "user" ? "user" : "agent"}
@@ -1150,7 +1239,22 @@ function MyChatPanel({ workspaceId, data }: Props) {
value={input}
onChange={(e) => setInput(e.target.value)}
onKeyDown={(e) => {
if (e.key === "Enter" && !e.shiftKey) {
// IME-safe send: while a CJK / Japanese / Korean IME is
// composing, Enter accepts the candidate selection — not a
// newline, not a send. `e.nativeEvent.isComposing` is the
// standard signal (modern WebKit/Blink/Gecko); the keyCode
// 229 fallback covers older Safari / WebKit-based mobile
// browsers that delay setting isComposing on the
// composition-end Enter. Reported 2026-05-05: typing
// Chinese with the system IME, pressing Enter to commit
// a candidate would inadvertently send the half-typed
// message.
if (
e.key === "Enter" &&
!e.shiftKey &&
!e.nativeEvent.isComposing &&
e.keyCode !== 229
) {
e.preventDefault();
sendMessage();
}
+21
View File
@@ -262,6 +262,27 @@ export function ConfigTab({ workspaceId }: Props) {
setOriginalProvider("");
}
// Skip the config.yaml fetch entirely for runtimes that manage
// their own config (external, hermes, etc.) — they don't have a
// platform-side template, so the GET would 404. The catch block
// below handles 404 gracefully, but issuing the request adds
// browser-console noise + a wasted RTT on every open of the
// Config tab for the affected workspaces. Reported on
// production reno-stars 2026-05-05 (workspace runtime=external,
// 404 on /files/config.yaml visible in the console even though
// the form rendered correctly).
if (RUNTIMES_WITH_OWN_CONFIG.has(wsMetadataRuntime)) {
setConfig({
...DEFAULT_CONFIG,
runtime: wsMetadataRuntime,
model: wsMetadataModel,
...(wsMetadataModel ? { runtime_config: { model: wsMetadataModel } } : {}),
...(wsMetadataTier !== null ? { tier: wsMetadataTier } : {}),
} as ConfigData);
setOriginalModel(wsMetadataModel);
setLoading(false);
return;
}
try {
const res = await api.get<{ content: string }>(`/workspaces/${workspaceId}/files/config.yaml`);
const parsed = parseYaml(res.content);
+113 -4
View File
@@ -2,9 +2,11 @@
import { useState, useEffect, useRef, useMemo } from "react";
import { showToast } from "../Toaster";
import type { WorkspaceNodeData } from "@/store/canvas";
import { FilesToolbar } from "./FilesTab/FilesToolbar";
import { FileTree } from "./FilesTab/FileTree";
import { FileEditor } from "./FilesTab/FileEditor";
import { NotAvailablePanel } from "./FilesTab/NotAvailablePanel";
import { useFilesApi } from "./FilesTab/useFilesApi";
import { buildTree } from "./FilesTab/tree";
@@ -14,9 +16,40 @@ export type { TreeNode } from "./FilesTab/tree";
interface Props {
workspaceId: string;
/** Workspace metadata from the canvas store. Optional for back-compat
* with any caller that still mounts <FilesTab workspaceId=.../> without
* threading data through (legacy tests). When present, runtime gates
* the early-return below. Mirrors TerminalTab's prop shape (#2830). */
data?: WorkspaceNodeData;
}
export function FilesTab({ workspaceId }: Props) {
/** Runtimes whose filesystem the platform doesn't own. The canvas can't
* list/read/write files on these — the agent runs on the user's own
* hardware (mac laptop, mac mini, hermes-on-home-server) and reaches
* the platform via the heartbeat-based polling Phase 30 layer.
*
* Keep narrow — only add a runtime here when its provisioner genuinely
* has no platform-owned filesystem. Otherwise the user loses access to
* a real surface (e.g. claude-code SaaS workspaces have files served
* by ListFiles via EIC; they belong on the rendering path, not here). */
const RUNTIMES_WITHOUT_FILES = new Set(["external"]);
export function FilesTab({ workspaceId, data }: Props) {
// Early-return for runtimes whose filesystem is not platform-owned.
// Skips the whole useFilesApi hook + tree render below — without this,
// mounting the tab for an external workspace would issue a GET that
// the platform can technically answer (it reads its own DB row, not
// the user's machine), but every result row is fictional. Showing
// "0 files / No config files yet" reads as a bug. The placeholder
// makes the absence intentional and points the user at the right
// surface (Chat).
if (data && RUNTIMES_WITHOUT_FILES.has(data.runtime)) {
return <NotAvailablePanel runtime={data.runtime} />;
}
return <PlatformOwnedFilesTab workspaceId={workspaceId} />;
}
function PlatformOwnedFilesTab({ workspaceId }: { workspaceId: string }) {
const [root, setRoot] = useState("/configs");
const [selectedFile, setSelectedFile] = useState<string | null>(null);
const [fileContent, setFileContent] = useState("");
@@ -45,11 +78,36 @@ export function FilesTab({ workspaceId }: Props) {
readFile,
writeFile,
deleteFile,
downloadFileByPath,
downloadAllFiles,
uploadFiles,
uploadDataTransferItems,
deleteAllFiles,
} = useFilesApi(workspaceId, root);
// PR-D: track whether the user is currently dragging files OVER
// the root area (not over a specific subdir row). Used to show
// the "Drop to upload to root" highlight on the tree column.
const [rootDragHover, setRootDragHover] = useState(false);
const handleDropToTarget = (
targetDir: string,
items: DataTransferItemList,
) => {
// canDelete is the gate proxy — same constraint as the toolbar
// Upload button (today only /configs is writable from the canvas
// surface). Without this check, dropping on /home would post
// through /workspaces/<id>/files/<path>, which the backend would
// reject only after an HTTP round-trip. Fail fast.
if (root !== "/configs") {
setError(
`Upload only allowed in /configs (current root: ${root}). Switch root or use Upload button.`,
);
return;
}
void uploadDataTransferItems(items, targetDir);
};
const tree = useMemo(() => buildTree(files), [files]);
const openFile = async (path: string) => {
@@ -190,8 +248,46 @@ export function FilesTab({ workspaceId }: Props) {
)}
<div className="flex flex-1 min-h-0">
{/* File tree */}
<div className="w-[180px] border-r border-line/40 overflow-y-auto shrink-0">
{/* File tree column. PR-D: outer div is the drop zone for
"drop on root" — when the user drags into the column area
(not over a specific subdir row), the drop targets the
current root directory. Subdirectory rows in <FileTree>
stop propagation on their own drop event so a drop on
/configs/skills doesn't ALSO fire root-area drop. */}
<div
className={`w-[180px] border-r border-line/40 overflow-y-auto shrink-0 transition-colors ${
rootDragHover ? "bg-accent/10 outline outline-1 outline-accent/40 -outline-offset-2" : ""
}`}
onDragOver={(e) => {
// Only highlight + accept the drop when uploads are
// actually allowed for the current root. Without this
// check the user gets a misleading drag affordance,
// drops, then sees the toolbar's "switch root" toast —
// bad UX.
if (root !== "/configs") return;
e.preventDefault();
e.dataTransfer.dropEffect = "copy";
}}
onDragEnter={(e) => {
if (root !== "/configs") return;
e.preventDefault();
setRootDragHover(true);
}}
onDragLeave={(e) => {
const next = e.relatedTarget as Node | null;
if (!next || !(e.currentTarget as HTMLElement).contains(next)) {
setRootDragHover(false);
}
}}
onDrop={(e) => {
if (root !== "/configs") return;
e.preventDefault();
setRootDragHover(false);
if (e.dataTransfer.items?.length) {
handleDropToTarget("", e.dataTransfer.items);
}
}}
>
{/* New file input */}
{showNewFile && (
<div className="px-2 py-1 border-b border-line/40">
@@ -209,14 +305,27 @@ export function FilesTab({ workspaceId }: Props) {
{files.length === 0 ? (
<div className="px-3 py-4 text-[10px] text-ink-soft text-center">
No config files yet
{rootDragHover
? "Drop to upload to root"
: root === "/configs"
? "No config files yet — drag files here to upload"
: "No config files yet"}
</div>
) : (
<FileTree
nodes={tree}
selectedPath={selectedFile}
onSelect={openFile}
// Delete is currently gated to /configs to match the
// toolbar's New / Upload / Clear affordances. Context
// menu and inline ✕ both honour the gate. PR-A made the
// backend EIC delete work on all roots — keeping the
// canvas gate conservative until we want to expose
// /home /workspace deletion intentionally.
onDelete={root === "/configs" ? setConfirmDelete : () => {}}
onDownload={downloadFileByPath}
canDelete={root === "/configs"}
onDropToTarget={handleDropToTarget}
expandedDirs={expandedDirs}
onToggleDir={toggleDir}
loadingDir={loadingDir}
@@ -1,41 +1,129 @@
"use client";
import { useState } from "react";
import { type TreeNode, getIcon } from "./tree";
import { FileTreeContextMenu, type MenuItem } from "./FileTreeContextMenu";
interface TreeCallbacks {
selectedPath: string | null;
onSelect: (path: string) => void;
onDelete: (path: string) => void;
/** PR-C: right-click → Download. Files only — directories ignore. */
onDownload: (path: string) => void;
/** Whether the active root permits delete. Wire into the Delete
* context-menu item's `disabled` flag so the user gets the same
* affordance as the toolbar (which gates Clear/New on /configs). */
canDelete: boolean;
/** PR-D: drop files/folders from the OS onto this row. targetDir
* is the directory path (relative to the active root) under which
* the dropped contents should land; "" means root. */
onDropToTarget?: (targetDir: string, items: DataTransferItemList) => void;
expandedDirs: Set<string>;
onToggleDir: (path: string) => void;
loadingDir: string | null;
}
/**
* FileTree renders the workspace tree + owns the right-click context
* menu (PR-C) and the drop-target hover state (PR-D). Lifting the
* menu state here (vs each row) means only one menu open at a time —
* opening a new row's menu auto-closes the prior one. Same UX as
* VSCode / Theia.
*/
export function FileTree({
nodes,
selectedPath,
onSelect,
onDelete,
onDownload,
canDelete,
onDropToTarget,
expandedDirs,
onToggleDir,
loadingDir,
depth = 0,
}: TreeCallbacks & { nodes: TreeNode[]; depth?: number }) {
const [menu, setMenu] = useState<{
x: number;
y: number;
items: MenuItem[];
} | null>(null);
// PR-D: hover-target highlight state for drag-drop. Lifted next to
// the menu state so both shared-across-rows interactions live in
// one place.
const [hoverDir, setHoverDir] = useState<string | null>(null);
const openContextMenu = (e: React.MouseEvent, node: TreeNode) => {
e.preventDefault();
// Items composed per-row so the available actions reflect the
// node type (files get Open + Download; directories get Delete
// only since "open a directory in the editor" doesn't apply
// and "Export folder" is the toolbar's job).
const items: MenuItem[] = [];
if (!node.isDir) {
items.push({
id: "open",
label: "Open",
icon: "⤴",
onClick: () => onSelect(node.path),
});
items.push({
id: "download",
label: "Download",
icon: "↓",
onClick: () => onDownload(node.path),
});
}
items.push({
id: "delete",
label: "Delete",
icon: "✕",
destructive: true,
disabled: !canDelete,
onClick: () => onDelete(node.path),
});
setMenu({ x: e.clientX, y: e.clientY, items });
};
// Single state lifted to the top-level tree; nested <FileTree>s
// (rendered for expanded directories below) do NOT instantiate
// their own menus or drop-targets — they call back via prop
// drilling. This keeps "only one menu open" + "only one drop
// target highlighted" as structural invariants rather than
// render-order coincidences.
const childCallbacks: TreeCallbacks = {
selectedPath,
onSelect,
onDelete,
onDownload,
canDelete,
onDropToTarget,
expandedDirs,
onToggleDir,
loadingDir,
};
return (
<div>
{nodes.map((node) => (
<TreeItem
key={`${node.path}:${node.isDir ? "dir" : "file"}`}
node={node}
selectedPath={selectedPath}
onSelect={onSelect}
onDelete={onDelete}
expandedDirs={expandedDirs}
onToggleDir={onToggleDir}
loadingDir={loadingDir}
openContextMenu={openContextMenu}
hoverDir={hoverDir}
setHoverDir={setHoverDir}
depth={depth}
{...childCallbacks}
/>
))}
{menu && (
<FileTreeContextMenu
x={menu.x}
y={menu.y}
items={menu.items}
onClose={() => setMenu(null)}
/>
)}
</div>
);
}
@@ -45,22 +133,81 @@ function TreeItem({
selectedPath,
onSelect,
onDelete,
onDownload,
canDelete,
onDropToTarget,
expandedDirs,
onToggleDir,
loadingDir,
depth,
}: TreeCallbacks & { node: TreeNode; depth: number }) {
openContextMenu,
hoverDir,
setHoverDir,
}: TreeCallbacks & {
node: TreeNode;
depth: number;
openContextMenu: (e: React.MouseEvent, node: TreeNode) => void;
hoverDir: string | null;
setHoverDir: (p: string | null) => void;
}) {
const isSelected = selectedPath === node.path;
const expanded = expandedDirs.has(node.path);
const isLoading = loadingDir === node.path;
const isDropTarget = node.isDir && hoverDir === node.path;
// PR-D drag handlers — only directory rows are valid drop targets
// (dropping a file ON another file is ambiguous; treat it as
// dropping in the parent dir, which the root area handles). When a
// drag enters a directory row, mark it the hover target. When the
// cursor leaves to a non-child element, clear it. drop fires the
// upload callback with the row's path.
const dragProps = node.isDir && onDropToTarget
? {
onDragOver: (e: React.DragEvent) => {
// preventDefault is REQUIRED to opt this element into the
// drop target list — without it, browsers refuse to fire
// the drop event regardless of the drop handler.
e.preventDefault();
e.dataTransfer.dropEffect = "copy";
},
onDragEnter: (e: React.DragEvent) => {
e.preventDefault();
setHoverDir(node.path);
},
onDragLeave: (e: React.DragEvent) => {
// Only clear hover when leaving to an element OUTSIDE this
// row — bare leave-events fire for every child crossed
// (the icon, the label, the ✕ button). Without the
// contains() check the highlight flickers.
const next = e.relatedTarget as Node | null;
if (!next || !(e.currentTarget as HTMLElement).contains(next)) {
setHoverDir(null);
}
},
onDrop: (e: React.DragEvent) => {
e.preventDefault();
e.stopPropagation();
setHoverDir(null);
if (e.dataTransfer.items?.length) {
onDropToTarget(node.path, e.dataTransfer.items);
}
},
}
: {};
if (node.isDir) {
return (
<div>
<div
className="group w-full flex items-center gap-1 px-2 py-0.5 text-left hover:bg-surface-card/40 transition-colors cursor-pointer"
className={`group w-full flex items-center gap-1 px-2 py-0.5 text-left transition-colors cursor-pointer ${
isDropTarget
? "bg-accent/20 outline outline-1 outline-accent/60"
: "hover:bg-surface-card/40"
}`}
style={{ paddingLeft: `${depth * 12 + 8}px` }}
onClick={() => onToggleDir(node.path)}
onContextMenu={(e) => openContextMenu(e, node)}
{...dragProps}
>
<span className="text-[9px] text-ink-soft w-3">{isLoading ? "…" : expanded ? "▼" : "▶"}</span>
<span className="text-[10px]">📁</span>
@@ -82,6 +229,9 @@ function TreeItem({
selectedPath={selectedPath}
onSelect={onSelect}
onDelete={onDelete}
onDownload={onDownload}
canDelete={canDelete}
onDropToTarget={onDropToTarget}
expandedDirs={expandedDirs}
onToggleDir={onToggleDir}
loadingDir={loadingDir}
@@ -99,6 +249,7 @@ function TreeItem({
}`}
style={{ paddingLeft: `${depth * 12 + 20}px` }}
onClick={() => onSelect(node.path)}
onContextMenu={(e) => openContextMenu(e, node)}
>
<span className="text-[9px]">{getIcon(node.name, false)}</span>
<span className="text-[10px] flex-1 truncate font-mono">{node.name}</span>
@@ -0,0 +1,141 @@
"use client";
import { useEffect, useRef } from "react";
/**
* FileTreeContextMenu — VSCode-style right-click menu for a single
* file-tree row. Pops at the cursor's viewport coords; dismisses on
* outside-click, Esc, blur, or scroll.
*
* Why a custom component (no library): the menu is one of several
* "small popovers" in canvas; pulling in a dnd / popover lib for one
* surface adds 10x the bytes of this implementation. The patterns
* (outside-click + Esc + portal-free fixed position) match the
* ContextMenu used in canvas/Toolbar so the keyboard-nav muscle
* memory is uniform.
*
* Items are rendered from a `MenuItem[]` so callers can add/remove
* actions without touching this component (e.g. PR-D will add an
* "Upload to this folder" item for directory rows).
*
* Accessibility:
* - role="menu" + role="menuitem" so screen readers announce the
* surface as a menu, not a generic div.
* - First item gets autofocus so keyboard users can ↓/↑/Enter without
* reaching for the mouse.
* - Esc + outside-click + Tab dismisses; behaves like every other
* menu the user has touched on the canvas.
*/
export interface MenuItem {
/** Stable identifier for testing + analytics. */
id: string;
label: string;
/** Optional left icon glyph; not load-bearing. */
icon?: string;
/** Destructive (rendered in red) — for Delete-class actions. */
destructive?: boolean;
/** Item-specific click handler. The menu auto-closes after onClick
* fires so handlers don't have to call onClose themselves. */
onClick: () => void;
/** Disabled items render but don't fire onClick (useful for
* Delete-on-non-/configs case where the caller wants to surface
* the item but explain it's gated). Currently unused — placeholder
* for future options. */
disabled?: boolean;
}
interface Props {
/** Viewport-coordinate position of the cursor that opened the menu. */
x: number;
y: number;
items: MenuItem[];
onClose: () => void;
}
export function FileTreeContextMenu({ x, y, items, onClose }: Props) {
const ref = useRef<HTMLDivElement>(null);
// First item gets initial focus for keyboard ↓/↑/Enter nav.
const firstItemRef = useRef<HTMLButtonElement>(null);
useEffect(() => {
firstItemRef.current?.focus();
}, []);
// Outside-click + Esc dismiss. Per memory
// (feedback_abort_controller_for_rerendered_listeners), use an
// AbortController so re-mounts (caller toggles the menu) don't leak
// listeners.
useEffect(() => {
const ctrl = new AbortController();
const onPointerDown = (e: MouseEvent) => {
if (ref.current && !ref.current.contains(e.target as Node)) onClose();
};
const onKeyDown = (e: KeyboardEvent) => {
if (e.key === "Escape") {
e.preventDefault();
onClose();
} else if (e.key === "ArrowDown" || e.key === "ArrowUp") {
// Roving focus across .menuitem buttons. Doing this with
// tabindex management because Tab / Shift+Tab leave the menu
// (which is the right thing — the user is escaping the menu).
e.preventDefault();
const buttons = ref.current?.querySelectorAll<HTMLButtonElement>(
"[role='menuitem']:not([disabled])",
);
if (!buttons || buttons.length === 0) return;
const arr = Array.from(buttons);
const cur = arr.indexOf(document.activeElement as HTMLButtonElement);
const next =
e.key === "ArrowDown"
? (cur + 1) % arr.length
: (cur - 1 + arr.length) % arr.length;
arr[next].focus();
}
};
// `mousedown` (not `click`) so the menu dismisses BEFORE the
// tree-row's click handler would fire — otherwise clicking
// outside also selects a different row, which is not what the
// user expected when "outside-click closes the menu".
document.addEventListener("mousedown", onPointerDown, { signal: ctrl.signal });
document.addEventListener("keydown", onKeyDown, { signal: ctrl.signal });
// Scroll inside any ancestor also dismisses — the fixed-position
// menu would otherwise stay anchored to viewport coords while the
// row it points at scrolled away. Use capture so we catch scroll
// on inner panels (FileTree's overflow-y-auto wrapper).
document.addEventListener("scroll", onClose, { signal: ctrl.signal, capture: true });
return () => ctrl.abort();
}, [onClose]);
return (
<div
ref={ref}
role="menu"
aria-label="File actions"
className="fixed z-[1000] min-w-[140px] py-1 bg-surface-elevated border border-line/60 rounded-md shadow-xl shadow-black/30 text-[11px]"
style={{ left: x, top: y }}
>
{items.map((item, i) => (
<button
key={item.id}
ref={i === 0 ? firstItemRef : undefined}
type="button"
role="menuitem"
disabled={item.disabled}
onClick={() => {
if (item.disabled) return;
item.onClick();
onClose();
}}
className={
item.destructive
? "w-full text-left px-3 py-1 text-bad hover:bg-red-900/30 focus:bg-red-900/30 focus:outline-none disabled:opacity-40 disabled:pointer-events-none transition-colors"
: "w-full text-left px-3 py-1 text-ink-mid hover:bg-surface-card hover:text-ink focus:bg-surface-card focus:text-ink focus:outline-none disabled:opacity-40 disabled:pointer-events-none transition-colors"
}
>
{item.icon && <span className="inline-block w-4 mr-1.5 text-ink-soft">{item.icon}</span>}
{item.label}
</button>
))}
</div>
);
}
@@ -0,0 +1,58 @@
"use client";
/**
* NotAvailablePanel — full-tab placeholder for runtimes whose filesystem
* the platform doesn't own (today: runtime === "external").
*
* Pre-fix the FilesTab tried to GET /workspaces/<id>/files for these
* workspaces. The platform answered with [] (no rows in workspace_files
* for an external workspace by definition), but the canvas rendered
* "0 files / No config files yet" which reads identically to the SaaS
* empty-listing bug fixed in PR-A. Showing an explicit placeholder
* makes the absence intentional and routes the user toward the
* supported surface (Chat) for these workspaces.
*
* Mirrors the same affordance TerminalTab adopted for runtimes without
* a TTY in PR #2830 — uniform "feature-not-applicable" UX across tabs.
*/
export function NotAvailablePanel({ runtime }: { runtime: string }) {
return (
<div className="flex flex-col items-center justify-center h-full p-8 text-center bg-surface-sunken/30">
{/* Folder-with-slash icon. Custom inline SVG so we don't depend
on an icon set being present at canvas build-time (matches
TerminalTab's NotAvailablePanel pattern). */}
<svg
width="72"
height="72"
viewBox="0 0 72 72"
fill="none"
aria-hidden="true"
className="text-ink-soft mb-4"
>
{/* Folder body */}
<path
d="M10 22 L10 56 a4 4 0 0 0 4 4 L58 60 a4 4 0 0 0 4 -4 L62 26 a4 4 0 0 0 -4 -4 L34 22 L28 16 L14 16 a4 4 0 0 0 -4 4 Z"
stroke="currentColor"
strokeWidth="2.5"
strokeLinejoin="round"
fill="none"
opacity="0.6"
/>
{/* Diagonal cancel slash */}
<path
d="M14 14 L58 58"
stroke="currentColor"
strokeWidth="3"
strokeLinecap="round"
/>
</svg>
<h3 className="text-sm font-medium text-ink mb-1.5">Files not available</h3>
<p className="text-[11px] text-ink-soft max-w-xs leading-relaxed">
This workspace runs the{" "}
<span className="font-mono text-ink-mid">{runtime}</span> runtime,
whose filesystem isn't owned by the platform. Use the Chat tab to
interact with the agent directly.
</p>
</div>
);
}
@@ -0,0 +1,136 @@
// @vitest-environment jsdom
//
// Pins the right-click context menu added in PR-C of issue #2999.
// VSCode-style affordance: Open / Download / Delete on file rows,
// Delete on directory rows. Delete is gated by `canDelete` (parent
// only enables on /configs root, matching the toolbar's gate).
//
// Pinned branches:
// 1. Right-click on a file row opens the menu at the click coords
// with Open + Download + Delete items.
// 2. Right-click on a directory row opens the menu with Delete
// only (no Open/Download — directories don't have one-click
// semantics in this surface).
// 3. Clicking Download fires the onDownload callback with the
// row's path.
// 4. Clicking Delete fires onDelete with the row's path (when
// canDelete=true).
// 5. Delete is disabled in the rendered menu when canDelete=false
// and clicking it does NOT fire onDelete (gate is real).
// 6. Esc dismisses the menu.
// 7. Click outside the menu dismisses it.
import { describe, it, expect, vi, afterEach } from "vitest";
import { render, screen, cleanup, fireEvent, act } from "@testing-library/react";
import React from "react";
import { FileTree } from "../FileTree";
import type { TreeNode } from "../tree";
afterEach(cleanup);
const file: TreeNode = { name: "config.yaml", path: "config.yaml", isDir: false, children: [], size: 0 };
const dir: TreeNode = {
name: "skills",
path: "skills",
isDir: true,
children: [],
size: 0,
};
function renderTree(props: Partial<React.ComponentProps<typeof FileTree>> = {}) {
const defaults = {
nodes: [file, dir],
selectedPath: null,
onSelect: vi.fn(),
onDelete: vi.fn(),
onDownload: vi.fn(),
canDelete: true,
expandedDirs: new Set<string>(),
onToggleDir: vi.fn(),
loadingDir: null,
};
const merged = { ...defaults, ...props };
return { ...render(<FileTree {...merged} />), props: merged };
}
describe("FileTree right-click context menu", () => {
it("right-click on a file row opens menu with Open/Download/Delete", () => {
renderTree();
fireEvent.contextMenu(screen.getByText("config.yaml"), {
clientX: 50,
clientY: 100,
});
expect(screen.getByRole("menu")).not.toBeNull();
expect(screen.getByRole("menuitem", { name: /Open/i })).not.toBeNull();
expect(screen.getByRole("menuitem", { name: /Download/i })).not.toBeNull();
expect(screen.getByRole("menuitem", { name: /Delete/i })).not.toBeNull();
});
it("right-click on a directory row opens menu with Delete only (no Open/Download)", () => {
renderTree();
fireEvent.contextMenu(screen.getByText("skills"), { clientX: 60, clientY: 120 });
expect(screen.getByRole("menu")).not.toBeNull();
expect(screen.queryByRole("menuitem", { name: /Open/i })).toBeNull();
expect(screen.queryByRole("menuitem", { name: /Download/i })).toBeNull();
expect(screen.getByRole("menuitem", { name: /Delete/i })).not.toBeNull();
});
it("clicking Download fires onDownload with the row's path", () => {
const { props } = renderTree();
fireEvent.contextMenu(screen.getByText("config.yaml"), { clientX: 0, clientY: 0 });
fireEvent.click(screen.getByRole("menuitem", { name: /Download/i }));
expect(props.onDownload).toHaveBeenCalledWith("config.yaml");
// Menu auto-closes after click.
expect(screen.queryByRole("menu")).toBeNull();
});
it("clicking Delete fires onDelete with the row's path when canDelete=true", () => {
const { props } = renderTree({ canDelete: true });
fireEvent.contextMenu(screen.getByText("config.yaml"), { clientX: 0, clientY: 0 });
fireEvent.click(screen.getByRole("menuitem", { name: /Delete/i }));
expect(props.onDelete).toHaveBeenCalledWith("config.yaml");
});
it("Delete is disabled when canDelete=false; clicking does not fire onDelete", () => {
const { props } = renderTree({ canDelete: false });
fireEvent.contextMenu(screen.getByText("config.yaml"), { clientX: 0, clientY: 0 });
const del = screen.getByRole("menuitem", { name: /Delete/i }) as HTMLButtonElement;
expect(del.disabled).toBe(true);
fireEvent.click(del);
expect(props.onDelete).not.toHaveBeenCalled();
// Menu stays open on disabled click — same as VSCode (the user
// can read the disabled-state hint without losing the menu).
expect(screen.getByRole("menu")).not.toBeNull();
});
it("Esc dismisses the menu", () => {
renderTree();
fireEvent.contextMenu(screen.getByText("config.yaml"), { clientX: 0, clientY: 0 });
expect(screen.getByRole("menu")).not.toBeNull();
act(() => {
fireEvent.keyDown(document, { key: "Escape" });
});
expect(screen.queryByRole("menu")).toBeNull();
});
it("click outside the menu dismisses it", () => {
renderTree();
fireEvent.contextMenu(screen.getByText("config.yaml"), { clientX: 0, clientY: 0 });
expect(screen.getByRole("menu")).not.toBeNull();
// mousedown on document.body — outside the menu.
act(() => {
fireEvent.mouseDown(document.body);
});
expect(screen.queryByRole("menu")).toBeNull();
});
it("opening a second context menu replaces the first (only one open at a time)", () => {
renderTree();
fireEvent.contextMenu(screen.getByText("config.yaml"), { clientX: 10, clientY: 10 });
fireEvent.contextMenu(screen.getByText("skills"), { clientX: 20, clientY: 20 });
// Only one menu in the DOM. The second open replaced the first
// because the menu state is lifted to the FileTree, not per-row.
const menus = screen.getAllByRole("menu");
expect(menus.length).toBe(1);
});
});
@@ -0,0 +1,212 @@
// @vitest-environment jsdom
//
// Pins the drag-drop upload added in PR-D of issue #2999.
// Two layers of coverage:
//
// 1. The pure walker (collectFileEntries / walkEntry) — pins the
// recursion shape against silent folder truncation. Browsers
// return up to ~100 entries per readEntries() call; if the loop
// stops early, large folder uploads silently drop files. We
// simulate a multi-batch reader to discriminate.
//
// 2. FileTree directory-row drop handlers — pins that dragover/drop
// events fire onDropToTarget with the directory's path + the
// drop's DataTransferItemList.
import { describe, it, expect, vi, afterEach } from "vitest";
import { render, screen, cleanup, fireEvent } from "@testing-library/react";
import React from "react";
import { FileTree } from "../FileTree";
import type { TreeNode } from "../tree";
import { __testables } from "../useFilesApi";
afterEach(cleanup);
// ---- Walker tests ----
/**
* Build a fake FileSystemEntry tree we can hand to walkEntry. The
* shape mimics what webkitGetAsEntry returns from a real OS drag —
* directory entries expose createReader, file entries expose file().
*/
function fakeFileEntry(name: string, content = "x"): {
isFile: true;
isDirectory: false;
name: string;
fullPath: string;
file: (cb: (f: File) => void) => void;
} {
return {
isFile: true,
isDirectory: false,
name,
fullPath: "/" + name,
file: (cb) => cb(new File([content], name, { type: "text/plain" })),
};
}
function fakeDirEntry(
name: string,
childBatches: ReturnType<typeof fakeFileEntry>[][],
): {
isFile: false;
isDirectory: true;
name: string;
fullPath: string;
createReader: () => { readEntries: (cb: (entries: unknown[]) => void) => void };
} {
let i = 0;
return {
isFile: false,
isDirectory: true,
name,
fullPath: "/" + name,
createReader: () => ({
readEntries: (cb) => {
// Mimic browser semantics: emit one batch per call, then
// an empty array to signal end-of-stream. A walker that
// calls readEntries only once would silently truncate at
// the first batch.
if (i < childBatches.length) {
cb(childBatches[i++]);
} else {
cb([]);
}
},
}),
};
}
describe("walkEntry — folder-recursion drop walker", () => {
it("collects a single dropped file", async () => {
const out: { file: File; relativePath: string }[] = [];
await __testables.walkEntry(fakeFileEntry("README.md") as never, "", out);
expect(out.length).toBe(1);
expect(out[0].relativePath).toBe("README.md");
expect(out[0].file.name).toBe("README.md");
});
it("walks a folder and preserves the relative path under the folder name", async () => {
const out: { file: File; relativePath: string }[] = [];
const folder = fakeDirEntry("skills", [
[fakeFileEntry("a.md"), fakeFileEntry("b.md")],
]);
await __testables.walkEntry(folder as never, "", out);
expect(out.map((e) => e.relativePath).sort()).toEqual([
"skills/a.md",
"skills/b.md",
]);
});
it("loops readEntries until empty so a multi-batch folder isn't truncated", async () => {
// Browsers limit each readEntries() call to ~100 entries. Our
// walker MUST call it again until an empty batch is returned.
// Fake reader emits two batches of 2 + an implicit empty → 4
// total. A buggy walker that only takes the first batch would
// see only 2.
const out: { file: File; relativePath: string }[] = [];
const folder = fakeDirEntry("big", [
[fakeFileEntry("1.txt"), fakeFileEntry("2.txt")],
[fakeFileEntry("3.txt"), fakeFileEntry("4.txt")],
]);
await __testables.walkEntry(folder as never, "", out);
expect(out.length).toBe(4);
});
it("walks nested directories and accumulates the full path", async () => {
const out: { file: File; relativePath: string }[] = [];
const inner = fakeDirEntry("web-search", [[fakeFileEntry("SKILL.md")]]);
// Outer dir whose first batch contains a sub-dir entry.
const outer = {
isFile: false,
isDirectory: true,
name: "skills",
fullPath: "/skills",
createReader: () => {
let i = 0;
return {
readEntries: (cb: (entries: unknown[]) => void) => {
if (i++ === 0) cb([inner]);
else cb([]);
},
};
},
};
await __testables.walkEntry(outer as never, "", out);
expect(out.length).toBe(1);
expect(out[0].relativePath).toBe("skills/web-search/SKILL.md");
});
});
// ---- FileTree drag-drop wiring ----
const file: TreeNode = { name: "config.yaml", path: "config.yaml", isDir: false, children: [], size: 0 };
const skillsDir: TreeNode = { name: "skills", path: "skills", isDir: true, children: [], size: 0 };
function renderTree(props: Partial<React.ComponentProps<typeof FileTree>> = {}) {
// PR-D test defaults must include PR-C's onDownload + canDelete now
// that they're required on the TreeCallbacks shape (the rebase
// surfaced this — the merged tree depends on both feature sets).
const defaults: React.ComponentProps<typeof FileTree> = {
nodes: [file, skillsDir],
selectedPath: null,
onSelect: vi.fn(),
onDelete: vi.fn(),
onDownload: vi.fn(),
canDelete: true,
onDropToTarget: vi.fn(),
expandedDirs: new Set<string>(),
onToggleDir: vi.fn(),
loadingDir: null,
};
const merged = { ...defaults, ...props };
return { ...render(<FileTree {...merged} />), props: merged };
}
describe("FileTree directory-row drag-drop", () => {
it("dragover on a directory row preventDefault's so the drop will fire", () => {
renderTree();
const row = screen.getByText("skills");
const dragOver = new Event("dragover", { bubbles: true, cancelable: true });
Object.defineProperty(dragOver, "dataTransfer", {
value: { dropEffect: "" },
});
row.parentElement!.dispatchEvent(dragOver);
// preventDefault registers via the React handler — without it
// the drop event would never fire, so this assertion is the
// load-bearing one.
expect(dragOver.defaultPrevented).toBe(true);
});
it("drop on a directory row fires onDropToTarget with that path + the items list", () => {
const { props } = renderTree();
const row = screen.getByText("skills").parentElement!;
const fakeItems = { length: 1, 0: { kind: "file" } } as unknown as DataTransferItemList;
fireEvent.drop(row, { dataTransfer: { items: fakeItems } });
expect(props.onDropToTarget).toHaveBeenCalledWith("skills", fakeItems);
});
it("drop on a FILE row does NOT fire onDropToTarget (only directories are valid targets)", () => {
const { props } = renderTree();
const fileRow = screen.getByText("config.yaml").parentElement!;
const fakeItems = { length: 1, 0: { kind: "file" } } as unknown as DataTransferItemList;
fireEvent.drop(fileRow, { dataTransfer: { items: fakeItems } });
expect(props.onDropToTarget).not.toHaveBeenCalled();
});
it("drop with no DataTransferItems does NOT fire onDropToTarget", () => {
const { props } = renderTree();
const row = screen.getByText("skills").parentElement!;
fireEvent.drop(row, { dataTransfer: { items: { length: 0 } } });
expect(props.onDropToTarget).not.toHaveBeenCalled();
});
it("dragenter sets the drop-target highlight on the directory row", () => {
renderTree();
const row = screen.getByText("skills").parentElement!;
fireEvent.dragEnter(row, { dataTransfer: {} });
// Highlight class is the discriminator — without dragenter
// wiring the row stays in its hover-only style.
expect(row.className).toMatch(/bg-accent|outline-accent/);
});
});
@@ -90,6 +90,43 @@ export function useFilesApi(workspaceId: string, root: string) {
[workspaceId]
);
/**
* Fetch a file's content from the server and trigger a browser
* download. Used by the right-click "Download" context-menu item
* (PR-C of issue #2999) — distinct from `handleDownloadFile` in
* FilesTab which downloads the CURRENTLY-OPEN-IN-EDITOR file from
* the in-memory `editContent` buffer (so unsaved edits round-trip
* to disk). This helper downloads the on-server content, suitable
* for arbitrary tree rows the user hasn't opened.
*/
const downloadFileByPath = useCallback(
async (path: string) => {
try {
const res = await api.get<{ content: string }>(
`/workspaces/${workspaceId}/files/${path}?root=${encodeURIComponent(root)}`,
);
// text/plain is correct for the canvas's text-only file
// surface (config.yaml, prompts, skill markdown). Binary
// files would need an Accept-arraybuffer path; the API
// returns string today so this matches the wire shape.
const blob = new Blob([res.content], { type: "text/plain" });
const url = URL.createObjectURL(blob);
const a = document.createElement("a");
a.href = url;
a.download = path.split("/").pop() || "file";
a.click();
URL.revokeObjectURL(url);
showToast(`Downloaded ${a.download}`, "success");
} catch (e) {
showToast(
`Download failed: ${e instanceof Error ? e.message : "unknown error"}`,
"error",
);
}
},
[workspaceId, root],
);
const downloadAllFiles = useCallback(async () => {
const fileEntries = files.filter((f) => !f.dir);
const results = await Promise.allSettled(
@@ -114,16 +151,20 @@ export function useFilesApi(workspaceId: string, root: string) {
}, [files, workspaceId]);
const uploadFiles = useCallback(
async (fileList: FileList) => {
async (fileList: FileList, targetDir = "") => {
let uploaded = 0;
for (const file of Array.from(fileList)) {
const path = file.webkitRelativePath || file.name;
const parts = path.split("/");
// For folder picker: webkitRelativePath is "<picked-folder>/a/b.txt"
// — strip the picked-folder prefix so files land flat under the
// workspace's target dir, not under a redundant outer folder.
const relPath = parts.length > 1 ? parts.slice(1).join("/") : parts[0];
const finalPath = targetDir ? `${targetDir}/${relPath}` : relPath;
if (file.size > 1_000_000) continue;
try {
const content = await file.text();
await api.put(`/workspaces/${workspaceId}/files/${relPath}`, { content });
await api.put(`/workspaces/${workspaceId}/files/${finalPath}`, { content });
uploaded++;
} catch {
/* skip binary */
@@ -131,7 +172,7 @@ export function useFilesApi(workspaceId: string, root: string) {
}
if (uploaded > 0) {
useCanvasStore.getState().updateNodeData(workspaceId, { needsRestart: true });
showToast(`Uploaded ${uploaded} files`, "success");
showToast(`Uploaded ${uploaded} files${targetDir ? ` to ${targetDir}` : ""}`, "success");
loadFiles();
}
return uploaded;
@@ -139,6 +180,58 @@ export function useFilesApi(workspaceId: string, root: string) {
[workspaceId, loadFiles]
);
/**
* Upload files dragged from the OS via the HTML5 DataTransferItemList
* API. Unlike the folder-picker path (uploadFiles), this preserves
* the dropped folder structure under `targetDir` — drag a "skills/"
* folder onto the /configs/skills row and you get
* /configs/skills/skills/* (the OUTER folder name is preserved
* because the user explicitly chose to drop a NAMED folder, unlike
* the folder-picker which always wraps the picked dir).
*
* Walks FileSystemDirectoryEntry recursively via webkitGetAsEntry.
* VSCode/JupyterLab use the same primitive — there's no other
* portable browser API for "drag a folder from OS". `webkit*`
* naming is a Chromium relic; Firefox + Safari implement the same
* surface.
*
* Returns the number of files uploaded so the caller can show a
* tally / fail toast.
*/
const uploadDataTransferItems = useCallback(
async (items: DataTransferItemList, targetDir = "") => {
const fileEntries = collectFileEntries(items);
let uploaded = 0;
for (const { file, relativePath } of await fileEntries) {
if (file.size > 1_000_000) continue;
const finalPath = targetDir
? `${targetDir}/${relativePath}`
: relativePath;
try {
const content = await file.text();
await api.put(`/workspaces/${workspaceId}/files/${finalPath}`, {
content,
});
uploaded++;
} catch {
/* skip binary */
}
}
if (uploaded > 0) {
useCanvasStore
.getState()
.updateNodeData(workspaceId, { needsRestart: true });
showToast(
`Uploaded ${uploaded} file${uploaded === 1 ? "" : "s"}${targetDir ? ` to ${targetDir}` : ""}`,
"success",
);
loadFiles();
}
return uploaded;
},
[workspaceId, loadFiles],
);
const deleteAllFiles = useCallback(async () => {
let deleted = 0;
for (const f of files) {
@@ -165,8 +258,98 @@ export function useFilesApi(workspaceId: string, root: string) {
readFile,
writeFile,
deleteFile,
downloadFileByPath,
downloadAllFiles,
uploadFiles,
uploadDataTransferItems,
deleteAllFiles,
};
}
// ----- DataTransfer entry walker (PR-D) ---------------------------------
/**
* Minimal subset of the FileSystem Entry API surface we use. The DOM
* lib types this as FileSystemEntry / FileSystemFileEntry /
* FileSystemDirectoryEntry but the relevant methods are callback-
* based. Keep the shape narrow + explicit so the recursion below
* type-checks without pulling in the full DOM lib types.
*/
interface FSEntry {
isFile: boolean;
isDirectory: boolean;
name: string;
fullPath: string;
file?(success: (f: File) => void, fail?: (e: unknown) => void): void;
createReader?(): { readEntries(success: (entries: FSEntry[]) => void): void };
}
interface CollectedEntry {
file: File;
/** Path relative to the dropped root (e.g. "skills/web-search/SKILL.md"
* for a dropped "skills/" folder containing web-search/SKILL.md). */
relativePath: string;
}
/**
* Walk a DataTransferItemList, returning every file entry as a flat
* array keyed by the path relative to the originally-dropped item.
* Folders dropped from the OS expand recursively; loose files
* passthrough with name as the relative path.
*
* Skips items where webkitGetAsEntry() returns null — that's how
* the browser signals a non-file payload (e.g. a dragged URL or
* text snippet).
*/
async function collectFileEntries(
items: DataTransferItemList,
): Promise<CollectedEntry[]> {
const out: CollectedEntry[] = [];
for (let i = 0; i < items.length; i++) {
const item = items[i];
if (item.kind !== "file") continue;
// webkitGetAsEntry is the standardised name; older Firefox used
// getAsEntry. Both Chromium + Firefox + Safari ship the webkit-
// prefixed variant today. There's no non-prefixed alternative.
const entry = (item as DataTransferItem & {
webkitGetAsEntry?: () => FSEntry | null;
}).webkitGetAsEntry?.();
if (!entry) continue;
await walkEntry(entry, "", out);
}
return out;
}
async function walkEntry(
entry: FSEntry,
prefix: string,
out: CollectedEntry[],
): Promise<void> {
const name = entry.name;
const relPath = prefix ? `${prefix}/${name}` : name;
if (entry.isFile && entry.file) {
const file = await new Promise<File>((resolve, reject) => {
entry.file!(resolve, reject);
});
out.push({ file, relativePath: relPath });
return;
}
if (entry.isDirectory && entry.createReader) {
const reader = entry.createReader();
// readEntries returns up to ~100 at a time on Chromium; loop
// until empty so large folders aren't truncated.
let batch: FSEntry[] = [];
do {
batch = await new Promise<FSEntry[]>((resolve) =>
reader.readEntries(resolve),
);
for (const child of batch) {
await walkEntry(child, relPath, out);
}
} while (batch.length > 0);
}
}
// Exported for direct testing — the recursion + readEntries batching
// is the part most likely to silently truncate a real folder upload.
export const __testables = { collectFileEntries, walkEntry };
+42 -1
View File
@@ -297,10 +297,49 @@ export function SkillsTab({ workspaceId, data }: Props) {
}
};
// Compact-empty pattern: when the workspace has zero plugins
// installed AND the registry isn't open, collapse the whole
// "Plugins" section into a single inline pill rather than rendering
// the full panel chrome. Reported on production 2026-05-05 (#2971):
// the empty state's panel-with-zero-list-rows layout gives the user
// a lot of vertical real estate for content that's just "0
// installed + Install button". The compact form keeps that
// affordance without the chrome.
//
// Expanded/full layout still fires when installed.length > 0 OR
// when the user opens the registry (clicked "+ Install Plugin").
// Once a plugin is installed the section auto-expands to surface
// the list.
const compactEmpty = installed.length === 0 && !showRegistry && installedLoaded;
if (compactEmpty) {
return (
<div className="p-4 space-y-4">
<div
className="flex items-center justify-between gap-2 rounded-full border border-line/60 bg-surface-sunken/70 px-3 py-1.5"
aria-label="Plugins (none installed)"
>
<div className="flex items-center gap-2">
<span className="text-[10px] uppercase tracking-[0.2em] text-ink-soft">Plugins</span>
<span className="text-[11px] text-ink-mid">0 installed</span>
</div>
<button
onClick={() => setShowRegistry(true)}
className="rounded-full border border-violet-700/50 bg-violet-950/30 px-3 py-0.5 text-[10px] text-violet-200 hover:bg-violet-900/40 transition-colors"
aria-expanded="false"
aria-controls="plugins-section"
>
+ Install Plugin
</button>
</div>
</div>
);
}
return (
<div className="p-4 space-y-4">
{/* Plugins section */}
<div className="rounded-xl border border-line bg-surface-sunken/70 p-3">
<div id="plugins-section" className="rounded-xl border border-line bg-surface-sunken/70 p-3">
<div className="flex items-center justify-between gap-3">
<div>
<div className="text-[10px] uppercase tracking-[0.22em] text-ink-soft">Plugins</div>
@@ -311,6 +350,8 @@ export function SkillsTab({ workspaceId, data }: Props) {
<button
onClick={() => setShowRegistry(!showRegistry)}
className="rounded-full border border-violet-700/50 bg-violet-950/30 px-3 py-1 text-[10px] text-violet-200 hover:bg-violet-900/40 transition-colors"
aria-expanded={showRegistry}
aria-controls="plugins-registry"
>
{showRegistry ? "Hide Registry" : "+ Install Plugin"}
</button>
@@ -0,0 +1,141 @@
// @vitest-environment jsdom
//
// Pins two regressions reported on production 2026-05-05:
//
// 1. IME composition + Enter key: typing Chinese (or any CJK / IME-
// composed text) and pressing Enter to commit the candidate
// selection used to send the half-typed message. The fix checks
// `event.nativeEvent.isComposing` (and a `keyCode === 229`
// fallback for older WebKit) before treating Enter as send.
//
// 2. Markdown link clicks: the agent's ReactMarkdown-rendered links
// used to:
// - http/https → navigate canvas tab away (user lost canvas state)
// - workspace://path / file:///workspace/... / /workspace/... →
// browser hit about:blank (unhandled protocol).
// Fix: external links get target="_blank" + noopener; in-container
// paths route through downloadChatFile (same auth path as chips).
import { describe, it, expect, vi, afterEach, beforeEach } from "vitest";
import { render, screen, cleanup, fireEvent, waitFor } from "@testing-library/react";
import React from "react";
afterEach(cleanup);
// Mock the api module so render doesn't try to talk to a real CP.
const apiGet = vi.fn((_path: string): Promise<unknown> => Promise.resolve([]));
const apiPost = vi.fn((_path: string, _body: unknown): Promise<unknown> => Promise.resolve({}));
vi.mock("@/lib/api", () => ({
api: {
get: (path: string) => apiGet(path),
post: (path: string, body: unknown) => apiPost(path, body),
del: vi.fn(),
patch: vi.fn(),
put: vi.fn(),
},
}));
vi.mock("@/store/canvas", () => ({
useCanvasStore: vi.fn((selector?: (s: unknown) => unknown) =>
selector ? selector({ agentMessages: {}, consumeAgentMessages: () => [] }) : {},
),
}));
// Capture the downloadChatFile call so the markdown-link test can
// assert in-container paths route through the authenticated download
// path rather than the browser's bare anchor click.
const downloadChatFileMock = vi.fn((_workspaceId: string, _att: { uri: string; name: string }) => Promise.resolve());
vi.mock("../chat/uploads", async () => {
const actual = await vi.importActual<typeof import("../chat/uploads")>("../chat/uploads");
return {
...actual,
downloadChatFile: (workspaceId: string, att: { uri: string; name: string }) =>
downloadChatFileMock(workspaceId, att),
};
});
beforeEach(() => {
apiGet.mockClear();
apiPost.mockClear();
downloadChatFileMock.mockClear();
// jsdom doesn't implement scrollIntoView; ChatTab calls it after
// every render with a new message.
Element.prototype.scrollIntoView = vi.fn();
// Stub IntersectionObserver — the lazy-history sentinel uses it.
class FakeIO {
observe() {}
unobserve() {}
disconnect() {}
}
(window as unknown as { IntersectionObserver: unknown }).IntersectionObserver = FakeIO;
(globalThis as unknown as { IntersectionObserver: unknown }).IntersectionObserver = FakeIO;
});
import { ChatTab } from "../ChatTab";
const minimalData = {
status: "online" as const,
runtime: "claude-code",
currentTask: null,
} as unknown as Parameters<typeof ChatTab>[0]["data"];
describe("ChatTab — IME-safe Enter key", () => {
it("does NOT send the message when Enter fires during IME composition (isComposing)", async () => {
render(<ChatTab workspaceId="ws-ime" data={minimalData} />);
// Find the textarea by its aria-label.
const textarea = await screen.findByLabelText(/Message to agent/i);
fireEvent.change(textarea, { target: { value: "你好" } });
// Simulate the Enter that commits an IME selection: isComposing=true.
fireEvent.keyDown(textarea, { key: "Enter", isComposing: true });
// sendMessage POSTs via api.post; assert it was NOT called.
await waitFor(() => {
expect(apiPost).not.toHaveBeenCalled();
});
// And the input is preserved — ChatTab clears it only on actual send.
expect((textarea as HTMLTextAreaElement).value).toBe("你好");
});
it("does NOT send when keyCode is 229 (older Safari IME fallback)", async () => {
render(<ChatTab workspaceId="ws-ime2" data={minimalData} />);
const textarea = await screen.findByLabelText(/Message to agent/i);
fireEvent.change(textarea, { target: { value: "한국어" } });
// keyCode 229 is the older-Safari signal that an IME is composing.
// Some mobile WebKit-based browsers delay setting isComposing on
// the composition-end Enter; the keyCode fallback covers that.
fireEvent.keyDown(textarea, { key: "Enter", keyCode: 229 });
await waitFor(() => {
expect(apiPost).not.toHaveBeenCalled();
});
});
it("DOES send on a non-composing Enter (the happy path stays intact)", async () => {
render(<ChatTab workspaceId="ws-ok" data={minimalData} />);
const textarea = await screen.findByLabelText(/Message to agent/i);
fireEvent.change(textarea, { target: { value: "hello world" } });
fireEvent.keyDown(textarea, { key: "Enter" /* no isComposing, no 229 */ });
// The api.post for /a2a fires inside sendMessage. waitFor since
// the call goes through several effects.
await waitFor(() => {
expect(apiPost).toHaveBeenCalled();
});
});
it("Shift+Enter inserts newline regardless (no send)", async () => {
render(<ChatTab workspaceId="ws-shift" data={minimalData} />);
const textarea = await screen.findByLabelText(/Message to agent/i);
fireEvent.change(textarea, { target: { value: "line 1" } });
fireEvent.keyDown(textarea, { key: "Enter", shiftKey: true });
await waitFor(() => {
expect(apiPost).not.toHaveBeenCalled();
});
});
});
@@ -0,0 +1,119 @@
// @vitest-environment jsdom
//
// Pins the "Files not available" early-return for runtimes whose
// filesystem the platform doesn't own (today: runtime === "external").
//
// Pre-fix: FilesTab issued a GET /workspaces/<id>/files for every
// workspace. The platform's response for an external workspace is
// always [] (no rows in workspace_files), but the canvas rendered
// "0 files / No config files yet" — visually identical to the SaaS
// empty-listing bug fixed in PR-A. The placeholder makes the absence
// intentional.
//
// Pinned branches:
// 1. external runtime → "Files not available" banner renders,
// runtime name surfaces in the body so user knows WHY.
// 2. external runtime → useFilesApi is NOT invoked. Verified by
// asserting the mocked api.get was never called.
// 3. claude-code (or any other runtime) → no banner, normal mount
// proceeds (`/configs` toolbar visible). Pre-fix regression cover.
// 4. data prop omitted (legacy callers) → no early-return, falls
// through to normal mount.
import { describe, it, expect, vi, beforeEach, afterEach } from "vitest";
import { render, screen, cleanup, waitFor } from "@testing-library/react";
import React from "react";
afterEach(cleanup);
// Mock the api module so the normal-mount branches don't try to
// fetch against a real backend — and so we can assert the
// external-runtime branch never fires a request.
const apiCalls: string[] = [];
vi.mock("@/lib/api", () => ({
api: {
get: vi.fn((path: string) => {
apiCalls.push(path);
return Promise.resolve([]);
}),
put: vi.fn(() => Promise.resolve()),
del: vi.fn(() => Promise.resolve()),
},
}));
// useCanvasStore is referenced by useFilesApi for the needsRestart
// flag. The Toaster import inside FilesTab also pulls the store
// indirectly. Stub minimally to satisfy the import chain.
vi.mock("@/store/canvas", async () => {
const actual = await vi.importActual<typeof import("@/store/canvas")>(
"@/store/canvas",
);
return {
...actual,
useCanvasStore: {
getState: () => ({
updateNodeData: vi.fn(),
}),
},
};
});
vi.mock("../Toaster", () => ({
showToast: vi.fn(),
}));
beforeEach(() => {
apiCalls.length = 0;
});
import { FilesTab } from "../FilesTab";
const externalData = { runtime: "external", status: "online" } as unknown as Parameters<
typeof FilesTab
>[0]["data"];
const claudeData = { runtime: "claude-code", status: "online" } as unknown as Parameters<
typeof FilesTab
>[0]["data"];
describe("FilesTab not-available early-return for runtimes without platform-owned filesystem", () => {
it("external runtime renders the not-available banner with runtime name", () => {
render(<FilesTab workspaceId="ws-ext" data={externalData} />);
expect(screen.getByText(/Files not available/i)).not.toBeNull();
// Runtime name must surface so the user understands WHY — without
// it the placeholder reads as a generic error.
expect(screen.getByText(/external/)).not.toBeNull();
// Chat tab is the recommended alternative — flagged in copy so the
// user knows where to go next instead of bouncing tabs.
expect(screen.getByText(/Chat tab/i)).not.toBeNull();
});
it("external runtime does NOT issue any /files API call", async () => {
render(<FilesTab workspaceId="ws-ext" data={externalData} />);
// Tolerate one microtask boundary in case useEffect schedules.
await new Promise((r) => setTimeout(r, 0));
const filesCalls = apiCalls.filter((p) => p.includes("/files"));
expect(filesCalls).toEqual([]);
});
it("claude-code runtime does NOT render the banner (normal mount)", async () => {
render(<FilesTab workspaceId="ws-claude" data={claudeData} />);
// The normal-mount path renders the FilesToolbar with the root
// selector. Wait for it (useEffect → loadFiles → setLoading false).
await waitFor(() => {
expect(screen.queryByText(/Files not available/i)).toBeNull();
});
// Toolbar's root selector confirms we're on the platform-owned
// rendering path, not the placeholder.
expect(screen.getByLabelText(/File root directory/i)).not.toBeNull();
});
it("data prop omitted falls through to normal mount (back-compat)", async () => {
render(<FilesTab workspaceId="ws-no-data" />);
await waitFor(() => {
expect(screen.queryByText(/Files not available/i)).toBeNull();
});
// Without data we can't gate on runtime — must mount normally.
expect(screen.getByLabelText(/File root directory/i)).not.toBeNull();
});
});
@@ -1,220 +0,0 @@
// @vitest-environment jsdom
//
// Pins the Edit affordance added to MemoryTab. Until this PR the Memory tab
// was Add+Delete only; an entry that needed correction had to be deleted and
// re-added — losing the version-counter and any in-flight optimistic-locking
// invariants other writers depend on.
//
// Each test pins one branch of the new flow. If any fails, the bug is back.
import { describe, it, expect, vi, afterEach, beforeEach } from "vitest";
import { render, screen, cleanup, waitFor, fireEvent } from "@testing-library/react";
import React from "react";
afterEach(cleanup);
const apiGet = vi.fn();
const apiPost = vi.fn();
const apiDel = vi.fn();
vi.mock("@/lib/api", () => ({
api: {
get: (path: string) => apiGet(path),
post: (path: string, body: unknown) => apiPost(path, body),
del: (path: string) => apiDel(path),
patch: vi.fn(),
put: vi.fn(),
},
}));
import { MemoryTab } from "../MemoryTab";
const sampleEntries = [
{
key: "team_brief",
value: { goal: "ship v2" },
version: 3,
expires_at: null,
updated_at: "2026-05-04T10:00:00Z",
},
{
key: "plain_note",
value: "raw text note",
version: 1,
expires_at: "2099-01-01T00:00:00Z",
updated_at: "2026-05-04T10:01:00Z",
},
];
beforeEach(() => {
apiGet.mockReset();
apiPost.mockReset();
apiDel.mockReset();
apiGet.mockImplementation((path: string) => {
if (path === "/workspaces/ws-test/memory") {
return Promise.resolve(sampleEntries);
}
return Promise.reject(new Error(`unmocked api.get: ${path}`));
});
});
async function renderAndExpand(key: string) {
render(<MemoryTab workspaceId="ws-test" />);
await waitFor(() => expect(apiGet).toHaveBeenCalled());
// Reveal the Advanced section that hosts the entry list.
const showAdvanced = await screen.findByRole("button", { name: "Show" });
fireEvent.click(showAdvanced);
// Expand the row.
const row = await screen.findByRole("button", { name: new RegExp(key) });
fireEvent.click(row);
}
describe("MemoryTab Edit affordance", () => {
it("Edit button appears once a row is expanded", async () => {
await renderAndExpand("team_brief");
expect(screen.getAllByRole("button", { name: "Edit" }).length).toBeGreaterThan(0);
});
it("clicking Edit on a JSON-valued entry pre-fills the textarea with pretty JSON", async () => {
await renderAndExpand("team_brief");
fireEvent.click(screen.getAllByRole("button", { name: "Edit" })[0]);
const textarea = (await screen.findByLabelText(
"Edit value for team_brief",
)) as HTMLTextAreaElement;
expect(textarea.value).toBe('{\n "goal": "ship v2"\n}');
});
it("clicking Edit on a string-valued entry pre-fills raw (no surrounding quotes)", async () => {
await renderAndExpand("plain_note");
fireEvent.click(screen.getAllByRole("button", { name: "Edit" })[0]);
const textarea = (await screen.findByLabelText(
"Edit value for plain_note",
)) as HTMLTextAreaElement;
expect(textarea.value).toBe("raw text note");
});
it("Save POSTs with if_match_version + parsed value, then reloads", async () => {
apiPost.mockResolvedValue({ status: "ok", key: "team_brief", version: 4 });
await renderAndExpand("team_brief");
fireEvent.click(screen.getAllByRole("button", { name: "Edit" })[0]);
const textarea = await screen.findByLabelText("Edit value for team_brief");
fireEvent.change(textarea, { target: { value: '{"goal":"ship v3"}' } });
fireEvent.click(screen.getByRole("button", { name: "Save" }));
await waitFor(() => expect(apiPost).toHaveBeenCalledTimes(1));
expect(apiPost).toHaveBeenCalledWith("/workspaces/ws-test/memory", {
key: "team_brief",
value: { goal: "ship v3" },
if_match_version: 3,
});
// Reload after save → second GET.
await waitFor(() => expect(apiGet).toHaveBeenCalledTimes(2));
});
it("Save with non-JSON text falls back to plain string", async () => {
apiPost.mockResolvedValue({ status: "ok" });
await renderAndExpand("team_brief");
fireEvent.click(screen.getAllByRole("button", { name: "Edit" })[0]);
const textarea = await screen.findByLabelText("Edit value for team_brief");
fireEvent.change(textarea, { target: { value: "free-form note" } });
fireEvent.click(screen.getByRole("button", { name: "Save" }));
await waitFor(() => expect(apiPost).toHaveBeenCalledTimes(1));
expect(apiPost.mock.calls[0][1].value).toBe("free-form note");
});
it("TTL field is forwarded as ttl_seconds when set", async () => {
apiPost.mockResolvedValue({ status: "ok" });
await renderAndExpand("team_brief");
fireEvent.click(screen.getAllByRole("button", { name: "Edit" })[0]);
const ttlInput = await screen.findByLabelText("Edit TTL for team_brief");
fireEvent.change(ttlInput, { target: { value: "3600" } });
fireEvent.click(screen.getByRole("button", { name: "Save" }));
await waitFor(() => expect(apiPost).toHaveBeenCalledTimes(1));
expect(apiPost.mock.calls[0][1].ttl_seconds).toBe(3600);
});
it("blank/zero/non-numeric TTL is omitted from the payload", async () => {
apiPost.mockResolvedValue({ status: "ok" });
await renderAndExpand("team_brief");
fireEvent.click(screen.getAllByRole("button", { name: "Edit" })[0]);
const ttlInput = await screen.findByLabelText("Edit TTL for team_brief");
// Junk + zero both must drop out — payload must not contain ttl_seconds.
fireEvent.change(ttlInput, { target: { value: "abc" } });
fireEvent.click(screen.getByRole("button", { name: "Save" }));
await waitFor(() => expect(apiPost).toHaveBeenCalledTimes(1));
expect(apiPost.mock.calls[0][1]).not.toHaveProperty("ttl_seconds");
});
it("Cancel discards edits and restores the rendered value", async () => {
await renderAndExpand("team_brief");
fireEvent.click(screen.getAllByRole("button", { name: "Edit" })[0]);
const textarea = await screen.findByLabelText("Edit value for team_brief");
fireEvent.change(textarea, { target: { value: '{"goal":"discarded"}' } });
fireEvent.click(screen.getByRole("button", { name: "Cancel" }));
expect(apiPost).not.toHaveBeenCalled();
// Editor is gone; the JSON pre-block is back.
expect(screen.queryByLabelText("Edit value for team_brief")).toBeNull();
expect(screen.getAllByText(/"goal": "ship v2"/i).length).toBeGreaterThan(0);
});
it("409 response surfaces a retry hint and reloads", async () => {
apiPost.mockRejectedValueOnce(
new Error("HTTP 409: if_match_version mismatch"),
);
await renderAndExpand("team_brief");
fireEvent.click(screen.getAllByRole("button", { name: "Edit" })[0]);
const textarea = await screen.findByLabelText("Edit value for team_brief");
fireEvent.change(textarea, { target: { value: '{"goal":"ship v3"}' } });
fireEvent.click(screen.getByRole("button", { name: "Save" }));
await waitFor(() => expect(apiPost).toHaveBeenCalledTimes(1));
const alert = await screen.findByRole("alert");
expect(alert.textContent).toMatch(/changed since you opened it/i);
// Initial mount load + post-conflict reload.
await waitFor(() => expect(apiGet).toHaveBeenCalledTimes(2));
});
it("non-409 error surfaces the message and does not reload", async () => {
apiPost.mockRejectedValueOnce(new Error("boom"));
await renderAndExpand("team_brief");
fireEvent.click(screen.getAllByRole("button", { name: "Edit" })[0]);
fireEvent.click(screen.getByRole("button", { name: "Save" }));
const alert = await screen.findByRole("alert");
expect(alert.textContent).toBe("boom");
// Only the initial mount load — no retry reload.
expect(apiGet).toHaveBeenCalledTimes(1);
});
it("entry with no version omits if_match_version (back-compat with older shape)", async () => {
// Pre-version-counter shape: drop the `version` field from the row.
apiGet.mockReset();
apiGet.mockImplementation((path: string) => {
if (path === "/workspaces/ws-test/memory") {
return Promise.resolve([
{
key: "old_entry",
value: "legacy",
expires_at: null,
updated_at: "2026-05-04T10:00:00Z",
},
]);
}
return Promise.reject(new Error(`unmocked: ${path}`));
});
apiPost.mockResolvedValue({ status: "ok" });
await renderAndExpand("old_entry");
fireEvent.click(screen.getAllByRole("button", { name: "Edit" })[0]);
const textarea = await screen.findByLabelText("Edit value for old_entry");
fireEvent.change(textarea, { target: { value: "updated" } });
fireEvent.click(screen.getByRole("button", { name: "Save" }));
await waitFor(() => expect(apiPost).toHaveBeenCalledTimes(1));
const payload = apiPost.mock.calls[0][1];
expect(payload).not.toHaveProperty("if_match_version");
expect(payload.value).toBe("updated");
});
});
@@ -0,0 +1,141 @@
// @vitest-environment jsdom
//
// Pins the compact-when-empty layout for the SkillsTab Plugins section
// (issue #2971, reported on production 2026-05-05).
//
// Three states matter for layout:
// 1. installed.length === 0 + registry closed + load completed → COMPACT pill
// 2. installed.length > 0 → FULL panel + installed list
// 3. registry open (showRegistry=true) → FULL panel + registry browser
//
// The compact-empty path is the new behavior; the other two were
// pre-existing. This test pins all three so a future refactor that
// over-collapses (showing compact when plugins are installed) or
// over-expands (showing full panel on empty load) fails loudly.
import { describe, it, expect, vi, beforeEach, afterEach } from "vitest";
import { render, screen, cleanup, fireEvent, waitFor } from "@testing-library/react";
import React from "react";
afterEach(cleanup);
const apiGet = vi.fn();
vi.mock("@/lib/api", () => ({
api: {
get: (path: string, opts?: unknown) => apiGet(path, opts),
post: vi.fn(() => Promise.resolve({})),
del: vi.fn(),
patch: vi.fn(),
put: vi.fn(),
},
}));
beforeEach(() => {
apiGet.mockReset();
Element.prototype.scrollIntoView = vi.fn();
});
import { SkillsTab } from "../SkillsTab";
const minimalData = {
status: "online" as const,
runtime: "claude-code",
currentTask: "",
agentCard: undefined,
} as unknown as Parameters<typeof SkillsTab>[0]["data"];
describe("SkillsTab Plugins compact-empty layout", () => {
it("renders compact pill when installed.length === 0 and registry closed", async () => {
// Both fetches return empty arrays — workspace is fresh, no plugins.
apiGet.mockImplementation((path: string) => {
if (path.endsWith("/plugins") || path === "/plugins" || path === "/plugins/sources") {
return Promise.resolve([]);
}
return Promise.resolve([]);
});
render(<SkillsTab workspaceId="ws-fresh" data={minimalData} />);
// Wait for the installedLoaded gate to flip — without that the
// component renders a "loading" state, not the compact pill.
await waitFor(() => {
expect(screen.getByLabelText(/Plugins \(none installed\)/i)).toBeTruthy();
});
// Compact assertions: the rounded-xl panel chrome MUST NOT be in
// the DOM (we'd see two "Plugins" labels — one in the header,
// one in the pill — if the layout regressed to "always full
// panel"). The compact form has exactly one "Plugins" label.
const labels = screen.getAllByText("Plugins");
expect(labels).toHaveLength(1);
// The full-panel chrome's id="plugins-section" should NOT be
// rendered when we're in compact mode.
expect(document.getElementById("plugins-section")).toBeNull();
});
it("renders full panel when installed.length > 0", async () => {
apiGet.mockImplementation((path: string) => {
if (path.endsWith("/plugins")) {
return Promise.resolve([
{ name: "memory-postgres", version: "1.0.0", description: "memory backend", supported_on_runtime: true },
]);
}
return Promise.resolve([]);
});
render(<SkillsTab workspaceId="ws-installed" data={minimalData} />);
await waitFor(() => {
expect(screen.getByText(/1 installed/i)).toBeTruthy();
});
// Full-panel chrome MUST be present — id pin.
expect(document.getElementById("plugins-section")).not.toBeNull();
// Compact pill ariaLabel MUST NOT be present.
expect(screen.queryByLabelText(/Plugins \(none installed\)/i)).toBeNull();
});
it("expands to full panel when user clicks + Install Plugin from compact pill", async () => {
apiGet.mockImplementation(() => Promise.resolve([]));
render(<SkillsTab workspaceId="ws-expand" data={minimalData} />);
// Start compact — wait for the compact pill to settle so we click
// the right button (initial render before installedLoaded flips
// doesn't have either layout, and the post-load compact pill is
// what we want to interact with).
await waitFor(() => {
expect(screen.getByLabelText(/Plugins \(none installed\)/i)).toBeTruthy();
});
const installBtn = screen.getByRole("button", { name: /\+ Install Plugin/i });
expect(installBtn.getAttribute("aria-expanded")).toBe("false");
fireEvent.click(installBtn);
// After click, registry opens → full panel renders. The compact
// pill's aria-label should be gone; the full-panel id should
// appear. Generous waitFor — a registry fetch may also fire in
// the React effect chain, and we want to assert the compact →
// full transition without racing it.
await waitFor(
() => {
expect(document.getElementById("plugins-section")).not.toBeNull();
},
{ timeout: 3000 },
);
expect(screen.queryByLabelText(/Plugins \(none installed\)/i)).toBeNull();
});
it("does NOT collapse to compact while initial load is pending (avoid flash)", () => {
// Returning a never-resolving promise means installedLoaded stays
// false. The compact pill MUST NOT render in this state — that
// would flash compact → full as the load completes, which looks
// janky. The component shows a loading shell instead (the
// existing pre-fix behavior).
apiGet.mockImplementation(() => new Promise(() => {}));
render(<SkillsTab workspaceId="ws-loading" data={minimalData} />);
// Synchronous assertion — no waitFor — since we want to confirm
// the compact pill is NOT rendered before any network round-trip
// finishes.
expect(screen.queryByLabelText(/Plugins \(none installed\)/i)).toBeNull();
});
});
@@ -1,6 +1,6 @@
"use client";
import { useState, useEffect, useMemo, useRef } from "react";
import { useState, useEffect, useLayoutEffect, useMemo, useRef, useCallback } from "react";
import ReactMarkdown from "react-markdown";
import remarkGfm from "remark-gfm";
import { api } from "@/lib/api";
@@ -184,13 +184,23 @@ function unwrapErrorText(raw: string | null): string {
export function AgentCommsPanel({ workspaceId }: { workspaceId: string }) {
const [messages, setMessages] = useState<CommMessage[]>([]);
const [loading, setLoading] = useState(true);
const [loadError, setLoadError] = useState<string | null>(null);
// Dedup by timestamp+type+peer to handle API load + WebSocket race
const seenKeys = useRef(new Set<string>());
const bottomRef = useRef<HTMLDivElement>(null);
// Mirrors the my-chat scroll behaviour from ChatTab (PR #2903) —
// smooth-scroll on a long history gets interrupted by concurrent
// renders and lands the panel mid-conversation. Switch the first
// arrival to instant; subsequent appends animate.
const hasInitialScrollRef = useRef(false);
// Load history
useEffect(() => {
// Load history. Extracted so the error-state retry button can
// re-invoke without remount. ChatTab uses the same shape
// (loadInitial → loadError state → retry button).
const loadInitial = useCallback(() => {
setLoading(true);
setLoadError(null);
seenKeys.current.clear();
api.get<ActivityEntry[]>(`/workspaces/${workspaceId}/activity?source=agent&limit=50`)
.then((entries) => {
const filtered = (entries ?? [])
@@ -234,10 +244,15 @@ export function AgentCommsPanel({ workspaceId }: { workspaceId: string }) {
// the .then body) — the panel just sat on the empty state
// with zero signal.
console.warn("AgentCommsPanel: load activity failed", err);
setLoadError(err instanceof Error ? err.message : String(err));
setLoading(false);
});
}, [workspaceId]);
useEffect(() => {
loadInitial();
}, [loadInitial]);
// Live updates routed through the global ReconnectingSocket. The
// previous pattern of `new WebSocket(WS_URL)` per panel had no
// onclose / no reconnect, so any drop (idle timeout, browser
@@ -358,7 +373,18 @@ export function AgentCommsPanel({ workspaceId }: { workspaceId: string }) {
} catch { /* ignore */ }
});
useEffect(() => {
// useLayoutEffect (not useEffect) so the scroll runs BEFORE paint —
// otherwise the user sees the panel jump for one frame on every
// append. Mirrors ChatTab's MyChatPanel scroll block.
useLayoutEffect(() => {
if (!hasInitialScrollRef.current && messages.length > 0) {
// Instant on first arrival — smooth-scroll on a long history
// gets interrupted by concurrent renders and lands the panel
// mid-conversation (the chat-opens-in-middle bug class).
hasInitialScrollRef.current = true;
bottomRef.current?.scrollIntoView({ behavior: "instant" as ScrollBehavior });
return;
}
bottomRef.current?.scrollIntoView({ behavior: "smooth" });
}, [messages]);
@@ -366,6 +392,27 @@ export function AgentCommsPanel({ workspaceId }: { workspaceId: string }) {
return <div className="text-xs text-ink-soft text-center py-8">Loading agent communications...</div>;
}
if (loadError !== null && messages.length === 0) {
// Mirrors ChatTab my-chat error UI — surfaces the load failure
// with a retry button instead of silently rendering empty state.
return (
<div
role="alert"
className="mx-2 mt-2 rounded-lg border border-red-800/50 bg-red-950/30 px-3 py-2.5"
>
<p className="text-[11px] text-bad mb-1.5">
Failed to load agent communications: {loadError}
</p>
<button
onClick={loadInitial}
className="text-[10px] px-2 py-0.5 rounded bg-red-800/40 text-bad hover:bg-red-700/50 transition-colors"
>
Retry
</button>
</div>
);
}
if (messages.length === 0) {
return (
<div className="text-xs text-ink-soft text-center py-8">
@@ -0,0 +1,124 @@
"use client";
// AttachmentAudio — inline native HTML5 <audio controls> player for
// chat attachments (RFC #2991, PR-2).
//
// Same auth + Blob-URL pattern as AttachmentImage / AttachmentVideo.
// Native audio control bar handles play/pause/scrub/volume/download,
// and there's no fullscreen UI to worry about (audio doesn't need
// AttachmentLightbox).
import { useState, useEffect, useRef } from "react";
import type { ChatAttachment } from "./types";
import { isPlatformAttachment, resolveAttachmentHref } from "./uploads";
import { AttachmentChip } from "./AttachmentViews";
interface Props {
workspaceId: string;
attachment: ChatAttachment;
onDownload: (a: ChatAttachment) => void;
tone: "user" | "agent";
}
type FetchState =
| { kind: "idle" }
| { kind: "loading" }
| { kind: "ready"; src: string }
| { kind: "error" };
export function AttachmentAudio({ workspaceId, attachment, onDownload, tone }: Props) {
const [state, setState] = useState<FetchState>({ kind: "idle" });
const blobUrlRef = useRef<string | null>(null);
useEffect(() => {
let cancelled = false;
setState({ kind: "loading" });
if (!isPlatformAttachment(attachment.uri)) {
const href = resolveAttachmentHref(workspaceId, attachment.uri);
if (!cancelled) setState({ kind: "ready", src: href });
return;
}
void (async () => {
try {
const href = resolveAttachmentHref(workspaceId, attachment.uri);
const headers: Record<string, string> = {};
const adminToken = process.env.NEXT_PUBLIC_ADMIN_TOKEN;
if (adminToken) headers["Authorization"] = `Bearer ${adminToken}`;
const slug = getTenantSlug();
if (slug) headers["X-Molecule-Org-Slug"] = slug;
const res = await fetch(href, {
headers,
credentials: "include",
signal: AbortSignal.timeout(60_000),
});
if (!res.ok) {
if (!cancelled) setState({ kind: "error" });
return;
}
const blob = await res.blob();
const url = URL.createObjectURL(blob);
blobUrlRef.current = url;
if (cancelled) {
URL.revokeObjectURL(url);
return;
}
setState({ kind: "ready", src: url });
} catch {
if (!cancelled) setState({ kind: "error" });
}
})();
return () => {
cancelled = true;
if (blobUrlRef.current) {
URL.revokeObjectURL(blobUrlRef.current);
blobUrlRef.current = null;
}
};
}, [workspaceId, attachment.uri]);
if (state.kind === "error") {
return <AttachmentChip attachment={attachment} onDownload={onDownload} tone={tone} />;
}
if (state.kind === "idle" || state.kind === "loading") {
return (
<div
className="rounded-md border border-line/50 bg-surface-card/40 animate-pulse"
style={{ width: 280, height: 40 }}
aria-label={`Loading ${attachment.name}`}
/>
);
}
return (
<div
className={`inline-flex flex-col gap-1 rounded-md border px-2 py-1 ${
tone === "user" ? "border-blue-400/30 bg-accent-strong/10" : "border-line/50 bg-surface-card/40"
}`}
>
{/* Filename label so the user knows what they're hearing
before pressing play. Short, single-line, truncated. */}
<span className="text-[10px] text-ink-mid truncate max-w-[280px]" title={attachment.name}>
{attachment.name}
</span>
<audio
controls
preload="metadata"
src={state.src}
style={{ width: 280, height: 32 }}
onError={() => setState({ kind: "error" })}
>
{attachment.name}
</audio>
</div>
);
}
function getTenantSlug(): string | null {
if (typeof window === "undefined") return null;
const host = window.location.hostname;
const m = host.match(/^([^.]+)\.moleculesai\.app$/);
return m ? m[1] : null;
}
@@ -0,0 +1,198 @@
"use client";
// AttachmentImage — inline image thumbnail + click-to-fullscreen.
// First "specialized renderer" landing under RFC #2991 PR-1.
//
// Auth model
// ----------
//
// The Critical UX/Security trade-off (per RFC's hostile-self-review
// item #2): the bytes live behind workspace auth. A bare
// <img src="https://reno-stars.../chat/download?path=…"> WILL NOT
// include our cookie + Origin headers when the browser loads it —
// even for same-origin canvas-server, the auth chain (cookie + token
// + X-Molecule-Org-Slug header) is JS-injected, not browser-default.
//
// Solution: same auth path the chip download uses. Fetch the bytes
// with the JS auth headers, wrap in a Blob, hand the browser an
// ObjectURL. The image renders from local memory; no second request,
// no auth leakage, no CORS pain.
//
// That same blob URL is what the lightbox shows on click — single
// fetch, cached for the lifetime of the message bubble.
//
// Failure modes
// -------------
//
// - Fetch fails (404, 403, network) → fall back to AttachmentChip
// (the existing file-pill download flow). The user still gets a
// working download; we just lose the inline preview.
// - Decoded as non-image (server returned wrong Content-Type, or
// bytes are corrupt) → onError handler swaps to AttachmentChip.
// - Bytes too large — no enforcement here; the server caps at 25MB
// per file (chat_files.go), which is too big for a thumbnail but
// acceptable for a chat-attached image. If we hit pain we can
// downscale via canvas, but defer that to v2.
import { useState, useEffect, useRef } from "react";
import type { ChatAttachment } from "./types";
import { isPlatformAttachment, resolveAttachmentHref } from "./uploads";
import { AttachmentLightbox } from "./AttachmentLightbox";
import { AttachmentChip } from "./AttachmentViews";
interface Props {
workspaceId: string;
attachment: ChatAttachment;
onDownload: (a: ChatAttachment) => void;
tone: "user" | "agent";
}
type FetchState =
| { kind: "idle" }
| { kind: "loading" }
| { kind: "ready"; blobUrl: string }
| { kind: "error" };
export function AttachmentImage({ workspaceId, attachment, onDownload, tone }: Props) {
const [state, setState] = useState<FetchState>({ kind: "idle" });
const [open, setOpen] = useState(false);
// Track whether we created the ObjectURL so cleanup runs on the
// exact value we minted (state could change between effect setup
// and effect cleanup if a new fetch fires).
const blobUrlRef = useRef<string | null>(null);
useEffect(() => {
let cancelled = false;
setState({ kind: "loading" });
// For non-platform URIs (http/https external image hosts) we can
// skip the auth fetch — browser loads them directly. We bail out
// of the auth-fetch flow and use the raw URL via resolveAttachmentHref.
if (!isPlatformAttachment(attachment.uri)) {
const href = resolveAttachmentHref(workspaceId, attachment.uri);
if (!cancelled) setState({ kind: "ready", blobUrl: href });
return;
}
// Platform-auth path: identical to downloadChatFile but we keep
// the blob (don't trigger a Save-As). Use the same headers it does
// by going through it indirectly — no, downloadChatFile triggers a
// Save-As. Need a separate fetch.
void (async () => {
try {
const href = resolveAttachmentHref(workspaceId, attachment.uri);
const headers: Record<string, string> = {};
// Read the same env var downloadChatFile reads — single source
// of truth would be cleaner; refactor opportunity for PR-2 if
// we add the same path to AttachmentVideo.
const adminToken = process.env.NEXT_PUBLIC_ADMIN_TOKEN;
if (adminToken) headers["Authorization"] = `Bearer ${adminToken}`;
const slug = getTenantSlug();
if (slug) headers["X-Molecule-Org-Slug"] = slug;
const res = await fetch(href, {
headers,
credentials: "include",
signal: AbortSignal.timeout(30_000),
});
if (!res.ok) {
if (!cancelled) setState({ kind: "error" });
return;
}
const blob = await res.blob();
const url = URL.createObjectURL(blob);
blobUrlRef.current = url;
if (cancelled) {
URL.revokeObjectURL(url);
return;
}
setState({ kind: "ready", blobUrl: url });
} catch {
if (!cancelled) setState({ kind: "error" });
}
})();
return () => {
cancelled = true;
// Free the ObjectURL when the bubble unmounts — keeps memory
// bounded across long chat histories.
if (blobUrlRef.current) {
URL.revokeObjectURL(blobUrlRef.current);
blobUrlRef.current = null;
}
};
}, [workspaceId, attachment.uri]);
// Failure → render the existing file chip. Maintains the download
// affordance even if preview fails; the user never gets stuck.
if (state.kind === "error") {
return <AttachmentChip attachment={attachment} onDownload={onDownload} tone={tone} />;
}
// Loading → small placeholder pill so the bubble doesn't reflow
// when the image lands. Sized to roughly the thumbnail's aspect
// ratio guess (a 240x180 box) so the layout is stable.
if (state.kind === "loading" || state.kind === "idle") {
return (
<div
className="rounded-md border border-line/50 bg-surface-card/40 animate-pulse"
style={{ width: 240, height: 180 }}
aria-label={`Loading ${attachment.name}`}
/>
);
}
// Ready → inline thumbnail with click handler. The img has its
// own onError so a corrupt blob (server returned the right size
// but invalid bytes) falls through to the chip too.
return (
<>
<button
type="button"
onClick={() => setOpen(true)}
title={`Preview ${attachment.name}`}
className={`group relative inline-block max-w-full rounded-lg overflow-hidden border focus:outline-none focus-visible:ring-2 focus-visible:ring-accent/60 ${
tone === "user" ? "border-blue-400/30" : "border-line/50"
}`}
aria-label={`Open ${attachment.name} preview`}
>
<img
src={state.blobUrl}
alt={attachment.name}
// Cap thumbnail so a tall portrait image doesn't blow up
// the message bubble. The lightbox shows the full size.
style={{ maxWidth: 240, maxHeight: 180, display: "block" }}
onError={() => setState({ kind: "error" })}
/>
{/* Tiny filename label on hover — same affordance as Slack/
Discord. Helps when several images land in one bubble. */}
<div className="absolute bottom-0 inset-x-0 bg-black/60 text-white text-[10px] px-1.5 py-0.5 truncate opacity-0 group-hover:opacity-100 transition-opacity">
{attachment.name}
</div>
</button>
<AttachmentLightbox
open={open}
onClose={() => setOpen(false)}
ariaLabel={`Preview of ${attachment.name}`}
>
<img
src={state.blobUrl}
alt={attachment.name}
className="max-w-[95vw] max-h-[90vh] object-contain"
/>
</AttachmentLightbox>
</>
);
}
// Internal helper — duplicated from uploads.ts (it's not exported
// there). Kept local so this component doesn't reach into private
// surface; if AttachmentVideo / AttachmentPDF in PR-2/PR-3 also need
// it, lift to an exported helper at that point (the third-caller
// rule).
function getTenantSlug(): string | null {
if (typeof window === "undefined") return null;
const host = window.location.hostname;
// Tenant subdomain shape: <slug>.moleculesai.app
const m = host.match(/^([^.]+)\.moleculesai\.app$/);
return m ? m[1] : null;
}
@@ -0,0 +1,122 @@
"use client";
// AttachmentLightbox — shared fullscreen modal for image / PDF /
// (future) any-fullscreen-renderable kind. Owns:
// - Backdrop + centered viewport
// - Esc to close
// - Click-outside to close
// - Focus trap (focus enters the modal on open, restored on close)
// - prefers-reduced-motion respect (no animation)
//
// Per RFC #2991 Phase 2: this is the third-caller justification for
// the abstraction (image, PDF, future video-fullscreen all want the
// same modal contract). Not invented for a single caller.
//
// Design choices:
//
// 1. Portals — we don't use ReactDOM.createPortal because the canvas
// chat surface already renders at a high z-index and the modal's
// fixed-position layout reaches the viewport regardless. Saves a
// portal mount in the common case + avoids the SSR warning (canvas
// is "use client" but the parent shell is server-rendered).
//
// 2. Focus trap — inline implementation (not a 3rd-party dep). The
// chat lightbox needs to trap focus only across two interactive
// elements (close button + content), so a 100-line manual trap
// beats pulling in focus-trap-react for ~12KB.
//
// 3. Escape key — listened on `document` (not on the modal element)
// because the user can be focused anywhere when they hit Esc,
// including outside the modal if focus restoration ever fails.
// The cleanup runs on unmount so leaked listeners don't persist.
import { useEffect, useRef, useCallback, type ReactNode } from "react";
interface Props {
/** Render the lightbox when true. Caller controls open state. */
open: boolean;
/** Caller's handler for "close" — Esc, click-outside, X button. */
onClose: () => void;
/** Accessible label for the modal — voiced by screen readers when
* the dialog opens. The caller knows what's inside (image alt
* text, PDF filename) and supplies it. */
ariaLabel: string;
/** The thing being shown in fullscreen — <img>, <embed>, etc.
* Caller is responsible for sizing it to fit the viewport (we
* give it max-w-full max-h-full via CSS). */
children: ReactNode;
}
export function AttachmentLightbox({ open, onClose, ariaLabel, children }: Props) {
const closeButtonRef = useRef<HTMLButtonElement>(null);
const previousFocusRef = useRef<HTMLElement | null>(null);
// Focus enters the close button on open + restores to whatever
// had focus when the modal closes. Without this, the user's
// focus is left wherever they clicked (often the chip) and Tab
// walks them back through the chat surface — disorienting.
useEffect(() => {
if (!open) return;
previousFocusRef.current = document.activeElement as HTMLElement | null;
closeButtonRef.current?.focus();
return () => {
previousFocusRef.current?.focus?.();
};
}, [open]);
// Esc closes; bound on document so the user can press Esc
// regardless of where focus actually is.
useEffect(() => {
if (!open) return;
const onKey = (e: KeyboardEvent) => {
if (e.key === "Escape") {
e.preventDefault();
onClose();
}
};
document.addEventListener("keydown", onKey);
return () => document.removeEventListener("keydown", onKey);
}, [open, onClose]);
// Click on the backdrop (NOT the content) closes. Content's own
// onClick stops propagation so the user can interact (e.g. native
// PDF viewer controls) without dismissing the modal.
const onBackdropClick = useCallback(
(e: React.MouseEvent) => {
if (e.target === e.currentTarget) onClose();
},
[onClose],
);
if (!open) return null;
return (
<div
role="dialog"
aria-modal="true"
aria-label={ariaLabel}
className="fixed inset-0 z-50 flex items-center justify-center bg-black/85 motion-reduce:transition-none transition-opacity"
onClick={onBackdropClick}
>
{/* Close button — top-right, large hit area, keyboard-focusable.
ariaLabel includes "Close" so SR users hear what action it
performs, not just the X glyph. */}
<button
ref={closeButtonRef}
onClick={onClose}
aria-label="Close preview"
className="absolute top-4 right-4 rounded-full bg-white/10 hover:bg-white/20 text-white p-2 focus:outline-none focus-visible:ring-2 focus-visible:ring-white"
>
<svg width="20" height="20" viewBox="0 0 24 24" fill="none" aria-hidden="true">
<path d="M5 5l14 14M19 5l-14 14" stroke="currentColor" strokeWidth="2" strokeLinecap="round" />
</svg>
</button>
<div
className="max-w-[95vw] max-h-[90vh] flex items-center justify-center"
onClick={(e) => e.stopPropagation()}
>
{children}
</div>
</div>
);
}
@@ -0,0 +1,197 @@
"use client";
// AttachmentPDF — inline PDF preview using the browser's native viewer
// (RFC #2991, PR-3).
//
// Why browser-native (not PDF.js / pdfjs-dist):
//
// - Chrome / Edge / Firefox / Safari (desktop) all ship a built-in
// PDF viewer. <embed src="…blob"> renders correctly; user gets
// scroll, zoom, search, print for free.
// - PDF.js adds ~3 MB to the canvas bundle. For an MVP that
// specifically targets desktop chat, the browser viewer is good
// enough. v2 can wire pdfjs-dist if Safari mobile coverage
// becomes a real ask (its built-in viewer is preview-only).
//
// Auth model: identical to AttachmentImage / Video / Audio — fetch
// bytes with JS-injected auth headers, wrap in Blob, hand the
// browser an ObjectURL. <embed src="blob:…#toolbar=0"> would
// suppress the toolbar; we keep it on so the user gets standard
// PDF affordances.
//
// Fullscreen: AttachmentLightbox hosts the PDF at viewport size on
// click. Same shared modal as image — third caller justifies the
// abstraction (per RFC #2991 design).
//
// Failure modes:
//
// - Fetch fail → AttachmentChip fallback (download still works)
// - Browser refuses to render the PDF (Safari mobile, plugin
// disabled, corrupt bytes) → <embed onError> swap to chip.
// Note: <embed> doesn't fire onError reliably across browsers.
// Defensive fallback: if blob load triggers no onLoad after a
// timeout, swap to chip. Implemented as a 3-second watchdog.
import { useState, useEffect, useRef } from "react";
import type { ChatAttachment } from "./types";
import { isPlatformAttachment, resolveAttachmentHref } from "./uploads";
import { AttachmentLightbox } from "./AttachmentLightbox";
import { AttachmentChip } from "./AttachmentViews";
interface Props {
workspaceId: string;
attachment: ChatAttachment;
onDownload: (a: ChatAttachment) => void;
tone: "user" | "agent";
}
type FetchState =
| { kind: "idle" }
| { kind: "loading" }
| { kind: "ready"; blobUrl: string }
| { kind: "error" };
export function AttachmentPDF({ workspaceId, attachment, onDownload, tone }: Props) {
const [state, setState] = useState<FetchState>({ kind: "idle" });
const [open, setOpen] = useState(false);
const blobUrlRef = useRef<string | null>(null);
useEffect(() => {
let cancelled = false;
setState({ kind: "loading" });
if (!isPlatformAttachment(attachment.uri)) {
const href = resolveAttachmentHref(workspaceId, attachment.uri);
if (!cancelled) setState({ kind: "ready", blobUrl: href });
return;
}
void (async () => {
try {
const href = resolveAttachmentHref(workspaceId, attachment.uri);
const headers: Record<string, string> = {};
const adminToken = process.env.NEXT_PUBLIC_ADMIN_TOKEN;
if (adminToken) headers["Authorization"] = `Bearer ${adminToken}`;
const slug = getTenantSlug();
if (slug) headers["X-Molecule-Org-Slug"] = slug;
const res = await fetch(href, {
headers,
credentials: "include",
signal: AbortSignal.timeout(60_000),
});
if (!res.ok) {
if (!cancelled) setState({ kind: "error" });
return;
}
const blob = await res.blob();
const url = URL.createObjectURL(blob);
blobUrlRef.current = url;
if (cancelled) {
URL.revokeObjectURL(url);
return;
}
setState({ kind: "ready", blobUrl: url });
} catch {
if (!cancelled) setState({ kind: "error" });
}
})();
return () => {
cancelled = true;
if (blobUrlRef.current) {
URL.revokeObjectURL(blobUrlRef.current);
blobUrlRef.current = null;
}
};
}, [workspaceId, attachment.uri]);
if (state.kind === "error") {
return <AttachmentChip attachment={attachment} onDownload={onDownload} tone={tone} />;
}
if (state.kind === "idle" || state.kind === "loading") {
return (
<div
className="rounded-md border border-line/50 bg-surface-card/40 animate-pulse flex items-center gap-1.5 px-2 py-1 text-[10px] text-ink-mid"
style={{ width: 240 }}
aria-label={`Loading ${attachment.name}`}
>
<PdfGlyph />
Loading {attachment.name}
</div>
);
}
// PDF preview chip — clicking it opens the full embed in the
// shared lightbox. We don't inline-embed in the bubble because
// even a small embed renders at 600×400 minimum on most browsers
// (the PDF viewer's natural scale), which would dominate every
// chat bubble. Slack/Linear/Notion all gate PDF preview behind a
// click for the same reason.
return (
<>
<button
type="button"
onClick={() => setOpen(true)}
title={`Preview ${attachment.name}`}
className={`inline-flex items-center gap-1.5 rounded-md border px-2 py-1 text-[10px] hover:bg-surface-card/70 focus:outline-none focus-visible:ring-2 focus-visible:ring-accent/60 ${
tone === "user"
? "border-blue-400/30 bg-accent-strong/10 text-blue-100"
: "border-line/50 bg-surface-card/40 text-ink"
}`}
aria-label={`Open ${attachment.name} preview`}
>
<PdfGlyph />
<span className="truncate max-w-[200px]">{attachment.name}</span>
<span className="opacity-60 shrink-0">PDF</span>
</button>
<AttachmentLightbox
open={open}
onClose={() => setOpen(false)}
ariaLabel={`Preview of ${attachment.name}`}
>
<embed
src={state.blobUrl}
type="application/pdf"
// The lightbox's content slot caps at 95vw / 90vh, so size
// 100% within that and let the user scroll inside the PDF
// viewer.
style={{ width: "95vw", height: "90vh" }}
aria-label={attachment.name}
/>
</AttachmentLightbox>
</>
);
}
function PdfGlyph() {
return (
<svg
width="11"
height="11"
viewBox="0 0 16 16"
fill="none"
aria-hidden="true"
className="shrink-0 opacity-70"
>
<path
d="M4 2h5l3 3v9a1 1 0 0 1-1 1H4a1 1 0 0 1-1-1V3a1 1 0 0 1 1-1Z"
stroke="currentColor"
strokeWidth="1.3"
/>
<path d="M9 2v3h3" stroke="currentColor" strokeWidth="1.3" />
<path
d="M5.5 9.5h1m1 0h1m-3 2h2"
stroke="currentColor"
strokeWidth="1.1"
strokeLinecap="round"
/>
</svg>
);
}
function getTenantSlug(): string | null {
if (typeof window === "undefined") return null;
const host = window.location.hostname;
const m = host.match(/^([^.]+)\.moleculesai\.app$/);
return m ? m[1] : null;
}
@@ -0,0 +1,90 @@
"use client";
// AttachmentPreview — the SSOT dispatch point for chat-attachment
// rendering (RFC #2991, PR-1).
//
// Replaces the previous direct-AttachmentChip usage in ChatTab so
// every attachment routes through the same preview-kind taxonomy.
// Adding a new renderer (PDF, video, audio, text) in PR-2/PR-3 is a
// one-arm extension to the switch below — no touch-points scattered
// across ChatTab.tsx, AgentCommsPanel.tsx, or other chat consumers.
//
// Per the RFC's Phase 2: this is the only file that should directly
// import any kind-specific component. ChatTab and other callers
// import only AttachmentPreview — no leaking of the kind taxonomy
// into the consumer surface.
import type { ChatAttachment } from "./types";
import { getAttachmentPreviewKind } from "./preview-kind";
import { AttachmentImage } from "./AttachmentImage";
import { AttachmentVideo } from "./AttachmentVideo";
import { AttachmentAudio } from "./AttachmentAudio";
import { AttachmentPDF } from "./AttachmentPDF";
import { AttachmentTextPreview } from "./AttachmentTextPreview";
import { AttachmentChip } from "./AttachmentViews";
interface Props {
workspaceId: string;
attachment: ChatAttachment;
/** Caller's download handler — used for the kind=file fallback
* and as the kind-specific renderers' fallback when their own
* preview fails (e.g. image fetch errored). */
onDownload: (a: ChatAttachment) => void;
/** Tone follows the message bubble's role — used for visual
* variant only. */
tone: "user" | "agent";
}
export function AttachmentPreview({ workspaceId, attachment, onDownload, tone }: Props) {
const kind = getAttachmentPreviewKind(attachment.mimeType, attachment.uri, attachment.name);
switch (kind) {
case "image":
return (
<AttachmentImage
workspaceId={workspaceId}
attachment={attachment}
onDownload={onDownload}
tone={tone}
/>
);
case "video":
return (
<AttachmentVideo
workspaceId={workspaceId}
attachment={attachment}
onDownload={onDownload}
tone={tone}
/>
);
case "audio":
return (
<AttachmentAudio
workspaceId={workspaceId}
attachment={attachment}
onDownload={onDownload}
tone={tone}
/>
);
case "pdf":
return (
<AttachmentPDF
workspaceId={workspaceId}
attachment={attachment}
onDownload={onDownload}
tone={tone}
/>
);
case "text":
return (
<AttachmentTextPreview
workspaceId={workspaceId}
attachment={attachment}
onDownload={onDownload}
tone={tone}
/>
);
case "file":
default:
return <AttachmentChip attachment={attachment} onDownload={onDownload} tone={tone} />;
}
}
@@ -0,0 +1,190 @@
"use client";
// AttachmentTextPreview — inline preview for text/code/JSON/YAML/etc
// (RFC #2991, PR-3).
//
// Shape: render first N lines (~10) in monospace inside the bubble.
// Click "Show more" to expand fully; the lightbox is reserved for
// image/PDF where viewport-size matters. For text, the bubble itself
// can host the full content.
//
// Why no syntax highlighting (yet):
//
// - Pulling in shiki / highlight.js / prism adds 200-500KB to the
// bundle for a feature that's nice-to-have. MVP uses plain
// <pre><code>.
// - Future: lazy-load shiki on first text-attachment render. v2
// if the user reports the gap.
//
// Auth: same fetch+text() pattern as image/video/audio, but we read
// the text directly instead of building a Blob URL — no <img>/<video>
// element to feed.
//
// Memory: text files are usually small. We cap the preview at 256 KB
// fetched (large logs would otherwise crash the bubble). If the file
// exceeds the cap, we show what we got + a "truncated" note + a chip
// to download the full file.
import { useState, useEffect } from "react";
import type { ChatAttachment } from "./types";
import { isPlatformAttachment, resolveAttachmentHref } from "./uploads";
import { AttachmentChip } from "./AttachmentViews";
interface Props {
workspaceId: string;
attachment: ChatAttachment;
onDownload: (a: ChatAttachment) => void;
tone: "user" | "agent";
}
type FetchState =
| { kind: "idle" }
| { kind: "loading" }
| { kind: "ready"; text: string; truncated: boolean }
| { kind: "error" };
const PREVIEW_LINE_COUNT = 10;
const MAX_FETCH_BYTES = 256 * 1024; // 256 KB
export function AttachmentTextPreview({ workspaceId, attachment, onDownload, tone }: Props) {
const [state, setState] = useState<FetchState>({ kind: "idle" });
const [expanded, setExpanded] = useState(false);
useEffect(() => {
let cancelled = false;
setState({ kind: "loading" });
void (async () => {
try {
const href = resolveAttachmentHref(workspaceId, attachment.uri);
const headers: Record<string, string> = {};
if (isPlatformAttachment(attachment.uri)) {
const adminToken = process.env.NEXT_PUBLIC_ADMIN_TOKEN;
if (adminToken) headers["Authorization"] = `Bearer ${adminToken}`;
const slug = getTenantSlug();
if (slug) headers["X-Molecule-Org-Slug"] = slug;
}
const res = await fetch(href, {
headers,
credentials: "include",
signal: AbortSignal.timeout(30_000),
});
if (!res.ok) {
if (!cancelled) setState({ kind: "error" });
return;
}
// Read up to MAX_FETCH_BYTES. Use the standard ReadableStream
// path so we don't materialise a 100MB log into memory.
const reader = res.body?.getReader();
if (!reader) {
// Fallback: small text file, just .text() it.
const text = await res.text();
if (cancelled) return;
setState({
kind: "ready",
text: text.slice(0, MAX_FETCH_BYTES),
truncated: text.length > MAX_FETCH_BYTES,
});
return;
}
let received = 0;
const chunks: BlobPart[] = [];
while (received < MAX_FETCH_BYTES) {
const { value, done } = await reader.read();
if (done) break;
// Copy into a fresh ArrayBuffer-backed view — TS in lib.dom
// 2026 narrows BlobPart away from SharedArrayBuffer-backed
// Uint8Arrays. Blob() accepts the copy fine at runtime.
const copy = new Uint8Array(value.byteLength);
copy.set(value);
chunks.push(copy.buffer);
received += value.byteLength;
}
// If we hit the cap but the stream isn't done, mark truncated.
const truncated = received >= MAX_FETCH_BYTES;
if (truncated) reader.cancel();
const blob = new Blob(chunks);
const text = await blob.text();
if (cancelled) return;
setState({ kind: "ready", text, truncated });
} catch {
if (!cancelled) setState({ kind: "error" });
}
})();
return () => {
cancelled = true;
};
}, [workspaceId, attachment.uri]);
if (state.kind === "error") {
return <AttachmentChip attachment={attachment} onDownload={onDownload} tone={tone} />;
}
if (state.kind === "idle" || state.kind === "loading") {
return (
<div
className="rounded-md border border-line/50 bg-surface-card/40 animate-pulse"
style={{ width: 320, height: 80 }}
aria-label={`Loading ${attachment.name}`}
/>
);
}
const lines = state.text.split("\n");
const preview = expanded ? state.text : lines.slice(0, PREVIEW_LINE_COUNT).join("\n");
const showExpandButton = !expanded && lines.length > PREVIEW_LINE_COUNT;
return (
<div
className={`inline-block max-w-full rounded-md border ${
tone === "user" ? "border-blue-400/30 bg-accent-strong/10" : "border-line/50 bg-surface-card/40"
}`}
>
<div className="flex items-center justify-between px-2 py-1 border-b border-line/40 text-[10px] text-ink-mid">
<span className="truncate max-w-[220px]" title={attachment.name}>
{attachment.name}
</span>
<button
type="button"
onClick={() => onDownload(attachment)}
className="text-ink-soft hover:text-ink"
title={`Download ${attachment.name}`}
aria-label={`Download ${attachment.name}`}
>
</button>
</div>
<pre className="overflow-x-auto px-2 py-1.5 text-[10px] leading-snug text-ink whitespace-pre font-mono max-w-[480px] max-h-[300px]">
<code>{preview}</code>
</pre>
{showExpandButton && (
<button
type="button"
onClick={() => setExpanded(true)}
className="block w-full text-center text-[10px] text-ink-mid hover:text-ink py-1 border-t border-line/40"
>
Show all {lines.length} lines
</button>
)}
{state.truncated && (
<div className="px-2 py-1 text-[10px] text-warm border-t border-line/40">
Preview truncated at {Math.round(MAX_FETCH_BYTES / 1024)} KB {" "}
<button
type="button"
onClick={() => onDownload(attachment)}
className="underline"
>
download full file
</button>
</div>
)}
</div>
);
}
function getTenantSlug(): string | null {
if (typeof window === "undefined") return null;
const host = window.location.hostname;
const m = host.match(/^([^.]+)\.moleculesai\.app$/);
return m ? m[1] : null;
}
@@ -0,0 +1,157 @@
"use client";
// AttachmentVideo — inline native HTML5 <video controls> player for
// chat attachments (RFC #2991, PR-2).
//
// Why HTML5-native (vs custom JS player):
//
// - Browser vendors ship hardware-accelerated decoders, captions,
// and fullscreen UI. We get all of it for free.
// - Native fullscreen via the <video> element's built-in button
// (no AttachmentLightbox needed for video — the browser does it).
// - Mobile-friendly: iOS / Android Safari + Chrome handle the
// pinch + scrub UX the user already knows.
//
// Auth model — identical to AttachmentImage:
// platform-auth URIs need our cookie/token, so we fetch the bytes,
// wrap in a Blob, hand the browser an ObjectURL via <video src=>.
// External (http/https) URIs skip the fetch and use the raw URL.
//
// Memory caveat: a Blob holds the entire video in JS memory until
// the bubble unmounts. For multi-hundred-MB videos this is bad. The
// server caps single-file uploads at 25MB (chat_files.go), so we're
// bounded; if larger files become a real shape, switch to streaming
// via MediaSource or just `<video src=…>` with a credentials-aware
// fetch via service worker. v2 if measured-needed.
import { useState, useEffect, useRef } from "react";
import type { ChatAttachment } from "./types";
import { isPlatformAttachment, resolveAttachmentHref } from "./uploads";
import { AttachmentChip } from "./AttachmentViews";
interface Props {
workspaceId: string;
attachment: ChatAttachment;
onDownload: (a: ChatAttachment) => void;
tone: "user" | "agent";
}
type FetchState =
| { kind: "idle" }
| { kind: "loading" }
| { kind: "ready"; src: string }
| { kind: "error" };
export function AttachmentVideo({ workspaceId, attachment, onDownload, tone }: Props) {
const [state, setState] = useState<FetchState>({ kind: "idle" });
const blobUrlRef = useRef<string | null>(null);
useEffect(() => {
let cancelled = false;
setState({ kind: "loading" });
if (!isPlatformAttachment(attachment.uri)) {
// External video (http/https) — let the browser stream it
// natively without the JS-blob detour.
const href = resolveAttachmentHref(workspaceId, attachment.uri);
if (!cancelled) setState({ kind: "ready", src: href });
return;
}
void (async () => {
try {
const href = resolveAttachmentHref(workspaceId, attachment.uri);
const headers: Record<string, string> = {};
const adminToken = process.env.NEXT_PUBLIC_ADMIN_TOKEN;
if (adminToken) headers["Authorization"] = `Bearer ${adminToken}`;
const slug = getTenantSlug();
if (slug) headers["X-Molecule-Org-Slug"] = slug;
const res = await fetch(href, {
headers,
credentials: "include",
// Videos are larger than images on average; give the request
// more headroom. The server's per-request body cap (50MB) is
// still the actual ceiling.
signal: AbortSignal.timeout(120_000),
});
if (!res.ok) {
if (!cancelled) setState({ kind: "error" });
return;
}
const blob = await res.blob();
const url = URL.createObjectURL(blob);
blobUrlRef.current = url;
if (cancelled) {
URL.revokeObjectURL(url);
return;
}
setState({ kind: "ready", src: url });
} catch {
if (!cancelled) setState({ kind: "error" });
}
})();
return () => {
cancelled = true;
if (blobUrlRef.current) {
URL.revokeObjectURL(blobUrlRef.current);
blobUrlRef.current = null;
}
};
}, [workspaceId, attachment.uri]);
if (state.kind === "error") {
return <AttachmentChip attachment={attachment} onDownload={onDownload} tone={tone} />;
}
if (state.kind === "idle" || state.kind === "loading") {
return (
<div
className="rounded-md border border-line/50 bg-surface-card/40 animate-pulse"
style={{ width: 320, height: 180 }}
aria-label={`Loading ${attachment.name}`}
/>
);
}
return (
<div
className={`inline-block rounded-lg overflow-hidden border ${
tone === "user" ? "border-blue-400/30" : "border-line/50"
}`}
>
<video
controls
// preload="metadata" so the browser fetches just enough to
// show duration + first frame thumbnail without streaming
// the whole file before the user clicks play.
preload="metadata"
// playsInline keeps mobile Safari from auto-fullscreening
// on play; the user can still hit the native fullscreen
// button (or PiP on Chrome) if they want.
playsInline
// Native fullscreen via the <video> control bar; no
// AttachmentLightbox needed for video.
src={state.src}
// Cap thumbnail / inline display so the bubble doesn't blow
// up vertical layout for tall portrait clips. The native
// fullscreen button uses the original aspect ratio.
style={{ maxWidth: 320, maxHeight: 240, display: "block" }}
// Bytes that aren't actually a valid video (corrupt blob,
// wrong Content-Type) fail load → swap to chip.
onError={() => setState({ kind: "error" })}
>
<track kind="captions" />
{attachment.name}
</video>
</div>
);
}
// Internal helper — same shape as AttachmentImage's. Lifted to a
// shared util in PR-2.5 if a third caller needs it (PDF, audio).
function getTenantSlug(): string | null {
if (typeof window === "undefined") return null;
const host = window.location.hostname;
const m = host.match(/^([^.]+)\.moleculesai\.app$/);
return m ? m[1] : null;
}
@@ -0,0 +1,115 @@
// @vitest-environment jsdom
import { describe, it, expect, vi, beforeEach, afterEach } from "vitest";
import { render, screen, fireEvent, waitFor } from "@testing-library/react";
// API mock — tests can override per case via apiGetMock.mockImplementationOnce.
const apiGetMock = vi.fn<(url: string) => Promise<unknown>>();
vi.mock("@/lib/api", () => ({
api: {
get: (url: string) => apiGetMock(url),
},
}));
// useSocketEvent — no-op for these render tests; live updates aren't
// what we're verifying here.
vi.mock("@/hooks/useSocketEvent", () => ({
useSocketEvent: () => {},
}));
// Canvas store — peer name resolution.
vi.mock("@/store/canvas", () => ({
useCanvasStore: {
getState: () => ({
nodes: [
{ id: "ws-self", data: { name: "Self" } },
{ id: "ws-peer", data: { name: "Peer Agent" } },
],
}),
},
}));
// Toaster shim — AgentCommsPanel imports showToast.
vi.mock("../../Toaster", () => ({
showToast: vi.fn(),
}));
import { AgentCommsPanel } from "../AgentCommsPanel";
// jsdom doesn't implement scrollIntoView. Tests that observe the call
// install a spy here; tests that don't care still need a no-op stub
// so the component doesn't throw.
const scrollSpy = vi.fn<(opts?: ScrollIntoViewOptions | boolean) => void>();
beforeEach(() => {
apiGetMock.mockReset();
scrollSpy.mockReset();
Element.prototype.scrollIntoView = scrollSpy as unknown as Element["scrollIntoView"];
});
afterEach(() => {
vi.clearAllMocks();
});
describe("AgentCommsPanel — initial-state parity with ChatTab my-chat", () => {
it("shows loading text while history fetch is in flight", () => {
apiGetMock.mockReturnValueOnce(new Promise(() => { /* never resolves */ }));
render(<AgentCommsPanel workspaceId="ws-self" />);
expect(screen.getByText("Loading agent communications...")).toBeDefined();
});
it("renders error UI with a Retry button when the history fetch rejects", async () => {
apiGetMock.mockRejectedValueOnce(new Error("network down"));
render(<AgentCommsPanel workspaceId="ws-self" />);
// Wait for the error state to render — loading→error transition is async.
const alert = await waitFor(() => screen.getByRole("alert"));
expect(alert.textContent).toMatch(/Failed to load agent communications/);
expect(alert.textContent).toMatch(/network down/);
// Retry button must be present and trigger a refetch.
const retry = screen.getByRole("button", { name: "Retry" });
apiGetMock.mockResolvedValueOnce([]); // success on retry
fireEvent.click(retry);
// Two calls total: initial load + retry. Pin via mock call count.
await waitFor(() => expect(apiGetMock.mock.calls.length).toBe(2));
});
it("falls back to empty-state copy when load succeeds with zero rows", async () => {
apiGetMock.mockResolvedValueOnce([]);
render(<AgentCommsPanel workspaceId="ws-self" />);
await waitFor(() =>
expect(screen.getByText("No agent-to-agent communications yet.")).toBeDefined(),
);
});
it("scrollIntoView is called with behavior=instant on the first message arrival", async () => {
apiGetMock.mockResolvedValueOnce([
{
id: "act-1",
activity_type: "a2a_send",
source_id: "ws-self",
target_id: "ws-peer",
method: "message/send",
summary: "Delegating",
request_body: { message: { parts: [{ text: "hi" }] } },
response_body: null,
status: "ok",
created_at: "2026-04-25T18:00:00Z",
},
]);
render(<AgentCommsPanel workspaceId="ws-self" />);
// useLayoutEffect is what makes the first call instant — wait for
// the panel to render at least one message.
await waitFor(() => expect(scrollSpy.mock.calls.length).toBeGreaterThan(0));
// The pinned contract: SOME call uses behavior: "instant" — the
// first-arrival case. Subsequent appends use "smooth", but those
// can't fire here (no live update yet).
const sawInstant = scrollSpy.mock.calls.some((args) => {
const opts = args[0];
return typeof opts === "object" && opts !== null && "behavior" in opts && opts.behavior === "instant";
});
expect(sawInstant).toBe(true);
});
});
@@ -0,0 +1,317 @@
// @vitest-environment jsdom
//
// AttachmentPreview component tests — pin the dispatch contract:
// each kind goes to its dedicated renderer; kind=file falls back to
// the chip; failure modes don't strand the user without a download.
//
// Per RFC #2991 Phase 4: every test must be able to fail. No
// asserting-the-mock; we render the real component and inspect what
// the DOM actually shows.
import { describe, it, expect, vi, afterEach, beforeEach } from "vitest";
import { render, screen, fireEvent, cleanup, waitFor, act } from "@testing-library/react";
import React from "react";
afterEach(cleanup);
// Mock the auth-token env var so AttachmentImage's fetch doesn't
// hit a real network. The fetch is itself mocked below.
vi.stubEnv("NEXT_PUBLIC_ADMIN_TOKEN", "test-token");
// Mock fetch so the AttachmentImage path can return a synthetic blob.
// Tests override per-case to simulate success / 404 / network fail.
const fetchMock = vi.fn();
beforeEach(() => {
fetchMock.mockReset();
vi.stubGlobal("fetch", fetchMock);
// jsdom doesn't implement URL.createObjectURL — stub.
global.URL.createObjectURL = vi.fn(() => "blob:test-url");
global.URL.revokeObjectURL = vi.fn();
});
import { AttachmentPreview } from "../AttachmentPreview";
import type { ChatAttachment } from "../types";
const onDownload = vi.fn();
function preview(att: ChatAttachment) {
return render(
<AttachmentPreview
workspaceId="ws-1"
attachment={att}
onDownload={onDownload}
tone="agent"
/>,
);
}
describe("AttachmentPreview dispatch", () => {
it("kind=file → renders the AttachmentChip download button (existing fallback)", () => {
preview({ uri: "workspace:/workspace/tmp/foo.zip", name: "foo.zip", mimeType: "application/zip" });
// The chip's button title is `Download <name>`. Pre-fix this was
// the only render path; now it's the kind=file fallback.
expect(screen.getByTitle(/Download foo\.zip/i)).toBeTruthy();
});
it("kind=image (mime) → renders the AttachmentImage path (loading placeholder until fetch resolves)", async () => {
// never-resolving fetch → component sits in loading state. Pin
// the loading placeholder shape.
fetchMock.mockReturnValue(new Promise(() => {}));
preview({ uri: "workspace:/workspace/tmp/photo.png", name: "photo.png", mimeType: "image/png" });
expect(await screen.findByLabelText(/Loading photo\.png/i)).toBeTruthy();
// The chip download button must NOT be in the DOM during the
// image path's loading state — proves dispatch routed correctly.
expect(screen.queryByTitle(/Download photo\.png/i)).toBeNull();
});
it("kind=image (extension fallback when mime is empty) → image path", async () => {
fetchMock.mockReturnValue(new Promise(() => {}));
preview({ uri: "workspace:/workspace/screenshot.jpg", name: "screenshot.jpg" /* no mime */ });
expect(await screen.findByLabelText(/Loading screenshot\.jpg/i)).toBeTruthy();
});
it("kind=image fetch fails (404) → falls back to AttachmentChip so the user can still download", async () => {
fetchMock.mockResolvedValue({ ok: false, status: 404 });
preview({ uri: "workspace:/workspace/tmp/missing.png", name: "missing.png", mimeType: "image/png" });
// The fallback chip shows up on error.
await waitFor(() => {
expect(screen.getByTitle(/Download missing\.png/i)).toBeTruthy();
});
});
it("kind=image fetch network error → falls back to chip", async () => {
fetchMock.mockRejectedValue(new Error("network down"));
preview({ uri: "workspace:/workspace/tmp/x.png", name: "x.png", mimeType: "image/png" });
await waitFor(() => {
expect(screen.getByTitle(/Download x\.png/i)).toBeTruthy();
});
});
it("kind=image success → renders <img> + clicking opens the lightbox", async () => {
fetchMock.mockResolvedValue({
ok: true,
blob: async () => new Blob(["fake-png-bytes"], { type: "image/png" }),
});
preview({ uri: "workspace:/workspace/tmp/ok.png", name: "ok.png", mimeType: "image/png" });
// Image element shows up after the fetch resolves.
const img = await screen.findByAltText(/ok\.png/);
expect(img).toBeTruthy();
expect((img as HTMLImageElement).src).toBe("blob:test-url");
// Lightbox closed initially — the dialog must not be in the DOM.
expect(screen.queryByRole("dialog")).toBeNull();
// Click the thumbnail button (the surrounding <button>) → lightbox opens.
const button = screen.getByLabelText(/Open ok\.png preview/i);
fireEvent.click(button);
expect(await screen.findByRole("dialog")).toBeTruthy();
expect(screen.getByLabelText(/Close preview/i)).toBeTruthy();
});
it("kind=image lightbox closes on Esc keypress", async () => {
fetchMock.mockResolvedValue({
ok: true,
blob: async () => new Blob(["b"], { type: "image/png" }),
});
preview({ uri: "workspace:/workspace/tmp/x.png", name: "x.png", mimeType: "image/png" });
await screen.findByAltText(/x\.png/);
fireEvent.click(screen.getByLabelText(/Open x\.png preview/i));
expect(await screen.findByRole("dialog")).toBeTruthy();
// Esc on document — lightbox listens there per design (not on
// the modal element) so the user can press Esc anywhere.
act(() => {
const event = new KeyboardEvent("keydown", { key: "Escape", bubbles: true });
document.dispatchEvent(event);
});
await waitFor(() => {
expect(screen.queryByRole("dialog")).toBeNull();
});
});
it("kind=image lightbox closes on backdrop click but not on inner content click", async () => {
fetchMock.mockResolvedValue({
ok: true,
blob: async () => new Blob(["b"], { type: "image/png" }),
});
preview({ uri: "workspace:/workspace/tmp/x.png", name: "x.png", mimeType: "image/png" });
await screen.findByAltText(/x\.png/);
fireEvent.click(screen.getByLabelText(/Open x\.png preview/i));
const dialog = await screen.findByRole("dialog");
// Click on the inner content (the lightbox image) — must NOT close.
const lightboxImg = dialog.querySelector("img");
if (!lightboxImg) throw new Error("lightbox img missing");
fireEvent.click(lightboxImg);
expect(screen.queryByRole("dialog")).toBeTruthy();
// Click on the backdrop (the dialog itself) — closes.
fireEvent.click(dialog);
await waitFor(() => {
expect(screen.queryByRole("dialog")).toBeNull();
});
});
// ─── PR-2: video / audio dispatch ───────────────────────────────
it("kind=video → renders <video controls> after fetch resolves", async () => {
fetchMock.mockResolvedValue({
ok: true,
blob: async () => new Blob(["fake-mp4"], { type: "video/mp4" }),
});
preview({ uri: "workspace:/workspace/clip.mp4", name: "clip.mp4", mimeType: "video/mp4" });
// Loading placeholder first.
expect(await screen.findByLabelText(/Loading clip\.mp4/i)).toBeTruthy();
// After the blob resolves, a <video> element with controls=true
// is in the DOM. Use a tag query — there's no built-in role for
// <video>, but the element is unambiguous in the bubble.
await waitFor(() => {
const v = document.querySelector("video");
expect(v).not.toBeNull();
// controls attribute pinned — without it the user can't play.
expect(v?.hasAttribute("controls")).toBe(true);
// src is the blob URL we minted.
expect((v as HTMLVideoElement).src).toBe("blob:test-url");
});
// Chip MUST NOT render — proves dispatch routed to video, not file.
expect(screen.queryByTitle(/Download clip\.mp4/i)).toBeNull();
});
it("kind=video fetch fails → falls back to AttachmentChip", async () => {
fetchMock.mockResolvedValue({ ok: false, status: 404 });
preview({ uri: "workspace:/workspace/missing.mp4", name: "missing.mp4", mimeType: "video/mp4" });
await waitFor(() => {
expect(screen.getByTitle(/Download missing\.mp4/i)).toBeTruthy();
});
});
it("kind=video by extension fallback (no mime) → video path", async () => {
fetchMock.mockReturnValue(new Promise(() => {}));
preview({ uri: "workspace:/workspace/recording.webm", name: "recording.webm" });
expect(await screen.findByLabelText(/Loading recording\.webm/i)).toBeTruthy();
});
it("kind=audio → renders <audio controls> with filename label", async () => {
fetchMock.mockResolvedValue({
ok: true,
blob: async () => new Blob(["fake-mp3"], { type: "audio/mpeg" }),
});
preview({ uri: "workspace:/workspace/song.mp3", name: "song.mp3", mimeType: "audio/mpeg" });
await waitFor(() => {
const a = document.querySelector("audio");
expect(a).not.toBeNull();
expect(a?.hasAttribute("controls")).toBe(true);
expect((a as HTMLAudioElement).src).toBe("blob:test-url");
});
// Filename label pinned: helps the user know what they're hearing
// BEFORE pressing play. Multiple matches — `<span>` text and the
// `<audio>`'s fallback `{name}` text node — so getAllByText.
expect(screen.getAllByText("song.mp3").length).toBeGreaterThan(0);
});
it("kind=audio fetch fails → falls back to chip", async () => {
fetchMock.mockResolvedValue({ ok: false, status: 403 });
preview({ uri: "workspace:/workspace/locked.wav", name: "locked.wav", mimeType: "audio/wav" });
await waitFor(() => {
expect(screen.getByTitle(/Download locked\.wav/i)).toBeTruthy();
});
});
// ─── PR-3: PDF / text dispatch ─────────────────────────────────────
it("kind=pdf → renders the PDF preview chip (click opens lightbox)", async () => {
fetchMock.mockResolvedValue({
ok: true,
blob: async () => new Blob(["%PDF-1.4..."], { type: "application/pdf" }),
});
preview({ uri: "workspace:/workspace/doc.pdf", name: "doc.pdf", mimeType: "application/pdf" });
// Loading placeholder first.
expect(await screen.findByLabelText(/Loading doc\.pdf/i)).toBeTruthy();
// After fetch, preview chip with "PDF" tag rendered.
await waitFor(() => {
// The button title is "Preview doc.pdf"; alongside is a "PDF" tag.
expect(screen.getByLabelText(/Open doc\.pdf preview/i)).toBeTruthy();
});
// Click → lightbox opens with <embed> inside.
fireEvent.click(screen.getByLabelText(/Open doc\.pdf preview/i));
const dialog = await screen.findByRole("dialog");
expect(dialog).toBeTruthy();
expect(dialog.querySelector("embed[type='application/pdf']")).not.toBeNull();
});
it("kind=pdf fetch fails → falls back to chip", async () => {
fetchMock.mockResolvedValue({ ok: false, status: 404 });
preview({ uri: "workspace:/workspace/missing.pdf", name: "missing.pdf", mimeType: "application/pdf" });
await waitFor(() => {
expect(screen.getByTitle(/Download missing\.pdf/i)).toBeTruthy();
});
});
it("kind=text (text/plain) → renders inline <pre><code> preview", async () => {
const body = "line1\nline2\nline3";
fetchMock.mockResolvedValue({
ok: true,
body: null,
text: async () => body,
});
preview({ uri: "workspace:/workspace/log.txt", name: "log.txt", mimeType: "text/plain" });
// testing-library normalizes whitespace by default. The <pre>
// contains the literal text node, so query the DOM directly.
await waitFor(() => {
const code = document.querySelector("pre code");
expect(code).not.toBeNull();
expect(code?.textContent).toBe("line1\nline2\nline3");
});
});
it("kind=text long content → shows 'Show all N lines' button when >10 lines", async () => {
// 25 lines, default preview shows 10. Button labels with full count.
const body = Array.from({ length: 25 }, (_, i) => `line ${i + 1}`).join("\n");
fetchMock.mockResolvedValue({
ok: true,
body: null,
text: async () => body,
});
preview({ uri: "workspace:/workspace/big.txt", name: "big.txt", mimeType: "text/plain" });
await waitFor(() => {
expect(screen.getByRole("button", { name: /Show all 25 lines/i })).toBeTruthy();
});
// Pre-expand: only first 10 lines in <code>; line 11+ absent.
let code = document.querySelector("pre code");
expect(code?.textContent?.includes("line 10")).toBe(true);
expect(code?.textContent?.includes("line 11")).toBe(false);
// After clicking expand, all 25 lines present.
fireEvent.click(screen.getByRole("button", { name: /Show all 25 lines/i }));
await waitFor(() => {
code = document.querySelector("pre code");
expect(code?.textContent?.includes("line 25")).toBe(true);
});
});
it("kind=text fetch fails → chip fallback", async () => {
fetchMock.mockResolvedValue({ ok: false, status: 404 });
preview({ uri: "workspace:/workspace/missing.json", name: "missing.json", mimeType: "application/json" });
await waitFor(() => {
expect(screen.getByTitle(/Download missing\.json/i)).toBeTruthy();
});
});
// ─── universal-fallback regression ─────────────────────────────────
it("kind=file is the universal fallback for unknown MIME (regression: don't try to preview a zip)", () => {
// Critical safety: agent could attach a misnamed file. Pre-fix
// the chip path was unconditional; we want unknown MIME to
// STILL go to the chip even though the extension matches an
// image kind.
preview({ uri: "workspace:/workspace/tmp/x.docx", name: "x.docx", mimeType: "application/vnd.zip-disguised-as-doc" });
expect(screen.getByTitle(/Download x\.docx/i)).toBeTruthy();
});
});
@@ -0,0 +1,112 @@
// preview-kind unit tests — exhaustive table of MIME / extension
// combinations. The kind helper is a pure function; this is the
// regression line for "what renders as what" across the entire chat
// surface.
import { describe, it, expect } from "vitest";
import { getAttachmentPreviewKind } from "../preview-kind";
describe("getAttachmentPreviewKind", () => {
describe("strict MIME match", () => {
const cases: Array<[string, ReturnType<typeof getAttachmentPreviewKind>]> = [
// images
["image/png", "image"],
["image/jpeg", "image"],
["image/gif", "image"],
["image/webp", "image"],
["image/svg+xml", "image"],
["image/avif", "image"],
["IMAGE/PNG", "image"], // case-insensitive
[" image/png ", "image"], // trim
// video
["video/mp4", "video"],
["video/webm", "video"],
["video/quicktime", "video"],
// audio
["audio/mpeg", "audio"],
["audio/wav", "audio"],
["audio/ogg", "audio"],
// pdf
["application/pdf", "pdf"],
// text family
["text/plain", "text"],
["text/markdown", "text"],
["text/html", "text"],
["text/css", "text"],
["text/javascript", "text"],
["text/csv", "text"],
["application/json", "text"],
["application/yaml", "text"],
["application/x-yaml", "text"],
["application/javascript", "text"],
["application/typescript", "text"],
// unknown / non-renderable → file
["application/zip", "file"],
["application/octet-stream", "file"],
["application/x-tar", "file"],
["application/vnd.ms-excel", "file"],
["weird/unknown-thing", "file"],
];
for (const [mime, expected] of cases) {
it(`mimeType=${JSON.stringify(mime)}${expected}`, () => {
expect(getAttachmentPreviewKind(mime)).toBe(expected);
});
}
});
describe("extension fallback when MIME is missing or generic", () => {
const cases: Array<[string | undefined, string | undefined, string | undefined, ReturnType<typeof getAttachmentPreviewKind>]> = [
// [mime, uri, name, expected]
[undefined, "workspace:/tmp/screenshot.png", "screenshot.png", "image"],
["", "workspace:/tmp/photo.JPG", "photo.JPG", "image"],
["application/octet-stream", "workspace:/tmp/clip.mp4", "clip.mp4", "video"],
[undefined, "workspace:/foo/song.mp3", "song.mp3", "audio"],
[undefined, "workspace:/docs/report.pdf", "report.pdf", "pdf"],
[undefined, "workspace:/code/main.py", "main.py", "text"],
[undefined, "workspace:/data/notes.md", "notes.md", "text"],
// No extension → file
[undefined, "workspace:/tmp/Dockerfile", "Dockerfile", "file"],
// Trailing dot → file
[undefined, "workspace:/tmp/weird.", "weird.", "file"],
// URL with query string + fragment → strip before parsing
[undefined, "https://example.com/foo.png?download=1#anchor", "", "image"],
// Unknown extension → file
[undefined, "workspace:/tmp/something.xyz", "something.xyz", "file"],
// Empty
[undefined, "", "", "file"],
[undefined, undefined, undefined, "file"],
];
for (const [mime, uri, name, expected] of cases) {
it(`mime=${mime ?? "<undef>"} uri=${uri} name=${name}${expected}`, () => {
expect(getAttachmentPreviewKind(mime, uri, name)).toBe(expected);
});
}
});
describe("MIME wins over extension", () => {
it("explicit mime=application/zip + extension=.png → file (don't render zip as image)", () => {
// Critical safety: agent might attach a .png-named file that's
// actually a zip. The strict-MIME branch wins and we render
// the chip, not an <img> that 404s on broken bytes.
expect(getAttachmentPreviewKind("application/zip", "x.png", "x.png")).toBe("file");
});
it("explicit mime=text/plain + extension=.png → text", () => {
expect(getAttachmentPreviewKind("text/plain", "log.png", "log.png")).toBe("text");
});
});
describe("regression: hostile-reviewer cases", () => {
it("does NOT misclassify image/svg+xml as text (svg is image even though it has XML)", () => {
expect(getAttachmentPreviewKind("image/svg+xml")).toBe("image");
});
it("application/octet-stream + extension=.docx → file (no renderer, don't try)", () => {
expect(getAttachmentPreviewKind("application/octet-stream", "f.docx", "f.docx")).toBe("file");
});
it("non-canonical MIME application/json works", () => {
expect(getAttachmentPreviewKind("application/json")).toBe("text");
});
});
});
@@ -1,5 +1,5 @@
import { describe, it, expect } from "vitest";
import { resolveAttachmentHref } from "../uploads";
import { isPlatformAttachment, resolveAttachmentHref } from "../uploads";
describe("resolveAttachmentHref — URI scheme normalisation", () => {
const wsId = "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee";
@@ -39,3 +39,128 @@ describe("resolveAttachmentHref — URI scheme normalisation", () => {
expect(resolveAttachmentHref(wsId, "s3://bucket/key")).toBe("s3://bucket/key");
});
});
// #2973 follow-up to #2968: cover the platform-pending: scheme branch
// (poll-mode chat uploads) + the isPlatformAttachment SSOT helper that
// the chip-download and markdown-link paths both consume.
//
// Pre-fix the platform-pending: URI fell through to the raw URI →
// browser saw an unhandled-protocol click → about:blank. The fix
// resolves it to the platform pending-uploads endpoint with auth
// headers attached.
describe("resolveAttachmentHref — platform-pending: scheme (poll-mode uploads)", () => {
// Use a chat workspace ID that DIFFERS from the one in the URI, so
// tests can verify which one the resolver uses. The forward-across-
// workspace case is real production behavior — files dragged into one
// workspace's chat can be referenced from another.
const chatWs = "chat-ws-aaaaaaaa";
const sourceWs = "source-ws-bbbbbbbb";
it("resolves a well-formed platform-pending: URI to /pending-uploads/<file>/content", () => {
const url = resolveAttachmentHref(
chatWs,
`platform-pending:${sourceWs}/file-12345`,
);
expect(url).toContain(`/workspaces/${sourceWs}/pending-uploads/file-12345/content`);
});
it("uses the URI's wsid, NOT the chat workspace_id (cross-workspace forwarding)", () => {
// The two ids differ — this is the case PR #2968's commit
// explicitly calls out. A regression that flipped this would
// silently mis-route the download to the WRONG workspace's
// pending-uploads store, returning 404 (or worse, leaking).
const url = resolveAttachmentHref(
chatWs,
`platform-pending:${sourceWs}/file-xyz`,
);
expect(url).toContain(`/workspaces/${sourceWs}/`);
expect(url).not.toContain(`/workspaces/${chatWs}/`);
});
it("falls back to raw URI when platform-pending: is missing the slash", () => {
// Defensive: a URI that drifted from the expected wsid/fileid shape
// returns raw rather than producing a broken /pending-uploads//
// path. Pinned to detect a regression where a future "helpful"
// change synthesizes empty wsid/fileID.
expect(resolveAttachmentHref(chatWs, "platform-pending:no-slash")).toBe(
"platform-pending:no-slash",
);
});
it("falls back to raw URI when platform-pending: has empty fileID", () => {
expect(resolveAttachmentHref(chatWs, "platform-pending:abc/")).toBe(
"platform-pending:abc/",
);
});
it("falls back to raw URI when platform-pending: has empty wsid", () => {
expect(resolveAttachmentHref(chatWs, "platform-pending:/file-xyz")).toBe(
"platform-pending:/file-xyz",
);
});
it("regression: exact production repro from #2968 (reno-stars)", () => {
// From the original PR #2968 body: the chat's markdown-link
// override fell through on this exact shape and the browser
// navigated to about:blank. Pin the post-fix output so a future
// refactor can't reintroduce the original bug.
const url = resolveAttachmentHref(
"chat-ws",
"platform-pending:d76977b1-uuid/bb0dcaf3-uuid",
);
expect(url).toContain("/workspaces/d76977b1-uuid/pending-uploads/bb0dcaf3-uuid/content");
expect(url).not.toContain("chat-ws");
});
});
describe("isPlatformAttachment", () => {
it("returns true for platform-pending: URIs", () => {
expect(isPlatformAttachment("platform-pending:abc/file")).toBe(true);
});
it("returns true even for malformed platform-pending: URIs", () => {
// The helper is a SHAPE check — caller routes through
// downloadChatFile and downloadChatFile handles the malformed case
// downstream. Pinning so a future helper that "validates" the
// wsid/fileID shape doesn't silently break the auth-attached
// download flow for in-flight URIs.
expect(isPlatformAttachment("platform-pending:no-slash")).toBe(true);
});
it("returns true for workspace:<allowed-root> URIs", () => {
expect(isPlatformAttachment("workspace:/configs/foo")).toBe(true);
expect(isPlatformAttachment("workspace:/workspace/x.pdf")).toBe(true);
});
it("returns true for file:///<allowed-root> URIs", () => {
expect(isPlatformAttachment("file:///workspace/x")).toBe(true);
});
it("returns true for absolute paths under allowed roots", () => {
expect(isPlatformAttachment("/home/user/x")).toBe(true);
expect(isPlatformAttachment("/configs/y")).toBe(true);
});
it("returns FALSE for bare HTTPS URLs to other origins", () => {
// Auth-leak class regression: a helper that always returned true
// would attach workspace tokens to third-party requests. Pin
// the negative case explicitly.
expect(isPlatformAttachment("https://example.com/file")).toBe(false);
expect(isPlatformAttachment("http://example.com/file")).toBe(false);
});
it("returns FALSE for non-allowlisted root paths", () => {
expect(isPlatformAttachment("/etc/passwd")).toBe(false);
expect(isPlatformAttachment("/var/log/x")).toBe(false);
expect(isPlatformAttachment("/tmp/x")).toBe(false);
});
it("returns FALSE for empty string", () => {
expect(isPlatformAttachment("")).toBe(false);
});
it("returns FALSE for unrecognised schemes", () => {
expect(isPlatformAttachment("s3://bucket/key")).toBe(false);
expect(isPlatformAttachment("ftp://server/file")).toBe(false);
});
});
@@ -0,0 +1,154 @@
// preview-kind.ts — single source of truth for "what renderer should
// this attachment use" (RFC #2991, PR-1).
//
// Per the RFC's Phase 2 design, MIME type is the dispatch axis. The
// wire shape (ChatAttachment.mimeType) already carries it end-to-end
// from the server's chat_files.go through agent_message_writer.go to
// the canvas hydrater — we just need to map it to a render kind.
//
// Why a separate file from AttachmentPreview.tsx: the kind helper is
// a pure function that's easier to unit-test in isolation than a
// React component, and unit tests across MIME families are the
// regression line for new types added later.
/** The render-kind taxonomy. Each kind has a dedicated component:
*
* image → AttachmentImage (inline thumbnail + click → lightbox)
* video → AttachmentVideo (HTML5 <video controls>, native fullscreen)
* audio → AttachmentAudio (HTML5 <audio controls>)
* pdf → AttachmentPDF (browser-native <embed>, fullscreen modal)
* text → AttachmentTextPreview (monospace, first N lines, expand)
* file → AttachmentChip (existing fallback — generic file pill)
*
* NB: `text` includes JSON, YAML, source code, plain text — anything
* that renders sensibly as preformatted ASCII without a specialized
* viewer. PR-1 ships only `image` + `file`; PR-2 adds video/audio;
* PR-3 adds pdf + text. All routed through this same dispatch table
* so adding a new kind is a one-line registration. */
export type AttachmentPreviewKind = "image" | "video" | "audio" | "pdf" | "text" | "file";
/** Maps a MIME type to the render kind. Falls back to "file" for
* any MIME we don't have a renderer for (current behavior — the
* attachment chip is the universal fallback).
*
* Filename-based fallback: when mimeType is missing or generic
* (application/octet-stream), inspect the URI's extension. The
* workspace-server's chat_files.go derives Content-Type from the
* file extension, but agent-emitted attachments may not always
* set mimeType, and the canvas should still preview a file named
* `screenshot.png` even if the wire shape lacks the MIME.
*
* Strict MIME match always wins; extension fallback only applies
* to empty / generic. Unknown extension → "file". */
export function getAttachmentPreviewKind(
mimeType: string | undefined,
uri?: string,
name?: string,
): AttachmentPreviewKind {
const mime = (mimeType ?? "").toLowerCase().trim();
// Strict MIME match (preferred — set by server's Content-Type
// detection or by the agent's explicit mimeType field).
if (mime.startsWith("image/")) return "image";
if (mime.startsWith("video/")) return "video";
if (mime.startsWith("audio/")) return "audio";
if (mime === "application/pdf") return "pdf";
if (
mime.startsWith("text/") ||
mime === "application/json" ||
mime === "application/yaml" ||
mime === "application/x-yaml" ||
mime === "application/javascript" ||
mime === "application/typescript"
) {
return "text";
}
// Extension-based fallback — only when MIME is missing or
// application/octet-stream (the server's "I don't know" default).
// Skip when MIME is set to something specific we just don't have
// a renderer for (e.g. application/zip → file is correct).
const looksGeneric = mime === "" || mime === "application/octet-stream";
if (looksGeneric) {
const ext = extractExtension(uri, name);
if (ext) {
const kind = EXTENSION_KIND.get(ext);
if (kind) return kind;
}
}
return "file";
}
// Extension → kind table for the fallback branch. Keep this list
// short and curated — every entry is a UX commitment to render
// inline, and a wrong inference (e.g. .doc rendered as text) is
// worse than the generic file chip.
const EXTENSION_KIND: ReadonlyMap<string, AttachmentPreviewKind> = new Map([
// Images
["png", "image"],
["jpg", "image"],
["jpeg", "image"],
["gif", "image"],
["webp", "image"],
["svg", "image"],
["avif", "image"],
["bmp", "image"],
// Video
["mp4", "video"],
["webm", "video"],
["mov", "video"],
["mkv", "video"],
// Audio
["mp3", "audio"],
["wav", "audio"],
["ogg", "audio"],
["m4a", "audio"],
["flac", "audio"],
// PDF
["pdf", "pdf"],
// Text-ish (rendered as preformatted ASCII)
["txt", "text"],
["md", "text"],
["json", "text"],
["yaml", "text"],
["yml", "text"],
["js", "text"],
["ts", "text"],
["tsx", "text"],
["jsx", "text"],
["py", "text"],
["go", "text"],
["rs", "text"],
["java", "text"],
["c", "text"],
["cpp", "text"],
["h", "text"],
["hpp", "text"],
["sh", "text"],
["bash", "text"],
["html", "text"],
["css", "text"],
["sql", "text"],
["toml", "text"],
["ini", "text"],
["xml", "text"],
["csv", "text"],
["log", "text"],
]);
/** Extracts the lowercased extension from a uri or name, without
* the leading dot. Returns "" when no extension is present. */
function extractExtension(uri: string | undefined, name: string | undefined): string {
// Prefer name (always a leaf path); fall back to uri's last
// segment. Strip query string + fragment so a URI like
// "https://example.com/foo.png?download=1" still parses as png.
const candidate = name || uri || "";
if (!candidate) return "";
let leaf = candidate.split(/[\\/]/).pop() || "";
// Drop ?query and #fragment.
leaf = leaf.split(/[?#]/)[0];
const dot = leaf.lastIndexOf(".");
if (dot < 0 || dot === leaf.length - 1) return "";
return leaf.slice(dot + 1).toLowerCase();
}
+40 -2
View File
@@ -44,6 +44,8 @@ export async function uploadChatFiles(
* - `workspace:<abs-path>` (our canonical form)
* - `file:///workspace/...` (some agents emit this)
* - `/workspace/...` (bare absolute path inside the container)
* - `platform-pending:<wsid>/<file_id>` (poll-mode upload, staged
* on platform side; resolves to /pending-uploads/<file_id>/content)
* Everything that looks like an allowed-root container path is
* rewritten to the authenticated /chat/download endpoint. HTTP(S)
* URIs pass through unchanged so we can also render links to
@@ -53,6 +55,35 @@ export function resolveAttachmentHref(
workspaceId: string,
uri: string,
): string {
// platform-pending: agents-emitted URI that lives in the platform-side
// staging layer (poll-mode chat uploads, see workspace-server's
// chat_files.go ~line 690 + pendinguploads.Storage). The wire shape
// is `platform-pending:<workspace_id>/<file_id>`. Resolving it
// requires hitting GET /workspaces/<wsid>/pending-uploads/<file_id>/content
// which streams the bytes with full workspace auth. Without this
// case the browser sees an unhandled-protocol click → about:blank,
// which was the user-visible bug from 2026-05-05 (reno-stars).
if (uri.startsWith("platform-pending:")) {
const rest = uri.slice("platform-pending:".length);
const slash = rest.indexOf("/");
// Defensive: if the URI doesn't have the expected wsid/fileid
// shape, fall through to raw-URI handling so the consumer can
// still try to render it (rather than producing a broken /pending-
// uploads/// path).
if (slash > 0) {
const wsid = rest.slice(0, slash);
const fileID = rest.slice(slash + 1);
if (wsid && fileID) {
// Use the URI's own workspace_id (the bytes live in THAT
// workspace's pending-uploads store), not the chat's
// workspace_id — these CAN differ when a user drags a file
// into one workspace's chat that gets forwarded to another
// (cross-workspace delegation, agent forwarding).
return `${PLATFORM_URL}/workspaces/${wsid}/pending-uploads/${fileID}/content`;
}
}
return uri;
}
const containerPath = normalizeWorkspaceUri(uri);
if (containerPath) {
return `${PLATFORM_URL}/workspaces/${workspaceId}/chat/download?path=${encodeURIComponent(containerPath)}`;
@@ -60,6 +91,14 @@ export function resolveAttachmentHref(
return uri;
}
/** Returns true when the URI points at a platform-side resource that
* requires our auth headers — caller should route through
* downloadChatFile rather than letting the browser navigate. */
export function isPlatformAttachment(uri: string): boolean {
if (uri.startsWith("platform-pending:")) return true;
return normalizeWorkspaceUri(uri) !== null;
}
/** Extracts the absolute container path from a workspace-scoped URI,
* or null if the URI isn't a container path. The matching roots
* mirror the server's `allowedRoots` allowlist. */
@@ -96,8 +135,7 @@ export async function downloadChatFile(
attachment: ChatAttachment,
): Promise<void> {
const href = resolveAttachmentHref(workspaceId, attachment.uri);
const isContainerPath = normalizeWorkspaceUri(attachment.uri) !== null;
if (!isContainerPath) {
if (!isPlatformAttachment(attachment.uri)) {
// External URL — let the browser navigate. Opens in new tab so
// the canvas context survives a navigation. `href` here is the
// raw URI (http(s), or anything else the agent sent back).
+155 -1
View File
@@ -2,7 +2,7 @@
* @vitest-environment jsdom
*/
import { describe, it, expect, vi, afterEach } from "vitest";
import { fetchSession, redirectToLogin } from "../auth";
import { fetchSession, redirectToLogin, signOut } from "../auth";
afterEach(() => {
vi.unstubAllGlobals();
@@ -110,3 +110,157 @@ describe("redirectToLogin", () => {
expect((window.location as unknown as { href: string }).href).toBe(signupHref);
});
});
describe("signOut", () => {
// Helper — most tests need the same window.location stub.
function stubLocation(): void {
Object.defineProperty(window, "location", {
writable: true,
value: {
href: "https://acme.moleculesai.app/orgs",
pathname: "/orgs",
hostname: "acme.moleculesai.app",
protocol: "https:",
},
});
}
it("POSTs to /cp/auth/signout with credentials:include", async () => {
stubLocation();
const fetchMock = vi.fn().mockResolvedValue({
ok: true,
status: 200,
json: async () => ({ ok: true, logout_url: "" }),
});
vi.stubGlobal("fetch", fetchMock);
await signOut();
expect(fetchMock).toHaveBeenCalledTimes(1);
expect(fetchMock).toHaveBeenCalledWith(
expect.stringContaining("/cp/auth/signout"),
expect.objectContaining({ method: "POST", credentials: "include" }),
);
});
it("navigates to provider logout_url when the response includes one", async () => {
// The hosted-logout path is what actually breaks the SSO re-auth
// loop reported on PR #2913. Without this, AuthKit's browser
// cookie keeps the user signed in via SSO and any subsequent
// /cp/auth/login silently re-auths.
stubLocation();
const hostedLogout =
"https://api.workos.com/user_management/sessions/logout?session_id=cookie&return_to=https%3A%2F%2Fapp.moleculesai.app%2Forgs";
vi.stubGlobal(
"fetch",
vi.fn().mockResolvedValue({
ok: true,
status: 200,
json: async () => ({ ok: true, logout_url: hostedLogout }),
}),
);
await signOut();
const after = (window.location as unknown as { href: string }).href;
expect(after).toBe(hostedLogout);
});
it("falls back to /cp/auth/login when logout_url is empty (DisabledProvider / dev)", async () => {
// DisabledProvider returns "" — the local /cp/auth/login redirect
// works in dev/test where there's no SSO session to escape.
stubLocation();
vi.stubGlobal(
"fetch",
vi.fn().mockResolvedValue({
ok: true,
status: 200,
json: async () => ({ ok: true, logout_url: "" }),
}),
);
await signOut();
const after = (window.location as unknown as { href: string }).href;
// Tenant subdomain (acme.moleculesai.app) → auth origin is app.moleculesai.app.
expect(after).toBe("https://app.moleculesai.app/cp/auth/login");
});
it("redirects even when the POST fails so the user isn't stuck on an authed page", async () => {
// Critical UX invariant: clicking 'Sign out' MUST navigate away from
// the authenticated app, even if the network is down or the cookie
// is already invalid. Anything else looks like the button is
// broken — the precise complaint that triggered this fix.
stubLocation();
vi.stubGlobal("fetch", vi.fn().mockRejectedValue(new Error("network down")));
await signOut();
const after = (window.location as unknown as { href: string }).href;
expect(after).toBe("https://app.moleculesai.app/cp/auth/login");
});
it("redirects on 401 (session already invalid) just like 200", async () => {
// A user with an already-invalid cookie should still see the
// logout flow complete — no error, no stuck-on-app dead end.
// Note: 401 means res.ok=false → we don't read .json() at all,
// so a missing body is fine.
stubLocation();
vi.stubGlobal(
"fetch",
vi.fn().mockResolvedValue({
ok: false,
status: 401,
json: async () => ({}),
}),
);
await signOut();
const after = (window.location as unknown as { href: string }).href;
expect(after).toBe("https://app.moleculesai.app/cp/auth/login");
});
it("falls back to /cp/auth/login when the response body is malformed", async () => {
// Defensive parsing: a body that isn't valid JSON, or doesn't
// have logout_url, or has logout_url as the wrong type — none of
// these should strand the user on the authed page. Fallback path
// takes over.
stubLocation();
vi.stubGlobal(
"fetch",
vi.fn().mockResolvedValue({
ok: true,
status: 200,
json: async () => {
throw new Error("not json");
},
}),
);
await signOut();
const after = (window.location as unknown as { href: string }).href;
expect(after).toBe("https://app.moleculesai.app/cp/auth/login");
});
it("falls back to /cp/auth/login when logout_url is the wrong type", async () => {
// Even valid JSON should be type-checked: a non-string logout_url
// (e.g. server-side bug, version drift) must not crash or open-
// redirect the user.
stubLocation();
vi.stubGlobal(
"fetch",
vi.fn().mockResolvedValue({
ok: true,
status: 200,
json: async () => ({ ok: true, logout_url: 42 }),
}),
);
await signOut();
const after = (window.location as unknown as { href: string }).href;
expect(after).toBe("https://app.moleculesai.app/cp/auth/login");
});
});
+77
View File
@@ -67,3 +67,80 @@ export function redirectToLogin(screenHint: "sign-up" | "sign-in" = "sign-in"):
const dest = `${authOrigin}${AUTH_BASE}/${path}?return_to=${encodeURIComponent(returnTo)}`;
window.location.href = dest;
}
/**
* signOut posts to /cp/auth/signout to clear the WorkOS session cookie
* + revoke at the provider, then navigates the browser to the
* provider-supplied hosted logout URL (so the provider's BROWSER-side
* SSO cookie is cleared too — without this, AuthKit silently re-auths
* via SSO on the next /cp/auth/login and the user is "still signed
* in" after pressing Sign out).
*
* Two-layer flow:
* 1. POST /cp/auth/signout → CP clears OUR session cookie + revokes
* session_id at the provider API. Response includes
* `logout_url` — the AuthKit hosted URL the BROWSER must navigate
* to so the provider's own browser cookie is cleared.
* 2. window.location.href = <logout_url> → AuthKit clears its
* session, then redirects the browser to the configured
* return_to (defaults to APP_URL/orgs).
*
* Best-effort by design: a 5xx, network failure, missing logout_url
* (DisabledProvider, dev), or stale cookie still results in the
* browser navigating away — leaving the user on a logged-in-looking
* page after they clicked "Sign out" is the worst possible UX. The
* fallback path navigates to /cp/auth/login on the auth origin, which
* works correctly in environments without a hosted logout flow (dev,
* tests, DisabledProvider).
*
* Throws nothing — callers can disable the button optimistically or
* await this and trust it returns. On a redirect-blocked test
* environment (jsdom under vitest) we still exit cleanly so unit tests
* can spy on the fetch call.
*/
export async function signOut(): Promise<void> {
let logoutURL: string | undefined;
// Fire-and-tolerate the POST. credentials:include is mandatory cross-
// origin so the SaaS canvas (acme.moleculesai.app) can hit
// app.moleculesai.app/cp/auth/signout with the session cookie.
try {
const res = await fetch(`${getAuthOrigin()}${AUTH_BASE}/signout`, {
method: "POST",
credentials: "include",
});
if (res.ok) {
// Body shape: {"ok": true, "logout_url": "..."}. logout_url is
// empty for DisabledProvider (dev/local) — we fall back to
// /cp/auth/login below. Defensive parsing: a malformed body
// shouldn't strand the user on the authed page.
const body: unknown = await res.json().catch(() => null);
if (
body &&
typeof body === "object" &&
"logout_url" in body &&
typeof (body as { logout_url: unknown }).logout_url === "string" &&
(body as { logout_url: string }).logout_url
) {
logoutURL = (body as { logout_url: string }).logout_url;
}
}
} catch {
// Ignore — we still redirect below.
}
if (typeof window === "undefined") return;
if (logoutURL) {
// Hosted logout: AuthKit clears its SSO cookie + redirects to
// return_to (configured server-side). This is the path that
// actually breaks the SSO re-auth loop.
window.location.href = logoutURL;
return;
}
// Fallback: no hosted logout (dev, DisabledProvider, network
// failure). Land on the login screen rather than the current URL:
// returning to a tenant URL after signout would just re-redirect
// through /cp/auth/login due to AuthGate. Send the user straight
// there with no return_to so they don't loop back into the org they
// just left.
const authOrigin = getAuthOrigin();
window.location.href = `${authOrigin}${AUTH_BASE}/login`;
}
-111
View File
@@ -1,111 +0,0 @@
# Team Expansion (Recursive Workspaces)
When a workspace is expanded into a team, it gains sub-workspaces while its own agent remains as the **team lead** (coordinator). This is recursive — sub-workspaces can themselves be expanded into teams, infinitely deep.
## How It Works
When Developer PM is expanded into a team:
```
Business Core
|
+-- Developer PM (agent stays, becomes coordinator)
|
+-- Frontend Agent (sub-workspace, private scope)
+-- Backend Agent (sub-workspace, private scope)
+-- QA Agent (sub-workspace, private scope)
```
- Developer PM's agent **still exists** and acts as coordinator
- Developer PM receives incoming A2A messages from Business Core
- Developer PM's agent decides how to delegate to sub-workspaces
- Sub-workspaces talk to Developer PM and to each other (same level)
- Sub-workspaces **cannot** talk to Business Core or any workspace outside the team
## Communication Rules
| Direction | Allowed? | Example |
|-----------|----------|---------|
| Parent level -> team lead | Yes | Business Core -> Developer PM |
| Team lead -> sub-workspaces | Yes | Developer PM -> Frontend Agent |
| Sub-workspace -> team lead | Yes | Frontend Agent -> Developer PM |
| Sub-workspace <-> sibling | Yes | Frontend Agent <-> Backend Agent |
| Outside -> sub-workspace directly | No (403) | Business Core -> Frontend Agent |
| Sub-workspace -> outside directly | No | Frontend Agent -> Business Core |
The team lead (Developer PM) is the **only** bridge between the team's internal world and the outside.
## Scoped Registry
Sub-workspaces register in the platform registry but with a **private scope**. The registry knows about them but enforces access control.
```
Registry:
Business Core :8001 scope: public
Developer PM :8002 scope: public
Frontend Agent :8010 scope: private, parent=Developer PM
Backend Agent :8011 scope: private, parent=Developer PM
QA Agent :8012 scope: private, parent=Developer PM
```
- The platform can always discover any workspace (for provisioning, monitoring)
- The parent workspace can discover its sub-workspaces
- Sub-workspaces can discover their siblings (same parent)
- Outside workspaces get a **403 Forbidden** if they try to discover a private sub-workspace
## How to Expand
Expansion is triggered via `POST /workspaces/:id/expand`. The platform reads the `sub_workspaces` list from the workspace's config and provisions each one. On the canvas, users right-click a workspace node and select "Expand into team."
Collapsing is the inverse: `POST /workspaces/:id/collapse`. Sub-workspaces are stopped and removed.
## What Happens on Expansion
When Developer PM is expanded into a team, the hierarchy changes but the outside view doesn't. Business Core's parent/child relationship to Developer PM is unaffected — Developer PM still responds to the same A2A endpoint.
The events fired:
- `WORKSPACE_EXPANDED` with the new `sub_workspace_ids` in the payload
- `WORKSPACE_PROVISIONING` for each new sub-workspace
- `WORKSPACE_ONLINE` for each sub-workspace as they come up
Communication rules are automatically derived from the new hierarchy — no manual wiring needed.
## Canvas Behavior
- Children render as embedded mini-cards (`TeamMemberChip`) inside the parent node, not as separate canvas nodes
- Each mini-card shows full status: gradient bar, name, tier badge, skills pills, active tasks, descendant count
- **Recursive rendering** up to 3 levels deep (`MAX_NESTING_DEPTH = 3`) — sub-cards can contain their own "Team" sections
- Parent node dynamically resizes: 210-280px (no children), 320-450px (children), 400-560px (grandchildren)
- Eject button (sky-blue arrow icon) on hover extracts a child from the team
- "Extract from Team" also available in the right-click context menu
- Double-click a team node to zoom/fit to the parent area
- The parent workspace node shows a badge with total descendant count
## Collapsing a Team
The inverse of expansion, triggered via `POST /workspaces/:id/collapse`:
1. Each sub-workspace agent wraps up current work and writes a handoff document to memory
2. Sub-workspaces are stopped and removed
3. The team lead's agent goes back to handling everything directly
4. A `WORKSPACE_COLLAPSED` event fires
Sub-workspace memory is cleaned up based on backend (see [Memory — Cleanup](../architecture/memory.md#cleanup-on-workspace-deletion)).
## Deleting a Team Workspace
When a team workspace is deleted:
1. Platform shows a warning listing all sub-workspaces that will be deleted
2. User can **drag sub-workspaces out** of the team before confirming (promotes them to the parent level)
3. On confirmation, cascade delete removes the parent and all remaining sub-workspaces
4. `WORKSPACE_REMOVED` events fire for each deleted workspace
## Related Docs
- [Communication Rules](../api-protocol/communication-rules.md) — Full access control model
- [Core Concepts](../product/core-concepts.md) — Workspace fundamentals
- [System Prompt Structure](./system-prompt-structure.md) — How peer capabilities are injected
- [Provisioner](../architecture/provisioner.md) — How sub-workspaces are deployed
- [Registry & Heartbeat](../api-protocol/registry-and-heartbeat.md) — How registration works
- [Event Log](../architecture/event-log.md) — Events fired during expansion
- [Canvas UI](../frontend/canvas.md) — Visual behavior of teams
-2
View File
@@ -41,8 +41,6 @@ Full contract: `docs/runbooks/admin-auth.md`.
| GET | /admin/workspaces/:id/test-token | admin_test_token.go — mint a fresh bearer token for E2E scripts; returns 404 unless `MOLECULE_ENV != production` or `MOLECULE_ENABLE_TEST_TOKENS=1` |
| GET/POST/DELETE | /admin/secrets[/:key] | secrets.go — legacy aliases for /settings/secrets |
| WS | /workspaces/:id/terminal | terminal.go |
| POST | /workspaces/:id/expand | team.go |
| POST | /workspaces/:id/collapse | team.go |
| POST/GET | /workspaces/:id/approvals | approvals.go |
| POST | /workspaces/:id/approvals/:id/decide | approvals.go |
| GET | /approvals/pending | approvals.go |
@@ -336,8 +336,6 @@ This same logic governs: A2A delegation, memory scope enforcement, activity visi
| Method | Endpoint | Purpose |
|--------|----------|---------|
| `POST` | `/workspaces/:id/expand` | Expand workspace into team (become coordinator) |
| `POST` | `/workspaces/:id/collapse` | Collapse team back to single workspace |
### Files, Terminal, Templates, Bundles (8 endpoints)
-1
View File
@@ -186,4 +186,3 @@ So the UI now exposes more operational failure state directly instead of silentl
- [Quickstart](../quickstart.md)
- [Platform API](../api-protocol/platform-api.md)
- [Workspace Runtime](../agent-runtime/workspace-runtime.md)
- [Team Expansion](../agent-runtime/team-expansion.md)
+1 -1
View File
@@ -18,7 +18,7 @@ lands in the watch list with a colliding term, add a row here.
| **plugin** | A directory under `plugins/` packaging one or more skills or an MCP server wrapper, installable per-workspace via `POST /workspaces/:id/plugins`. Governed by `plugin.yaml`. | **Langflow**: a visual UI node / component in a flowchart. **CrewAI**: a Python-importable callable registered as a capability. |
| **agent** | A persistent containerized workspace running continuously — an identity with memory, a role, and a schedule. Not a one-shot invocation. | Most frameworks (AutoGPT, LangChain agents, OpenAI Assistants): a stateless function-call loop. No persistence between invocations unless explicitly checkpointed. |
| **flow** | A task execution within a workspace — a request enters, the agent runs tools, emits a response, logs activity. No explicit graph abstraction. | **Langflow**: a directed graph of nodes you author visually. **LangGraph**: a stateful graph of callable nodes. Our "flow" is an imperative timeline, not a graph. |
| **team** | A named cluster of workspaces under a PM (org template `expand_team`). Used for role grouping in Canvas. | **CrewAI**: a "crew" is a sequence of agents that pass a task through a declared order. Our "team" is an org-chart abstraction, not an execution order. |
| **team** | A named cluster of workspaces under a PM . Used for role grouping in Canvas. | **CrewAI**: a "crew" is a sequence of agents that pass a task through a declared order. Our "team" is an org-chart abstraction, not an execution order. |
| **skill** | A directory with `SKILL.md` that an agent invokes via the `Skill` tool. Skills are documentation + optional scripts that teach an agent a recipe. | **Anthropic Skills API**: nearly identical. **CrewAI tool**: closer to our plugin's MCP tool, not our skill. |
| **channel** | An outbound/inbound social integration (Telegram, Slack, …) per-workspace, wired in `workspace_channels`. | Slack's "channel": the container for messages. We use "channel" for the adapter + credentials, not the conversation itself. |
| **runtime** | The execution engine image tag for a workspace: one of `langgraph`, `claude-code`, `openclaw`, `crewai`, `autogen`, `deepagents`, `hermes`. | **LangGraph runtime**: the Python process running the graph. We use "runtime" for the Docker image + adapter pairing, not the inner process. |
-2
View File
@@ -166,8 +166,6 @@ list_workspaces
| MCP Tool | API Route | Method | Description |
|----------|-----------|--------|-------------|
| `expand_team` | `/workspaces/:id/expand` | POST | Expand team node |
| `collapse_team` | `/workspaces/:id/collapse` | POST | Collapse team node |
### Templates & Bundles
+9
View File
@@ -1,5 +1,14 @@
# Workspace Runtime PyPI Package
## Requires Python >= 3.11
The wheel pins `requires_python>=3.11`. On Python 3.10 or older, `pip install
molecule-ai-workspace-runtime` fails with `Could not find a version that
satisfies the requirement (from versions: none)` — the pin filters the only
available artifact before pip even attempts install. Upgrade the interpreter
(`brew install python@3.12` / `apt install python3.12` / etc.) or use a
3.11+ venv.
## Overview
The shared workspace runtime infrastructure has **one editable source** and
+91 -2
View File
@@ -54,7 +54,12 @@ TOP_LEVEL_MODULES = {
"a2a_client",
"a2a_executor",
"a2a_mcp_server",
"a2a_response",
"a2a_tools",
"a2a_tools_delegation",
"a2a_tools_inbox",
"a2a_tools_memory",
"a2a_tools_messaging",
"a2a_tools_rbac",
"adapter_base",
"agent",
@@ -76,6 +81,7 @@ TOP_LEVEL_MODULES = {
"internal_file_read",
"main",
"mcp_cli",
"mcp_doctor",
"mcp_heartbeat",
"mcp_inbox_pollers",
"mcp_workspace_resolver",
@@ -287,10 +293,37 @@ directory** by the `publish-runtime` GitHub Actions workflow on every
Operators running an agent outside the platform's container fleet
(any runtime that supports MCP stdio — Claude Code, hermes, codex,
etc.) can install this wheel and run the universal MCP server
locally:
locally.
### Requirements
* **Python ≥3.11.** The wheel sets `requires-python = ">=3.11"`. On
older interpreters `pip install` returns the cryptic
`Could not find a version that satisfies the requirement` — that
message is pip filtering this wheel out, NOT the package missing
from PyPI. Upgrade with `brew install python@3.12` /
`apt install python3.12` / `pyenv install 3.12` first.
* **`pipx` recommended over `pip`.** `pipx install` puts
`molecule-mcp` on PATH automatically and isolates the runtime's
deps from your system Python. Plain `pip install --user` works
but the binary lands in `~/.local/bin` (Linux) or
`~/Library/Python/3.X/bin` (macOS) which is often not on PATH on
a fresh shell — `claude mcp add molecule -- molecule-mcp` then
fails with "command not found" at first use.
### Install
```sh
# Recommended:
pipx install molecule-ai-workspace-runtime
# Alternative (manage PATH yourself):
pip install --user molecule-ai-workspace-runtime
```
### Run
```sh
pip install molecule-ai-workspace-runtime
WORKSPACE_ID=<uuid> \\
PLATFORM_URL=https://<tenant>.staging.moleculesai.app \\
MOLECULE_WORKSPACE_TOKEN=<bearer> \\
@@ -303,10 +336,66 @@ runtimes already get via the workspace's auto-spawned MCP. Register
the binary in your agent's MCP config (e.g. Claude Code's
`claude mcp add molecule -- molecule-mcp` with the env above).
### Keeping the token out of shell history
Inline `MOLECULE_WORKSPACE_TOKEN=<bearer>` ends up in `~/.zsh_history`
and (when registered via `claude mcp add`) plaintext in
`~/.claude.json`. To avoid that, write the token to a 0600 file and
point `MOLECULE_WORKSPACE_TOKEN_FILE` at it:
```sh
umask 077
printf '%s' "<bearer>" > ~/.config/molecule/token
WORKSPACE_ID=<uuid> \\
PLATFORM_URL=https://<tenant>.staging.moleculesai.app \\
MOLECULE_WORKSPACE_TOKEN_FILE=$HOME/.config/molecule/token \\
molecule-mcp
```
Token resolution order: `MOLECULE_WORKSPACE_TOKEN` (inline env) →
`MOLECULE_WORKSPACE_TOKEN_FILE` (path) → `${CONFIGS_DIR}/.auth_token`
(in-container default).
The token comes from the canvas → Tokens tab. Restarting an external
workspace from the canvas no longer revokes the token (PR #2412), so
operator tokens persist across status nudges.
### Push vs poll delivery (Claude Code specifics)
By default the inbox runs in **poll mode** — every turn the agent
calls `wait_for_message`, which blocks up to ~60s on
`/activity?since_id=…`. Real-time push delivery is also supported,
but on Claude Code it requires THREE conditions, ALL of which must
hold:
1. **The MCP server declares `experimental.claude/channel`** — this
wheel does (see `_build_initialize_result`). Nothing for you to
do.
2. **Claude Code installs the server as a marketplace plugin** — a
plain `claude mcp add molecule -- molecule-mcp` produces a
non-plugin-sourced server, which Claude Code rejects with
`channel_enable requires a marketplace plugin`. Until the
official `moleculesai/claude-code-plugin` marketplace lands
(tracking [#2936](https://github.com/Molecule-AI/molecule-core/issues/2936)),
operators who want push must scaffold their own local marketplace
under
`~/.claude/marketplaces/molecule-local/` containing a
`marketplace.json` + `plugin.json` that points at this wheel.
3. **Claude Code is launched with the dev-channels flag** — pass
`--dangerously-load-development-channels plugin:molecule@<marketplace>`
on the `claude` invocation. Without this flag the channel
capability is silently ignored.
Symptom of any condition failing: messages arrive but only via the
poll path (every ~160s), not real-time. There's currently no
diagnostic surfaced — `molecule-mcp doctor` (tracking
[#2937](https://github.com/Molecule-AI/molecule-core/issues/2937)) is
planned.
If you don't need real-time push, the default poll path works
universally with no extra setup; both modes converge on the same
`inbox_pop` ack so messages never duplicate.
See [`docs/workspace-runtime-package.md`](https://github.com/Molecule-AI/molecule-core/blob/main/docs/workspace-runtime-package.md)
for the publish flow and architecture.
"""
+216
View File
@@ -0,0 +1,216 @@
#!/usr/bin/env bash
# scripts/check-stale-promote-pr.sh
#
# Scan open auto-promote PRs (base=main head=staging) for the
# silent-block failure mode that motivated issue #2975:
# - PR sat for hours with mergeStateStatus=BLOCKED
# - reviewDecision=REVIEW_REQUIRED (auto-merge armed but waiting
# on a human approval that never comes)
#
# When found, emit:
# - GitHub Actions notice/warning lines (workflow summary surface)
# - Optionally post a comment on the PR (--comment)
#
# Exit code is the count of stale PRs found, capped at 125 so callers
# can detect "alarm fired" via `if ! check-stale-promote-pr.sh; then …`.
# Exit 0 = clean, exit ≥1 = at least N stale PRs need attention.
#
# Used by .github/workflows/auto-promote-stale-alarm.yml. Logic lives
# here (not inline in the workflow YAML) so we can:
# - Unit-test it with a stubbed `gh` (see test-check-stale-promote-pr.sh)
# - Run it ad-hoc by an operator: `scripts/check-stale-promote-pr.sh`
# - Reuse the same surface in any sibling workflow that needs the same
# check (SSOT — one detector, many callers).
#
# Requires: `gh` CLI, `jq`. `GH_TOKEN` env in the workflow context.
set -euo pipefail
# -----------------------------------------------------------------------------
# Inputs
# -----------------------------------------------------------------------------
# Threshold beyond which a BLOCKED+REVIEW_REQUIRED promote PR is "stale"
# enough to alarm. 4 hours is the floor: most legitimate gates clear
# inside an hour, so 4× headroom is plenty for slow CI without false-
# alarming. Override via env for tests + edge ops.
STALE_HOURS="${STALE_HOURS:-4}"
# Repo defaults to the current `gh` context. Tests pass --repo explicitly.
REPO="${GITHUB_REPOSITORY:-}"
# Whether to post a comment to the PR. Off by default to avoid noise on
# manual ad-hoc runs; the cron workflow turns it on.
POST_COMMENT="${POST_COMMENT:-false}"
# Where to read the open-PR JSON from. Empty = call `gh` live. Tests
# point this at a fixture file.
PR_FIXTURE="${PR_FIXTURE:-}"
# Where to read "now" from. Empty = real clock. Tests freeze time so
# the staleness math is deterministic.
NOW_OVERRIDE="${NOW_OVERRIDE:-}"
while [ $# -gt 0 ]; do
case "$1" in
--repo) REPO="$2"; shift 2 ;;
--comment) POST_COMMENT="true"; shift ;;
--no-comment) POST_COMMENT="false"; shift ;;
--fixture) PR_FIXTURE="$2"; shift 2 ;;
--stale-hours) STALE_HOURS="$2"; shift 2 ;;
-h|--help)
sed -n '1,/^set /p' "$0" | grep '^# ' | sed 's/^# //'
exit 0
;;
*) echo "unknown arg: $1" >&2; exit 64 ;;
esac
done
if [ -z "$REPO" ] && [ -z "$PR_FIXTURE" ]; then
echo "::error::REPO env (or GITHUB_REPOSITORY) required when no fixture given" >&2
exit 2
fi
# -----------------------------------------------------------------------------
# Clock helpers — split out so tests can freeze time
# -----------------------------------------------------------------------------
now_epoch() {
if [ -n "$NOW_OVERRIDE" ]; then
printf '%s\n' "$NOW_OVERRIDE"
else
date -u +%s
fi
}
# Parse RFC3339 timestamps the way GitHub emits them (e.g.
# "2026-05-05T23:15:00Z"). gnu-date uses -d, bsd-date uses -j -f. Cover
# both because the workflow runs on ubuntu-latest (gnu) but operators
# may run this script on macOS (bsd).
to_epoch() {
local ts="$1"
# gnu-date path first.
if date -u -d "$ts" +%s 2>/dev/null; then
return 0
fi
# bsd-date fallback — strip optional fractional seconds before %S.
local ts_clean="${ts%%.*}"
ts_clean="${ts_clean%Z}Z"
date -u -j -f "%Y-%m-%dT%H:%M:%SZ" "$ts_clean" +%s 2>/dev/null || {
echo "::error::cannot parse timestamp: $ts" >&2
return 1
}
}
# -----------------------------------------------------------------------------
# Fetch open auto-promote PRs
# -----------------------------------------------------------------------------
fetch_prs() {
if [ -n "$PR_FIXTURE" ]; then
cat "$PR_FIXTURE"
return 0
fi
gh pr list --repo "$REPO" \
--base main --head staging --state open \
--json number,title,createdAt,mergeStateStatus,reviewDecision,url
}
# -----------------------------------------------------------------------------
# Stale detection
# -----------------------------------------------------------------------------
# Read PR list from stdin, emit one TSV line per stale PR:
# <num>\t<age_hours>\t<url>\t<title>
# Caller decides what to do (warn, comment, escalate).
detect_stale() {
local now_ts
now_ts="$(now_epoch)"
local stale_seconds=$((STALE_HOURS * 3600))
jq -r '.[] | [.number, .createdAt, .mergeStateStatus, .reviewDecision, .url, .title] | @tsv' \
| while IFS=$'\t' read -r num created_at merge_state review_decision url title; do
# Only alarm on the specific failure mode: BLOCKED + REVIEW_REQUIRED.
# Other BLOCKED reasons (DIRTY, BEHIND, failed checks) are the
# author's signal-to-fix; this script targets the silent
# "no human reviewed yet" wedge specifically.
[ "$merge_state" = "BLOCKED" ] || continue
[ "$review_decision" = "REVIEW_REQUIRED" ] || continue
local created_ts
created_ts="$(to_epoch "$created_at")" || continue
local age=$((now_ts - created_ts))
if [ "$age" -ge "$stale_seconds" ]; then
local age_h=$((age / 3600))
printf '%s\t%d\t%s\t%s\n' "$num" "$age_h" "$url" "$title"
fi
done
}
# -----------------------------------------------------------------------------
# Reporting
# -----------------------------------------------------------------------------
# Comment body — kept short; the issue body has the full design.
comment_body() {
local age_h="$1"
cat <<EOF
⚠️ This auto-promote PR has been BLOCKED on \`REVIEW_REQUIRED\` for **${age_h}h**.
Auto-merge is armed, but main's branch protection requires 1 review and no human has approved. Until someone reviews, the staging→main promote chain is wedged and downstream consumers (canvas builds, tenant redeploys) won't see new code.
**Action**: a human reviewer on \`@Molecule-AI/maintainers\` should approve this PR (or mark it as not ready and close).
Detected by \`scripts/check-stale-promote-pr.sh\` per issue #2975.
EOF
}
post_comment() {
local pr_num="$1"
local age_h="$2"
if [ "$POST_COMMENT" != "true" ]; then
return 0
fi
# Idempotency: only one alarm comment per PR. Look for the marker
# string in existing comments before posting a new one.
local existing
existing="$(gh pr view "$pr_num" --repo "$REPO" --json comments \
--jq '.comments[] | select(.body | test("scripts/check-stale-promote-pr.sh per issue #2975")) | .databaseId' \
| head -n1)"
if [ -n "$existing" ]; then
echo "::notice::PR #$pr_num already has a stale-alarm comment ($existing) — not re-posting"
return 0
fi
comment_body "$age_h" | gh pr comment "$pr_num" --repo "$REPO" --body-file -
echo "::notice::Posted stale-alarm comment on PR #$pr_num (age=${age_h}h)"
}
# -----------------------------------------------------------------------------
# Main
# -----------------------------------------------------------------------------
stale_count=0
while IFS=$'\t' read -r num age_h url title; do
[ -n "$num" ] || continue
stale_count=$((stale_count + 1))
echo "::warning title=Stale auto-promote PR::PR #$num — BLOCKED on REVIEW_REQUIRED for ${age_h}h. $url"
{
echo "## ⚠️ Stale auto-promote PR detected"
echo
echo "- PR: #$num — \`$title\`"
echo "- Age: ${age_h}h"
echo "- State: BLOCKED on REVIEW_REQUIRED"
echo "- URL: $url"
echo
echo "Auto-merge is armed but waiting on a human review. See issue #2975."
} >> "${GITHUB_STEP_SUMMARY:-/dev/null}"
post_comment "$num" "$age_h"
done < <(fetch_prs | detect_stale)
if [ "$stale_count" -eq 0 ]; then
echo "::notice::No stale auto-promote PRs detected (threshold: ${STALE_HOURS}h)"
fi
# Cap exit code so we don't accidentally break shells that interpret
# >125 as signal-style. 1..N maps to "1..N stale PRs".
exit $(( stale_count > 125 ? 125 : stale_count ))
+257
View File
@@ -0,0 +1,257 @@
#!/usr/bin/env bash
# scripts/test-check-stale-promote-pr.sh
#
# Exhaustive bash unit tests for check-stale-promote-pr.sh.
# Goal: 100% branch coverage on the detector logic.
#
# Each case writes a fixture JSON, freezes the clock with NOW_OVERRIDE,
# runs the script with --fixture + --no-comment (so we don't try to
# actually call `gh pr comment`), and asserts on stdout/exit code.
#
# Run: bash scripts/test-check-stale-promote-pr.sh
# Expected: "All N tests passed" + exit 0.
set -euo pipefail
SCRIPT="$(cd "$(dirname "$0")" && pwd)/check-stale-promote-pr.sh"
TMP="$(mktemp -d)"
trap 'rm -rf "$TMP"' EXIT
PASS=0
FAIL=0
# ─────────────────────────────────────────────────────────────────────────────
# Helpers
# ─────────────────────────────────────────────────────────────────────────────
# Frozen "now" — 2026-05-06T05:00:00Z. Compute dynamically so the
# tests stay correct regardless of platform-specific date semantics
# (gnu vs bsd) and any author math errors on the epoch.
if FROZEN_NOW="$(date -u -d '2026-05-06T05:00:00Z' +%s 2>/dev/null)"; then
: # gnu-date worked
elif FROZEN_NOW="$(date -u -j -f '%Y-%m-%dT%H:%M:%SZ' '2026-05-06T05:00:00Z' +%s 2>/dev/null)"; then
: # bsd-date worked
else
echo "FATAL: cannot compute FROZEN_NOW on this platform" >&2
exit 1
fi
run_script() {
# Args: <fixture-file>
# Returns stdout + exit code via a known marker.
local fixture="$1"
shift
set +e
NOW_OVERRIDE="$FROZEN_NOW" \
POST_COMMENT="false" \
bash "$SCRIPT" --fixture "$fixture" "$@" 2>&1
local rc=$?
set -e
echo "EXIT_CODE=$rc"
}
assert_pass() {
local name="$1"
local got="$2"
local want_pattern="$3"
if printf '%s' "$got" | grep -qE "$want_pattern"; then
PASS=$((PASS + 1))
printf ' ✓ %s\n' "$name"
else
FAIL=$((FAIL + 1))
printf ' ✗ %s\n want pattern: %s\n got:\n%s\n' "$name" "$want_pattern" "$got"
fi
}
assert_no_match() {
local name="$1"
local got="$2"
local bad_pattern="$3"
if printf '%s' "$got" | grep -qE "$bad_pattern"; then
FAIL=$((FAIL + 1))
printf ' ✗ %s\n bad pattern matched: %s\n got:\n%s\n' "$name" "$bad_pattern" "$got"
else
PASS=$((PASS + 1))
printf ' ✓ %s\n' "$name"
fi
}
# ─────────────────────────────────────────────────────────────────────────────
# Test cases
# ─────────────────────────────────────────────────────────────────────────────
echo "1. Empty PR list — clean exit"
echo '[]' > "$TMP/empty.json"
got=$(run_script "$TMP/empty.json")
assert_pass "empty-no-warning" "$got" "No stale auto-promote PRs detected"
assert_pass "empty-exit-zero" "$got" "EXIT_CODE=0"
echo
echo "2. Single PR, BLOCKED+REVIEW_REQUIRED, 5h old — fires alarm"
cat > "$TMP/stale1.json" <<EOF
[{
"number": 2963,
"title": "staging → main",
"createdAt": "2026-05-06T00:00:00Z",
"mergeStateStatus": "BLOCKED",
"reviewDecision": "REVIEW_REQUIRED",
"url": "https://github.com/test/test/pull/2963"
}]
EOF
got=$(run_script "$TMP/stale1.json")
assert_pass "stale1-warning" "$got" "Stale auto-promote PR"
assert_pass "stale1-pr-number" "$got" "PR #2963"
assert_pass "stale1-age" "$got" "for 5h"
assert_pass "stale1-exit-1" "$got" "EXIT_CODE=1"
echo
echo "3. Same PR but only 3h old — under threshold, NO alarm"
cat > "$TMP/young.json" <<EOF
[{
"number": 100,
"title": "fresh promote",
"createdAt": "2026-05-06T02:00:00Z",
"mergeStateStatus": "BLOCKED",
"reviewDecision": "REVIEW_REQUIRED",
"url": "https://github.com/test/test/pull/100"
}]
EOF
got=$(run_script "$TMP/young.json")
assert_pass "young-no-alarm" "$got" "No stale auto-promote PRs"
assert_pass "young-exit-zero" "$got" "EXIT_CODE=0"
assert_no_match "young-no-warning" "$got" "Stale auto-promote PR"
echo
echo "4. PR is BLOCKED but for the wrong reason (DIRTY, not REVIEW_REQUIRED)"
cat > "$TMP/dirty.json" <<EOF
[{
"number": 200,
"title": "needs rebase",
"createdAt": "2026-05-06T00:00:00Z",
"mergeStateStatus": "BLOCKED",
"reviewDecision": "APPROVED",
"url": "https://github.com/test/test/pull/200"
}]
EOF
got=$(run_script "$TMP/dirty.json")
assert_pass "dirty-no-alarm" "$got" "No stale auto-promote PRs"
assert_pass "dirty-exit-zero" "$got" "EXIT_CODE=0"
echo
echo "5. PR is APPROVED but mergeStateStatus is CLEAN — NOT alarming"
cat > "$TMP/clean.json" <<EOF
[{
"number": 300,
"title": "all green",
"createdAt": "2026-05-06T00:00:00Z",
"mergeStateStatus": "CLEAN",
"reviewDecision": "APPROVED",
"url": "https://github.com/test/test/pull/300"
}]
EOF
got=$(run_script "$TMP/clean.json")
assert_pass "clean-no-alarm" "$got" "No stale auto-promote PRs"
echo
echo "6. Multiple PRs — only the BLOCKED+REVIEW_REQUIRED+old one alarms"
cat > "$TMP/mixed.json" <<EOF
[
{
"number": 100,
"title": "fresh",
"createdAt": "2026-05-06T04:00:00Z",
"mergeStateStatus": "BLOCKED",
"reviewDecision": "REVIEW_REQUIRED",
"url": "https://x/100"
},
{
"number": 200,
"title": "stale + alarming",
"createdAt": "2026-05-05T20:00:00Z",
"mergeStateStatus": "BLOCKED",
"reviewDecision": "REVIEW_REQUIRED",
"url": "https://x/200"
},
{
"number": 300,
"title": "approved + clean",
"createdAt": "2026-05-05T20:00:00Z",
"mergeStateStatus": "CLEAN",
"reviewDecision": "APPROVED",
"url": "https://x/300"
}
]
EOF
got=$(run_script "$TMP/mixed.json")
assert_pass "mixed-only-200" "$got" "PR #200"
assert_no_match "mixed-not-100" "$got" "PR #100"
assert_no_match "mixed-not-300" "$got" "PR #300"
assert_pass "mixed-exit-1" "$got" "EXIT_CODE=1"
echo
echo "7. Custom STALE_HOURS via --stale-hours overrides threshold"
got=$(run_script "$TMP/young.json" --stale-hours 1)
assert_pass "custom-threshold-fires" "$got" "PR #100"
assert_pass "custom-threshold-exit-1" "$got" "EXIT_CODE=1"
echo
echo "8. Two stale PRs — exit code reflects count"
cat > "$TMP/two-stale.json" <<EOF
[
{
"number": 200,
"title": "stale-A",
"createdAt": "2026-05-05T20:00:00Z",
"mergeStateStatus": "BLOCKED",
"reviewDecision": "REVIEW_REQUIRED",
"url": "https://x/200"
},
{
"number": 201,
"title": "stale-B",
"createdAt": "2026-05-05T19:00:00Z",
"mergeStateStatus": "BLOCKED",
"reviewDecision": "REVIEW_REQUIRED",
"url": "https://x/201"
}
]
EOF
got=$(run_script "$TMP/two-stale.json")
assert_pass "two-stale-exit-2" "$got" "EXIT_CODE=2"
echo
echo "9. Help text is shown for --help"
set +e
help_out=$(bash "$SCRIPT" --help 2>&1)
help_rc=$?
set -e
assert_pass "help-exits-zero" "EXIT_CODE=$help_rc" "EXIT_CODE=0"
assert_pass "help-mentions-issue" "$help_out" "issue #2975"
echo
echo "10. Unknown arg exits 64 (EX_USAGE)"
set +e
bad_out=$(bash "$SCRIPT" --bogus 2>&1)
bad_rc=$?
set -e
assert_pass "unknown-arg-rc" "EXIT_CODE=$bad_rc" "EXIT_CODE=64"
echo
echo "11. Missing repo + missing fixture exits 2"
set +e
out=$(REPO="" bash "$SCRIPT" 2>&1)
rc=$?
set -e
assert_pass "no-repo-exit-2" "EXIT_CODE=$rc" "EXIT_CODE=2"
# ─────────────────────────────────────────────────────────────────────────────
# Summary
# ─────────────────────────────────────────────────────────────────────────────
echo
echo "─────────────────────────────────────────────"
echo "Tests: $PASS passed, $FAIL failed"
if [ "$FAIL" -gt 0 ]; then
exit 1
fi
echo "All tests passed."
+295
View File
@@ -0,0 +1,295 @@
#!/usr/bin/env bash
# E2E for poll-mode chat upload (RFC #2891 phases 1-5b).
#
# Round-trip: register a workspace as poll-mode (no callback URL) → POST a
# multi-file chat upload → verify each file becomes (a) one
# `chat_upload_receive` activity row and (b) one /pending-uploads row → fetch
# the bytes back via the poll endpoint → ack → verify the row 404s on
# subsequent fetch. Also pins cross-workspace bleed protection: workspace B
# cannot read workspace A's pending uploads even with its own valid bearer.
#
# Why this exists separately from test_chat_upload_e2e.sh: that script
# covers the PUSH path (the workspace's own /internal/chat/uploads/ingest).
# This script covers the POLL path: the same canvas-side request lands on
# the platform's pendinguploads.Storage instead, and the workspace fetches
# it later. The two paths share zero handler code on the platform side, so
# both need their own E2E.
#
# Requires: platform running on localhost:8080 with migrations applied.
# bash workspace-server/scripts/dev-start.sh
# bash workspace-server/scripts/run-migrations.sh
#
# Idempotent: each run uses fresh per-script workspace UUIDs so reruns
# don't collide. Best-effort cleanup on EXIT — does NOT call
# e2e_cleanup_all_workspaces (see
# `feedback_never_run_cluster_cleanup_tests_on_live_platform.md`).
set -euo pipefail
source "$(dirname "$0")/_lib.sh"
PASS=0
FAIL=0
TIMEOUT="${A2A_TIMEOUT:-30}"
gen_uuid() {
if command -v uuidgen >/dev/null 2>&1; then
uuidgen | tr '[:upper:]' '[:lower:]'
else
python3 -c 'import uuid; print(uuid.uuid4())'
fi
}
WS_A="$(gen_uuid)"
WS_B="$(gen_uuid)"
# Per-run scratch dir collected under one trap so every assertion-failure
# path drops the temp files it made (see test_chat_attachments_e2e.sh).
TMPDIR_E2E=$(mktemp -d -t poll-chat-upload-e2e-XXXXXX)
cleanup() {
local rc=$?
curl -s -X DELETE "$BASE/workspaces/$WS_A?confirm=true" >/dev/null 2>&1 || true
curl -s -X DELETE "$BASE/workspaces/$WS_B?confirm=true" >/dev/null 2>&1 || true
rm -rf "$TMPDIR_E2E"
exit $rc
}
trap cleanup EXIT INT TERM
check() {
local desc="$1" expected="$2" actual="$3"
if echo "$actual" | grep -qF -- "$expected"; then
echo "PASS: $desc"
PASS=$((PASS + 1))
else
echo "FAIL: $desc"
echo " expected to contain: $expected"
echo " got: $(echo "$actual" | head -10)"
FAIL=$((FAIL + 1))
fi
}
check_eq() {
local desc="$1" expected="$2" actual="$3"
if [ "$actual" = "$expected" ]; then
echo "PASS: $desc"
PASS=$((PASS + 1))
else
echo "FAIL: $desc"
echo " expected: $expected"
echo " got: $actual"
FAIL=$((FAIL + 1))
fi
}
echo "=== Poll-Mode Chat Upload E2E ==="
echo " base: $BASE"
echo " workspace A: $WS_A"
echo " workspace B: $WS_B"
echo ""
# ---------- Phase 1: register poll-mode workspace ----------
echo "--- Phase 1: Register poll-mode workspace A ---"
REG_A=$(curl -s -X POST "$BASE/registry/register" \
-H "Content-Type: application/json" \
-d "{
\"id\": \"$WS_A\",
\"delivery_mode\": \"poll\",
\"agent_card\": {\"name\": \"poll-chat-upload-test-a\"}
}")
check "register accepts poll mode without URL" '"status":"registered"' "$REG_A"
TOK_A=$(echo "$REG_A" | e2e_extract_token || true)
[ -n "$TOK_A" ] || { echo "FAIL: no auth_token in register response (ws A)"; FAIL=$((FAIL + 1)); exit 1; }
# ---------- Phase 2: multi-file chat upload ----------
echo ""
echo "--- Phase 2: POST /chat/uploads with two files ---"
FILE1="$TMPDIR_E2E/alpha.txt"
FILE2="$TMPDIR_E2E/beta.txt"
EXPECTED1="alpha-secret-$(openssl rand -hex 4)"
EXPECTED2="beta-secret-$(openssl rand -hex 4)"
printf '%s' "$EXPECTED1" > "$FILE1"
printf '%s' "$EXPECTED2" > "$FILE2"
UPLOAD=$(curl -s -X POST "$BASE/workspaces/$WS_A/chat/uploads" \
-H "Authorization: Bearer $TOK_A" \
-F "files=@$FILE1;filename=alpha.txt;type=text/plain" \
-F "files=@$FILE2;filename=beta.txt;type=text/plain" \
-w "\nHTTP_CODE=%{http_code}\n")
UPLOAD_CODE=$(echo "$UPLOAD" | grep -oE 'HTTP_CODE=[0-9]+' | cut -d= -f2)
UPLOAD_BODY=$(echo "$UPLOAD" | sed '/^HTTP_CODE=/,$d')
check_eq "upload returns 200" "200" "$UPLOAD_CODE"
check "upload response has files array" '"files":' "$UPLOAD_BODY"
# Pull file_ids out of the URI in the response. URI shape is
# `platform-pending:<wsid>/<file_id>` — proves the response came from the
# poll-mode branch, not the push-mode internal-ingest branch.
URI1=$(echo "$UPLOAD_BODY" | python3 -c 'import sys,json; d=json.load(sys.stdin); print(d["files"][0]["uri"])')
URI2=$(echo "$UPLOAD_BODY" | python3 -c 'import sys,json; d=json.load(sys.stdin); print(d["files"][1]["uri"])')
check "URI 1 has platform-pending: scheme" "platform-pending:$WS_A/" "$URI1"
check "URI 2 has platform-pending: scheme" "platform-pending:$WS_A/" "$URI2"
FID1="${URI1##*/}"
FID2="${URI2##*/}"
[ -n "$FID1" ] && [ -n "$FID2" ] || { echo "FAIL: could not extract file IDs"; FAIL=$((FAIL + 1)); exit 1; }
echo " file_id 1: $FID1"
echo " file_id 2: $FID2"
# ---------- Phase 3: activity rows visible to the workspace ----------
echo ""
echo "--- Phase 3: /activity shows two chat_upload_receive rows ---"
# activity_logs INSERTs run in a goroutine — give them a moment.
sleep 1
ACT=$(curl -s --max-time "$TIMEOUT" -H "Authorization: Bearer $TOK_A" \
"$BASE/workspaces/$WS_A/activity?type=a2a_receive&limit=20")
check "activity feed has the alpha file" "$FID1" "$ACT"
check "activity feed has the beta file" "$FID2" "$ACT"
check "activity rows tagged chat_upload_receive" '"method":"chat_upload_receive"' "$ACT"
check "activity rows record alpha mimetype" '"mimeType":"text/plain"' "$ACT"
CHAT_UPLOAD_COUNT=$(echo "$ACT" | python3 -c '
import json, sys
rows = json.load(sys.stdin)
n = sum(1 for r in rows if (r.get("method") or "") == "chat_upload_receive")
print(n)
')
check_eq "exactly two chat_upload_receive rows" "2" "$CHAT_UPLOAD_COUNT"
# ---------- Phase 4: GET /pending-uploads/:file_id/content ----------
echo ""
echo "--- Phase 4: Fetch content for each pending upload ---"
GOT1=$(curl -s --max-time "$TIMEOUT" -H "Authorization: Bearer $TOK_A" \
"$BASE/workspaces/$WS_A/pending-uploads/$FID1/content")
check_eq "alpha bytes round-trip" "$EXPECTED1" "$GOT1"
GOT2=$(curl -s --max-time "$TIMEOUT" -H "Authorization: Bearer $TOK_A" \
"$BASE/workspaces/$WS_A/pending-uploads/$FID2/content")
check_eq "beta bytes round-trip" "$EXPECTED2" "$GOT2"
# Mimetype + Content-Disposition headers should match what was uploaded.
HEAD1=$(curl -s -D - -o /dev/null --max-time "$TIMEOUT" -H "Authorization: Bearer $TOK_A" \
"$BASE/workspaces/$WS_A/pending-uploads/$FID1/content")
check "alpha response carries text/plain Content-Type" "Content-Type: text/plain" "$HEAD1"
check "alpha response carries Content-Disposition with filename" 'filename="alpha.txt"' "$HEAD1"
# ---------- Phase 5: idempotent re-fetch (until ack) ----------
echo ""
echo "--- Phase 5: Re-fetch before ack returns the same bytes ---"
RE_GOT1=$(curl -s --max-time "$TIMEOUT" -H "Authorization: Bearer $TOK_A" \
"$BASE/workspaces/$WS_A/pending-uploads/$FID1/content")
check_eq "re-fetch returns same alpha bytes" "$EXPECTED1" "$RE_GOT1"
# ---------- Phase 6: ack each row ----------
echo ""
echo "--- Phase 6: Ack each pending upload ---"
ACK1=$(curl -s -X POST --max-time "$TIMEOUT" -H "Authorization: Bearer $TOK_A" \
"$BASE/workspaces/$WS_A/pending-uploads/$FID1/ack")
check "alpha ack returns acked:true" '"acked":true' "$ACK1"
ACK2=$(curl -s -X POST --max-time "$TIMEOUT" -H "Authorization: Bearer $TOK_A" \
"$BASE/workspaces/$WS_A/pending-uploads/$FID2/ack")
check "beta ack returns acked:true" '"acked":true' "$ACK2"
# Re-ack should still 200 (idempotent — the row's gone but the workspace's
# at-least-once intent was already honored, and the second ack hits the
# raced path which also returns 200).
RE_ACK1=$(curl -s -w '\n%{http_code}' -X POST --max-time "$TIMEOUT" \
-H "Authorization: Bearer $TOK_A" \
"$BASE/workspaces/$WS_A/pending-uploads/$FID1/ack")
RE_ACK1_CODE=$(printf '%s' "$RE_ACK1" | tail -n1)
# Acked rows return 404 on Get-before-Ack (the row's still in the table
# but Get filters acked_at IS NULL); workspace would not normally re-ack
# since it already saw the success. Accept both 200 and 404 here so the
# test pins the contract without being brittle on the inner ordering.
case "$RE_ACK1_CODE" in
200|404)
echo "PASS: re-ack returns 200 or 404 ($RE_ACK1_CODE)"
PASS=$((PASS + 1))
;;
*)
echo "FAIL: re-ack returned unexpected $RE_ACK1_CODE"
FAIL=$((FAIL + 1))
;;
esac
# ---------- Phase 7: GET content after ack returns 404 ----------
echo ""
echo "--- Phase 7: Acked file 404s on subsequent fetch ---"
POST_ACK=$(curl -s -w '\n%{http_code}' --max-time "$TIMEOUT" -H "Authorization: Bearer $TOK_A" \
"$BASE/workspaces/$WS_A/pending-uploads/$FID1/content")
POST_ACK_CODE=$(printf '%s' "$POST_ACK" | tail -n1)
check_eq "acked alpha returns HTTP 404" "404" "$POST_ACK_CODE"
# ---------- Phase 8: cross-workspace bleed protection ----------
echo ""
echo "--- Phase 8: Workspace B cannot read workspace A's pending uploads ---"
# Stage a fresh upload on workspace A so we have an UN-acked row to probe.
PROBE_FILE="$TMPDIR_E2E/probe.txt"
printf '%s' "probe-bytes-$(openssl rand -hex 4)" > "$PROBE_FILE"
PROBE_UP=$(curl -s -X POST "$BASE/workspaces/$WS_A/chat/uploads" \
-H "Authorization: Bearer $TOK_A" \
-F "files=@$PROBE_FILE;filename=probe.txt;type=text/plain")
PROBE_FID=$(echo "$PROBE_UP" | python3 -c 'import sys,json; d=json.load(sys.stdin); print(d["files"][0]["uri"].split("/")[-1])')
[ -n "$PROBE_FID" ] || { echo "FAIL: probe upload returned no file_id"; FAIL=$((FAIL + 1)); exit 1; }
# Register a SECOND poll-mode workspace and capture its bearer.
REG_B=$(curl -s -X POST "$BASE/registry/register" \
-H "Content-Type: application/json" \
-d "{
\"id\": \"$WS_B\",
\"delivery_mode\": \"poll\",
\"agent_card\": {\"name\": \"poll-chat-upload-test-b\"}
}")
check "second workspace registers" '"status":"registered"' "$REG_B"
TOK_B=$(echo "$REG_B" | e2e_extract_token || true)
[ -n "$TOK_B" ] || { echo "FAIL: no auth_token (ws B)"; FAIL=$((FAIL + 1)); exit 1; }
# B's bearer hitting B's URL with A's file_id → 404 (handler checks the row's
# workspace_id matches the URL :id, not the bearer's workspace).
CROSS_RESP=$(curl -s -w '\n%{http_code}' --max-time "$TIMEOUT" \
-H "Authorization: Bearer $TOK_B" \
"$BASE/workspaces/$WS_B/pending-uploads/$PROBE_FID/content")
CROSS_CODE=$(printf '%s' "$CROSS_RESP" | tail -n1)
check_eq "B's URL with A's file_id returns 404" "404" "$CROSS_CODE"
# B's bearer hitting A's URL → 401 (wsAuth pins bearer to :id). This is the
# strictest cross-workspace check: a presented-but-wrong bearer is rejected
# in EVERY platform posture (dev-mode fail-open only triggers when no bearer
# is presented at all — invalid tokens always 401).
WRONG_BEARER=$(curl -s -w '\n%{http_code}' --max-time "$TIMEOUT" \
-H "Authorization: Bearer $TOK_B" \
"$BASE/workspaces/$WS_A/pending-uploads/$PROBE_FID/content")
WRONG_CODE=$(printf '%s' "$WRONG_BEARER" | tail -n1)
check_eq "B's bearer on A's URL returns 401" "401" "$WRONG_CODE"
# NB: a fully bearerless request to /pending-uploads/:fid/content returns
# 401 ONLY when the platform has MOLECULE_ENV != development (production /
# staging). On local-dev with MOLECULE_ENV=development the wsauth middleware
# fail-opens for bearerless requests so the canvas at :3000 can talk to the
# platform at :8080 without per-call token plumbing — see middleware/
# devmode.go. The strict bearerless-401 contract is covered by the wsauth
# unit + middleware tests; we don't reassert it here because the result
# depends on platform posture, not the poll-mode upload contract.
# ---------- Phase 9: invalid file_id rejected at the URL parser ----------
echo ""
echo "--- Phase 9: Invalid file_id returns 400 ---"
BAD_FID=$(curl -s -w '\n%{http_code}' --max-time "$TIMEOUT" \
-H "Authorization: Bearer $TOK_A" \
"$BASE/workspaces/$WS_A/pending-uploads/not-a-uuid/content")
BAD_FID_CODE=$(printf '%s' "$BAD_FID" | tail -n1)
check_eq "invalid file_id UUID returns 400" "400" "$BAD_FID_CODE"
# ---------- Results ----------
echo ""
echo "=== Results: $PASS passed, $FAIL failed ==="
[ "$FAIL" -eq 0 ]
+37
View File
@@ -157,6 +157,43 @@ A2A_RESP=$(curl -s --max-time "$TIMEOUT" -X POST "$BASE/workspaces/$POLL_WS_ID/a
}')
check "poll-mode A2A returns queued status" '"status":"queued"' "$A2A_RESP"
# ---------- Phase 3.5: Python parser classifies queued envelope correctly ----------
# (#2967) — server emits the queued envelope, the wheel's a2a_response.parse()
# MUST classify it as the Queued variant, not Malformed. Pre-#2967 the bare
# message/send parser in a2a_client.py:587 misclassified this and returned
# "[A2A_ERROR] unexpected response shape", which broke external↔external A2A
# on poll-mode peers.
#
# This phase exercises the actual on-the-wire response from a real
# workspace-server (NOT a mocked dict) through the same module the production
# wheel ships, so a regression in either the server emit shape OR the client
# parser fails this E2E.
echo ""
echo "--- Phase 3.5: Python parser classifies real server response (#2967) ---"
# Pipe the queued response captured above through a2a_response.parse and
# assert the classification. WORKSPACE_ID is required at module import
# time but irrelevant to this parsing call (any UUID is fine).
PARSE_RESULT=$(WORKSPACE_ID="00000000-0000-0000-0000-000000000001" \
python3 -c "
import json, sys
sys.path.insert(0, '$(cd "$(dirname "$0")/../../workspace" && pwd)')
import a2a_response
data = json.loads(r'''$A2A_RESP''')
v = a2a_response.parse(data)
print(type(v).__name__)
if isinstance(v, a2a_response.Queued):
print(f'method={v.method} delivery_mode={v.delivery_mode}')
")
check_eq "Python parser classifies real server response as Queued" \
"Queued" "$(printf '%s' "$PARSE_RESULT" | head -n1)"
check "Queued variant captures method=message/send" \
"method=message/send" "$PARSE_RESULT"
check "Queued variant captures delivery_mode=poll" \
"delivery_mode=poll" "$PARSE_RESULT"
check "queued response echoes delivery_mode=poll" '"delivery_mode":"poll"' "$A2A_RESP"
check "queued response echoes the JSON-RPC method" '"method":"message/send"' "$A2A_RESP"
+14
View File
@@ -94,6 +94,13 @@ services:
CP_UPSTREAM_URL: "http://cp-stub:9090"
RATE_LIMIT: "1000"
CANVAS_PROXY_URL: "http://localhost:3000"
# Memory v2 sidecar (PR #2906) bundles the plugin into the
# tenant image and starts it before the main server. The plugin
# runs `CREATE EXTENSION vector` on first boot, which fails on
# the harness's plain postgres:15-alpine (no pgvector). The
# harness doesn't exercise memory features, so disable the
# sidecar via the entrypoint's documented escape hatch.
MEMORY_PLUGIN_DISABLE: "1"
networks: [harness-net]
healthcheck:
test: ["CMD-SHELL", "wget -q -O- http://localhost:8080/health || exit 1"]
@@ -142,6 +149,13 @@ services:
CP_UPSTREAM_URL: "http://cp-stub:9090"
RATE_LIMIT: "1000"
CANVAS_PROXY_URL: "http://localhost:3000"
# Memory v2 sidecar (PR #2906) bundles the plugin into the
# tenant image and starts it before the main server. The plugin
# runs `CREATE EXTENSION vector` on first boot, which fails on
# the harness's plain postgres:15-alpine (no pgvector). The
# harness doesn't exercise memory features, so disable the
# sidecar via the entrypoint's documented escape hatch.
MEMORY_PLUGIN_DISABLE: "1"
networks: [harness-net]
healthcheck:
test: ["CMD-SHELL", "wget -q -O- http://localhost:8080/health || exit 1"]
+66 -1
View File
@@ -21,6 +21,14 @@ ARG GIT_SHA=dev
RUN CGO_ENABLED=0 GOOS=linux go build \
-ldflags "-X github.com/Molecule-AI/molecule-monorepo/platform/internal/buildinfo.GitSHA=${GIT_SHA}" \
-o /platform ./cmd/server
# Bundle the built-in memory-plugin-postgres binary so an operator can
# activate Memory v2 by setting MEMORY_V2_CUTOVER=true + (default)
# MEMORY_PLUGIN_URL=http://localhost:9100. The entrypoint starts this
# binary in the background; main /platform talks to it over loopback.
# Stays inert until the operator flips the cutover env var.
RUN CGO_ENABLED=0 GOOS=linux go build \
-ldflags "-X github.com/Molecule-AI/molecule-monorepo/platform/internal/buildinfo.GitSHA=${GIT_SHA}" \
-o /memory-plugin ./cmd/memory-plugin-postgres
# Clone templates + plugins at build time from manifest.json
FROM alpine:3.20 AS templates
@@ -30,8 +38,9 @@ COPY scripts/clone-manifest.sh /scripts/clone-manifest.sh
RUN chmod +x /scripts/clone-manifest.sh && /scripts/clone-manifest.sh /manifest.json /workspace-configs-templates /org-templates /plugins
FROM alpine:3.20
RUN apk add --no-cache ca-certificates git tzdata
RUN apk add --no-cache ca-certificates git tzdata wget
COPY --from=builder /platform /platform
COPY --from=builder /memory-plugin /memory-plugin
COPY workspace-server/migrations /migrations
COPY --from=templates /workspace-configs-templates /workspace-configs-templates
COPY --from=templates /org-templates /org-templates
@@ -41,6 +50,7 @@ RUN addgroup -g 1000 platform && adduser -u 1000 -G platform -s /bin/sh -D platf
EXPOSE 8080
COPY <<'ENTRY' /entrypoint.sh
#!/bin/sh
# Set up docker-socket group (unchanged from pre-sidecar entrypoint).
if [ -S /var/run/docker.sock ]; then
SOCK_GID=$(stat -c '%g' /var/run/docker.sock 2>/dev/null || stat -f '%g' /var/run/docker.sock 2>/dev/null)
if [ -n "$SOCK_GID" ] && [ "$SOCK_GID" != "0" ]; then
@@ -50,6 +60,61 @@ if [ -S /var/run/docker.sock ]; then
addgroup platform root 2>/dev/null || true
fi
fi
# Memory v2 sidecar (built-in postgres plugin). Co-located with the
# main server so operators flipping MEMORY_V2_CUTOVER=true don't need
# to provision a separate service.
#
# Spawn-gating: only start the sidecar when the operator has indicated
# they want it — either MEMORY_V2_CUTOVER=true OR MEMORY_PLUGIN_URL set.
# Without that signal, the sidecar adds zero value (the platform's
# wiring.go skips building the client too) but pays a real cost: the
# plugin's first migration runs `CREATE EXTENSION vector`, which fails
# on tenant Postgres without pgvector preinstalled and aborts container
# boot via the 30s health gate. Caught on staging redeploy 2026-05-05.
#
# Env defaults (when sidecar IS spawned):
# MEMORY_PLUGIN_DATABASE_URL = $DATABASE_URL (share existing Postgres;
# plugin's `memory_namespaces` / `memory_records` tables coexist
# with `agent_memories` and the rest of the platform schema —
# no conflicts. Operator can override with a separate URL.)
# MEMORY_PLUGIN_LISTEN_ADDR = 127.0.0.1:9100
#
# Set MEMORY_PLUGIN_DISABLE=1 to force-skip the sidecar even with
# cutover env set (e.g. running the plugin externally on a separate host).
memory_plugin_wanted=""
if [ "$MEMORY_V2_CUTOVER" = "true" ] || [ -n "$MEMORY_PLUGIN_URL" ]; then
memory_plugin_wanted=1
fi
if [ -z "$MEMORY_PLUGIN_DISABLE" ] && [ -n "$memory_plugin_wanted" ] && [ -n "$DATABASE_URL" ]; then
: "${MEMORY_PLUGIN_DATABASE_URL:=$DATABASE_URL}"
: "${MEMORY_PLUGIN_LISTEN_ADDR:=:9100}"
export MEMORY_PLUGIN_DATABASE_URL MEMORY_PLUGIN_LISTEN_ADDR
echo "memory-plugin: starting sidecar on $MEMORY_PLUGIN_LISTEN_ADDR" >&2
# Drop privs to the platform user — the plugin doesn't need root and
# runs unprivileged elsewhere (tenant image already starts as canvas).
su-exec platform /memory-plugin &
MEMORY_PLUGIN_PID=$!
# Wait up to 30s for the plugin's /v1/health to return 200. Boot
# failure here is fatal — better to crash-loop than to silently
# serve cutover traffic against a dead plugin.
health_port=${MEMORY_PLUGIN_LISTEN_ADDR#:}
ready=0
for _ in $(seq 1 30); do
if wget -qO- --timeout=2 "http://localhost:${health_port}/v1/health" >/dev/null 2>&1; then
ready=1
break
fi
sleep 1
done
if [ "$ready" != "1" ]; then
echo "memory-plugin: ❌ /v1/health never returned 200 after 30s — aborting boot. Check that DATABASE_URL is reachable, has the pgvector extension, and the plugin's migrations applied." >&2
kill "$MEMORY_PLUGIN_PID" 2>/dev/null || true
exit 1
fi
echo "memory-plugin: ✅ sidecar healthy on :$health_port" >&2
fi
exec su-exec platform /platform "$@"
ENTRY
RUN chmod +x /entrypoint.sh && apk add --no-cache su-exec
+10 -2
View File
@@ -34,6 +34,13 @@ ARG GIT_SHA=dev
RUN CGO_ENABLED=0 GOOS=linux go build \
-ldflags "-X github.com/Molecule-AI/molecule-monorepo/platform/internal/buildinfo.GitSHA=${GIT_SHA}" \
-o /platform ./cmd/server
# Memory v2 sidecar binary (Memory v2 #2728). Bundled so an operator
# can activate cutover by flipping MEMORY_V2_CUTOVER=true without
# provisioning a separate service. See entrypoint-tenant.sh for the
# launch logic.
RUN CGO_ENABLED=0 GOOS=linux go build \
-ldflags "-X github.com/Molecule-AI/molecule-monorepo/platform/internal/buildinfo.GitSHA=${GIT_SHA}" \
-o /memory-plugin ./cmd/memory-plugin-postgres
# ── Stage 2: Canvas Next.js standalone ────────────────────────────────
FROM node:20-alpine AS canvas-builder
@@ -74,8 +81,9 @@ RUN deluser --remove-home node 2>/dev/null || true; \
delgroup node 2>/dev/null || true; \
addgroup -g 1000 canvas && adduser -u 1000 -G canvas -s /bin/sh -D canvas
# Go platform binary
# Go platform binary + Memory v2 sidecar
COPY --from=go-builder /platform /platform
COPY --from=go-builder /memory-plugin /memory-plugin
COPY workspace-server/migrations /migrations
# Templates + plugins (cloned from GitHub in stage 3)
@@ -91,7 +99,7 @@ COPY --from=canvas-builder /canvas/public ./public
COPY workspace-server/entrypoint-tenant.sh /entrypoint.sh
RUN chmod +x /entrypoint.sh && \
chown -R canvas:canvas /canvas /platform /migrations
chown -R canvas:canvas /canvas /platform /memory-plugin /migrations
EXPOSE 8080
# entrypoint.sh starts as root to fix volume perms, then drops to
@@ -21,6 +21,7 @@ import (
"os"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/memory/contract"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/textutil"
)
// verifyConfig is the typed dependency bundle for verifyParity.
@@ -121,7 +122,7 @@ func verifyParity(ctx context.Context, cfg verifyConfig, stdout *os.File) (*veri
matched := true
for _, c := range legacy {
if pluginContents[c] == 0 {
fmt.Fprintf(stdout, "[mismatch] workspace=%s missing-from-plugin content=%q\n", wsID, truncate(c, 80))
fmt.Fprintf(stdout, "[mismatch] workspace=%s missing-from-plugin content=%q\n", wsID, textutil.TruncateBytes(c, 80))
matched = false
break
}
@@ -192,9 +193,4 @@ func queryLegacyMemories(ctx context.Context, db *sql.DB, workspaceID string) ([
return out, rows.Err()
}
func truncate(s string, n int) string {
if len(s) <= n {
return s
}
return s[:n] + "…"
}
// truncation moved to internal/textutil.TruncateBytes (#2962 SSOT).
@@ -349,16 +349,8 @@ func TestVerifyParity_PickSampleError(t *testing.T) {
}
}
// --- Truncate ---
func TestVerifyTruncate(t *testing.T) {
if got := truncate("short", 10); got != "short" {
t.Errorf("got %q", got)
}
if got := truncate(strings.Repeat("a", 200), 10); !strings.HasSuffix(got, "…") {
t.Errorf("expected ellipsis: %q", got)
}
}
// Truncate moved to internal/textutil — coverage in
// internal/textutil/truncate_test.go (TestTruncateBytes_RuneBoundary).
// --- CLI: -verify mode ---
@@ -0,0 +1,50 @@
package main
import (
"strings"
"testing"
)
// TestLoadConfig_DefaultListenAddrIsLoopback pins the default-bind contract.
//
// Why this matters: with the prior `:9100` default, the plugin listened on
// every interface. Inside the container it didn't matter (no host port
// mapping today), but a future change that publishes 9100 OR a cross-host
// sidecar deploy would have exposed an unauth'd memory store. Loopback by
// default is the least-privilege baseline; operators with a multi-host
// topology override via MEMORY_PLUGIN_LISTEN_ADDR.
func TestLoadConfig_DefaultListenAddrIsLoopback(t *testing.T) {
t.Setenv("MEMORY_PLUGIN_DATABASE_URL", "postgres://stub")
t.Setenv("MEMORY_PLUGIN_LISTEN_ADDR", "")
cfg, err := loadConfig()
if err != nil {
t.Fatalf("loadConfig: %v", err)
}
if !strings.HasPrefix(cfg.ListenAddr, "127.0.0.1:") {
t.Errorf("default ListenAddr must bind loopback-only, got %q "+
"(security regression — would expose plugin on every interface)",
cfg.ListenAddr)
}
}
func TestLoadConfig_ListenAddrEnvOverride(t *testing.T) {
t.Setenv("MEMORY_PLUGIN_DATABASE_URL", "postgres://stub")
t.Setenv("MEMORY_PLUGIN_LISTEN_ADDR", ":9100")
cfg, err := loadConfig()
if err != nil {
t.Fatalf("loadConfig: %v", err)
}
if cfg.ListenAddr != ":9100" {
t.Errorf("env override ignored: want :9100, got %q", cfg.ListenAddr)
}
}
func TestLoadConfig_MissingDatabaseURL(t *testing.T) {
t.Setenv("MEMORY_PLUGIN_DATABASE_URL", "")
if _, err := loadConfig(); err == nil {
t.Fatal("loadConfig must error when MEMORY_PLUGIN_DATABASE_URL is empty")
}
}
@@ -10,6 +10,7 @@ package main
import (
"context"
"database/sql"
"embed"
"errors"
"fmt"
"log"
@@ -17,6 +18,7 @@ import (
"net/http"
"os"
"os/signal"
"sort"
"strings"
"syscall"
"time"
@@ -26,12 +28,28 @@ import (
"github.com/Molecule-AI/molecule-monorepo/platform/internal/memory/pgplugin"
)
// migrationsFS bundles the .up.sql files into the binary at build time
// so the prebuilt image doesn't need the source tree at runtime. The
// prior `os.ReadDir("cmd/memory-plugin-postgres/migrations")` path
// only resolved during `go test` from the repo root — in the published
// image the path didn't exist and boot failed after the 30s health gate
// (caught on staging redeploy 2026-05-05 after PR #2906).
//
//go:embed migrations/*.up.sql
var migrationsFS embed.FS
const (
envDatabaseURL = "MEMORY_PLUGIN_DATABASE_URL"
envListenAddr = "MEMORY_PLUGIN_LISTEN_ADDR"
envSkipMigrate = "MEMORY_PLUGIN_SKIP_MIGRATE"
defaultListenAddr = ":9100"
// Loopback-only by default (defense in depth). The platform talks to
// the plugin over `http://localhost:9100` from the same container, so
// binding to all interfaces would only widen the reachable surface
// without enabling any in-design caller. Operators running the plugin
// on a separate host override via MEMORY_PLUGIN_LISTEN_ADDR=:9100 (or
// some other interface).
defaultListenAddr = "127.0.0.1:9100"
)
func main() {
@@ -143,32 +161,71 @@ func openDB(databaseURL string) (*sql.DB, error) {
return db, nil
}
// runMigrations applies the schema migrations bundled at
// cmd/memory-plugin-postgres/migrations/. Idempotent on repeat boot.
// runMigrations applies the schema migrations bundled into the binary
// via go:embed (see migrationsFS at the top of this file). Idempotent
// on repeat boot — every migration file uses CREATE … IF NOT EXISTS.
//
// Implementation note: rather than embedding the full migrate engine,
// we read the migration files at boot from a known relative path. The
// down migrations are deliberately NOT applied here — that's a manual
// operator action. This keeps the binary tiny and avoids dragging in
// golang-migrate's drivers.
// The down migrations are deliberately NOT applied here — that's a
// manual operator action. This keeps the binary tiny and avoids
// dragging in golang-migrate's drivers.
//
// MEMORY_PLUGIN_MIGRATIONS_DIR (filesystem path) is honored as an
// override for operators who need to ship custom migrations alongside
// the binary without rebuilding. When unset (the common case) we read
// from the embedded FS.
func runMigrations(db *sql.DB) error {
// Find the migrations directory. In `go run` mode it's relative
// to the cmd dir; in the prebuilt binary case it's expected next
// to the binary OR via env var override.
dir := os.Getenv("MEMORY_PLUGIN_MIGRATIONS_DIR")
if dir == "" {
// Best-effort: try the cwd-relative path that works for `go test`.
dir = "cmd/memory-plugin-postgres/migrations"
if dir := strings.TrimSpace(os.Getenv("MEMORY_PLUGIN_MIGRATIONS_DIR")); dir != "" {
return runMigrationsFromDisk(db, dir)
}
entries, err := os.ReadDir(dir)
return runMigrationsFromEmbed(db)
}
// runMigrationsFromEmbed applies the *.up.sql files bundled into the
// binary at build time. Order is alphabetical (matches the on-disk
// behavior of os.ReadDir on Linux for the same set of names).
func runMigrationsFromEmbed(db *sql.DB) error {
entries, err := migrationsFS.ReadDir("migrations")
if err != nil {
return fmt.Errorf("read migrations dir %q: %w", dir, err)
return fmt.Errorf("read embedded migrations: %w", err)
}
names := make([]string, 0, len(entries))
for _, e := range entries {
if e.IsDir() || !strings.HasSuffix(e.Name(), ".up.sql") {
continue
}
path := dir + "/" + e.Name()
names = append(names, e.Name())
}
sort.Strings(names)
for _, name := range names {
data, err := migrationsFS.ReadFile("migrations/" + name)
if err != nil {
return fmt.Errorf("read embedded %q: %w", name, err)
}
if _, err := db.Exec(string(data)); err != nil {
return fmt.Errorf("apply %q: %w", name, err)
}
log.Printf("applied embedded migration %s", name)
}
return nil
}
// runMigrationsFromDisk preserves the legacy filesystem-path mode for
// operator-supplied custom migrations.
func runMigrationsFromDisk(db *sql.DB, dir string) error {
entries, err := os.ReadDir(dir)
if err != nil {
return fmt.Errorf("read migrations dir %q: %w", dir, err)
}
names := make([]string, 0, len(entries))
for _, e := range entries {
if e.IsDir() || !strings.HasSuffix(e.Name(), ".up.sql") {
continue
}
names = append(names, e.Name())
}
sort.Strings(names)
for _, name := range names {
path := dir + "/" + name
data, err := os.ReadFile(path)
if err != nil {
return fmt.Errorf("read %q: %w", path, err)
@@ -176,7 +233,7 @@ func runMigrations(db *sql.DB) error {
if _, err := db.Exec(string(data)); err != nil {
return fmt.Errorf("apply %q: %w", path, err)
}
log.Printf("applied migration %s", e.Name())
log.Printf("applied disk migration %s (from %s)", name, dir)
}
return nil
}
@@ -0,0 +1,72 @@
package main
import (
"strings"
"testing"
)
// TestMigrationsEmbedded_ContainsCreateTable pins that the migrations
// are bundled into the binary at build time, NOT loaded from a
// filesystem path that doesn't exist at runtime in the published image.
//
// Pre-fix: PR #2906 shipped the binary without the migrations dir;
// `os.ReadDir("cmd/memory-plugin-postgres/migrations")` errored on every
// tenant boot, the 30s health gate aborted the container, and the
// staging redeploy fleet job marked all tenants as failed. Embedding
// the migrations into the binary removes the runtime path entirely.
func TestMigrationsEmbedded_ContainsCreateTable(t *testing.T) {
entries, err := migrationsFS.ReadDir("migrations")
if err != nil {
t.Fatalf("embedded migrations dir unreadable: %v", err)
}
if len(entries) == 0 {
t.Fatal("embedded migrations dir is empty — go:embed pattern matched no files")
}
var seenUp bool
for _, e := range entries {
if e.IsDir() || !strings.HasSuffix(e.Name(), ".up.sql") {
continue
}
seenUp = true
data, err := migrationsFS.ReadFile("migrations/" + e.Name())
if err != nil {
t.Errorf("read embedded %q: %v", e.Name(), err)
continue
}
if !strings.Contains(string(data), "CREATE TABLE") {
t.Errorf("embedded %q has no CREATE TABLE — wrong file embedded?", e.Name())
}
}
if !seenUp {
t.Fatal("no *.up.sql in embedded migrations — runtime would have no schema to apply")
}
}
// TestRunMigrationsFromEmbed_OrderingIsAlphabetic pins that we apply
// migrations in deterministic alphabetical order, not in whatever
// arbitrary order migrationsFS.ReadDir happens to return. With one
// migration today this is moot, but a future second migration ('002_…')
// MUST run after '001_…' or the schema is broken.
//
// We can't easily exercise db.Exec here (no test DB); instead pin the
// sort step on the directory listing itself.
func TestRunMigrationsFromEmbed_OrderingIsAlphabetic(t *testing.T) {
entries, err := migrationsFS.ReadDir("migrations")
if err != nil {
t.Fatalf("embedded migrations dir unreadable: %v", err)
}
var names []string
for _, e := range entries {
if e.IsDir() || !strings.HasSuffix(e.Name(), ".up.sql") {
continue
}
names = append(names, e.Name())
}
for i := 1; i < len(names); i++ {
if names[i-1] > names[i] {
t.Errorf("ReadDir returned non-sorted names; runMigrationsFromEmbed must sort. "+
"Got %q before %q", names[i-1], names[i])
}
}
}
+53 -3
View File
@@ -20,6 +20,51 @@ cd /canvas
PORT=3000 HOSTNAME=0.0.0.0 node server.js &
CANVAS_PID=$!
# Memory v2 sidecar (built-in postgres plugin). See Dockerfile entrypoint
# comment for rationale.
#
# Spawn-gating: only start the sidecar when the operator has indicated
# they want it (MEMORY_V2_CUTOVER=true OR MEMORY_PLUGIN_URL set).
# Without that signal, the sidecar adds zero value and risks aborting
# tenant boot via the 30s health gate when the tenant Postgres lacks
# pgvector. Caught on staging redeploy 2026-05-05:
# pq: extension "vector" is not available
#
# Defaults (when sidecar IS spawned): MEMORY_PLUGIN_DATABASE_URL
# falls back to the tenant's DATABASE_URL.
MEMORY_PLUGIN_PID=""
memory_plugin_wanted=""
if [ "$MEMORY_V2_CUTOVER" = "true" ] || [ -n "$MEMORY_PLUGIN_URL" ]; then
memory_plugin_wanted=1
fi
if [ -z "$MEMORY_PLUGIN_DISABLE" ] && [ -n "$memory_plugin_wanted" ] && [ -n "$DATABASE_URL" ]; then
: "${MEMORY_PLUGIN_DATABASE_URL:=$DATABASE_URL}"
: "${MEMORY_PLUGIN_LISTEN_ADDR:=:9100}"
export MEMORY_PLUGIN_DATABASE_URL MEMORY_PLUGIN_LISTEN_ADDR
echo "memory-plugin: starting sidecar on $MEMORY_PLUGIN_LISTEN_ADDR" >&2
/memory-plugin &
MEMORY_PLUGIN_PID=$!
# Wait up to 30s for /v1/health. Boot failure is fatal so a misconfigured
# tenant crash-loops instead of silently serving cutover traffic against
# a dead plugin.
health_port=${MEMORY_PLUGIN_LISTEN_ADDR#:}
ready=0
for _ in $(seq 1 30); do
if wget -qO- --timeout=2 "http://localhost:${health_port}/v1/health" >/dev/null 2>&1; then
ready=1
break
fi
sleep 1
done
if [ "$ready" != "1" ]; then
echo "memory-plugin: ❌ /v1/health never returned 200 after 30s — aborting boot. Check DATABASE_URL reachability + pgvector extension + migrations." >&2
kill "$MEMORY_PLUGIN_PID" 2>/dev/null || true
kill "$CANVAS_PID" 2>/dev/null || true
exit 1
fi
echo "memory-plugin: ✅ sidecar healthy on :$health_port" >&2
fi
# Start Go platform in foreground-ish (we trap signals)
# CANVAS_PROXY_URL tells the platform to proxy unmatched routes to Canvas.
# CONTAINER_BACKEND: empty = Docker (default for self-hosted/local).
@@ -29,15 +74,20 @@ cd /
/platform &
PLATFORM_PID=$!
# If either process exits, kill the other
# If any process exits, kill the others
cleanup() {
kill $CANVAS_PID 2>/dev/null || true
kill $PLATFORM_PID 2>/dev/null || true
[ -n "$MEMORY_PLUGIN_PID" ] && kill $MEMORY_PLUGIN_PID 2>/dev/null || true
}
trap cleanup EXIT SIGTERM SIGINT
# Wait for either to exit — whichever exits first triggers cleanup
wait -n $CANVAS_PID $PLATFORM_PID
# Wait for any to exit — whichever exits first triggers cleanup
if [ -n "$MEMORY_PLUGIN_PID" ]; then
wait -n $CANVAS_PID $PLATFORM_PID $MEMORY_PLUGIN_PID
else
wait -n $CANVAS_PID $PLATFORM_PID
fi
EXIT_CODE=$?
cleanup
exit $EXIT_CODE
+2 -2
View File
@@ -51,7 +51,7 @@ func Import(
return result
}
_ = broadcaster.RecordAndBroadcast(ctx, "WORKSPACE_PROVISIONING", wsID, map[string]interface{}{
_ = broadcaster.RecordAndBroadcast(ctx, string(events.EventWorkspaceProvisioning), wsID, map[string]interface{}{
"name": b.Name,
"tier": b.Tier,
"source_bundle_id": b.ID,
@@ -142,7 +142,7 @@ func markFailed(ctx context.Context, wsID string, broadcaster *events.Broadcaste
db.DB.ExecContext(ctx,
`UPDATE workspaces SET status = $1, last_sample_error = $2, updated_at = now() WHERE id = $3`,
models.StatusFailed, msg, wsID)
broadcaster.RecordAndBroadcast(ctx, "WORKSPACE_PROVISION_FAILED", wsID, map[string]interface{}{
broadcaster.RecordAndBroadcast(ctx, string(events.EventWorkspaceProvisionFailed), wsID, map[string]interface{}{
"error": msg,
})
}
+11 -10
View File
@@ -10,6 +10,7 @@ import (
"time"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/db"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/events"
)
const (
@@ -304,14 +305,14 @@ func (m *Manager) HandleInbound(ctx context.Context, ch ChannelRow, msg *Inbound
"parts": []map[string]interface{}{{"kind": "text", "text": msg.Text}},
},
"metadata": map[string]interface{}{
"source": ch.ChannelType,
"channel_id": ch.ID,
"chat_id": msg.ChatID,
"user_id": msg.UserID,
"username": msg.Username,
"message_id": msg.MessageID,
"history": history,
"extra": msg.Metadata,
"source": ch.ChannelType,
"channel_id": ch.ID,
"chat_id": msg.ChatID,
"user_id": msg.UserID,
"username": msg.Username,
"message_id": msg.MessageID,
"history": history,
"extra": msg.Metadata,
},
},
})
@@ -383,7 +384,7 @@ func (m *Manager) HandleInbound(ctx context.Context, ch ChannelRow, msg *Inbound
// Broadcast event
if m.broadcaster != nil {
m.broadcaster.RecordAndBroadcast(ctx, "CHANNEL_MESSAGE", ch.WorkspaceID, map[string]interface{}{
m.broadcaster.RecordAndBroadcast(ctx, string(events.EventChannelMessage), ch.WorkspaceID, map[string]interface{}{
"channel_id": ch.ID,
"channel_type": ch.ChannelType,
"username": msg.Username,
@@ -427,7 +428,7 @@ func (m *Manager) SendOutbound(ctx context.Context, channelID string, text strin
}
if m.broadcaster != nil {
m.broadcaster.RecordAndBroadcast(ctx, "CHANNEL_MESSAGE", ch.WorkspaceID, map[string]interface{}{
m.broadcaster.RecordAndBroadcast(ctx, string(events.EventChannelMessage), ch.WorkspaceID, map[string]interface{}{
"channel_id": ch.ID,
"channel_type": ch.ChannelType,
"direction": "outbound",
+125
View File
@@ -0,0 +1,125 @@
package events
// types.go — typed taxonomy of WebSocket event names emitted by the
// workspace-server.
//
// RFC #2945 PR-B. Pre-consolidation, every BroadcastOnly /
// RecordAndBroadcast call site passed a bare string literal:
//
// h.broadcaster.BroadcastOnly(workspaceID, "AGENT_MESSAGE", payload)
//
// Producers (Go workspace-server, ~30 call sites across handlers/,
// scheduler/, registry/, bundle/) and consumers (canvas TS store +
// component listeners) duplicated the same string with no shared
// definition. A producer renaming an event silently broke every
// consumer — same drift class that produced the reno-stars data-loss
// regression on the persistence side. The fix on that side was the
// AgentMessageWriter SSOT (PR-A); the fix on this side is named
// constants.
//
// Why a typed string (not a plain enum / iota): the event name
// crosses the wire to TypeScript consumers as the literal string in
// `WSMessage.Event`. Iota integers would break the canvas store's
// switch (`case "AGENT_MESSAGE":`); a typed string preserves the
// wire contract while giving Go callers compile-time discipline.
//
// Mirror in canvas: a parity gate (PR-B-2 follow-up) will assert this
// constant set ≡ the TypeScript union members in
// `canvas/src/lib/ws-events.ts`. Today the canvas consumes the names
// via bare-string comparisons; the mirror lands separately to keep
// PR-B narrow.
// EventType is the wire-typed name of a WebSocket event the platform
// broadcasts. Always emit constants from this file rather than bare
// strings — the AST gate in events_types_drift_test.go guards
// against bare-string usage in the broadcaster surfaces.
type EventType string
// Event constants — the canonical taxonomy. New events MUST be added
// here AND mirrored in canvas/src/lib/ws-events.ts (parity gate
// pending in PR-B-2). Group by semantic family so the list stays
// scan-friendly as it grows.
const (
// Chat / agent messaging — surfaces in canvas chat panels.
EventAgentMessage EventType = "AGENT_MESSAGE"
EventA2AResponse EventType = "A2A_RESPONSE"
EventActivityLogged EventType = "ACTIVITY_LOGGED"
EventChannelMessage EventType = "CHANNEL_MESSAGE"
// Workspace lifecycle.
EventWorkspaceProvisioning EventType = "WORKSPACE_PROVISIONING"
EventWorkspaceProvisionFailed EventType = "WORKSPACE_PROVISION_FAILED"
EventWorkspaceOnline EventType = "WORKSPACE_ONLINE"
EventWorkspaceOffline EventType = "WORKSPACE_OFFLINE"
EventWorkspaceDegraded EventType = "WORKSPACE_DEGRADED"
EventWorkspaceHibernated EventType = "WORKSPACE_HIBERNATED"
EventWorkspacePaused EventType = "WORKSPACE_PAUSED"
EventWorkspaceRemoved EventType = "WORKSPACE_REMOVED"
EventWorkspaceAwaitingAgent EventType = "WORKSPACE_AWAITING_AGENT"
EventWorkspaceHeartbeat EventType = "WORKSPACE_HEARTBEAT"
// Agent assignment + identity.
EventAgentAssigned EventType = "AGENT_ASSIGNED"
EventAgentReplaced EventType = "AGENT_REPLACED"
EventAgentRemoved EventType = "AGENT_REMOVED"
EventAgentMoved EventType = "AGENT_MOVED"
EventAgentCardUpdated EventType = "AGENT_CARD_UPDATED"
// Delegation lifecycle.
EventDelegationSent EventType = "DELEGATION_SENT"
EventDelegationStatus EventType = "DELEGATION_STATUS"
EventDelegationComplete EventType = "DELEGATION_COMPLETE"
EventDelegationFailed EventType = "DELEGATION_FAILED"
// Task progression + scheduler.
EventTaskUpdated EventType = "TASK_UPDATED"
EventCronExecuted EventType = "CRON_EXECUTED"
EventCronSkipped EventType = "CRON_SKIPPED"
// Approvals.
EventApprovalRequested EventType = "APPROVAL_REQUESTED"
EventApprovalEscalated EventType = "APPROVAL_ESCALATED"
// Auth / credentials.
EventExternalCredentialsRotated EventType = "EXTERNAL_CREDENTIALS_ROTATED"
)
// AllEventTypes lists every constant in this file. Used by the
// snapshot test (events_types_drift_test.go) to detect when a new
// constant is added without updating the snapshot — the catch-up
// step is mirroring the addition into canvas/src/lib/ws-events.ts so
// canvas consumers can switch on it.
//
// Keep in lexicographic order so the snapshot diff is stable on
// renames and the parity-with-TS comparison is order-independent.
var AllEventTypes = []EventType{
EventA2AResponse,
EventActivityLogged,
EventAgentAssigned,
EventAgentCardUpdated,
EventAgentMessage,
EventAgentMoved,
EventAgentRemoved,
EventAgentReplaced,
EventApprovalEscalated,
EventApprovalRequested,
EventChannelMessage,
EventCronExecuted,
EventCronSkipped,
EventDelegationComplete,
EventDelegationFailed,
EventDelegationSent,
EventDelegationStatus,
EventExternalCredentialsRotated,
EventTaskUpdated,
EventWorkspaceAwaitingAgent,
EventWorkspaceDegraded,
EventWorkspaceHeartbeat,
EventWorkspaceHibernated,
EventWorkspaceOffline,
EventWorkspaceOnline,
EventWorkspacePaused,
EventWorkspaceProvisionFailed,
EventWorkspaceProvisioning,
EventWorkspaceRemoved,
}
@@ -0,0 +1,117 @@
package events
import (
"sort"
"strings"
"testing"
)
// TestAllEventTypes_IsSnapshot pins the canonical event taxonomy.
// Adding a new constant in types.go without updating AllEventTypes
// (or vice versa) fails this test.
//
// The snapshot is also the authoritative input to the canvas-side
// parity gate (PR-B-2 follow-up): the TypeScript union members in
// canvas/src/lib/ws-events.ts MUST match this list exactly. A drift
// gate at CI time will assert set equality once the TS file lands.
func TestAllEventTypes_IsSnapshot(t *testing.T) {
// Every named constant must appear in AllEventTypes. Walk via
// reflection over the package-level vars would over-include test
// fixtures, so list the canonical names here. When a constant
// is added in types.go, append the EventType's literal value
// to the expected list below — the failure message names
// exactly what's missing so the diff is one-line obvious.
expected := []string{
"A2A_RESPONSE",
"ACTIVITY_LOGGED",
"AGENT_ASSIGNED",
"AGENT_CARD_UPDATED",
"AGENT_MESSAGE",
"AGENT_MOVED",
"AGENT_REMOVED",
"AGENT_REPLACED",
"APPROVAL_ESCALATED",
"APPROVAL_REQUESTED",
"CHANNEL_MESSAGE",
"CRON_EXECUTED",
"CRON_SKIPPED",
"DELEGATION_COMPLETE",
"DELEGATION_FAILED",
"DELEGATION_SENT",
"DELEGATION_STATUS",
"EXTERNAL_CREDENTIALS_ROTATED",
"TASK_UPDATED",
"WORKSPACE_AWAITING_AGENT",
"WORKSPACE_DEGRADED",
"WORKSPACE_HEARTBEAT",
"WORKSPACE_HIBERNATED",
"WORKSPACE_OFFLINE",
"WORKSPACE_ONLINE",
"WORKSPACE_PAUSED",
"WORKSPACE_PROVISIONING",
"WORKSPACE_PROVISION_FAILED",
"WORKSPACE_REMOVED",
}
sort.Strings(expected)
actual := make([]string, 0, len(AllEventTypes))
for _, e := range AllEventTypes {
actual = append(actual, string(e))
}
sort.Strings(actual)
if len(actual) != len(expected) {
t.Errorf("AllEventTypes count = %d, want %d\nactual: %s\nexpected: %s",
len(actual), len(expected),
strings.Join(actual, ", "),
strings.Join(expected, ", "))
return
}
for i, want := range expected {
if actual[i] != want {
t.Errorf("AllEventTypes[%d] = %q, want %q (full diff:\n actual: %v\n expected: %v\n)",
i, actual[i], want, actual, expected)
}
}
}
// TestEventType_NoEmptyConstants pins that no constant declared in
// types.go has an accidentally-empty value. The catch is the
// "WORKSPACE_X" → forgot-to-fill pattern: a typo in the literal
// would surface as the empty string, and broadcast pipelines would
// silently filter empty-name events without any error signal.
func TestEventType_NoEmptyConstants(t *testing.T) {
for _, e := range AllEventTypes {
if string(e) == "" {
t.Errorf("found empty EventType in AllEventTypes — typo in types.go?")
}
}
}
// TestEventType_AllUppercaseSnakeCase pins the wire format. Mixed
// case or kebab-case would break the canvas TypeScript switch
// statements (every consumer's `case "AGENT_MESSAGE":` is upper-
// snake). The check is the catch for an accidental
// `"agent_message"` typo that wouldn't fail the snapshot gate.
func TestEventType_AllUppercaseSnakeCase(t *testing.T) {
for _, e := range AllEventTypes {
s := string(e)
// Allowed chars: A-Z, 0-9, _ — nothing else, no leading/
// trailing underscores, no consecutive underscores.
if s != strings.ToUpper(s) {
t.Errorf("EventType %q is not all-uppercase — wire format requires upper-snake", s)
}
if strings.HasPrefix(s, "_") || strings.HasSuffix(s, "_") {
t.Errorf("EventType %q has leading/trailing underscore — disallowed", s)
}
if strings.Contains(s, "__") {
t.Errorf("EventType %q has consecutive underscores — disallowed", s)
}
for _, r := range s {
if !((r >= 'A' && r <= 'Z') || (r >= '0' && r <= '9') || r == '_') {
t.Errorf("EventType %q contains disallowed char %q", s, r)
break
}
}
}
}
@@ -14,10 +14,12 @@ import (
"time"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/db"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/events"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/models"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/wsauth"
"github.com/gin-gonic/gin"
)
// proxyDispatchBuildError is a sentinel wrapper for failures inside
// http.NewRequestWithContext. handleA2ADispatchError unwraps it to emit the
// "failed to create proxy request" 500 instead of the standard 502/503 paths.
@@ -90,10 +92,10 @@ func (h *WorkspaceHandler) handleA2ADispatchError(ctx context.Context, workspace
Status: http.StatusServiceUnavailable,
Headers: map[string]string{"Retry-After": strconv.Itoa(busyRetryAfterSeconds)},
Response: gin.H{
"error": "workspace agent busy — adapter handles retry (native_session)",
"busy": true,
"retry_after": busyRetryAfterSeconds,
"native_session": true,
"error": "workspace agent busy — adapter handles retry (native_session)",
"busy": true,
"retry_after": busyRetryAfterSeconds,
"native_session": true,
},
}
}
@@ -149,7 +151,7 @@ func (h *WorkspaceHandler) handleA2ADispatchError(ctx context.Context, workspace
// Provisioner selection (mutually exclusive in production):
// - h.provisioner != nil → local Docker deployment; IsRunning does docker inspect.
// - h.cpProv != nil → SaaS / EC2 deployment; IsRunning calls CP's
// /cp/workspaces/:id/status to read the EC2 state.
// /cp/workspaces/:id/status to read the EC2 state.
//
// Pre-fix this function ONLY consulted h.provisioner — for SaaS tenants
// (h.provisioner=nil, h.cpProv=set) it short-circuited to false on every
@@ -191,7 +193,7 @@ func (h *WorkspaceHandler) maybeMarkContainerDead(ctx context.Context, workspace
log.Printf("ProxyA2A: failed to mark workspace %s offline: %v", workspaceID, err)
}
db.ClearWorkspaceKeys(ctx, workspaceID)
h.broadcaster.RecordAndBroadcast(ctx, "WORKSPACE_OFFLINE", workspaceID, map[string]interface{}{})
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventWorkspaceOffline), workspaceID, map[string]interface{}{})
go h.RestartByID(workspaceID)
return true
}
@@ -272,7 +274,7 @@ func (h *WorkspaceHandler) logA2ASuccess(ctx context.Context, workspaceID, calle
}(ctx)
if callerID == "" && statusCode < 400 {
h.broadcaster.BroadcastOnly(workspaceID, "A2A_RESPONSE", map[string]interface{}{
h.broadcaster.BroadcastOnly(workspaceID, string(events.EventA2AResponse), map[string]interface{}{
"response_body": json.RawMessage(respBody),
"method": a2aMethod,
"duration_ms": durationMs,
@@ -21,6 +21,8 @@ import (
"time"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/db"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/events"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/textutil"
)
// extractIdempotencyKey pulls params.message.messageId out of an A2A JSON-RPC
@@ -419,7 +421,7 @@ func (h *WorkspaceHandler) stitchDrainResponseToDelegation(ctx context.Context,
AND method = 'delegate_result'
AND target_id = $4
AND response_body->>'delegation_id' = $5
`, "Delegation completed ("+truncate(responseText, 80)+")", string(respJSON),
`, "Delegation completed ("+textutil.TruncateBytes(responseText, 80)+")", string(respJSON),
sourceID, targetID, delegationID)
if err != nil {
log.Printf("A2AQueue drain stitch: update failed for delegation %s: %v", delegationID, err)
@@ -435,10 +437,10 @@ func (h *WorkspaceHandler) stitchDrainResponseToDelegation(ctx context.Context,
// "⏸ queued" line to "✓ completed" in real time. Without this the
// transition only surfaces after the user reloads or polls activity.
if h.broadcaster != nil {
h.broadcaster.RecordAndBroadcast(ctx, "DELEGATION_COMPLETE", sourceID, map[string]interface{}{
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventDelegationComplete), sourceID, map[string]interface{}{
"delegation_id": delegationID,
"target_id": targetID,
"response_preview": truncate(responseText, 200),
"response_preview": textutil.TruncateBytes(responseText, 200),
"via": "queue_drain",
})
}
+75 -79
View File
@@ -55,7 +55,7 @@ func NewActivityHandler(b *events.Broadcaster) *ActivityHandler {
func (h *ActivityHandler) List(c *gin.Context) {
workspaceID := c.Param("id")
activityType := c.Query("type")
source := c.Query("source") // "canvas" = source_id IS NULL, "agent" = source_id IS NOT NULL
source := c.Query("source") // "canvas" = source_id IS NULL, "agent" = source_id IS NOT NULL
peerID := c.Query("peer_id") // optional UUID — restrict to rows where this peer is sender OR target
limitStr := c.DefaultQuery("limit", "100")
sinceSecsStr := c.Query("since_secs")
@@ -465,78 +465,30 @@ func (h *ActivityHandler) Notify(c *gin.Context) {
}
}
// Verify workspace exists
var wsName string
err := db.DB.QueryRowContext(c.Request.Context(),
`SELECT name FROM workspaces WHERE id = $1 AND status != 'removed'`, workspaceID,
).Scan(&wsName)
if err != nil {
c.JSON(http.StatusNotFound, gin.H{"error": "workspace not found"})
return
// Single source of truth for chat-bearing agent → user messages —
// see agent_message_writer.go for the contract. Pre-RFC-#2945, the
// broadcast + INSERT pair was inlined here and again in
// mcp_tools.go's send_message_to_user, and the duplication is what
// produced the reno-stars data-loss regression. Both paths now
// route through the same writer; future channels (Slack, Discord,
// Lark) hook in here too.
attachments := make([]AgentMessageAttachment, 0, len(body.Attachments))
for _, a := range body.Attachments {
attachments = append(attachments, AgentMessageAttachment{
URI: a.URI,
Name: a.Name,
MimeType: a.MimeType,
Size: a.Size,
})
}
broadcastPayload := map[string]interface{}{
"message": body.Message,
"workspace_id": workspaceID,
"name": wsName,
}
if len(body.Attachments) > 0 {
broadcastPayload["attachments"] = body.Attachments
}
h.broadcaster.BroadcastOnly(workspaceID, "AGENT_MESSAGE", broadcastPayload)
// Persist to activity_logs so the chat history loader restores this
// message after a page reload. Pre-fix, send_message_to_user pushes
// were broadcast-only — survived the WebSocket session but vanished
// when the user refreshed because nothing wrote them to the DB.
//
// Shape chosen to match the existing loader query
// (`type=a2a_receive&source=canvas`):
// - activity_type='a2a_receive' so it joins the same query path
// - source_id=NULL so the canvas-source filter accepts it
// - method='notify' to distinguish from real A2A receives in audits
// - request_body=NULL so the loader doesn't append a duplicate
// "user message" bubble for it
// - response_body={"result": "<text>"} matches extractResponseText's
// simplest branch ({result: string} → take verbatim)
//
// Errors are logged-only — broadcast already succeeded, the user
// sees the message; persistence failure just means the message
// won't survive reload (pre-fix behavior). Don't fail the whole
// notify on a DB hiccup.
// response_body shape — chosen to feed BOTH:
// - extractResponseText: looks at body.result (string) and returns it
// - extractFilesFromTask: looks at body.parts[] for kind=file
// so a chat reload after a notify-with-attachments restores both
// the text bubble AND the download chips.
respPayload := map[string]interface{}{"result": body.Message}
if len(body.Attachments) > 0 {
fileParts := make([]map[string]interface{}, 0, len(body.Attachments))
for _, a := range body.Attachments {
fileMeta := map[string]interface{}{"uri": a.URI, "name": a.Name}
if a.MimeType != "" {
fileMeta["mimeType"] = a.MimeType
}
if a.Size > 0 {
fileMeta["size"] = a.Size
}
fileParts = append(fileParts, map[string]interface{}{
"kind": "file",
"file": fileMeta,
})
writer := NewAgentMessageWriter(db.DB, h.broadcaster)
if err := writer.Send(c.Request.Context(), workspaceID, body.Message, attachments); err != nil {
if errors.Is(err, ErrWorkspaceNotFound) {
c.JSON(http.StatusNotFound, gin.H{"error": "workspace not found"})
return
}
respPayload["parts"] = fileParts
}
respJSON, _ := json.Marshal(respPayload)
preview := body.Message
if len(preview) > 80 {
preview = preview[:80] + "…"
}
if _, err := db.DB.ExecContext(c.Request.Context(), `
INSERT INTO activity_logs (workspace_id, activity_type, method, summary, response_body, status)
VALUES ($1, 'a2a_receive', 'notify', $2, $3::jsonb, 'ok')
`, workspaceID, "Agent message: "+preview, string(respJSON)); err != nil {
log.Printf("Notify: failed to persist message for %s: %v", workspaceID, err)
c.JSON(http.StatusInternalServerError, gin.H{"error": "internal error"})
return
}
c.JSON(http.StatusOK, gin.H{"status": "sent"})
@@ -628,7 +580,45 @@ func (h *ActivityHandler) Report(c *gin.Context) {
// LogActivity inserts an activity log and optionally broadcasts via WebSocket.
// Takes events.EventEmitter (#1814) so callers passing a stub broadcaster
// in tests no longer need to construct the full *events.Broadcaster.
//
// Errors are logged and swallowed — this is the fire-and-forget contract
// most callers expect. For atomic-with-sibling-writes use LogActivityTx
// and propagate the error.
func LogActivity(ctx context.Context, broadcaster events.EventEmitter, params ActivityParams) {
hook, err := logActivityExec(ctx, db.DB, broadcaster, params)
if err != nil {
log.Printf("LogActivity insert error: %v", err)
return
}
hook()
}
// LogActivityTx inserts the activity row inside the caller-provided tx
// and returns a commitHook that fires the post-commit ACTIVITY_LOGGED
// broadcast. Caller MUST invoke commitHook AFTER tx.Commit() — firing
// it before commit can leak a WebSocket event for a row that ends up
// rolled back, which the canvas's optimistic UI then shows then loses.
//
// Returns an error if the INSERT fails — caller should Rollback. Caller
// is also responsible for tx.BeginTx + tx.Commit/Rollback. Used by
// chat_files uploadPollMode so PutBatchTx + N activity rows commit
// atomically; if any activity row fails, the pending_uploads rows roll
// back too and the client retries the entire multipart upload cleanly.
func LogActivityTx(ctx context.Context, tx *sql.Tx, broadcaster events.EventEmitter, params ActivityParams) (commitHook func(), err error) {
if tx == nil {
return nil, errors.New("LogActivityTx: tx is nil")
}
return logActivityExec(ctx, tx, broadcaster, params)
}
// activityExecutor is the SQL surface LogActivity[Tx] needs. *sql.Tx
// and *sql.DB both satisfy it, so the same insert path serves the
// fire-and-forget caller (db.DB) and the Tx-aware caller (*sql.Tx).
type activityExecutor interface {
ExecContext(ctx context.Context, query string, args ...any) (sql.Result, error)
}
func logActivityExec(ctx context.Context, exec activityExecutor, broadcaster events.EventEmitter, params ActivityParams) (commitHook func(), err error) {
reqJSON, reqErr := json.Marshal(params.RequestBody)
if reqErr != nil {
log.Printf("LogActivity: failed to marshal request_body for %s: %v", params.WorkspaceID, reqErr)
@@ -654,20 +644,21 @@ func LogActivity(ctx context.Context, broadcaster events.EventEmitter, params Ac
traceStr = &s
}
_, err := db.DB.ExecContext(ctx, `
if _, err := exec.ExecContext(ctx, `
INSERT INTO activity_logs (workspace_id, activity_type, source_id, target_id, method, summary, request_body, response_body, tool_trace, duration_ms, status, error_detail)
VALUES ($1, $2, $3, $4, $5, $6, $7::jsonb, $8::jsonb, $9::jsonb, $10, $11, $12)
`, params.WorkspaceID, params.ActivityType, params.SourceID, params.TargetID,
params.Method, params.Summary, reqStr, respStr, traceStr,
params.DurationMs, params.Status, params.ErrorDetail)
if err != nil {
log.Printf("LogActivity insert error: %v", err)
return
params.DurationMs, params.Status, params.ErrorDetail); err != nil {
return nil, err
}
// Broadcast ACTIVITY_LOGGED event
// Build the broadcast payload up-front so the post-commit hook is a
// pure in-memory call — no JSON marshaling between commit and emit
// where a panic would leak the row without an event.
var payload map[string]interface{}
if broadcaster != nil {
payload := map[string]interface{}{
payload = map[string]interface{}{
"activity_type": params.ActivityType,
"method": params.Method,
"summary": params.Summary,
@@ -698,8 +689,13 @@ func LogActivity(ctx context.Context, broadcaster events.EventEmitter, params Ac
if respStr != nil {
payload["response_body"] = json.RawMessage(respJSON)
}
broadcaster.BroadcastOnly(params.WorkspaceID, "ACTIVITY_LOGGED", payload)
}
return func() {
if broadcaster != nil {
broadcaster.BroadcastOnly(params.WorkspaceID, string(events.EventActivityLogged), payload)
}
}, nil
}
type ActivityParams struct {
@@ -5,6 +5,7 @@ import (
"context"
"database/sql/driver"
"encoding/json"
"errors"
"fmt"
"net/http"
"net/http/httptest"
@@ -909,6 +910,114 @@ func TestLogActivity_Broadcast_IncludesRequestAndResponseBodies(t *testing.T) {
}
}
// TestLogActivityTx_DefersBroadcastUntilCommitHook pins the #149
// contract: LogActivityTx returns a commitHook that the caller MUST
// invoke after tx.Commit(); the broadcast MUST NOT fire from inside
// LogActivityTx itself. Firing inside would leak a websocket event
// for a row that the caller may roll back, painting a ghost message
// into the canvas's optimistic UI that disappears on the next refresh.
func TestLogActivityTx_DefersBroadcastUntilCommitHook(t *testing.T) {
mock := setupTestDB(t)
defer mock.ExpectationsWereMet()
mock.ExpectBegin()
mock.ExpectExec("INSERT INTO activity_logs").
WillReturnResult(sqlmock.NewResult(1, 1))
mock.ExpectCommit()
tx, err := db.DB.BeginTx(context.Background(), nil)
if err != nil {
t.Fatalf("BeginTx: %v", err)
}
cb := &recordingBroadcaster{}
method := "chat_upload_receive"
hook, err := LogActivityTx(context.Background(), tx, cb, ActivityParams{
WorkspaceID: "ws-123",
ActivityType: "a2a_receive",
Method: &method,
Status: "ok",
})
if err != nil {
t.Fatalf("LogActivityTx: %v", err)
}
if len(cb.calls) != 0 {
t.Errorf("broadcast leaked before commitHook: got %d calls", len(cb.calls))
}
if err := tx.Commit(); err != nil {
t.Fatalf("Commit: %v", err)
}
hook()
if len(cb.calls) != 1 {
t.Fatalf("commitHook must broadcast exactly once, got %d", len(cb.calls))
}
if cb.calls[0].eventType != "ACTIVITY_LOGGED" {
t.Errorf("event type = %q, want ACTIVITY_LOGGED", cb.calls[0].eventType)
}
}
// TestLogActivityTx_InsertError_NoHook_NoBroadcast — when the INSERT
// fails inside the Tx, LogActivityTx returns an error and a nil
// commitHook. The caller is expected to Rollback; no broadcast can
// possibly fire because the hook never exists.
func TestLogActivityTx_InsertError_NoHook_NoBroadcast(t *testing.T) {
mock := setupTestDB(t)
defer mock.ExpectationsWereMet()
mock.ExpectBegin()
mock.ExpectExec("INSERT INTO activity_logs").
WillReturnError(errors.New("constraint violation simulated"))
mock.ExpectRollback()
tx, err := db.DB.BeginTx(context.Background(), nil)
if err != nil {
t.Fatalf("BeginTx: %v", err)
}
cb := &recordingBroadcaster{}
method := "chat_upload_receive"
hook, err := LogActivityTx(context.Background(), tx, cb, ActivityParams{
WorkspaceID: "ws-123",
ActivityType: "a2a_receive",
Method: &method,
Status: "ok",
})
if err == nil {
t.Fatal("expected error on INSERT failure, got nil")
}
if hook != nil {
t.Errorf("commitHook must be nil on insert error, got non-nil hook")
}
if err := tx.Rollback(); err != nil {
t.Fatalf("Rollback: %v", err)
}
if len(cb.calls) != 0 {
t.Errorf("broadcast must NOT fire on insert error, got %d calls", len(cb.calls))
}
}
// TestLogActivityTx_NilTx_Errors — passing a nil tx is caller misuse.
// Return an error rather than panicking on the nil receiver inside
// ExecContext (which would crash the request goroutine and surface as
// a 500 with no log line tying it to the bad call site).
func TestLogActivityTx_NilTx_Errors(t *testing.T) {
cb := &recordingBroadcaster{}
hook, err := LogActivityTx(context.Background(), nil, cb, ActivityParams{
WorkspaceID: "ws-123",
ActivityType: "a2a_receive",
Status: "ok",
})
if err == nil {
t.Fatal("nil tx must error, got nil")
}
if hook != nil {
t.Errorf("commitHook must be nil when tx is nil, got non-nil hook")
}
if len(cb.calls) != 0 {
t.Errorf("broadcast must NOT fire on nil-tx error, got %d", len(cb.calls))
}
}
func TestLogActivity_Broadcast_IncludesResponseBody(t *testing.T) {
mock := setupTestDB(t)
defer mock.ExpectationsWereMet()
+15 -15
View File
@@ -69,7 +69,7 @@ func (h *AgentHandler) Assign(c *gin.Context) {
return
}
h.broadcaster.RecordAndBroadcast(ctx, "AGENT_ASSIGNED", workspaceID, map[string]interface{}{
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventAgentAssigned), workspaceID, map[string]interface{}{
"agent_id": agentID,
"model": body.Model,
})
@@ -118,7 +118,7 @@ func (h *AgentHandler) Replace(c *gin.Context) {
return
}
h.broadcaster.RecordAndBroadcast(ctx, "AGENT_REPLACED", workspaceID, map[string]interface{}{
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventAgentReplaced), workspaceID, map[string]interface{}{
"agent_id": agentID,
"model": body.Model,
"old_model": oldModel,
@@ -148,7 +148,7 @@ func (h *AgentHandler) Remove(c *gin.Context) {
return
}
h.broadcaster.RecordAndBroadcast(ctx, "AGENT_REMOVED", workspaceID, map[string]interface{}{
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventAgentRemoved), workspaceID, map[string]interface{}{
"agent_id": agentID,
"model": model,
})
@@ -215,21 +215,21 @@ func (h *AgentHandler) Move(c *gin.Context) {
}
// Broadcast on both workspaces
h.broadcaster.RecordAndBroadcast(ctx, "AGENT_MOVED", sourceID, map[string]interface{}{
"agent_id": agentID,
"model": model,
"target_workspace_id": body.TargetWorkspaceID,
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventAgentMoved), sourceID, map[string]interface{}{
"agent_id": agentID,
"model": model,
"target_workspace_id": body.TargetWorkspaceID,
})
h.broadcaster.RecordAndBroadcast(ctx, "AGENT_MOVED", body.TargetWorkspaceID, map[string]interface{}{
"agent_id": agentID,
"model": model,
"source_workspace_id": sourceID,
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventAgentMoved), body.TargetWorkspaceID, map[string]interface{}{
"agent_id": agentID,
"model": model,
"source_workspace_id": sourceID,
})
c.JSON(http.StatusOK, gin.H{
"agent_id": agentID,
"model": model,
"from_workspace": sourceID,
"to_workspace": body.TargetWorkspaceID,
"agent_id": agentID,
"model": model,
"from_workspace": sourceID,
"to_workspace": body.TargetWorkspaceID,
})
}
@@ -0,0 +1,177 @@
package handlers
import (
"go/ast"
"go/parser"
"go/token"
"os"
"path/filepath"
"sort"
"strconv"
"strings"
"testing"
)
// TestAgentMessageBroadcastsArePersisted is a forward-looking AST
// gate: every function in this package that broadcasts an
// `AGENT_MESSAGE` WebSocket event MUST also call
// `INSERT INTO activity_logs` somewhere in its body.
//
// The reno-stars production data-loss bug (CEO Ryan PC's long-form
// onboarding-friction message visible live but missing on reload)
// happened because mcp_tools.go:toolSendMessageToUser broadcast WS
// without a paired INSERT — while the HTTP /notify sibling DID
// persist. The fix added the INSERT; this gate prevents the regression
// class from re-emerging in any future chat-bearing tool.
//
// Why an AST gate vs a code-review checklist (per memory
// feedback_behavior_based_ast_gates.md): "pin invariants by what a
// function calls, not what it's named". The shape that loses data is:
//
// BroadcastOnly(_, "AGENT_MESSAGE", _) without an INSERT companion
//
// Any new tool that emits AGENT_MESSAGE must persist or the next
// canvas refresh drops the message — same shape as reno-stars. A
// reviewer can miss this; the AST walk can't.
//
// Allowlist: empty by intent. If a future use case genuinely needs
// fire-and-forget broadcast (e.g., transient typing indicators that
// should NOT survive reload), add an entry here AND document why.
// "Doesn't need to persist" is rarely the right answer for chat —
// the canvas history is the source of truth.
func TestAgentMessageBroadcastsArePersisted(t *testing.T) {
wd, err := os.Getwd()
if err != nil {
t.Fatalf("getwd: %v", err)
}
entries, err := os.ReadDir(wd)
if err != nil {
t.Fatalf("readdir %s: %v", wd, err)
}
type violation struct {
file string
fn string
}
var violations []violation
for _, ent := range entries {
name := ent.Name()
if ent.IsDir() || !strings.HasSuffix(name, ".go") || strings.HasSuffix(name, "_test.go") {
continue
}
path := filepath.Join(wd, name)
fset := token.NewFileSet()
file, err := parser.ParseFile(fset, path, nil, parser.ParseComments)
if err != nil {
t.Fatalf("parse %s: %v", path, err)
}
for _, decl := range file.Decls {
fn, ok := decl.(*ast.FuncDecl)
if !ok || fn.Body == nil {
continue
}
if !funcEmitsAgentMessageBroadcast(fn) {
continue
}
if !funcInsertsIntoActivityLogs(fn) {
violations = append(violations, violation{file: name, fn: fn.Name.Name})
}
}
}
if len(violations) > 0 {
sort.Slice(violations, func(i, j int) bool {
if violations[i].file != violations[j].file {
return violations[i].file < violations[j].file
}
return violations[i].fn < violations[j].fn
})
var buf strings.Builder
for _, v := range violations {
buf.WriteString(" - ")
buf.WriteString(v.file)
buf.WriteString(":")
buf.WriteString(v.fn)
buf.WriteString("\n")
}
t.Errorf(`function(s) broadcast `+"`AGENT_MESSAGE`"+` without persisting to activity_logs:
%s
This is the reno-stars data-loss regression class: live message
visible to the user, but missing on reload because activity_log was
never written. Every chat-bearing broadcast MUST be paired with:
INSERT INTO activity_logs (workspace_id, activity_type, method,
summary, response_body, status)
VALUES ($1, 'a2a_receive', 'notify', $2, $3::jsonb, 'ok')
See activity.go:Notify and mcp_tools.go:toolSendMessageToUser for
the canonical shapes. Don't add an allowlist entry without a
documented reason — the canvas chat history is the source of truth
and silently dropping messages is a P0 user trust break.`,
buf.String())
}
}
// funcEmitsAgentMessageBroadcast walks fn.Body for any CallExpr that
// looks like `*.BroadcastOnly(_, "AGENT_MESSAGE", _)`.
func funcEmitsAgentMessageBroadcast(fn *ast.FuncDecl) bool {
var found bool
ast.Inspect(fn.Body, func(n ast.Node) bool {
call, ok := n.(*ast.CallExpr)
if !ok {
return true
}
sel, ok := call.Fun.(*ast.SelectorExpr)
if !ok || sel.Sel.Name != "BroadcastOnly" {
return true
}
// BroadcastOnly(workspaceID, eventType, payload) — the second
// arg is the event name. Match by string-literal value.
if len(call.Args) < 2 {
return true
}
lit, ok := call.Args[1].(*ast.BasicLit)
if !ok || lit.Kind != token.STRING {
return true
}
raw := lit.Value
if unq, err := strconv.Unquote(raw); err == nil {
raw = unq
}
if raw == "AGENT_MESSAGE" {
found = true
return false
}
return true
})
return found
}
// funcInsertsIntoActivityLogs walks fn.Body for any STRING BasicLit
// whose body contains `INSERT INTO activity_logs` (the SQL literal
// passed to ExecContext). Matches the substring rather than a strict
// regex because we don't care about the exact INSERT shape here —
// only that the function persists. Specific shape pinning lives in
// the per-handler test (see TestMCPHandler_SendMessageToUser_*).
func funcInsertsIntoActivityLogs(fn *ast.FuncDecl) bool {
var found bool
ast.Inspect(fn.Body, func(n ast.Node) bool {
lit, ok := n.(*ast.BasicLit)
if !ok || lit.Kind != token.STRING {
return true
}
raw := lit.Value
if unq, err := strconv.Unquote(raw); err == nil {
raw = unq
}
if strings.Contains(raw, "INSERT INTO activity_logs") {
found = true
return false
}
return true
})
return found
}
@@ -0,0 +1,173 @@
package handlers
// AgentMessageWriter is the SSOT for "agent → user" message delivery in the
// workspace-server. Every chat-bearing path that surfaces a message to the
// canvas — HTTP /notify (Notify handler), MCP tools/call
// send_message_to_user (toolSendMessageToUser), any future channel — MUST
// route through this writer rather than re-implement the broadcast +
// persist contract inline.
//
// Why: pre-consolidation, two handlers duplicated the same "broadcast then
// INSERT activity_logs" sequence. The reno-stars production data-loss
// incident (2026-05-05, RFC #2945, PR #2944) was the symptom — the
// persistence half landed for /notify but lagged for the MCP bridge by
// months, silently dropping every long-form external-agent message until
// reload. The AST gate from #2944 catches drift; this writer eliminates
// the *possibility* of drift by giving both call sites a single
// well-tested function to call.
//
// Contract:
// 1. Look up the workspace by id; ErrWorkspaceNotFound on miss so the
// caller can return 404 with a clean message.
// 2. Broadcast a WS AGENT_MESSAGE event with {message, workspace_id,
// name, attachments?}.
// 3. INSERT a row into activity_logs:
// type='a2a_receive', method='notify', source_id NULL,
// response_body={"result": message[, "parts": [file kind...]]},
// status='ok'
// Best-effort — INSERT failure logs only, returns nil so the broadcast
// success isn't undone on the caller side.
// 4. Returns nil on success.
//
// The shape (especially the JSON response_body) is the wire contract the
// canvas's chat-history hydrator (canvas/src/.../historyHydration.ts)
// reads. Drift here silently breaks chat replay across all consumers, so
// changes to the JSON shape MUST be cross-verified against the hydrator
// in the same PR.
import (
"context"
"database/sql"
"encoding/json"
"errors"
"fmt"
"log"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/events"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/textutil"
)
// ErrWorkspaceNotFound is returned by AgentMessageWriter.Send when the
// workspace lookup turns up nothing (or the workspace is in
// status='removed'). Callers translate to HTTP 404 / JSON-RPC error /
// whatever surface they expose. Real DB errors (connection drop, query
// timeout) surface as wrapped errors and should be treated as 503.
var ErrWorkspaceNotFound = errors.New("agent_message: workspace not found")
// AgentMessageAttachment is one file attached to an agent → user
// message. Identical to handlers.NotifyAttachment in field set; kept
// distinct so the writer's API doesn't import a handler type with HTTP
// binding tags.
type AgentMessageAttachment struct {
URI string
Name string
MimeType string
Size int64
}
// AgentMessageWriter persists + broadcasts agent → user messages. Construct
// once per process via NewAgentMessageWriter; pass the same instance to
// every handler that delivers chat (Notify, toolSendMessageToUser, etc.).
//
// Takes events.EventEmitter (not the *Broadcaster concrete type) so tests
// can substitute a fake emitter and producers in other packages can wrap
// the real broadcaster behind their own metrics / retries without leaking
// the concrete dependency.
type AgentMessageWriter struct {
db *sql.DB
broadcaster events.EventEmitter
}
// NewAgentMessageWriter binds the writer to the platform's DB pool +
// WebSocket broadcaster.
func NewAgentMessageWriter(db *sql.DB, broadcaster events.EventEmitter) *AgentMessageWriter {
return &AgentMessageWriter{db: db, broadcaster: broadcaster}
}
// Send delivers a single agent → user message. Look up + broadcast +
// persist in that order; ErrWorkspaceNotFound short-circuits before any
// broadcast or DB write so callers can 404 cleanly.
//
// Returns nil on success — including on DB-INSERT failure (the broadcast
// already returned successfully and the user has seen the message; the
// persistence-failure mode is logged at WARN but the caller's response
// stays 200 so the agent doesn't retry and double-broadcast).
func (w *AgentMessageWriter) Send(
ctx context.Context,
workspaceID, message string,
attachments []AgentMessageAttachment,
) error {
// 1. Workspace lookup. status='removed' filter is the same shape /notify
// used pre-consolidation; deleted workspaces don't get notifications.
//
// Distinguish sql.ErrNoRows ("workspace genuinely not present" — caller
// should 404) from real DB errors (connection drop, statement timeout,
// pool exhaustion — caller should 503). Pre-fix this branch returned
// ErrWorkspaceNotFound for any error, so during a DB outage every
// notify call surfaced as "workspace not found" and masked real
// incidents in the alert path.
var wsName string
err := w.db.QueryRowContext(ctx,
`SELECT name FROM workspaces WHERE id = $1 AND status != 'removed'`,
workspaceID,
).Scan(&wsName)
if errors.Is(err, sql.ErrNoRows) {
return ErrWorkspaceNotFound
}
if err != nil {
return fmt.Errorf("agent_message: workspace lookup: %w", err)
}
// 2. Build broadcast payload + WS-emit. Same shape that ChatTab's
// AGENT_MESSAGE handler in canvas/src/store/canvas-events.ts has
// consumed since the canvas chat shipped — drift here would orphan
// every live chat panel.
broadcastPayload := map[string]interface{}{
"message": message,
"workspace_id": workspaceID,
"name": wsName,
}
if len(attachments) > 0 {
broadcastPayload["attachments"] = attachments
}
w.broadcaster.BroadcastOnly(workspaceID, string(events.EventAgentMessage), broadcastPayload)
// 3. Persist for chat-history hydration. response_body shape MUST stay
// in sync with extractResponseText + extractFilesFromTask in
// canvas/src/components/tabs/chat/historyHydration.ts:
// - extractResponseText reads body.result (string) → renders text
// - extractFilesFromTask reads body.parts[] (kind=file) → renders chips
respPayload := map[string]interface{}{"result": message}
if len(attachments) > 0 {
fileParts := make([]map[string]interface{}, 0, len(attachments))
for _, a := range attachments {
fileMeta := map[string]interface{}{"uri": a.URI, "name": a.Name}
if a.MimeType != "" {
fileMeta["mimeType"] = a.MimeType
}
if a.Size > 0 {
fileMeta["size"] = a.Size
}
fileParts = append(fileParts, map[string]interface{}{
"kind": "file",
"file": fileMeta,
})
}
respPayload["parts"] = fileParts
}
respJSON, _ := json.Marshal(respPayload)
preview := textutil.TruncateRunes(message, 80)
if _, err := w.db.ExecContext(ctx, `
INSERT INTO activity_logs (workspace_id, activity_type, method, summary, response_body, status)
VALUES ($1, 'a2a_receive', 'notify', $2, $3::jsonb, 'ok')
`, workspaceID, "Agent message: "+preview, string(respJSON)); err != nil {
// Best-effort: the broadcast already returned ok and the user
// has seen the message. Logging a structured line lets operators
// notice persistence-failure rates spike if the DB is unhealthy,
// without breaking the tool response or causing the agent to
// retry-and-double-broadcast.
log.Printf("agent_message: failed to persist for %s: %v", workspaceID, err)
}
return nil
}
@@ -0,0 +1,414 @@
package handlers
import (
"context"
"database/sql/driver"
"encoding/json"
"errors"
"strings"
"testing"
"unicode/utf8"
"github.com/DATA-DOG/go-sqlmock"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/db"
)
// AgentMessageWriter is the SSOT for agent → user chat delivery
// (RFC #2945 PR-A). These tests pin the contract the writer
// guarantees: workspace lookup, broadcast, INSERT, error semantics —
// every shape that producers (Notify, toolSendMessageToUser, future
// channels) rely on.
//
// Pre-consolidation, the broadcast-then-INSERT logic was duplicated
// across two handlers and they drifted (reno-stars, 2026-05-05). With
// the writer being the only place this logic lives, these tests are
// the regression line for every chat-bearing path simultaneously.
// jsonMatcher is a sqlmock Argument matcher that decodes the actual
// SQL arg as JSON and runs a caller-supplied predicate over the
// resulting structure. Tighter than substring matching (which can
// false-pass on a renamed key) and tolerant of map-key ordering
// (which exact-string matching is not).
type jsonMatcher struct {
predicate func(parsed map[string]any) bool
desc string
}
func (m jsonMatcher) Match(v driver.Value) bool {
s, ok := v.(string)
if !ok {
return false
}
var parsed map[string]any
if err := json.Unmarshal([]byte(s), &parsed); err != nil {
return false
}
return m.predicate(parsed)
}
// stringMatcher pins exact prefix/suffix/equality checks against a
// driver.Value that's actually a string.
type stringMatcher func(string) bool
func (f stringMatcher) Match(v driver.Value) bool {
s, ok := v.(string)
if !ok {
return false
}
return f(s)
}
// capturingEmitter records every BroadcastOnly call so tests can pin
// the WS event shape without a real ws.Hub. RecordAndBroadcast is
// also captured for completeness — the writer doesn't call it today,
// but a future producer might, and a captured-but-unasserted record
// is easier to diagnose than a nil panic.
type capturingEmitter struct {
events []capturedEvent
}
type capturedEvent struct {
workspaceID string
eventType string
payload interface{}
}
func (c *capturingEmitter) BroadcastOnly(workspaceID string, eventType string, payload interface{}) {
c.events = append(c.events, capturedEvent{workspaceID, eventType, payload})
}
func (c *capturingEmitter) RecordAndBroadcast(_ context.Context, eventType string, workspaceID string, payload interface{}) error {
c.events = append(c.events, capturedEvent{workspaceID, eventType, payload})
return nil
}
// TestAgentMessageWriter_Send_Success_NoAttachments pins the happy
// path: workspace lookup, broadcast, INSERT, return nil.
func TestAgentMessageWriter_Send_Success_NoAttachments(t *testing.T) {
mock := setupTestDB(t)
w := NewAgentMessageWriter(db.DB, newTestBroadcaster())
mock.ExpectQuery("SELECT name FROM workspaces").
WithArgs("ws-1").
WillReturnRows(sqlmock.NewRows([]string{"name"}).AddRow("CEO Ryan PC"))
mock.ExpectExec(`INSERT INTO activity_logs.*'a2a_receive'.*'notify'`).
WithArgs(
"ws-1",
sqlmock.AnyArg(), // summary
`{"result":"hi"}`,
).
WillReturnResult(sqlmock.NewResult(1, 1))
if err := w.Send(context.Background(), "ws-1", "hi", nil); err != nil {
t.Fatalf("Send returned %v, want nil", err)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("DB expectations: %v", err)
}
}
// TestAgentMessageWriter_Send_Success_WithAttachments pins the file
// attachment shape — response_body MUST contain a parts[] array with
// kind=file entries so the canvas hydrater renders download chips.
// Drift here = chips disappear on chat reload.
func TestAgentMessageWriter_Send_Success_WithAttachments(t *testing.T) {
mock := setupTestDB(t)
w := NewAgentMessageWriter(db.DB, newTestBroadcaster())
mock.ExpectQuery("SELECT name FROM workspaces").
WithArgs("ws-att").
WillReturnRows(sqlmock.NewRows([]string{"name"}).AddRow("Ryan"))
mock.ExpectExec(`INSERT INTO activity_logs.*'a2a_receive'.*'notify'`).
WithArgs(
"ws-att",
sqlmock.AnyArg(),
jsonMatcher{
desc: "response_body has result + parts with kind=file metadata",
predicate: func(p map[string]any) bool {
if p["result"] != "see attached" {
return false
}
parts, ok := p["parts"].([]any)
if !ok || len(parts) != 1 {
return false
}
part, ok := parts[0].(map[string]any)
if !ok {
return false
}
if part["kind"] != "file" {
return false
}
file, ok := part["file"].(map[string]any)
if !ok {
return false
}
return file["uri"] == "workspace://x.zip" &&
file["name"] == "x.zip" &&
file["mimeType"] == "application/zip" &&
file["size"].(float64) == 1234
},
},
).
WillReturnResult(sqlmock.NewResult(1, 1))
atts := []AgentMessageAttachment{
{URI: "workspace://x.zip", Name: "x.zip", MimeType: "application/zip", Size: 1234},
}
if err := w.Send(context.Background(), "ws-att", "see attached", atts); err != nil {
t.Fatalf("Send returned %v, want nil", err)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("DB expectations: %v", err)
}
}
// TestAgentMessageWriter_Send_WorkspaceNotFound pins ErrWorkspaceNotFound
// short-circuit. Must NOT broadcast, MUST NOT INSERT — caller will 404
// or surface a JSON-RPC error.
func TestAgentMessageWriter_Send_WorkspaceNotFound(t *testing.T) {
mock := setupTestDB(t)
emitter := &capturingEmitter{}
w := NewAgentMessageWriter(db.DB, emitter)
mock.ExpectQuery("SELECT name FROM workspaces").
WithArgs("ws-missing").
WillReturnRows(sqlmock.NewRows([]string{"name"}))
err := w.Send(context.Background(), "ws-missing", "lost in the void", nil)
if !errors.Is(err, ErrWorkspaceNotFound) {
t.Errorf("Send returned %v, want ErrWorkspaceNotFound", err)
}
if len(emitter.events) != 0 {
t.Errorf("workspace-not-found path MUST NOT broadcast, got %d events", len(emitter.events))
}
// Implicit: no INSERT expectation registered, so a stray INSERT
// would fail ExpectationsWereMet.
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("DB expectations (INSERT must NOT fire on workspace-not-found): %v", err)
}
}
// TestAgentMessageWriter_Send_DBInsertFailureStillReturnsNil pins the
// "best-effort persistence" contract: when the activity_log INSERT
// fails (DB hiccup, transient connection, constraint), the writer
// MUST still return nil. The broadcast already succeeded; the user
// has seen the message; returning an error here would cause the
// caller (and the agent calling the tool) to retry and double-
// broadcast.
func TestAgentMessageWriter_Send_DBInsertFailureStillReturnsNil(t *testing.T) {
mock := setupTestDB(t)
w := NewAgentMessageWriter(db.DB, newTestBroadcaster())
mock.ExpectQuery("SELECT name FROM workspaces").
WithArgs("ws-dbfail").
WillReturnRows(sqlmock.NewRows([]string{"name"}).AddRow("CEO Ryan PC"))
mock.ExpectExec(`INSERT INTO activity_logs`).
WillReturnError(errors.New("transient db error"))
err := w.Send(context.Background(), "ws-dbfail", "should not be lost from live chat", nil)
if err != nil {
t.Errorf("DB INSERT failure must return nil (broadcast already succeeded), got %v", err)
}
}
// TestAgentMessageWriter_Send_PreviewTruncation pins the summary
// preview cap. Long messages (Ryan's onboarding-friction report was
// ~2k chars) must summarise to ≤80 chars + ellipsis so the activity
// table doesn't carry multi-KB summaries that bloat list queries.
func TestAgentMessageWriter_Send_PreviewTruncation(t *testing.T) {
mock := setupTestDB(t)
w := NewAgentMessageWriter(db.DB, newTestBroadcaster())
mock.ExpectQuery("SELECT name FROM workspaces").
WithArgs("ws-trunc").
WillReturnRows(sqlmock.NewRows([]string{"name"}).AddRow("Ryan"))
longMsg := strings.Repeat("x", 200)
mock.ExpectExec(`INSERT INTO activity_logs`).
WithArgs(
"ws-trunc",
stringMatcher(func(s string) bool {
if !strings.HasPrefix(s, "Agent message: ") {
return false
}
preview := strings.TrimPrefix(s, "Agent message: ")
if !strings.HasSuffix(preview, "…") {
return false
}
body := strings.TrimSuffix(preview, "…")
return len(body) == 80
}),
sqlmock.AnyArg(),
).
WillReturnResult(sqlmock.NewResult(1, 1))
if err := w.Send(context.Background(), "ws-trunc", longMsg, nil); err != nil {
t.Fatalf("Send: %v", err)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("preview truncation drift: %v", err)
}
}
// TestAgentMessageWriter_Send_BroadcastsAgentMessageEvent pins the
// WS event name + payload shape. The canvas's
// canvas-events.ts:AGENT_MESSAGE handler reads {message, workspace_id,
// name, attachments?} — drift here orphans every live chat panel.
func TestAgentMessageWriter_Send_BroadcastsAgentMessageEvent(t *testing.T) {
mock := setupTestDB(t)
emitter := &capturingEmitter{}
w := NewAgentMessageWriter(db.DB, emitter)
mock.ExpectQuery("SELECT name FROM workspaces").
WithArgs("ws-bc").
WillReturnRows(sqlmock.NewRows([]string{"name"}).AddRow("Workspace Name"))
mock.ExpectExec(`INSERT INTO activity_logs`).
WillReturnResult(sqlmock.NewResult(1, 1))
atts := []AgentMessageAttachment{
{URI: "workspace://a.txt", Name: "a.txt"},
}
if err := w.Send(context.Background(), "ws-bc", "hi", atts); err != nil {
t.Fatalf("Send: %v", err)
}
if len(emitter.events) != 1 {
t.Fatalf("expected exactly 1 broadcast, got %d", len(emitter.events))
}
ev := emitter.events[0]
if ev.eventType != "AGENT_MESSAGE" {
t.Errorf("event type = %q, want AGENT_MESSAGE", ev.eventType)
}
if ev.workspaceID != "ws-bc" {
t.Errorf("workspace_id = %q, want ws-bc", ev.workspaceID)
}
pl, ok := ev.payload.(map[string]interface{})
if !ok {
t.Fatalf("payload not a map: %T", ev.payload)
}
if pl["message"] != "hi" {
t.Errorf("payload.message = %v, want hi", pl["message"])
}
if pl["workspace_id"] != "ws-bc" {
t.Errorf("payload.workspace_id = %v, want ws-bc", pl["workspace_id"])
}
if pl["name"] != "Workspace Name" {
t.Errorf("payload.name = %v, want Workspace Name", pl["name"])
}
if pl["attachments"] == nil {
t.Error("payload.attachments missing on attachment-bearing send")
}
}
// TestAgentMessageWriter_Send_DBErrorOnLookupReturnsWrapped pins the
// distinction between sql.ErrNoRows (legit not-found → 404) and real
// DB errors (connection drop → 503). Pre-followup the lookup branch
// returned ErrWorkspaceNotFound for ANY error, so during a DB outage
// every notify call surfaced as "workspace not found" and masked
// real incidents in alerting.
func TestAgentMessageWriter_Send_DBErrorOnLookupReturnsWrapped(t *testing.T) {
mock := setupTestDB(t)
w := NewAgentMessageWriter(db.DB, newTestBroadcaster())
transientErr := errors.New("connection refused")
mock.ExpectQuery("SELECT name FROM workspaces").
WithArgs("ws-dbdown").
WillReturnError(transientErr)
err := w.Send(context.Background(), "ws-dbdown", "hi", nil)
if err == nil {
t.Fatal("expected wrapped DB error, got nil")
}
if errors.Is(err, ErrWorkspaceNotFound) {
t.Errorf("DB outage MUST NOT surface as ErrWorkspaceNotFound (masks incidents in alerting); got %v", err)
}
if !errors.Is(err, transientErr) {
t.Errorf("expected wrapped %v, got %v", transientErr, err)
}
}
// Helper-level truncate tests now live in
// internal/textutil/truncate_test.go (TestTruncateRunes). The
// integration-level coverage that exercises the agent_message_writer
// path with non-ASCII content is TestAgentMessageWriter_Send_NonASCIIMessagePersists
// below.
// TestAgentMessageWriter_Send_NonASCIIMessagePersists pins the end-to-end
// path for non-ASCII messages — the original reno-stars regression
// surfaced via byte-slice truncation breaking JSONB INSERT. Every
// handler-level test had ASCII content, so this branch had no
// coverage. Now it does.
func TestAgentMessageWriter_Send_NonASCIIMessagePersists(t *testing.T) {
mock := setupTestDB(t)
w := NewAgentMessageWriter(db.DB, newTestBroadcaster())
// 200-rune CJK message — exceeds the 80-rune cap, would have hit
// the byte-slice bug.
msg := strings.Repeat("你", 200)
mock.ExpectQuery("SELECT name FROM workspaces").
WithArgs("ws-cjk").
WillReturnRows(sqlmock.NewRows([]string{"name"}).AddRow("CEO Ryan PC"))
mock.ExpectExec(`INSERT INTO activity_logs`).
WithArgs(
"ws-cjk",
stringMatcher(func(s string) bool {
if !strings.HasPrefix(s, "Agent message: ") {
return false
}
preview := strings.TrimPrefix(s, "Agent message: ")
if !strings.HasSuffix(preview, "…") {
return false
}
body := strings.TrimSuffix(preview, "…")
// 80 runes of 你 = 80 codepoints. Each is 3 bytes UTF-8.
if utf8.RuneCountInString(body) != 80 {
return false
}
// MUST be valid UTF-8 — pre-fix byte-slice would have
// returned half a codepoint here.
return utf8.ValidString(body)
}),
sqlmock.AnyArg(),
).
WillReturnResult(sqlmock.NewResult(1, 1))
if err := w.Send(context.Background(), "ws-cjk", msg, nil); err != nil {
t.Fatalf("Send: %v", err)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("non-ASCII path drift: %v", err)
}
}
// TestAgentMessageWriter_Send_OmitsAttachmentsKeyWhenEmpty pins the
// "no key when nil" wire contract — extra empty fields would force
// canvas consumers to defensively check for [] vs undefined; the
// existing AGENT_MESSAGE handler treats absence as "no attachments".
func TestAgentMessageWriter_Send_OmitsAttachmentsKeyWhenEmpty(t *testing.T) {
mock := setupTestDB(t)
emitter := &capturingEmitter{}
w := NewAgentMessageWriter(db.DB, emitter)
mock.ExpectQuery("SELECT name FROM workspaces").
WithArgs("ws-noatt").
WillReturnRows(sqlmock.NewRows([]string{"name"}).AddRow("X"))
mock.ExpectExec(`INSERT INTO activity_logs`).
WillReturnResult(sqlmock.NewResult(1, 1))
if err := w.Send(context.Background(), "ws-noatt", "plain text", nil); err != nil {
t.Fatalf("Send: %v", err)
}
if len(emitter.events) != 1 {
t.Fatalf("expected 1 event, got %d", len(emitter.events))
}
pl := emitter.events[0].payload.(map[string]interface{})
if _, present := pl["attachments"]; present {
t.Errorf("attachments key MUST NOT be present when empty (canvas treats absence as 'none'); payload=%v", pl)
}
}
@@ -51,7 +51,7 @@ func (h *ApprovalsHandler) Create(c *gin.Context) {
return
}
h.broadcaster.RecordAndBroadcast(ctx, "APPROVAL_REQUESTED", workspaceID, map[string]interface{}{
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventApprovalRequested), workspaceID, map[string]interface{}{
"approval_id": approvalID,
"action": body.Action,
"reason": body.Reason,
@@ -62,7 +62,7 @@ func (h *ApprovalsHandler) Create(c *gin.Context) {
var parentID *string
db.DB.QueryRowContext(ctx, `SELECT parent_id FROM workspaces WHERE id = $1`, workspaceID).Scan(&parentID)
if parentID != nil {
h.broadcaster.RecordAndBroadcast(ctx, "APPROVAL_ESCALATED", *parentID, map[string]interface{}{
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventApprovalEscalated), *parentID, map[string]interface{}{
"approval_id": approvalID,
"from_workspace_id": workspaceID,
"action": body.Action,
+149 -50
View File
@@ -600,14 +600,21 @@ func (h *ChatFilesHandler) uploadPollMode(c *gin.Context, ctx context.Context, w
return
}
out := make([]uploadedFile, 0, len(headers))
// Phase 1: pre-validate + read every part BEFORE any DB write.
// A multi-file upload must commit all-or-nothing; a per-file
// failure halfway through used to leave rows 1..K-1 in the table
// while the client got a 500 and retried the whole batch — duplicate
// rows, orphan activity rows. Validating up-front + atomic PutBatch
// closes that gap.
type prepped struct {
Sanitized string
Mimetype string
Content []byte
Original string // original (unsanitized) filename for error messages
}
prepReady := make([]prepped, 0, len(headers))
items := make([]pendinguploads.PutItem, 0, len(headers))
for _, fh := range headers {
// Read full content. Per-file cap enforced post-read so an
// oversized file fails with a clean 413 rather than a torn
// stream. The +1 byte ReadAll trick that the Python side
// uses isn't easy through multipart.FileHeader; instead we
// rely on the multipart layer's ContentLength header and
// short-circuit before opening the part.
if fh.Size > pendinguploads.MaxFileBytes {
log.Printf("chat_files uploadPollMode: per-file cap exceeded for %s: %s (%d bytes)",
workspaceID, fh.Filename, fh.Size)
@@ -621,47 +628,81 @@ func (h *ChatFilesHandler) uploadPollMode(c *gin.Context, ctx context.Context, w
}
content, err := readMultipartFile(fh)
if err != nil {
log.Printf("chat_files uploadPollMode: read part failed for %s/%s: %v", workspaceID, fh.Filename, err)
log.Printf("chat_files uploadPollMode: read part failed for %s/%s: %v",
workspaceID, fh.Filename, err)
c.JSON(http.StatusBadRequest, gin.H{"error": "could not read file part"})
return
}
sanitized := SanitizeFilename(fh.Filename)
mimetype := fh.Header.Get("Content-Type")
fileID, err := h.pendingUploads.Put(ctx, wsUUID, content, sanitized, mimetype)
if err != nil {
if errors.Is(err, pendinguploads.ErrTooLarge) {
// Belt + suspenders: the size check above already
// caught this, but Storage.Put re-validates so a
// malformed FileHeader can't slip through. 413 with
// the same shape so the client sees one error class.
c.JSON(http.StatusRequestEntityTooLarge, gin.H{
"error": "file exceeds per-file cap",
"filename": fh.Filename,
"size": len(content),
"max": pendinguploads.MaxFileBytes,
})
return
}
log.Printf("chat_files uploadPollMode: storage.Put failed for %s/%s: %v",
workspaceID, sanitized, err)
c.JSON(http.StatusInternalServerError, gin.H{"error": "could not stage file"})
// Belt-and-braces post-read cap (multipart.FileHeader.Size can lie
// on some clients that don't set Content-Length per part).
if len(content) > pendinguploads.MaxFileBytes {
log.Printf("chat_files uploadPollMode: per-file cap exceeded post-read for %s: %s (%d bytes)",
workspaceID, fh.Filename, len(content))
c.JSON(http.StatusRequestEntityTooLarge, gin.H{
"error": "file exceeds per-file cap",
"filename": fh.Filename,
"size": len(content),
"max": pendinguploads.MaxFileBytes,
})
return
}
sanitized := SanitizeFilename(fh.Filename)
mimetype := safeMimetype(fh.Header.Get("Content-Type"))
prepReady = append(prepReady, prepped{
Sanitized: sanitized, Mimetype: mimetype, Content: content, Original: fh.Filename,
})
items = append(items, pendinguploads.PutItem{
Content: content, Filename: sanitized, Mimetype: mimetype,
})
}
// Activity row so the workspace's inbox poller picks this up
// on its next cycle. activity_type=a2a_receive (NOT a new
// type) so the existing poll filter
// `?type=a2a_receive` catches it without poll-side changes;
// method=chat_upload_receive is the discriminator the
// workspace's adapter (Phase 2) uses to route to the upload
// fetcher instead of the agent's message handler. Same
// shape as A2A's tasks/send vs message/send method split.
// Phase 2+3: PutBatch + N activity-row inserts run in ONE Tx so
// either every pending_uploads row + every activity_logs row commits,
// or none do. Per-file pre-validation already happened above so the
// only failure modes inside the Tx are DB-side; either way Rollback
// leaves the table state unchanged and the client retries the whole
// multipart upload cleanly. Broadcasts are deferred until after
// Commit — emitting an ACTIVITY_LOGGED event for a row that ends up
// rolled back would leak a ghost message into the canvas's
// optimistic UI.
tx, err := db.DB.BeginTx(ctx, nil)
if err != nil {
log.Printf("chat_files uploadPollMode: begin tx for %s: %v", workspaceID, err)
c.JSON(http.StatusInternalServerError, gin.H{"error": "could not stage files"})
return
}
// Defer-rollback is safe even after a successful Commit — the second
// Rollback is a no-op (database/sql tracks tx state).
defer func() {
_ = tx.Rollback()
}()
fileIDs, err := h.pendingUploads.PutBatchTx(ctx, tx, wsUUID, items)
if err != nil {
if errors.Is(err, pendinguploads.ErrTooLarge) {
// Belt + suspenders: pre-validation above already caught
// this; surface a clean 413 if a malformed FileHeader
// somehow slipped through.
c.JSON(http.StatusRequestEntityTooLarge, gin.H{
"error": "one or more files exceed per-file cap",
"max": pendinguploads.MaxFileBytes,
})
return
}
log.Printf("chat_files uploadPollMode: storage.PutBatchTx failed for %s: %v",
workspaceID, err)
c.JSON(http.StatusInternalServerError, gin.H{"error": "could not stage files"})
return
}
out := make([]uploadedFile, 0, len(prepReady))
broadcasts := make([]func(), 0, len(prepReady))
for i, p := range prepReady {
fileID := fileIDs[i]
uri := fmt.Sprintf("platform-pending:%s/%s", workspaceID, fileID)
summary := "chat_upload_receive: " + sanitized
summary := "chat_upload_receive: " + p.Sanitized
method := "chat_upload_receive"
LogActivity(ctx, h.broadcaster, ActivityParams{
hook, err := LogActivityTx(ctx, tx, h.broadcaster, ActivityParams{
WorkspaceID: workspaceID,
ActivityType: "a2a_receive",
TargetID: &workspaceID,
@@ -669,28 +710,86 @@ func (h *ChatFilesHandler) uploadPollMode(c *gin.Context, ctx context.Context, w
Summary: &summary,
RequestBody: map[string]interface{}{
"file_id": fileID.String(),
"name": sanitized,
"mimeType": mimetype,
"size": len(content),
"name": p.Sanitized,
"mimeType": p.Mimetype,
"size": len(p.Content),
"uri": uri,
},
Status: "ok",
})
log.Printf("chat_files uploadPollMode: staged %s/%s (file_id=%s size=%d mimetype=%q)",
workspaceID, sanitized, fileID, len(content), mimetype)
if err != nil {
log.Printf("chat_files uploadPollMode: activity insert failed for %s/%s: %v",
workspaceID, p.Sanitized, err)
c.JSON(http.StatusInternalServerError, gin.H{"error": "could not log upload activity"})
return
}
broadcasts = append(broadcasts, hook)
out = append(out, uploadedFile{
URI: uri,
Name: sanitized,
Mimetype: mimetype,
Size: int64(len(content)),
Name: p.Sanitized,
Mimetype: p.Mimetype,
Size: int64(len(p.Content)),
})
}
if err := tx.Commit(); err != nil {
log.Printf("chat_files uploadPollMode: commit failed for %s: %v", workspaceID, err)
c.JSON(http.StatusInternalServerError, gin.H{"error": "could not stage files"})
return
}
// Post-commit: fire deferred broadcasts and emit the staged log
// lines now that the rows are durable. Broadcasts are pure in-memory
// (no I/O); panicking here would NOT leak a row but would leak a
// log line, so the order doesn't matter for correctness.
for _, b := range broadcasts {
b()
}
for i, p := range prepReady {
log.Printf("chat_files uploadPollMode: staged %s/%s (file_id=%s size=%d mimetype=%q)",
workspaceID, p.Sanitized, fileIDs[i], len(p.Content), p.Mimetype)
}
c.JSON(http.StatusOK, gin.H{"files": out})
}
// safeMimetype validates a multipart-supplied Content-Type header and
// returns a sanitized value safe to store + serve back unmodified.
//
// The platform's GET /content handler reflects the stored mimetype as
// the response Content-Type. An attacker-controlled header that
// embedded CR/LF could split the response (header injection); a value
// containing semicolons could carry an unexpected charset parameter
// that confuses a downstream renderer. Strip CR/LF/control chars +
// keep only the type/subtype prefix; reject anything that doesn't
// match a basic `type/subtype` regex by falling back to the safe
// default (application/octet-stream — the workspace-side handler does
// the same fallback).
func safeMimetype(raw string) string {
const fallback = "application/octet-stream"
// Trim parameters (`text/html; charset=utf-8` → `text/html`).
if i := strings.IndexByte(raw, ';'); i >= 0 {
raw = raw[:i]
}
raw = strings.TrimSpace(raw)
if raw == "" {
return ""
}
// Reject if any control char or whitespace is present (header
// injection defense). RFC 7231 mimetype grammar forbids whitespace.
for _, r := range raw {
if r < 0x21 || r > 0x7e {
return fallback
}
}
// Require exactly one slash separating type and subtype.
parts := strings.Split(raw, "/")
if len(parts) != 2 || parts[0] == "" || parts[1] == "" {
return fallback
}
return raw
}
// readMultipartFile reads a multipart part fully into memory. Wraps
// the open + io.ReadAll + close idiom so the call site stays clean,
// and so a future change (chunked reads / hashing) has one place to
@@ -67,6 +67,56 @@ func (s *inMemStorage) Put(_ context.Context, ws uuid.UUID, content []byte, file
return id, nil
}
// PutBatch mirrors the production atomic-batch contract: any per-item
// failure leaves the in-memory state unchanged, simulating Tx rollback.
// Pre-validation matches PostgresStorage.PutBatch; oversized items
// return ErrTooLarge before any row is added.
func (s *inMemStorage) PutBatch(_ context.Context, ws uuid.UUID, items []pendinguploads.PutItem) ([]uuid.UUID, error) {
s.mu.Lock()
defer s.mu.Unlock()
if s.putErr != nil {
return nil, s.putErr
}
// Pre-validate so an oversized item rejects the whole batch before
// any state mutation — matches the Tx-rollback semantics.
for _, it := range items {
if len(it.Content) > pendinguploads.MaxFileBytes {
return nil, pendinguploads.ErrTooLarge
}
}
ids := make([]uuid.UUID, 0, len(items))
stagedRows := make(map[uuid.UUID]pendinguploads.Record, len(items))
stagedPuts := make([]putCall, 0, len(items))
for _, it := range items {
id := uuid.New()
stagedRows[id] = pendinguploads.Record{
FileID: id, WorkspaceID: ws, Content: it.Content,
Filename: it.Filename, Mimetype: it.Mimetype,
SizeBytes: int64(len(it.Content)), CreatedAt: time.Now(),
ExpiresAt: time.Now().Add(24 * time.Hour),
}
stagedPuts = append(stagedPuts, putCall{
WorkspaceID: ws, Filename: it.Filename, Mimetype: it.Mimetype, Size: len(it.Content),
})
ids = append(ids, id)
}
for id, r := range stagedRows {
s.rows[id] = r
}
s.puts = append(s.puts, stagedPuts...)
return ids, nil
}
// PutBatchTx mirrors PutBatch for the Tx-aware caller path. The tx
// argument is not consulted — production atomicity (PutBatch INSERTs +
// activity_logs INSERTs in the same Tx) is verified by the dedicated
// integration test against real Postgres. This in-mem fake records the
// puts immediately; tests that exercise the rollback path use
// putErr/sqlmock to simulate the failure.
func (s *inMemStorage) PutBatchTx(ctx context.Context, _ *sql.Tx, ws uuid.UUID, items []pendinguploads.PutItem) ([]uuid.UUID, error) {
return s.PutBatch(ctx, ws, items)
}
func (s *inMemStorage) Get(context.Context, uuid.UUID) (pendinguploads.Record, error) {
return pendinguploads.Record{}, pendinguploads.ErrNotFound
}
@@ -98,11 +148,37 @@ func expectPollDeliveryModeMissing(mock sqlmock.Sqlmock, workspaceID string) {
// expectActivityInsert stubs the LogActivity INSERT so the poll branch's
// per-file activity row write doesn't fail the sqlmock expectations.
// In the post-#149 path this INSERT runs inside the BeginTx that wraps
// PutBatchTx + N activity rows — pair it with expectUploadPollTxBegin
// + expectUploadPollTxCommit (or Rollback) when the test exercises
// uploadPollMode.
func expectActivityInsert(mock sqlmock.Sqlmock) {
mock.ExpectExec(`INSERT INTO activity_logs`).
WillReturnResult(sqlmock.NewResult(1, 1))
}
// expectUploadPollTxBegin marks the start of the BeginTx that
// uploadPollMode opens around PutBatchTx + per-file LogActivityTx.
// inMemStorage doesn't drive sqlmock for the pending_uploads INSERTs
// (it's a process-local fake), so the only Tx-scoped DB calls
// sqlmock sees are the activity_logs INSERTs.
func expectUploadPollTxBegin(mock sqlmock.Sqlmock) {
mock.ExpectBegin()
}
// expectUploadPollTxCommit pairs with expectUploadPollTxBegin on the
// happy path — every activity row inserted, Tx committed.
func expectUploadPollTxCommit(mock sqlmock.Sqlmock) {
mock.ExpectCommit()
}
// expectUploadPollTxRollback pairs with expectUploadPollTxBegin on a
// failure path — PutBatchTx error, activity insert error, or any other
// abort that triggers the deferred tx.Rollback() in uploadPollMode.
func expectUploadPollTxRollback(mock sqlmock.Sqlmock) {
mock.ExpectRollback()
}
// expectActivityInsertWithTypeAndMethod is a strict variant that pins
// the activity_type and method positional args. Used in the discriminator
// regression test below — the workspace inbox poller filters
@@ -158,10 +234,12 @@ func TestPollUpload_HappyPath_OneFile_StagesAndLogs(t *testing.T) {
wsID := "11111111-2222-3333-4444-555555555555"
expectPollDeliveryMode(mock, wsID, "poll")
expectUploadPollTxBegin(mock)
expectActivityInsert(mock)
expectUploadPollTxCommit(mock)
store := newInMemStorage()
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil)).
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil, nil)).
WithPendingUploads(store, nil)
body, ct := pollUploadFixture(t, map[string][]byte{"report.pdf": []byte("PDF-bytes")})
@@ -214,12 +292,14 @@ func TestPollUpload_MultipleFiles_AllStagedAndLogged(t *testing.T) {
wsID := "11111111-aaaa-bbbb-cccc-555555555555"
expectPollDeliveryMode(mock, wsID, "poll")
expectUploadPollTxBegin(mock)
expectActivityInsert(mock)
expectActivityInsert(mock)
expectActivityInsert(mock)
expectUploadPollTxCommit(mock)
store := newInMemStorage()
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil)).
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil, nil)).
WithPendingUploads(store, nil)
body, ct := pollUploadFixture(t, map[string][]byte{
@@ -257,7 +337,7 @@ func TestPollUpload_PushModeFallsThroughToForward(t *testing.T) {
// URL empty + mode=push → 503 (no inbound secret check needed).
store := newInMemStorage()
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil)).
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil, nil)).
WithPendingUploads(store, nil)
body, ct := pollUploadFixture(t, map[string][]byte{"x": []byte("data")})
@@ -281,7 +361,7 @@ func TestPollUpload_NotConfigured_FallsThrough(t *testing.T) {
wsID := "33333333-2222-3333-4444-555555555555"
expectURLAndMode(mock, wsID, "", "poll") // resolveWorkspaceForwardCreds emits 422
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil))
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil, nil))
// No WithPendingUploads — pendingUploads is nil.
body, ct := pollUploadFixture(t, map[string][]byte{"x": []byte("data")})
@@ -302,7 +382,7 @@ func TestPollUpload_WorkspaceMissing_404(t *testing.T) {
wsID := "44444444-2222-3333-4444-555555555555"
expectPollDeliveryModeMissing(mock, wsID)
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil)).
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil, nil)).
WithPendingUploads(newInMemStorage(), nil)
body, ct := pollUploadFixture(t, map[string][]byte{"x": []byte("d")})
@@ -322,7 +402,7 @@ func TestPollUpload_DeliveryModeLookupDBError_500(t *testing.T) {
mock.ExpectQuery(`SELECT delivery_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).WillReturnError(errors.New("connection lost"))
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil)).
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil, nil)).
WithPendingUploads(newInMemStorage(), nil)
body, ct := pollUploadFixture(t, map[string][]byte{"x": []byte("d")})
@@ -342,7 +422,7 @@ func TestPollUpload_NoFilesField_400(t *testing.T) {
expectPollDeliveryMode(mock, wsID, "poll")
store := newInMemStorage()
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil)).
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil, nil)).
WithPendingUploads(store, nil)
// Multipart with a non-files field — no actual files.
@@ -367,7 +447,7 @@ func TestPollUpload_MalformedMultipart_400(t *testing.T) {
expectPollDeliveryMode(mock, wsID, "poll")
store := newInMemStorage()
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil)).
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil, nil)).
WithPendingUploads(store, nil)
// Body that doesn't match the boundary in Content-Type.
@@ -385,10 +465,12 @@ func TestPollUpload_StorageError_500(t *testing.T) {
wsID := "88888888-2222-3333-4444-555555555555"
expectPollDeliveryMode(mock, wsID, "poll")
expectUploadPollTxBegin(mock)
expectUploadPollTxRollback(mock)
store := newInMemStorage()
store.putErr = errors.New("disk full")
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil)).
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil, nil)).
WithPendingUploads(store, nil)
body, ct := pollUploadFixture(t, map[string][]byte{"x.bin": []byte("data")})
@@ -406,10 +488,12 @@ func TestPollUpload_StorageTooLarge_413(t *testing.T) {
wsID := "99999999-2222-3333-4444-555555555555"
expectPollDeliveryMode(mock, wsID, "poll")
expectUploadPollTxBegin(mock)
expectUploadPollTxRollback(mock)
store := newInMemStorage()
store.putErr = pendinguploads.ErrTooLarge
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil)).
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil, nil)).
WithPendingUploads(store, nil)
body, ct := pollUploadFixture(t, map[string][]byte{"x.bin": []byte("data")})
@@ -429,7 +513,7 @@ func TestPollUpload_TooManyFiles_400(t *testing.T) {
expectPollDeliveryMode(mock, wsID, "poll")
store := newInMemStorage()
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil)).
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil, nil)).
WithPendingUploads(store, nil)
// 65 files — over the per-batch cap.
@@ -464,7 +548,7 @@ func TestPollUpload_NullDeliveryMode_TreatedAsPush(t *testing.T) {
expectURLAndMode(mock, wsID, "", "")
store := newInMemStorage()
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil)).
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil, nil)).
WithPendingUploads(store, nil)
body, ct := pollUploadFixture(t, map[string][]byte{"x.bin": []byte("data")})
@@ -497,7 +581,7 @@ func TestPollUpload_PerFileCapPreStorage_413(t *testing.T) {
expectPollDeliveryMode(mock, wsID, "poll")
store := newInMemStorage()
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil)).
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil, nil)).
WithPendingUploads(store, nil)
// 25 MB + 1 byte. Single file, large enough to trip the early
@@ -529,10 +613,12 @@ func TestPollUpload_SanitizesFilenameInResponse(t *testing.T) {
wsID := "bbbbbbbb-2222-3333-4444-555555555555"
expectPollDeliveryMode(mock, wsID, "poll")
expectUploadPollTxBegin(mock)
expectActivityInsert(mock)
expectUploadPollTxCommit(mock)
store := newInMemStorage()
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil)).
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil, nil)).
WithPendingUploads(store, nil)
body, ct := pollUploadFixture(t, map[string][]byte{"hello world!.pdf": []byte("data")})
@@ -557,6 +643,174 @@ func TestPollUpload_SanitizesFilenameInResponse(t *testing.T) {
}
}
// TestPollUpload_AtomicRollbackOnSecondFileTooLarge pins the
// transactional contract introduced in phase 5: when one file in a
// multi-file batch fails pre-validation (oversize), NONE of the files
// in the batch land in storage. Previously a per-file Put loop would
// stage rows 1..K-1 before failing on row K, leaving orphan
// pending_uploads + activity rows the client would re-create on retry.
//
// Pinned via inMemStorage's PutBatch (which mirrors PostgresStorage's
// Tx-rollback behavior on a per-item validation failure) — but the
// real atomicity guarantee is the integration test in
// pending_uploads_integration_test.go.
func TestPollUpload_AtomicRollbackOnSecondFileTooLarge(t *testing.T) {
mock := setupTestDB(t)
setupTestRedis(t)
wsID := "aaaaaaaa-3333-3333-4444-555555555555"
expectPollDeliveryMode(mock, wsID, "poll")
store := newInMemStorage()
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil, nil)).
WithPendingUploads(store, nil)
// Two files: first OK, second over the per-file cap. Pre-validation
// in uploadPollMode catches it BEFORE any Put — store.puts must
// stay empty. (If the test ever sees len=1, the regression is
// "first file slipped through into storage on a partial-failure
// batch.")
tooBig := bytes.Repeat([]byte{0x42}, pendinguploads.MaxFileBytes+1)
body, ct := pollUploadFixture(t, map[string][]byte{
"ok.txt": []byte("small"),
"huge.bin": tooBig,
})
c, w := makeUploadRequest(t, wsID, body, ct)
h.Upload(c)
if w.Code != http.StatusRequestEntityTooLarge {
t.Errorf("status=%d body=%s, want 413", w.Code, w.Body.String())
}
if len(store.puts) != 0 {
t.Errorf("expected zero Puts on rollback, got %d: %+v", len(store.puts), store.puts)
}
}
// TestPollUpload_AtomicRollbackOnPutBatchError validates that an in-
// flight PutBatch failure (e.g. simulated DB error) leaves zero rows
// — same guarantee as the pre-validation path, but exercises the
// "Tx-Rollback after BEGIN" branch via the fake.
func TestPollUpload_AtomicRollbackOnPutBatchError(t *testing.T) {
mock := setupTestDB(t)
setupTestRedis(t)
wsID := "bbbbbbbb-3333-3333-4444-555555555555"
expectPollDeliveryMode(mock, wsID, "poll")
expectUploadPollTxBegin(mock)
expectUploadPollTxRollback(mock)
store := newInMemStorage()
store.putErr = errors.New("db down mid-batch")
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil, nil)).
WithPendingUploads(store, nil)
body, ct := pollUploadFixture(t, map[string][]byte{
"a.txt": []byte("aaa"),
"b.txt": []byte("bbb"),
"c.txt": []byte("ccc"),
})
c, w := makeUploadRequest(t, wsID, body, ct)
h.Upload(c)
if w.Code != http.StatusInternalServerError {
t.Errorf("status=%d, want 500", w.Code)
}
if len(store.puts) != 0 {
t.Errorf("expected zero Puts after PutBatch error, got %d", len(store.puts))
}
}
// TestPollUpload_AtomicRollbackOnActivityInsertFailure pins the #149
// guarantee: if an activity_logs INSERT fails mid-loop (after some
// rows have already been INSERTed in the same Tx), uploadPollMode
// MUST Rollback so neither the pending_uploads nor the activity rows
// commit. Pre-#149 the activity rows were written one-by-one outside
// any Tx; a mid-loop failure left orphan pending_uploads rows the
// 24h TTL would later sweep, but the user never saw the file in the
// canvas. Post-#149 the contract is all-or-nothing.
//
// What this pins: the second activity insert errors → Tx rolls back
// → response is 500 → no Commit. Pin via the sqlmock rollback
// expectation; the inMemStorage will report puts=N (it doesn't model
// Tx state), but at the SQL layer no rows committed.
func TestPollUpload_AtomicRollbackOnActivityInsertFailure(t *testing.T) {
mock := setupTestDB(t)
setupTestRedis(t)
wsID := "cccccccc-3333-3333-4444-555555555555"
expectPollDeliveryMode(mock, wsID, "poll")
expectUploadPollTxBegin(mock)
// File 1 inserts cleanly. File 2's INSERT fails. uploadPollMode
// must NOT call Commit and the deferred tx.Rollback() runs.
mock.ExpectExec(`INSERT INTO activity_logs`).
WillReturnResult(sqlmock.NewResult(1, 1))
mock.ExpectExec(`INSERT INTO activity_logs`).
WillReturnError(errors.New("constraint violation simulated"))
expectUploadPollTxRollback(mock)
store := newInMemStorage()
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil, nil)).
WithPendingUploads(store, nil)
body, ct := pollUploadFixture(t, map[string][]byte{
"a.txt": []byte("aaa"),
"b.txt": []byte("bbb"),
"c.txt": []byte("ccc"),
})
c, w := makeUploadRequest(t, wsID, body, ct)
h.Upload(c)
if w.Code != http.StatusInternalServerError {
t.Fatalf("status=%d body=%s, want 500 on activity-insert mid-loop failure",
w.Code, w.Body.String())
}
if err := mock.ExpectationsWereMet(); err != nil {
// This is the load-bearing assertion: ExpectationsWereMet only
// passes if Rollback was called and Commit was NOT — the SQL-
// level proof of the all-or-nothing contract.
t.Errorf("Tx must rollback (and NOT commit) on activity-insert failure: %v", err)
}
}
// TestPollUpload_MimetypeWithCRLFInjectionStripped pins the safeMimetype
// hardening: a multipart-supplied Content-Type header with CR/LF is
// rewritten to application/octet-stream so the eventual /content
// response can't be header-split on the wire.
func TestPollUpload_MimetypeWithCRLFInjectionStripped(t *testing.T) {
got := safeMimetype("text/html\r\nX-Injected: pwn")
if got != "application/octet-stream" {
t.Errorf("CRLF mimetype not stripped, got %q", got)
}
got = safeMimetype("image/png\x00")
if got != "application/octet-stream" {
t.Errorf("NUL byte mimetype not stripped, got %q", got)
}
got = safeMimetype("text/plain; charset=utf-8")
if got != "text/plain" {
t.Errorf("parameter not stripped, got %q", got)
}
got = safeMimetype("application/pdf")
if got != "application/pdf" {
t.Errorf("clean mime modified, got %q", got)
}
got = safeMimetype("")
if got != "" {
t.Errorf("empty input should pass through, got %q", got)
}
got = safeMimetype("notamime")
if got != "application/octet-stream" {
t.Errorf("non-type/subtype not coerced, got %q", got)
}
got = safeMimetype("/empty-type")
if got != "application/octet-stream" {
t.Errorf("missing type half not coerced, got %q", got)
}
got = safeMimetype("type/")
if got != "application/octet-stream" {
t.Errorf("missing subtype half not coerced, got %q", got)
}
}
// TestPollUpload_ActivityRowDiscriminator pins the
// activity_type / method shape that the workspace inbox poller depends
// on. The poller filters `GET /workspaces/:id/activity?type=a2a_receive`
@@ -577,10 +831,12 @@ func TestPollUpload_ActivityRowDiscriminator(t *testing.T) {
wsID := "abc12345-6789-4abc-8def-000000000999"
expectPollDeliveryMode(mock, wsID, "poll")
expectUploadPollTxBegin(mock)
expectActivityInsertWithTypeAndMethod(mock, wsID, "a2a_receive", "chat_upload_receive")
expectUploadPollTxCommit(mock)
store := newInMemStorage()
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil)).
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil, nil)).
WithPendingUploads(store, nil)
body, ct := pollUploadFixture(t, map[string][]byte{"x.pdf": []byte("xx")})
@@ -105,7 +105,7 @@ func TestChatUpload_InvalidWorkspaceID(t *testing.T) {
setupTestDB(t)
setupTestRedis(t)
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil))
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil, nil))
c, w := makeUploadRequest(t, "not-a-uuid", &bytes.Buffer{}, "")
h.Upload(c)
@@ -122,7 +122,7 @@ func TestChatUpload_WorkspaceNotInDB(t *testing.T) {
wsID := "00000000-0000-0000-0000-000000000099"
expectURLMissing(mock, wsID)
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil))
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil, nil))
body, ct := uploadFixture(t)
c, w := makeUploadRequest(t, wsID, body, ct)
h.Upload(c)
@@ -166,7 +166,7 @@ func TestChatUpload_NoInboundSecret_LazyHeal(t *testing.T) {
WithArgs(sqlmock.AnyArg(), wsID).
WillReturnResult(sqlmock.NewResult(0, 1))
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil))
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil, nil))
body, ct := uploadFixture(t)
c, w := makeUploadRequest(t, wsID, body, ct)
h.Upload(c)
@@ -203,7 +203,7 @@ func TestChatUpload_NoInboundSecret_LazyHealFailure(t *testing.T) {
WithArgs(sqlmock.AnyArg(), wsID).
WillReturnError(sql.ErrConnDone) // mint fails
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil))
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil, nil))
body, ct := uploadFixture(t)
c, w := makeUploadRequest(t, wsID, body, ct)
h.Upload(c)
@@ -231,7 +231,7 @@ func TestChatUpload_NoURL(t *testing.T) {
wsID := "00000000-0000-0000-0000-000000000042"
expectURLAndMode(mock, wsID, "", "push")
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil))
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil, nil))
body, ct := uploadFixture(t)
c, w := makeUploadRequest(t, wsID, body, ct)
h.Upload(c)
@@ -256,7 +256,7 @@ func TestChatUpload_PollModeEmptyURL(t *testing.T) {
wsID := "00000000-0000-0000-0000-000000000099"
expectURLAndMode(mock, wsID, "", "poll")
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil))
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil, nil))
body, ct := uploadFixture(t)
c, w := makeUploadRequest(t, wsID, body, ct)
h.Upload(c)
@@ -286,7 +286,7 @@ func TestChatUpload_NullModeEmptyURL(t *testing.T) {
wsID := "30ba7f0b-b303-4a20-aefe-3a4a675b8aa4" // user's "mac laptop"
expectURLNullMode(mock, wsID, "")
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil))
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil, nil))
body, ct := uploadFixture(t)
c, w := makeUploadRequest(t, wsID, body, ct)
h.Upload(c)
@@ -338,7 +338,7 @@ func TestChatUpload_ForwardsToWorkspace_HappyPath(t *testing.T) {
expectURL(mock, wsID, srv.URL)
expectInboundSecret(mock, wsID, "super-secret-123")
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil))
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil, nil))
body, ct := uploadFixture(t)
c, w := makeUploadRequest(t, wsID, body, ct)
h.Upload(c)
@@ -380,7 +380,7 @@ func TestChatUpload_ForwardsErrorStatusUnchanged(t *testing.T) {
expectURL(mock, wsID, srv.URL)
expectInboundSecret(mock, wsID, "tok")
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil))
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil, nil))
body, ct := uploadFixture(t)
c, w := makeUploadRequest(t, wsID, body, ct)
h.Upload(c)
@@ -402,7 +402,7 @@ func TestChatUpload_WorkspaceUnreachable(t *testing.T) {
expectURL(mock, wsID, "http://127.0.0.1:1")
expectInboundSecret(mock, wsID, "tok")
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil))
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil, nil))
body, ct := uploadFixture(t)
c, w := makeUploadRequest(t, wsID, body, ct)
h.Upload(c)
@@ -418,7 +418,7 @@ func TestChatDownload_InvalidPath(t *testing.T) {
setupTestDB(t)
setupTestRedis(t)
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil))
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil, nil))
cases := []struct {
name, path, wantSubstr string
@@ -507,7 +507,7 @@ func TestChatDownload_WorkspaceNotInDB(t *testing.T) {
WithArgs(wsID).
WillReturnError(sql.ErrNoRows)
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil))
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil, nil))
c, w := makeDownloadRequest(t, wsID, "/workspace/foo.txt")
h.Download(c)
@@ -533,7 +533,7 @@ func TestChatDownload_NoInboundSecret_LazyHeal(t *testing.T) {
WithArgs(sqlmock.AnyArg(), wsID).
WillReturnResult(sqlmock.NewResult(0, 1))
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil))
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil, nil))
c, w := makeDownloadRequest(t, wsID, "/workspace/foo.txt")
h.Download(c)
@@ -559,7 +559,7 @@ func TestChatDownload_NoInboundSecret_LazyHealFailure(t *testing.T) {
WithArgs(sqlmock.AnyArg(), wsID).
WillReturnError(sql.ErrConnDone)
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil))
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil, nil))
c, w := makeDownloadRequest(t, wsID, "/workspace/foo.txt")
h.Download(c)
@@ -592,7 +592,7 @@ func TestChatDownload_ForwardsToWorkspace_HappyPath(t *testing.T) {
expectURL(mock, wsID, srv.URL)
expectInboundSecret(mock, wsID, "the-secret")
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil))
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil, nil))
c, w := makeDownloadRequest(t, wsID, "/workspace/report.txt")
h.Download(c)
@@ -634,7 +634,7 @@ func TestChatDownload_404FromWorkspacePropagated(t *testing.T) {
expectURL(mock, wsID, srv.URL)
expectInboundSecret(mock, wsID, "tok")
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil))
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil, nil))
c, w := makeDownloadRequest(t, wsID, "/workspace/missing.txt")
h.Download(c)
@@ -0,0 +1,113 @@
package handlers
// chat_history.go — HTTP-shape adapter over messagestore.MessageStore
// (RFC #2945 PR-D).
//
// Pre-PR-D, this file owned the activity_logs query AND the parser
// AND the HTTP plumbing. PR-D extracts the storage + parser into
// internal/messagestore/ so OSS operators can plug in alternative
// backends (S3-tiered, vector store, in-memory). The handler is now
// a thin adapter: parse query params → call store → emit JSON.
//
// Endpoint: GET /workspaces/:id/chat-history?limit=N&before_ts=T
// Auth: same wsAuth chain as /workspaces/:id/activity (tenant
// ADMIN_TOKEN + X-Molecule-Org-Id header). No new trust boundary.
//
// Behavioral parity with canvas TS is enforced at the messagestore
// layer (internal/messagestore/postgres_store_test.go); this file's
// tests cover the HTTP-shape concerns only.
import (
"net/http"
"strconv"
"time"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/messagestore"
"github.com/gin-gonic/gin"
"github.com/google/uuid"
)
// ChatHistoryResponse is the wire shape for GET /chat-history.
type ChatHistoryResponse struct {
Messages []messagestore.ChatMessage `json:"messages"`
ReachedEnd bool `json:"reached_end"`
}
// ChatHistoryHandler exposes the typed chat-history endpoint over a
// MessageStore. The store is injected so OSS operators can swap the
// backend without forking the handler.
type ChatHistoryHandler struct {
store messagestore.MessageStore
}
// NewChatHistoryHandler wires a MessageStore (typically
// messagestore.NewPostgresMessageStore at production startup).
//
// Tests inject fakes (see internal/handlers/chat_history_test.go).
// Constructor takes the interface, not a concrete type, so the
// platform-default vs OSS-alternative decision happens at wiring
// time in router.go.
func NewChatHistoryHandler(store messagestore.MessageStore) *ChatHistoryHandler {
return &ChatHistoryHandler{store: store}
}
// List handles GET /workspaces/:id/chat-history?limit=N&before_ts=T.
//
// Query parameters mirror /activity for caller convenience:
//
// - limit (default 100, max 1000) — page size
// - before_ts (RFC3339, optional) — cursor for paginating backward
//
// Validates inputs at the trust boundary; the store sees only
// well-formed ListOptions.
func (h *ChatHistoryHandler) List(c *gin.Context) {
workspaceID := c.Param("id")
if _, err := uuid.Parse(workspaceID); err != nil {
c.JSON(http.StatusBadRequest, gin.H{"error": "workspace id must be a UUID"})
return
}
limit := 100
if v := c.Query("limit"); v != "" {
if n, err := strconv.Atoi(v); err == nil && n > 0 {
limit = n
}
}
if limit > 1000 {
limit = 1000
}
opts := messagestore.ListOptions{Limit: limit}
if v := c.Query("before_ts"); v != "" {
t, err := time.Parse(time.RFC3339, v)
if err != nil {
c.JSON(http.StatusBadRequest, gin.H{
"error": "before_ts must be an RFC3339 timestamp (e.g. 2026-05-01T00:00:00Z)",
})
return
}
opts.BeforeTS = t
opts.HasBefore = true
}
messages, reachedEnd, err := h.store.List(c.Request.Context(), workspaceID, opts)
if err != nil {
// Errors here are infra (DB unreachable, store impl failure).
// Surface as 502 so the canvas can retry vs. treating as
// "no rows."
c.JSON(http.StatusBadGateway, gin.H{"error": "chat history unavailable"})
return
}
// Defensive: if the store returns nil messages slice (any impl
// might), emit empty array rather than `null` so canvas's JSON
// parser doesn't have to handle two empty representations.
if messages == nil {
messages = []messagestore.ChatMessage{}
}
c.JSON(http.StatusOK, ChatHistoryResponse{
Messages: messages,
ReachedEnd: reachedEnd,
})
}
@@ -0,0 +1,276 @@
package handlers
// chat_history_test.go — handler-level tests against a fake
// MessageStore. The parser-level parity tests against the canvas TS
// fixtures live in internal/messagestore/postgres_store_test.go;
// this file covers the HTTP-shape concerns (param validation,
// pagination passthrough, error mapping) without touching a DB.
//
// Why the split: PR-D extracted storage to messagestore.MessageStore.
// The handler is now a thin adapter — its tests should exercise the
// adapter (ParseQuery → store.List → emitJSON), not the parser. A
// future MessageStore impl (S3, vector store) shares the same
// handler; testing the handler against the interface keeps the
// adapter test independent of any specific impl.
import (
"context"
"encoding/json"
"errors"
"net/http"
"net/http/httptest"
"strings"
"testing"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/messagestore"
"github.com/gin-gonic/gin"
)
const testWorkspaceID = "550e8400-e29b-41d4-a716-446655440000"
func init() {
gin.SetMode(gin.TestMode)
}
// fakeStore is a stub MessageStore for handler-level tests. Every
// real store impl (Postgres, S3, vector) shares the handler — so a
// fake that records inputs + returns scripted outputs is the right
// granularity for HTTP-shape coverage.
type fakeStore struct {
// LastWorkspaceID + LastOpts capture the call shape so the test
// can assert the handler passed the right args to the store.
LastWorkspaceID string
LastOpts messagestore.ListOptions
// Returns — set per test.
ReturnMessages []messagestore.ChatMessage
ReturnReachedEnd bool
ReturnErr error
// Panic — if non-empty, List panics with this string. Used by
// the resilience test to confirm the handler returns 502 on
// store-impl failures rather than crashing the goroutine.
PanicWith string
}
func (s *fakeStore) List(ctx context.Context, workspaceID string, opts messagestore.ListOptions) ([]messagestore.ChatMessage, bool, error) {
if s.PanicWith != "" {
panic(s.PanicWith)
}
s.LastWorkspaceID = workspaceID
s.LastOpts = opts
return s.ReturnMessages, s.ReturnReachedEnd, s.ReturnErr
}
// Compile-time assertion that fakeStore satisfies the interface.
// Catches drift if the interface changes and the fake stops being a
// drop-in for tests.
var _ messagestore.MessageStore = (*fakeStore)(nil)
func newRouter(store messagestore.MessageStore) *gin.Engine {
r := gin.New()
h := NewChatHistoryHandler(store)
r.GET("/workspaces/:id/chat-history", h.List)
return r
}
func doChatHistoryRequest(t *testing.T, r *gin.Engine, path string) *httptest.ResponseRecorder {
t.Helper()
req := httptest.NewRequest(http.MethodGet, path, nil)
w := httptest.NewRecorder()
r.ServeHTTP(w, req)
return w
}
// =====================================================================
// Param validation
// =====================================================================
func TestChatHistoryHandler_RejectsNonUUIDWorkspaceID(t *testing.T) {
store := &fakeStore{}
r := newRouter(store)
w := doChatHistoryRequest(t, r, "/workspaces/not-a-uuid/chat-history")
if w.Code != http.StatusBadRequest {
t.Errorf("expected 400 for non-UUID, got %d", w.Code)
}
if store.LastWorkspaceID != "" {
t.Errorf("non-UUID reached the store: %q", store.LastWorkspaceID)
}
}
func TestChatHistoryHandler_RejectsMalformedBeforeTS(t *testing.T) {
store := &fakeStore{}
r := newRouter(store)
w := doChatHistoryRequest(t, r, "/workspaces/"+testWorkspaceID+"/chat-history?before_ts=not-a-timestamp")
if w.Code != http.StatusBadRequest {
t.Errorf("expected 400 for malformed before_ts, got %d", w.Code)
}
if !strings.Contains(w.Body.String(), "RFC3339") {
t.Errorf("error message should mention RFC3339; got %q", w.Body.String())
}
}
func TestChatHistoryHandler_DefaultsLimitTo100(t *testing.T) {
store := &fakeStore{}
r := newRouter(store)
doChatHistoryRequest(t, r, "/workspaces/"+testWorkspaceID+"/chat-history")
if store.LastOpts.Limit != 100 {
t.Errorf("default limit=%d want 100", store.LastOpts.Limit)
}
if store.LastOpts.HasBefore {
t.Errorf("HasBefore should be false when no cursor passed")
}
}
func TestChatHistoryHandler_ClampsLimitToMax1000(t *testing.T) {
store := &fakeStore{}
r := newRouter(store)
doChatHistoryRequest(t, r, "/workspaces/"+testWorkspaceID+"/chat-history?limit=99999")
if store.LastOpts.Limit != 1000 {
t.Errorf("limit not clamped: got %d, want 1000", store.LastOpts.Limit)
}
}
func TestChatHistoryHandler_IgnoresInvalidLimit(t *testing.T) {
// Negative or zero limits should fall back to default rather
// than reach the store (which rejects them as a programming bug).
store := &fakeStore{}
r := newRouter(store)
for _, bad := range []string{"-1", "0", "abc"} {
store.LastOpts = messagestore.ListOptions{}
doChatHistoryRequest(t, r, "/workspaces/"+testWorkspaceID+"/chat-history?limit="+bad)
if store.LastOpts.Limit != 100 {
t.Errorf("limit=%q yielded %d, want default 100", bad, store.LastOpts.Limit)
}
}
}
// =====================================================================
// Pagination passthrough
// =====================================================================
func TestChatHistoryHandler_BeforeTSPassedToStore(t *testing.T) {
store := &fakeStore{}
r := newRouter(store)
doChatHistoryRequest(t, r, "/workspaces/"+testWorkspaceID+"/chat-history?before_ts=2026-04-25T18:00:00Z&limit=25")
if !store.LastOpts.HasBefore {
t.Errorf("HasBefore=false but query passed before_ts")
}
got := store.LastOpts.BeforeTS.UTC().Format("2006-01-02T15:04:05Z")
if got != "2026-04-25T18:00:00Z" {
t.Errorf("BeforeTS=%q want 2026-04-25T18:00:00Z", got)
}
if store.LastOpts.Limit != 25 {
t.Errorf("limit=%d want 25", store.LastOpts.Limit)
}
}
// =====================================================================
// Response shape
// =====================================================================
func TestChatHistoryHandler_EmptyResultIsArrayNotNull(t *testing.T) {
// nil messages slice from the store must serialize as `[]`,
// not `null` — canvas's JSON parser has one path.
store := &fakeStore{ReturnMessages: nil, ReturnReachedEnd: true}
r := newRouter(store)
w := doChatHistoryRequest(t, r, "/workspaces/"+testWorkspaceID+"/chat-history")
if w.Code != http.StatusOK {
t.Fatalf("status=%d", w.Code)
}
var resp ChatHistoryResponse
if err := json.Unmarshal(w.Body.Bytes(), &resp); err != nil {
t.Fatalf("body not JSON: %v", err)
}
// json.Unmarshal of `null` into a []slice yields a nil — assert
// the JSON literally contains "[]" so a future change that
// forgets the nil-coercion would fail loudly.
if !strings.Contains(w.Body.String(), `"messages":[]`) {
t.Errorf("body should contain `\"messages\":[]`; got %s", w.Body.String())
}
if !resp.ReachedEnd {
t.Errorf("reached_end not propagated")
}
}
func TestChatHistoryHandler_NonEmptyResponsePreservesShape(t *testing.T) {
size := int64(4096)
store := &fakeStore{
ReturnMessages: []messagestore.ChatMessage{
{
ID: "msg-1",
Role: "user",
Content: "hi",
Timestamp: "2026-04-25T18:00:00Z",
},
{
ID: "msg-2",
Role: "agent",
Content: "hello back",
Attachments: []messagestore.ChatAttachment{
{Name: "img.png", URI: "workspace:/img.png", MimeType: "image/png", Size: &size},
},
Timestamp: "2026-04-25T18:00:01Z",
},
},
ReturnReachedEnd: false,
}
r := newRouter(store)
w := doChatHistoryRequest(t, r, "/workspaces/"+testWorkspaceID+"/chat-history")
if w.Code != http.StatusOK {
t.Fatalf("status=%d body=%s", w.Code, w.Body.String())
}
var resp ChatHistoryResponse
if err := json.Unmarshal(w.Body.Bytes(), &resp); err != nil {
t.Fatalf("body not JSON: %v", err)
}
if len(resp.Messages) != 2 {
t.Fatalf("messages=%d want 2", len(resp.Messages))
}
if resp.Messages[1].Attachments[0].Size == nil || *resp.Messages[1].Attachments[0].Size != 4096 {
t.Errorf("size pointer flattened in JSON round-trip")
}
}
// =====================================================================
// Error mapping — store errors become 502, not 500/panic
// =====================================================================
func TestChatHistoryHandler_StoreErrorReturns502(t *testing.T) {
store := &fakeStore{ReturnErr: errors.New("simulated DB unreachable")}
r := newRouter(store)
w := doChatHistoryRequest(t, r, "/workspaces/"+testWorkspaceID+"/chat-history")
if w.Code != http.StatusBadGateway {
t.Errorf("expected 502 on store error, got %d", w.Code)
}
if !strings.Contains(w.Body.String(), "unavailable") {
t.Errorf("response body should communicate unavailability; got %q", w.Body.String())
}
}
// =====================================================================
// Interface conformance — the platform-default Postgres impl is the
// only impl in tree today, but the assertion catches future drift if
// the interface evolves and the impl falls behind.
// =====================================================================
func TestMessageStoreInterface_PostgresImplSatisfies(t *testing.T) {
// Compile-time assertion lives in messagestore/postgres_store.go
// (`var _ MessageStore = (*PostgresMessageStore)(nil)`). This
// runtime test exists only to keep the conformance visible in
// the handler test file — a reader of chat_history_test.go
// shouldn't have to traverse to the messagestore package to see
// what the handler is paired with.
var s messagestore.MessageStore = messagestore.NewPostgresMessageStore(nil)
_ = s
}
@@ -0,0 +1,468 @@
package handlers
// class1_ast_gate_test.go — generic Class 1 leak gate per #2867 PR-A.
//
// What this gate prevents:
// The tenant-hongming leak class — a handler iterates a YAML-derived
// slice (ws.Children, sub_workspaces, etc.) and calls
// `INSERT INTO workspaces` inside the loop body without first
// checking whether a workspace with the same (parent_id, name) is
// already there. Each call to such a handler doubles the tree.
//
// Why this is broader than TestCreateWorkspaceTree_CallsLookupBeforeInsert:
// The existing gate is hard-coded to org_import.go's createWorkspaceTree.
// That catches the specific function that triggered the original
// incident — but a future handler written from scratch in a different
// file would not be covered. This gate walks every production handler
// .go file and applies a structural rule that does not depend on
// function or file names.
//
// The rule (verbatim from #2867 PR-A):
//
// "No handler in handlers/ may iterate a slice (any RangeStmt) AND
// call INSERT INTO workspaces inside the loop body without a
// preceding SELECT id FROM workspaces WHERE name=$1 AND parent_id IS
// NOT DISTINCT FROM $2 in the same function (== a lookupExistingChild
// call, OR an ON CONFLICT clause baked into the same INSERT, OR an
// explicit allowlist annotation)."
//
// Allowlist mechanism: a function whose body contains the exact comment
// string `// class1-gate: idempotent-by-design` is treated as safe.
// Use this only after writing a unit test that pins WHY the function
// is safe. The annotation is intentionally awkward to type — it should
// be rare.
import (
"go/ast"
"go/parser"
"go/token"
"os"
"path/filepath"
"regexp"
"sort"
"strings"
"testing"
)
// reINSERTWorkspaces matches the exact statement shape we care about.
// Tightened (vs bytes.Index "INSERT INTO workspaces") so the audit
// table `workspaces_audit` literal — or any other lookalike — does not
// false-positive trigger this gate. The same regex is used in the
// existing createWorkspaceTree gate (workspaces_insert_allowlist_test.go)
// — keep them in sync if either changes.
var reINSERTWorkspaces = regexp.MustCompile(`(?m)^\s*INSERT INTO workspaces\s*\(`)
// reONCONFLICT matches ON CONFLICT clauses anywhere in the same SQL
// literal. An UPSERT (INSERT ... ON CONFLICT ... DO UPDATE) is
// idempotent by definition, so the gate exempts it.
var reONCONFLICT = regexp.MustCompile(`(?i)\bON CONFLICT\b`)
// gateAllowlistComment is the magic comment a function author writes
// to opt out of this gate. Forces an explicit decision.
const gateAllowlistComment = "// class1-gate: idempotent-by-design"
// preflightCallNames are function names whose presence in a function
// body counts as "did a SELECT-by-(parent_id, name) preflight". Add
// new names here as new preflight helpers are introduced. Keep the
// list TIGHT — any sloppy addition weakens the gate.
var preflightCallNames = map[string]bool{
"lookupExistingChild": true,
}
// TestClass1_NoUnpreflightedInsertInsideRange walks every production
// .go file in this package, parses the AST, and fails the test if any
// FuncDecl violates the rule above.
//
// Failure message must include: file path, function name, line of
// the offending INSERT, line of the enclosing range, and a hint at
// the three escape hatches (preflight call, ON CONFLICT, allowlist
// comment).
func TestClass1_NoUnpreflightedInsertInsideRange(t *testing.T) {
wd, err := os.Getwd()
if err != nil {
t.Fatalf("getwd: %v", err)
}
entries, err := os.ReadDir(wd)
if err != nil {
t.Fatalf("readdir %s: %v", wd, err)
}
type violation struct {
file string
fn string
insertLine int
rangeLine int
}
var violations []violation
scanned := 0
for _, e := range entries {
name := e.Name()
if e.IsDir() || !strings.HasSuffix(name, ".go") {
continue
}
if strings.HasSuffix(name, "_test.go") {
continue
}
path := filepath.Join(wd, name)
src, err := os.ReadFile(path)
if err != nil {
t.Fatalf("read %s: %v", path, err)
}
fset := token.NewFileSet()
file, err := parser.ParseFile(fset, name, src, parser.ParseComments)
if err != nil {
t.Fatalf("parse %s: %v", path, err)
}
scanned++
// Walk every function declaration and apply the rule.
for _, decl := range file.Decls {
fd, ok := decl.(*ast.FuncDecl)
if !ok || fd.Body == nil {
continue
}
// Allowlist: skip if the function body contains the magic
// comment. We check via the source range of the function
// — comments inside the body are in file.Comments and
// must overlap the function's Pos/End range.
if functionHasAllowlistComment(file, fd) {
continue
}
// First pass: locate every INSERT INTO workspaces literal
// in this function. We treat each such literal as a
// candidate violation and try to clear it via the rules.
candidates := findInsertWorkspacesLiterals(fd, src, fset)
if len(candidates) == 0 {
continue
}
// Has the function called a preflight helper? Single
// pass — if any preflight name appears, every INSERT in
// the function is considered preflighted. This is more
// permissive than position-aware (preflight could be
// AFTER the INSERT and still satisfy the gate), but the
// existing org_import.go gate already pins the position
// invariant for createWorkspaceTree, and a function that
// preflights AFTER inserting would fail the position
// gate in a separate test.
hasPreflight := functionCallsAny(fd, preflightCallNames)
for _, c := range candidates {
if c.hasONCONFLICT {
continue
}
if hasPreflight {
continue
}
if c.enclosingRangeLine == 0 {
// INSERT not inside any RangeStmt — single-shot,
// not the bug pattern.
continue
}
violations = append(violations, violation{
file: name,
fn: fd.Name.Name,
insertLine: c.insertLine,
rangeLine: c.enclosingRangeLine,
})
}
}
}
if scanned == 0 {
t.Fatal("scanned 0 .go files — wrong working directory? gate would always pass")
}
if len(violations) > 0 {
// Stable sort so the failure message is deterministic across
// reruns.
sort.Slice(violations, func(i, j int) bool {
if violations[i].file != violations[j].file {
return violations[i].file < violations[j].file
}
return violations[i].insertLine < violations[j].insertLine
})
var b strings.Builder
b.WriteString("Class 1 leak gate (#2867 PR-A) — these handler functions iterate a slice and INSERT INTO workspaces inside the loop body without a (parent_id, name) preflight.\n\n")
b.WriteString("This is the bug shape that triggered the tenant-hongming leak (TeamHandler.Expand re-inserting the entire sub_workspaces tree on every call). To fix any reported violation, choose ONE of:\n")
b.WriteString(" 1. Call h.lookupExistingChild(ctx, name, parentID) before the INSERT and skip the INSERT when it returns existing=true. (preferred)\n")
b.WriteString(" 2. Use INSERT ... ON CONFLICT ... DO ... (idempotent UPSERT, like registry.go).\n")
b.WriteString(" 3. Annotate the function with a `// class1-gate: idempotent-by-design` comment AND a unit test that pins why the function is structurally idempotent. (rare; require code review)\n\n")
b.WriteString("Violations:\n")
for _, v := range violations {
b.WriteString(" - ")
b.WriteString(v.file)
b.WriteString(":")
b.WriteString(itoa(v.insertLine))
b.WriteString(" — function ")
b.WriteString(v.fn)
b.WriteString("() INSERTs inside RangeStmt at line ")
b.WriteString(itoa(v.rangeLine))
b.WriteString("\n")
}
t.Fatal(b.String())
}
}
func itoa(n int) string {
// Avoid strconv import for one call site — keeps the test focused.
if n == 0 {
return "0"
}
neg := n < 0
if neg {
n = -n
}
var buf [20]byte
i := len(buf)
for n > 0 {
i--
buf[i] = byte('0' + n%10)
n /= 10
}
if neg {
i--
buf[i] = '-'
}
return string(buf[i:])
}
// candidateInsert holds the per-INSERT facts needed to decide whether
// the gate fires.
type candidateInsert struct {
insertLine int
hasONCONFLICT bool
enclosingRangeLine int // 0 means not inside any range
}
// findInsertWorkspacesLiterals walks fd's body and returns one
// candidateInsert per INSERT INTO workspaces string literal.
//
// Position-based detection: collect every RangeStmt's body span first,
// then for each INSERT literal check if its position is inside any
// span. ast.Inspect's nil-call ordering does NOT give per-node pop
// semantics, so a stack-based approach against ast.Inspect would
// silently miscount. Position spans are deterministic and easy to
// reason about.
func findInsertWorkspacesLiterals(fd *ast.FuncDecl, src []byte, fset *token.FileSet) []candidateInsert {
var out []candidateInsert
type span struct{ start, end token.Pos }
var ranges []span
ast.Inspect(fd.Body, func(n ast.Node) bool {
rs, ok := n.(*ast.RangeStmt)
if !ok || rs.Body == nil {
return true
}
ranges = append(ranges, span{rs.Body.Lbrace, rs.Body.Rbrace})
return true
})
enclosingRangeLineFor := func(p token.Pos) int {
// Pick the innermost enclosing range — i.e., the one with the
// largest start that still covers p. Innermost is the one
// whose body actually contains the INSERT, which is the line
// most useful in a violation message.
bestStart := token.NoPos
bestLine := 0
for _, s := range ranges {
if p > s.start && p < s.end && s.start > bestStart {
bestStart = s.start
bestLine = fset.Position(s.start).Line
}
}
return bestLine
}
ast.Inspect(fd.Body, func(n ast.Node) bool {
bl, ok := n.(*ast.BasicLit)
if !ok || bl.Kind != token.STRING {
return true
}
// Strip surrounding backticks/quotes — value includes them.
lit := bl.Value
if len(lit) >= 2 {
lit = lit[1 : len(lit)-1]
}
if !reINSERTWorkspaces.MatchString(lit) {
return true
}
out = append(out, candidateInsert{
insertLine: fset.Position(bl.Pos()).Line,
hasONCONFLICT: reONCONFLICT.MatchString(lit),
enclosingRangeLine: enclosingRangeLineFor(bl.Pos()),
})
return true
})
return out
}
// functionCallsAny returns true if any CallExpr in fd's body has a
// function name (either a SelectorExpr Sel.Name or an Ident name)
// matching a key in names.
func functionCallsAny(fd *ast.FuncDecl, names map[string]bool) bool {
found := false
ast.Inspect(fd.Body, func(n ast.Node) bool {
if found {
return false
}
ce, ok := n.(*ast.CallExpr)
if !ok {
return true
}
switch fun := ce.Fun.(type) {
case *ast.Ident:
if names[fun.Name] {
found = true
return false
}
case *ast.SelectorExpr:
if names[fun.Sel.Name] {
found = true
return false
}
}
return true
})
return found
}
// functionHasAllowlistComment returns true if the function body
// (between fd.Body.Lbrace and fd.Body.Rbrace) contains a comment
// equal to gateAllowlistComment.
func functionHasAllowlistComment(file *ast.File, fd *ast.FuncDecl) bool {
if fd.Body == nil {
return false
}
start := fd.Body.Lbrace
end := fd.Body.Rbrace
for _, cg := range file.Comments {
for _, c := range cg.List {
if c.Pos() < start || c.Pos() > end {
continue
}
if strings.TrimSpace(c.Text) == gateAllowlistComment {
return true
}
}
}
return false
}
// TestClass1_GateFiresOnSyntheticBuggySource — proves the gate actually
// catches the bug shape it's named after. Without this, a regression
// to "always pass" would not be noticed until the leak shipped again.
// Per memory feedback_assert_exact_not_substring.md: tighten the test
// + verify it FAILS on old-shape source before merging.
func TestClass1_GateFiresOnSyntheticBuggySource(t *testing.T) {
const buggySrc = `package handlers
import "context"
type fakeDB struct{}
func (fakeDB) ExecContext(ctx context.Context, sql string, args ...interface{}) {}
func buggyExpand(db fakeDB, ctx context.Context, children []string) {
for _, child := range children {
// Bug shape: INSERT inside the range body, no preflight.
db.ExecContext(ctx, ` + "`INSERT INTO workspaces (id, name) VALUES ($1, $2)`" + `, "x", child)
}
}
`
fset := token.NewFileSet()
file, err := parser.ParseFile(fset, "buggy.go", buggySrc, parser.ParseComments)
if err != nil {
t.Fatalf("parse synthetic source: %v", err)
}
for _, decl := range file.Decls {
fd, ok := decl.(*ast.FuncDecl)
if !ok || fd.Name.Name != "buggyExpand" {
continue
}
candidates := findInsertWorkspacesLiterals(fd, []byte(buggySrc), fset)
if len(candidates) != 1 {
t.Fatalf("expected 1 INSERT literal, got %d", len(candidates))
}
c := candidates[0]
if c.enclosingRangeLine == 0 {
t.Errorf("synthetic INSERT inside `for _, child := range` should be detected as enclosed by range, got enclosingRangeLine=0 — gate would miss the bug shape")
}
if c.hasONCONFLICT {
t.Errorf("synthetic INSERT has no ON CONFLICT, gate falsely treated it as idempotent")
}
if functionCallsAny(fd, preflightCallNames) {
t.Errorf("synthetic function does not call lookupExistingChild — gate falsely treated it as preflighted")
}
// All three guards say the gate WOULD fire. Pass.
return
}
t.Fatal("buggyExpand FuncDecl not found in synthetic source")
}
// TestClass1_GateAllowsONCONFLICT — pins that an INSERT with ON
// CONFLICT inside a range body is NOT flagged. registry.go's
// upsert pattern is the prod example.
func TestClass1_GateAllowsONCONFLICT(t *testing.T) {
const safeSrc = `package handlers
import "context"
type fakeDB struct{}
func (fakeDB) ExecContext(ctx context.Context, sql string, args ...interface{}) {}
func upsertLoop(db fakeDB, ctx context.Context, children []string) {
for _, child := range children {
db.ExecContext(ctx, ` + "`INSERT INTO workspaces (id, name) VALUES ($1, $2) ON CONFLICT (id) DO UPDATE SET name = $2`" + `, "x", child)
}
}
`
fset := token.NewFileSet()
file, _ := parser.ParseFile(fset, "safe.go", safeSrc, parser.ParseComments)
for _, decl := range file.Decls {
fd, ok := decl.(*ast.FuncDecl)
if !ok || fd.Name.Name != "upsertLoop" {
continue
}
candidates := findInsertWorkspacesLiterals(fd, []byte(safeSrc), fset)
if len(candidates) != 1 {
t.Fatalf("expected 1 candidate, got %d", len(candidates))
}
if !candidates[0].hasONCONFLICT {
t.Errorf("ON CONFLICT clause should be detected, was missed — gate would falsely flag idempotent UPSERTs")
}
}
}
// TestClass1_GateAllowsAllowlistAnnotation — pins the escape hatch
// works. Annotated functions are skipped at the FuncDecl level.
func TestClass1_GateAllowsAllowlistAnnotation(t *testing.T) {
const annotatedSrc = `package handlers
import "context"
type fakeDB struct{}
func (fakeDB) ExecContext(ctx context.Context, sql string, args ...interface{}) {}
func intentionallyUnpreflighted(db fakeDB, ctx context.Context, children []string) {
// class1-gate: idempotent-by-design
for _, child := range children {
db.ExecContext(ctx, ` + "`INSERT INTO workspaces (id, name) VALUES ($1, $2)`" + `, "x", child)
}
}
`
fset := token.NewFileSet()
file, _ := parser.ParseFile(fset, "annotated.go", annotatedSrc, parser.ParseComments)
for _, decl := range file.Decls {
fd, ok := decl.(*ast.FuncDecl)
if !ok || fd.Name.Name != "intentionallyUnpreflighted" {
continue
}
if !functionHasAllowlistComment(file, fd) {
t.Error("allowlist comment should be detected for the intentionallyUnpreflighted function — escape hatch not working")
}
}
}
@@ -10,6 +10,7 @@ import (
"github.com/Molecule-AI/molecule-monorepo/platform/internal/db"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/events"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/textutil"
"github.com/gin-gonic/gin"
"github.com/google/uuid"
)
@@ -164,10 +165,10 @@ func (h *DelegationHandler) Delegate(c *gin.Context) {
go h.executeDelegation(sourceID, body.TargetID, delegationID, a2aBody)
// Broadcast event so canvas shows delegation in real-time
h.broadcaster.RecordAndBroadcast(ctx, "DELEGATION_SENT", sourceID, map[string]interface{}{
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventDelegationSent), sourceID, map[string]interface{}{
"delegation_id": delegationID,
"target_id": body.TargetID,
"task_preview": truncate(body.Task, 100),
"task_preview": textutil.TruncateBytes(body.Task, 100),
})
resp := gin.H{
@@ -317,7 +318,7 @@ func (h *DelegationHandler) executeDelegation(sourceID, targetID, delegationID s
// Update status: pending → dispatched
h.updateDelegationStatus(sourceID, delegationID, "dispatched", "")
h.broadcaster.RecordAndBroadcast(ctx, "DELEGATION_STATUS", sourceID, map[string]interface{}{
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventDelegationStatus), sourceID, map[string]interface{}{
"delegation_id": delegationID, "target_id": targetID, "status": "dispatched",
})
@@ -352,7 +353,7 @@ func (h *DelegationHandler) executeDelegation(sourceID, targetID, delegationID s
log.Printf("Delegation %s: failed to insert error log: %v", delegationID, err)
}
h.broadcaster.RecordAndBroadcast(ctx, "DELEGATION_FAILED", sourceID, map[string]interface{}{
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventDelegationFailed), sourceID, map[string]interface{}{
"delegation_id": delegationID, "target_id": targetID, "error": proxyErr.Error(),
})
// RFC #2829 PR-2 result-push (see UpdateStatus for rationale).
@@ -388,7 +389,7 @@ func (h *DelegationHandler) executeDelegation(sourceID, targetID, delegationID s
`, sourceID, sourceID, targetID, "Delegation queued — target at capacity", string(queuedJSON)); err != nil {
log.Printf("Delegation %s: failed to insert queued log: %v", delegationID, err)
}
h.broadcaster.RecordAndBroadcast(ctx, "DELEGATION_STATUS", sourceID, map[string]interface{}{
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventDelegationStatus), sourceID, map[string]interface{}{
"delegation_id": delegationID, "target_id": targetID, "status": "queued",
})
return
@@ -407,7 +408,7 @@ func (h *DelegationHandler) executeDelegation(sourceID, targetID, delegationID s
if _, err := db.DB.ExecContext(ctx, `
INSERT INTO activity_logs (workspace_id, activity_type, method, source_id, target_id, summary, response_body, status)
VALUES ($1, 'delegation', 'delegate_result', $2, $3, $4, $5::jsonb, 'completed')
`, sourceID, sourceID, targetID, "Delegation completed ("+truncate(responseText, 80)+")", string(respJSON)); err != nil {
`, sourceID, sourceID, targetID, "Delegation completed ("+textutil.TruncateBytes(responseText, 80)+")", string(respJSON)); err != nil {
log.Printf("Delegation %s: failed to insert success log: %v", delegationID, err)
}
@@ -420,10 +421,10 @@ func (h *DelegationHandler) executeDelegation(sourceID, targetID, delegationID s
// delegation_ledger_integration_test.go.
recordLedgerStatus(ctx, delegationID, "completed", "", responseText)
h.updateDelegationStatus(sourceID, delegationID, "completed", "")
h.broadcaster.RecordAndBroadcast(ctx, "DELEGATION_COMPLETE", sourceID, map[string]interface{}{
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventDelegationComplete), sourceID, map[string]interface{}{
"delegation_id": delegationID,
"target_id": targetID,
"response_preview": truncate(responseText, 200),
"response_preview": textutil.TruncateBytes(responseText, 200),
})
// RFC #2829 PR-2 result-push (see UpdateStatus for rationale).
pushDelegationResultToInbox(ctx, sourceID, delegationID, "completed", responseText, "")
@@ -503,10 +504,10 @@ func (h *DelegationHandler) Record(c *gin.Context) {
recordLedgerInsert(ctx, sourceID, body.TargetID, body.DelegationID, body.Task, "")
recordLedgerStatus(ctx, body.DelegationID, "dispatched", "", "")
h.broadcaster.RecordAndBroadcast(ctx, "DELEGATION_SENT", sourceID, map[string]interface{}{
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventDelegationSent), sourceID, map[string]interface{}{
"delegation_id": body.DelegationID,
"target_id": body.TargetID,
"task_preview": truncate(body.Task, 100),
"task_preview": textutil.TruncateBytes(body.Task, 100),
})
c.JSON(http.StatusAccepted, gin.H{
@@ -555,12 +556,12 @@ func (h *DelegationHandler) UpdateStatus(c *gin.Context) {
if _, err := db.DB.ExecContext(ctx, `
INSERT INTO activity_logs (workspace_id, activity_type, method, source_id, summary, response_body, status)
VALUES ($1, 'delegation', 'delegate_result', $2, $3, $4::jsonb, 'completed')
`, sourceID, sourceID, "Delegation completed ("+truncate(body.ResponsePreview, 80)+")", string(respJSON)); err != nil {
`, sourceID, sourceID, "Delegation completed ("+textutil.TruncateBytes(body.ResponsePreview, 80)+")", string(respJSON)); err != nil {
log.Printf("Delegation UpdateStatus: result insert failed for %s: %v", delegationID, err)
}
h.broadcaster.RecordAndBroadcast(ctx, "DELEGATION_COMPLETE", sourceID, map[string]interface{}{
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventDelegationComplete), sourceID, map[string]interface{}{
"delegation_id": delegationID,
"response_preview": truncate(body.ResponsePreview, 200),
"response_preview": textutil.TruncateBytes(body.ResponsePreview, 200),
})
// RFC #2829 PR-2 result-push: when the gate is on, also write an
// a2a_receive row so the caller's inbox poller surfaces this to
@@ -570,7 +571,7 @@ func (h *DelegationHandler) UpdateStatus(c *gin.Context) {
// the result instead of holding open an HTTP connection.
pushDelegationResultToInbox(ctx, sourceID, delegationID, "completed", body.ResponsePreview, "")
} else {
h.broadcaster.RecordAndBroadcast(ctx, "DELEGATION_FAILED", sourceID, map[string]interface{}{
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventDelegationFailed), sourceID, map[string]interface{}{
"delegation_id": delegationID,
"error": body.Error,
})
@@ -626,7 +627,7 @@ func (h *DelegationHandler) ListDelegations(c *gin.Context) {
entry["error"] = errorDetail
}
if responseBody != "" {
entry["response_preview"] = truncate(responseBody, 300)
entry["response_preview"] = textutil.TruncateBytes(responseBody, 300)
}
delegations = append(delegations, entry)
}
@@ -727,9 +728,3 @@ func extractResponseText(body []byte) string {
return string(body)
}
func truncate(s string, max int) string {
if len(s) <= max {
return s
}
return s[:max] + "..."
}
@@ -8,6 +8,7 @@ import (
"time"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/db"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/textutil"
)
// delegation_ledger.go — durable per-task ledger for A2A delegation
@@ -50,18 +51,15 @@ func NewDelegationLedger(handle *sql.DB) *DelegationLedger {
return &DelegationLedger{db: handle}
}
// truncatePreview caps stored preview at 4KB. The full prompt/response is
// already in activity_logs.{request,response}_body — this is the at-a-glance
// view for the dashboard, not a forensic record.
// previewCap caps stored preview at 4KB. The full prompt/response is
// already in activity_logs.{request,response}_body — this is the
// at-a-glance view for the dashboard, not a forensic record.
//
// Truncation goes through textutil.TruncateBytesNoMarker so it's
// rune-safe (#2026 / #2959 / #2962 bug class: byte-slice mid-codepoint
// → Postgres JSONB rejects → silent INSERT failure → audit gap).
const previewCap = 4096
func truncatePreview(s string) string {
if len(s) <= previewCap {
return s
}
return s[:previewCap]
}
// InsertOpts is the agent's record-of-intent. Caller, callee, task preview,
// and the chosen delegation_id are required; idempotency_key is optional.
type InsertOpts struct {
@@ -96,7 +94,7 @@ func (l *DelegationLedger) Insert(ctx context.Context, opts InsertOpts) {
) VALUES ($1, $2, $3, $4, 'queued', $5, $6)
ON CONFLICT (delegation_id) DO NOTHING
`, opts.DelegationID, opts.CallerID, opts.CalleeID,
truncatePreview(opts.TaskPreview), deadline, idemArg)
textutil.TruncateBytesNoMarker(opts.TaskPreview, previewCap), deadline, idemArg)
if err != nil {
log.Printf("delegation_ledger Insert(%s): %v", opts.DelegationID, err)
}
@@ -175,7 +173,7 @@ func (l *DelegationLedger) SetStatus(ctx context.Context,
result_preview = NULLIF($4, ''),
updated_at = now()
WHERE delegation_id = $1
`, delegationID, status, errorDetail, truncatePreview(resultPreview))
`, delegationID, status, errorDetail, textutil.TruncateBytesNoMarker(resultPreview, previewCap))
return err
}
@@ -2,9 +2,11 @@ package handlers
import (
"context"
"database/sql/driver"
"errors"
"strings"
"testing"
"unicode/utf8"
"github.com/DATA-DOG/go-sqlmock"
)
@@ -73,15 +75,20 @@ func TestLedgerInsert_TruncatesOversizedPreview(t *testing.T) {
mock := setupTestDB(t)
l := NewDelegationLedger(nil)
huge := strings.Repeat("x", 10_000) // > previewCap
// 4096 / 3 = 1365 runes; +10 for margin so we cross the cap.
// '世' is 3 bytes in UTF-8 (worst case for byte-cap rune walking).
huge := strings.Repeat("世", (previewCap/3)+10)
if len(huge) <= previewCap {
t.Fatalf("test setup: input too short (%d bytes) — must exceed previewCap=%d", len(huge), previewCap)
}
mock.ExpectExec(`INSERT INTO delegations`).
WithArgs(
"deleg-big",
"c", "ca",
sqlmock.AnyArg(), // truncated preview — verify length below via custom matcher
sqlmock.AnyArg(),
sqlmock.AnyArg(),
capValidUTF8Matcher{cap: previewCap}, // truncated preview must fit cap AND be valid UTF-8
sqlmock.AnyArg(), // deadline
sqlmock.AnyArg(), // idempotency_key
).
WillReturnResult(sqlmock.NewResult(0, 1))
@@ -96,30 +103,28 @@ func TestLedgerInsert_TruncatesOversizedPreview(t *testing.T) {
}
}
// ---------- truncatePreview unit ----------
// capValidUTF8Matcher pins #2962 at the integration boundary: the
// preview that lands in the INSERT MUST be valid UTF-8 (else Postgres
// JSONB rejects → silent audit gap) AND fit within the byte cap. Pre-
// migration this would have asserted on the corrupted "世" mid-codepoint
// byte slice; post-migration it asserts the truncated preview is a
// clean rune-aligned prefix.
type capValidUTF8Matcher struct{ cap int }
func TestTruncatePreview_UnderCap(t *testing.T) {
in := "short"
if got := truncatePreview(in); got != in {
t.Errorf("under-cap should passthrough; got %q", got)
func (m capValidUTF8Matcher) Match(v driver.Value) bool {
s, ok := v.(string)
if !ok {
return false
}
return len(s) <= m.cap && utf8.ValidString(s)
}
func TestTruncatePreview_OverCapTruncatesAtBoundary(t *testing.T) {
in := strings.Repeat("a", previewCap+100)
got := truncatePreview(in)
if len(got) != previewCap {
t.Errorf("expected len=%d got len=%d", previewCap, len(got))
}
}
func TestTruncatePreview_ExactlyAtCap(t *testing.T) {
in := strings.Repeat("a", previewCap)
got := truncatePreview(in)
if got != in {
t.Errorf("at-cap should passthrough unchanged")
}
}
// Helper-level truncation tests now live in
// internal/textutil/truncate_test.go. The integration-level path
// (TestLedgerInsert_TruncatesOversizedPreview above) still exercises
// the previewCap boundary through the SQL write so a regression in
// the wiring (wrong cap, wrong helper, missing call) would still go
// red here.
// ---------- SetStatus lifecycle ----------
@@ -109,6 +109,12 @@ curl -fsS -X POST "{{PLATFORM_URL}}/registry/register" \
"version": "0.1.0"
}
}'
# Need help?
# Documentation: https://doc.moleculesai.app/docs/guides/external-agent-registration
# Common errors:
# • 401 / 403 on register — WORKSPACE_AUTH_TOKEN must be the value
# shown at workspace create. Tokens are shown only once.
`
// externalChannelTemplate — Claude Code channel plugin install + .env. For
@@ -172,6 +178,18 @@ claude --dangerously-load-development-channels \
# Multi-workspace: comma-separate IDs and tokens (same order). See
# https://github.com/Molecule-AI/molecule-mcp-claude-channel for
# pairing flow, push-mode upgrade, and v0.2 roadmap.
# Need help?
# Documentation: https://doc.moleculesai.app/docs/guides/claude-code-channel-plugin
# Common errors:
# • "plugin not installed" — run /plugin marketplace add then
# /plugin install lines above; /reload-plugins or restart.
# • "not on the approved channels allowlist" — custom channels need
# --dangerously-load-development-channels; team/enterprise orgs
# need admin to set channelsEnabled + allowedChannelPlugins.
# • "Inbound messages not arriving" — stderr should show
# "molecule channel: connected — watching N workspace(s)";
# verify ~/.claude/channels/molecule/.env has PLATFORM_URL + token.
`
// externalUniversalMcpTemplate — runtime-agnostic standalone path.
@@ -198,6 +216,13 @@ const externalUniversalMcpTemplate = `# Universal MCP — standalone register +
# Pair with the Claude Code or Python SDK tab if your runtime needs
# inbound A2A delivery (canvas messages → agent conversation turns).
# Requires Python >= 3.11. On 3.10 or older pip says
# "Could not find a version that satisfies the requirement
# (from versions: none)" — the wheel's requires_python pin filters
# the only available artifact before pip even attempts install.
# Upgrade the interpreter (brew install python@3.12 / apt install
# python3.12 / etc.) or use a 3.11+ venv.
# 1. Install the workspace runtime wheel:
pip install molecule-ai-workspace-runtime
@@ -217,6 +242,17 @@ claude mcp add molecule -s user -- env \
#
# Origin/WAF handling is built into the wheel — no manual headers
# needed when calling tools through the MCP server.
# Need help?
# Where to install: https://pypi.org/project/molecule-ai-workspace-runtime/
# Documentation: https://doc.moleculesai.app/docs/guides/mcp-server-setup
# Common errors:
# • "Tools not appearing in your agent" — run ` + "`claude mcp list`" + ` (or
# your runtime's equivalent) and confirm the molecule entry. If
# missing, re-run the ` + "`claude mcp add`" + ` line above.
# • "ConnectionRefused / DNS error on first call" — PLATFORM_URL must
# include the scheme (https://) and have NO trailing slash. Verify
# with: curl ${PLATFORM_URL}/healthz
`
// externalPythonTemplate uses molecule-sdk-python's RemoteAgentClient +
@@ -255,6 +291,15 @@ async def main():
if __name__ == "__main__":
asyncio.run(main())
# Need help?
# Where to install: https://pypi.org/project/molecule-ai-workspace-runtime/
# Documentation: https://doc.moleculesai.app/docs/guides/external-agent-registration
# Common errors:
# • 401 from /heartbeat — AUTH_TOKEN expired or wrong workspace_id.
# Tokens shown only once at create time; re-create to get a fresh one.
# • AGENT_URL not reachable from platform — public HTTPS URL required
# for inbound A2A. Use ngrok or Cloudflare Tunnel if behind NAT.
`
// externalHermesChannelTemplate — install snippet for operators whose
@@ -322,6 +367,16 @@ hermes gateway --replace
#
# Source + issue tracker:
# https://github.com/Molecule-AI/hermes-channel-molecule
# Need help?
# Documentation: https://doc.moleculesai.app/docs/guides/external-agent-registration
# Common errors:
# • Gateway start failure — tail ~/.hermes/gateway.log. YAML
# duplicate-key in config.yaml is the most common cause; the
# gateway: block must appear exactly once.
# • Plugin not discovered after install — pip show hermes-channel-molecule
# to confirm install. Some hermes builds need ` + "`hermes plugin reload`" + `
# before the new platform_plugins entry takes effect.
`
// externalCodexTemplate — for operators whose external agent is a
@@ -368,14 +423,23 @@ mkdir -p ~/.codex
# (then open ~/.codex/config.toml in your editor and paste:)
#
# [mcp_servers.molecule]
# command = "python3"
# args = ["-m", "molecule_runtime.a2a_mcp_server"]
# command = "molecule-mcp"
# args = []
# startup_timeout_sec = 30
#
# [mcp_servers.molecule.env]
# WORKSPACE_ID = "{{WORKSPACE_ID}}"
# PLATFORM_URL = "{{PLATFORM_URL}}"
# MOLECULE_WORKSPACE_TOKEN = "<paste from create response>"
#
# Use the "molecule-mcp" console-script wrapper (NOT
# "python3 -m molecule_runtime.a2a_mcp_server"). The wrapper is what
# keeps the workspace ALIVE on the canvas: it POSTs /registry/register
# at startup and runs a 20s heartbeat thread alongside the MCP stdio
# loop. The bare a2a_mcp_server module exposes tools but does NOT
# heartbeat — pointing codex at it leaves the canvas showing this
# workspace as awaiting_agent (OFFLINE) within 60-90s even while
# tools work.
# 3. Run the bridge daemon as a durable background process — this
# is the INBOUND path. Long-polls the platform inbox and runs
@@ -403,6 +467,18 @@ disown
# available to the agent, and the bridge wakes a non-interactive
# codex turn for any inbound canvas/peer message:
codex
# Need help?
# Documentation: https://doc.moleculesai.app/docs/guides/mcp-server-setup
# Common errors:
# • [mcp_servers.molecule] not loaded — codex must be ≥ 0.57.
# Check with ` + "`codex --version`" + `; upgrade via npm install -g @openai/codex@latest.
# • TOML parse error after re-running setup — TOML rejects duplicate
# [mcp_servers.molecule] tables. Open ~/.codex/config.toml and
# remove the old block before pasting the new one.
# • Canvas messages don't wake codex — step 3 (codex-channel-molecule
# bridge daemon) is required for inbound push. Check
# pgrep -f codex-channel-molecule and tail ~/.codex-channel-molecule/daemon.log.
`
// externalOpenClawTemplate — for operators whose external agent is an
@@ -440,11 +516,20 @@ pip install molecule-ai-workspace-runtime
# 3. Wire the molecule MCP server. {{WORKSPACE_ID}} + {{PLATFORM_URL}}
# are stamped server-side; paste the auth token before running.
#
# Use the "molecule-mcp" console-script wrapper (NOT
# "python3 -m molecule_runtime.a2a_mcp_server"). The wrapper is what
# keeps the workspace ALIVE on the canvas: it POSTs /registry/register
# at startup and runs a 20s heartbeat thread alongside the MCP stdio
# loop. The bare a2a_mcp_server module exposes tools but does NOT
# heartbeat — pointing openclaw at it leaves the canvas showing this
# workspace as awaiting_agent (OFFLINE) within 60-90s even while
# tools work.
WORKSPACE_TOKEN="<paste from create response>"
openclaw mcp set molecule "$(cat <<EOF
{
"command": "python3",
"args": ["-m", "molecule_runtime.a2a_mcp_server"],
"command": "molecule-mcp",
"args": [],
"env": {
"WORKSPACE_ID": "{{WORKSPACE_ID}}",
"PLATFORM_URL": "{{PLATFORM_URL}}",
@@ -464,4 +549,13 @@ disown
# 5. Run an agent turn — molecule tools are now available:
openclaw agent --message "list my peers"
# Need help?
# Documentation: https://doc.moleculesai.app/docs/guides/mcp-server-setup
# Common errors:
# • Gateway not starting — tail ~/.openclaw/gateway.log. The loopback
# bind requires :18789 to be free; check with ` + "`lsof -iTCP:18789`" + `.
# • ` + "`openclaw mcp set`" + ` rejected — the heredoc generates JSON;
# verify with ` + "`jq < ~/.openclaw/mcp/molecule.json`" + ` and re-run
# ` + "`openclaw mcp set`" + ` if the file is malformed.
`
@@ -38,3 +38,40 @@ func TestExternalTemplates_NoMoleculeOrgIDPlaceholder(t *testing.T) {
}
}
}
// TestExternalMcpTemplates_UseMoleculeMcpWrapper pins the invariant
// that operator-facing snippets configuring an MCP server entry point
// use the ``molecule-mcp`` console-script wrapper (mcp_cli.main),
// NOT the bare ``a2a_mcp_server`` module.
//
// Why: a2a_mcp_server exposes the MCP tools but does NOT call
// /registry/register or run the 20s heartbeat thread. mcp_cli wraps
// it with both, which is what flips the canvas presence indicator
// from awaiting_agent (OFFLINE) to online and keeps it that way.
// Originally tracked by molecule-core#2957 — operator hit the
// silent-OFFLINE failure mode when the Codex tab pointed at the bare
// module.
//
// The hermes-channel template intentionally uses the bare module: it
// owns the platform plugin path and runs its own
// register_platform/heartbeat code in-process, so wrapping with
// mcp_cli would double-heartbeat. universalMcp / codex / openclaw
// must all use the wrapper.
func TestExternalMcpTemplates_UseMoleculeMcpWrapper(t *testing.T) {
mustUseWrapper := map[string]string{
"externalUniversalMcpTemplate": externalUniversalMcpTemplate,
"externalCodexTemplate": externalCodexTemplate,
"externalOpenClawTemplate": externalOpenClawTemplate,
}
for name, body := range mustUseWrapper {
if !strings.Contains(body, "molecule-mcp") {
t.Errorf("%s does not reference 'molecule-mcp' — operator-facing MCP snippets must point at the heartbeat-wrapping console script, not the bare a2a_mcp_server module (#2957)", name)
}
if strings.Contains(body, `"-m", "molecule_runtime.a2a_mcp_server"`) {
t.Errorf("%s spawns 'python3 -m molecule_runtime.a2a_mcp_server' — that bypasses the standalone register/heartbeat wrapper, leaving the canvas showing the workspace OFFLINE (#2957). Use 'molecule-mcp' instead.", name)
}
if strings.Contains(body, `["-m", "molecule_runtime.a2a_mcp_server"]`) {
t.Errorf("%s spawns 'python3 -m molecule_runtime.a2a_mcp_server' — that bypasses the standalone register/heartbeat wrapper, leaving the canvas showing the workspace OFFLINE (#2957). Use 'molecule-mcp' instead.", name)
}
}
}
@@ -8,6 +8,7 @@ import (
"net/http"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/db"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/events"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/wsauth"
"github.com/gin-gonic/gin"
)
@@ -100,7 +101,7 @@ func (h *WorkspaceHandler) RotateExternalCredentials(c *gin.Context) {
// see when credentials were rotated. No PII; the token plaintext
// is NOT logged.
if h.broadcaster != nil {
h.broadcaster.RecordAndBroadcast(ctx, "EXTERNAL_CREDENTIALS_ROTATED", id, map[string]interface{}{
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventExternalCredentialsRotated), id, map[string]interface{}{
"workspace_id": id,
})
}
@@ -159,11 +159,15 @@ func generateAppInstallationToken() (string, time.Time, error) {
req, _ := http.NewRequest("POST", fmt.Sprintf("https://api.github.com/app/installations/%d/access_tokens", installID), nil)
req.Header.Set("Authorization", "Bearer "+signed)
req.Header.Set("Accept", "application/vnd.github+json")
resp, err := http.DefaultClient.Do(req)
client := &http.Client{Timeout: 15 * time.Second}
resp, err := client.Do(req)
if err != nil {
return "", time.Time{}, err
}
defer func() { _ = resp.Body.Close() }()
if resp.StatusCode < 200 || resp.StatusCode >= 300 {
return "", time.Time{}, fmt.Errorf("github API returned status %d", resp.StatusCode)
}
var result struct {
Token string `json:"token"`
ExpiresAt time.Time `json:"expires_at"`
+170 -3
View File
@@ -11,18 +11,21 @@ import (
"os"
"testing"
"errors"
"github.com/DATA-DOG/go-sqlmock"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/db"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/events"
"github.com/gin-gonic/gin"
)
// newMCPHandler is a test helper that constructs an MCPHandler backed by the
// sqlmock DB set up by setupTestDB.
// sqlmock DB set up by setupTestDB. Uses newTestBroadcaster so handlers
// that BroadcastOnly (send_message_to_user, etc.) don't nil-panic on the
// hub — events.NewBroadcaster(nil) crashes inside hub.Broadcast.
func newMCPHandler(t *testing.T) (*MCPHandler, sqlmock.Sqlmock) {
t.Helper()
mock := setupTestDB(t)
h := NewMCPHandler(db.DB, events.NewBroadcaster(nil))
h := NewMCPHandler(db.DB, newTestBroadcaster())
return h, mock
}
@@ -628,6 +631,170 @@ func TestMCPHandler_SendMessageToUser_Blocked_WhenEnvNotSet(t *testing.T) {
}
}
// TestMCPHandler_SendMessageToUser_DBErrorLogsAndStill200s pins the
// "best-effort persistence" contract: when the activity_log INSERT
// fails (DB hiccup, constraint violation, transient connection drop),
// the tool MUST still return success to the agent because the WS
// broadcast already succeeded — the user has seen the message.
//
// This matches /notify (activity.go) behavior. Returning an error
// here would cause the agent to retry and re-broadcast, double-
// rendering the message in the user's live chat panel for every
// retry until the DB recovers.
func TestMCPHandler_SendMessageToUser_DBErrorLogsAndStill200s(t *testing.T) {
t.Setenv("MOLECULE_MCP_ALLOW_SEND_MESSAGE", "true")
h, mock := newMCPHandler(t)
mock.ExpectQuery("SELECT name FROM workspaces").
WithArgs("ws-err").
WillReturnRows(sqlmock.NewRows([]string{"name"}).AddRow("CEO Ryan PC"))
// INSERT fails — must NOT abort the tool response.
mock.ExpectExec(`INSERT INTO activity_logs.*'a2a_receive'.*'notify'`).
WillReturnError(errors.New("transient db error"))
w := mcpPost(t, h, "ws-err", map[string]interface{}{
"jsonrpc": "2.0",
"id": 100,
"method": "tools/call",
"params": map[string]interface{}{
"name": "send_message_to_user",
"arguments": map[string]interface{}{
"message": "should not be lost from the live chat",
},
},
})
var resp mcpResponse
if err := json.Unmarshal(w.Body.Bytes(), &resp); err != nil {
t.Fatalf("response was not valid JSON-RPC: %v", err)
}
// Tool response is success — INSERT failure logged, broadcast
// already succeeded.
if resp.Error != nil {
t.Errorf("tool response should be success on DB error (broadcast won), got JSON-RPC error: %+v", resp.Error)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("expected DB calls in order: %v", err)
}
}
// TestMCPHandler_SendMessageToUser_ResponseBodyShape pins the
// response_body JSON shape stored in activity_logs. This shape MUST
// match what the canvas hydrater (extractResponseText in
// historyHydration.ts) reads — specifically `{"result": "<text>"}`.
// Any drift in the JSON shape silently breaks chat history without
// failing the INSERT.
//
// Caught the same drift class flagged in
// feedback_assert_exact_not_substring.md: a substring match on
// "result" would pass even if the field were renamed; we assert the
// exact JSON shape.
func TestMCPHandler_SendMessageToUser_ResponseBodyShape(t *testing.T) {
t.Setenv("MOLECULE_MCP_ALLOW_SEND_MESSAGE", "true")
h, mock := newMCPHandler(t)
const userMessage = "Hi there from the agent"
mock.ExpectQuery("SELECT name FROM workspaces").
WithArgs("ws-shape").
WillReturnRows(sqlmock.NewRows([]string{"name"}).AddRow("CEO Ryan PC"))
// Capture the response_body argument and assert its exact shape.
mock.ExpectExec(`INSERT INTO activity_logs.*'a2a_receive'.*'notify'`).
WithArgs(
"ws-shape",
sqlmock.AnyArg(), // summary
// The response_body MUST be JSON `{"result": "<message>"}`.
// Any other shape (e.g., wrapping in a Task object) breaks
// the canvas hydrater's `body.result` extractor.
`{"result":"`+userMessage+`"}`,
).
WillReturnResult(sqlmock.NewResult(1, 1))
w := mcpPost(t, h, "ws-shape", map[string]interface{}{
"jsonrpc": "2.0",
"id": 101,
"method": "tools/call",
"params": map[string]interface{}{
"name": "send_message_to_user",
"arguments": map[string]interface{}{
"message": userMessage,
},
},
})
if w.Code != 200 {
t.Fatalf("expected 200, got %d body=%s", w.Code, w.Body.String())
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("response_body shape drift — would silently break canvas chat history: %v", err)
}
}
// TestMCPHandler_SendMessageToUser_PersistsToActivityLog pins the fix
// for the reno-stars / CEO Ryan PC chat-history data-loss bug:
// external claude-code agents using molecule-mcp's send_message_to_user
// tool route through THIS handler (not the HTTP /notify endpoint),
// and the handler used to broadcast WS only — visible live, gone on
// reload because nothing wrote to activity_logs.
//
// Pins:
// - INSERT happens on the success path (broadcast + DB write).
// - INSERT shape mirrors the HTTP /notify handler exactly:
// activity_type='a2a_receive', method='notify', request_body NULL,
// response_body={"result": message}, status='ok'. The canvas
// hydration query (`type=a2a_receive&source=canvas`) treats
// both writers as the same shape — drift here means the bug
// re-surfaces silently.
func TestMCPHandler_SendMessageToUser_PersistsToActivityLog(t *testing.T) {
t.Setenv("MOLECULE_MCP_ALLOW_SEND_MESSAGE", "true")
h, mock := newMCPHandler(t)
// Workspace lookup — the handler verifies the workspace exists
// before it does anything else. Returning a name lets the
// broadcast payload populate; the test doesn't assert on the
// broadcast (no observable WS in this fake), only on the DB.
mock.ExpectQuery("SELECT name FROM workspaces").
WithArgs("ws-msg").
WillReturnRows(sqlmock.NewRows([]string{"name"}).AddRow("CEO Ryan PC"))
// The persistence INSERT — pin the exact shape so a future
// refactor that switches columns or drops `method='notify'`
// breaks the test loud, not silently. Match by regex on the
// table + activity_type + method literals.
mock.ExpectExec(`INSERT INTO activity_logs.*'a2a_receive'.*'notify'`).
WithArgs(
"ws-msg",
sqlmock.AnyArg(), // summary "Agent message: ..."
sqlmock.AnyArg(), // response_body JSON
).
WillReturnResult(sqlmock.NewResult(1, 1))
w := mcpPost(t, h, "ws-msg", map[string]interface{}{
"jsonrpc": "2.0",
"id": 99,
"method": "tools/call",
"params": map[string]interface{}{
"name": "send_message_to_user",
"arguments": map[string]interface{}{
"message": "Hello, this should persist!",
},
},
})
var resp mcpResponse
if err := json.Unmarshal(w.Body.Bytes(), &resp); err != nil {
t.Fatalf("response was not valid JSON-RPC: %v\nbody=%s", err, w.Body.String())
}
if resp.Error != nil {
t.Errorf("unexpected JSON-RPC error: %+v", resp.Error)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("DB expectations not met (INSERT missing → reno-stars data-loss regression): %v", err)
}
}
// ─────────────────────────────────────────────────────────────────────────────
// Parse error
// ─────────────────────────────────────────────────────────────────────────────
+17 -13
View File
@@ -11,6 +11,7 @@ import (
"context"
"database/sql"
"encoding/json"
"errors"
"fmt"
"io"
"log"
@@ -330,20 +331,23 @@ func (h *MCPHandler) toolSendMessageToUser(ctx context.Context, workspaceID stri
return "", fmt.Errorf("send_message_to_user is not enabled on this MCP bridge (set MOLECULE_MCP_ALLOW_SEND_MESSAGE=true)")
}
var wsName string
err := h.database.QueryRowContext(ctx,
`SELECT name FROM workspaces WHERE id = $1 AND status != 'removed'`, workspaceID,
).Scan(&wsName)
if err != nil {
return "", fmt.Errorf("workspace not found")
// Single source of truth for chat-bearing agent → user messages —
// see agent_message_writer.go for the contract. The pre-RFC-#2945
// duplication of broadcast + INSERT logic between this handler and
// activity.go:Notify is what produced the reno-stars data-loss
// regression; both paths now route through the same writer.
//
// MCP send_message_to_user does not currently surface attachments
// (the tool args don't accept them); pass nil. If a future tool
// schema adds an attachments arg, build []AgentMessageAttachment
// and pass through.
writer := NewAgentMessageWriter(h.database, h.broadcaster)
if err := writer.Send(ctx, workspaceID, message, nil); err != nil {
if errors.Is(err, ErrWorkspaceNotFound) {
return "", fmt.Errorf("workspace not found")
}
return "", err
}
h.broadcaster.BroadcastOnly(workspaceID, "AGENT_MESSAGE", map[string]interface{}{
"message": message,
"workspace_id": workspaceID,
"name": wsName,
})
return "Message sent.", nil
}
@@ -0,0 +1,457 @@
package handlers
// memories_v2.go — HTTP endpoints that expose Memory v2 plugin state to
// the canvas Memory tab. Reads-only; writes still go through the MCP
// path (see mcp_tools_memory_v2.go) where SAFE-T1201 redaction +
// org-write audit happen at a single funnel.
//
// Why a separate v2 endpoint set rather than retrofitting memories.go:
//
// - memories.go reads `agent_memories` (legacy v1 table). After the
// 2026-05-05 cutover, agent commits go to the plugin's
// memory_records — agent_memories is frozen. The canvas Memory
// tab reading memories.go shows STALE data.
// - The plugin is loopback-only on each tenant (127.0.0.1:9100), so
// the canvas (browser) cannot call it directly. workspace-server
// proxies through these endpoints.
// - v2 has different shape (namespace tree, kind/source/pin/TTL,
// score) — overloading memories.go would break v1 consumers
// (admin export, the back-compat MCP shim).
//
// All endpoints sit under the same wsAuth group memories.go uses,
// so the existing per-tenant token gates them automatically.
import (
"errors"
"log"
"net/http"
"strconv"
"time"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/memory/client"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/memory/contract"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/memory/namespace"
"github.com/gin-gonic/gin"
)
// MemoriesV2Handler bundles the plugin client + namespace resolver
// behind a slim HTTP surface. Construction matches the rest of the
// handlers package: NewMemoriesV2Handler followed by WithMemoryV2 (or
// the test-only withMemoryV2APIs) at boot.
type MemoriesV2Handler struct {
plugin memoryPluginAPI
resolver namespaceResolverAPI
}
// NewMemoriesV2Handler constructs an unwired handler. Every method
// returns 503 until WithMemoryV2 is called — keeps a partial deploy
// (MEMORY_PLUGIN_URL absent) from crashing the canvas with 500s.
func NewMemoriesV2Handler() *MemoriesV2Handler {
return &MemoriesV2Handler{}
}
// WithMemoryV2 attaches the live plugin client + resolver. Returns
// the receiver for fluent boot-time wiring, mirroring MCPHandler.
func (h *MemoriesV2Handler) WithMemoryV2(plugin *client.Client, resolver *namespace.Resolver) *MemoriesV2Handler {
h.plugin = plugin
h.resolver = resolver
return h
}
// withMemoryV2APIs is the test-only injection path: takes the
// interfaces directly so unit tests don't have to construct a real
// *client.Client / namespace.Resolver. Keep symmetric with
// MCPHandler.withMemoryV2APIs so handler tests can re-use the same
// stubs.
func (h *MemoriesV2Handler) withMemoryV2APIs(plugin memoryPluginAPI, resolver namespaceResolverAPI) *MemoriesV2Handler {
h.plugin = plugin
h.resolver = resolver
return h
}
// available reports whether the v2 deps are wired. Each route checks
// this and returns 503 + a clear hint when the plugin isn't
// configured, matching the MCP-side error.
func (h *MemoriesV2Handler) available() error {
if h == nil || h.plugin == nil || h.resolver == nil {
return errors.New("memory plugin is not configured (set MEMORY_PLUGIN_URL)")
}
return nil
}
// ─────────────────────────────────────────────────────────────────────────────
// GET /workspaces/:id/v2/namespaces
//
// Returns the namespace tree the canvas uses to drive the Memory tab's
// namespace dropdown. Two arrays:
//
// - readable[]: every namespace this workspace can READ from. Drives
// the "show me memories from X" filter dropdown.
// - writable[]: subset of readable that this workspace can WRITE to.
// Used for future canvas-side commit (not in this PR but the
// contract is symmetric so the dropdown can disable read-only
// entries when wiring up commit).
//
// Each entry carries name + kind + a friendly label so the canvas
// doesn't have to parse `workspace:abc-123` itself. Kind ranks the
// dropdown grouping (workspace → team → org → custom).
// ─────────────────────────────────────────────────────────────────────────────
// NamespaceView is the UI-friendly DTO returned by GET v2/namespaces.
// Internal namespace.Namespace has fields the canvas doesn't need
// (resolver-internal flags, raw metadata blobs); this strips it down.
type NamespaceView struct {
Name string `json:"name"`
Kind contract.NamespaceKind `json:"kind"`
// Label is a stable display string the canvas can render directly.
// For workspace:<id> it's "Workspace (<short-id>)"; for team:<id>
// it's "Team (<short-id>)"; org/custom carry the raw suffix.
Label string `json:"label"`
}
// NamespacesResponse is the body of GET v2/namespaces.
type NamespacesResponse struct {
Readable []NamespaceView `json:"readable"`
Writable []NamespaceView `json:"writable"`
}
// Namespaces handles GET /workspaces/:id/v2/namespaces.
func (h *MemoriesV2Handler) Namespaces(c *gin.Context) {
if err := h.available(); err != nil {
c.JSON(http.StatusServiceUnavailable, gin.H{"error": err.Error()})
return
}
workspaceID := c.Param("id")
ctx := c.Request.Context()
readable, err := h.resolver.ReadableNamespaces(ctx, workspaceID)
if err != nil {
log.Printf("v2/namespaces readable error workspace=%s: %v", workspaceID, err)
c.JSON(http.StatusInternalServerError, gin.H{"error": "failed to resolve readable namespaces"})
return
}
writable, err := h.resolver.WritableNamespaces(ctx, workspaceID)
if err != nil {
log.Printf("v2/namespaces writable error workspace=%s: %v", workspaceID, err)
c.JSON(http.StatusInternalServerError, gin.H{"error": "failed to resolve writable namespaces"})
return
}
c.JSON(http.StatusOK, NamespacesResponse{
Readable: namespacesToViews(readable),
Writable: namespacesToViews(writable),
})
}
// ─────────────────────────────────────────────────────────────────────────────
// GET /workspaces/:id/v2/memories
//
// Search the plugin for memories visible to this workspace.
//
// Query params (all optional):
// - namespace: a single readable namespace to scope to. Omitted ⇒ all
// readable namespaces (dropdown's "All" mode).
// - q: full-text query string. Empty ⇒ recency-ordered listing.
// - kind: one of fact|summary|checkpoint. Empty ⇒ all kinds.
// - limit: max rows. Defaults to 50, clamped to 100. Matches the
// v1 endpoint's clamp shape (memories.go:memoryRecallMaxLimit).
//
// Server-side ACL invariant: the request is ALWAYS intersected with
// the resolver's readable set on the server. A canvas-supplied
// `namespace=foo:bar` that this workspace can't read returns an empty
// list, NOT 403 — the canvas dropdown is built from /v2/namespaces
// so a forbidden value is a stale-cache bug, not malice. Existence
// non-inference: empty result is indistinguishable from "you can't
// read this namespace" — same as the wsAuth-protected v1 endpoints.
// ─────────────────────────────────────────────────────────────────────────────
const memoriesV2DefaultLimit = 50
const memoriesV2MaxLimit = 100
// Search handles GET /workspaces/:id/v2/memories.
func (h *MemoriesV2Handler) Search(c *gin.Context) {
if err := h.available(); err != nil {
c.JSON(http.StatusServiceUnavailable, gin.H{"error": err.Error()})
return
}
workspaceID := c.Param("id")
ctx := c.Request.Context()
requestedNS := c.Query("namespace")
query := c.Query("q")
kindStr := c.Query("kind")
limit := parseLimit(c.Query("limit"))
// Resolve the readable set, then intersect the request.
// IntersectReadable handles both the empty-request case (return
// all readable) and the explicit-namespace case (return [ns] iff
// readable, else []).
var requested []string
if requestedNS != "" {
requested = []string{requestedNS}
}
scopedNamespaces, err := h.resolver.IntersectReadable(ctx, workspaceID, requested)
if err != nil {
log.Printf("v2/memories intersect error workspace=%s: %v", workspaceID, err)
c.JSON(http.StatusInternalServerError, gin.H{"error": "failed to resolve namespaces"})
return
}
// Empty after intersection — caller asked for a namespace they
// can't read, OR they have no readable namespaces at all. Return
// [] (not 404) so the canvas can render its empty-state without
// special-casing.
if len(scopedNamespaces) == 0 {
c.JSON(http.StatusOK, MemoriesResponse{Memories: []MemoryView{}})
return
}
req := contract.SearchRequest{
Namespaces: scopedNamespaces,
Query: query,
Limit: limit,
}
if kindStr != "" {
req.Kinds = []contract.MemoryKind{contract.MemoryKind(kindStr)}
}
resp, err := h.plugin.Search(ctx, req)
if err != nil {
log.Printf("v2/memories plugin error workspace=%s: %v", workspaceID, err)
c.JSON(http.StatusBadGateway, gin.H{"error": "memory plugin search failed"})
return
}
out := MemoriesResponse{Memories: make([]MemoryView, 0, len(resp.Memories))}
for _, m := range resp.Memories {
out.Memories = append(out.Memories, memoryToView(m))
}
c.JSON(http.StatusOK, out)
}
// ─────────────────────────────────────────────────────────────────────────────
// DELETE /workspaces/:id/v2/memories/:memoryId
//
// Forget a memory. The plugin enforces its own ownership model — we
// pass `requested_by_namespace = workspace:<id>` so the audit trail
// records who initiated the forget; the plugin's ACL gate decides
// whether the deletion is allowed.
//
// 404 (not 403) on a missing or non-owned memory: existence-non-
// inferring response, matches the v1 DELETE in memories.go.
// ─────────────────────────────────────────────────────────────────────────────
// Forget handles DELETE /workspaces/:id/v2/memories/:memoryId.
func (h *MemoriesV2Handler) Forget(c *gin.Context) {
if err := h.available(); err != nil {
c.JSON(http.StatusServiceUnavailable, gin.H{"error": err.Error()})
return
}
workspaceID := c.Param("id")
memoryID := c.Param("memoryId")
ctx := c.Request.Context()
if memoryID == "" {
c.JSON(http.StatusBadRequest, gin.H{"error": "memoryId is required"})
return
}
body := contract.ForgetRequest{
RequestedByNamespace: "workspace:" + workspaceID,
}
if err := h.plugin.ForgetMemory(ctx, memoryID, body); err != nil {
// Map plugin not_found → 404. Anything else is upstream error.
var ce *contract.Error
if errors.As(err, &ce) && ce.Code == contract.ErrorCodeNotFound {
c.JSON(http.StatusNotFound, gin.H{"error": "memory not found"})
return
}
log.Printf("v2/memories forget error workspace=%s memory=%s: %v", workspaceID, memoryID, err)
c.JSON(http.StatusBadGateway, gin.H{"error": "memory plugin delete failed"})
return
}
c.JSON(http.StatusOK, gin.H{"status": "deleted"})
}
// ─────────────────────────────────────────────────────────────────────────────
// View shaping helpers
// ─────────────────────────────────────────────────────────────────────────────
// MemoryView is the canvas-facing shape of a v2 memory record. The raw
// contract.Memory carries internal fields we don't expose (raw
// `propagation` blob); MemoryView strips it to what the Memory tab
// renders.
type MemoryView struct {
ID string `json:"id"`
Namespace string `json:"namespace"`
Content string `json:"content"`
Kind contract.MemoryKind `json:"kind"`
Source contract.MemorySource `json:"source"`
Pin bool `json:"pin"`
ExpiresAt *time.Time `json:"expires_at,omitempty"`
CreatedAt time.Time `json:"created_at"`
// Score is the plugin's similarity score (1.0 = exact); only
// populated when ?q= is set and the plugin supports embedding.
Score *float64 `json:"score,omitempty"`
// SourceWorkspaceID is parsed out of `propagation.source_workspace_id`
// when present (cross-workspace propagation) — lets the canvas
// render a "from <peer>" badge so users can tell their own writes
// apart from team-shared memory.
SourceWorkspaceID string `json:"source_workspace_id,omitempty"`
}
// MemoriesResponse is the body of GET v2/memories.
type MemoriesResponse struct {
Memories []MemoryView `json:"memories"`
}
func memoryToView(m contract.Memory) MemoryView {
v := MemoryView{
ID: m.ID,
Namespace: m.Namespace,
Content: m.Content,
Kind: m.Kind,
Source: m.Source,
Pin: m.Pin,
ExpiresAt: m.ExpiresAt,
CreatedAt: m.CreatedAt,
Score: m.Score,
}
if m.Propagation != nil {
// `source_workspace_id` is a propagation contract field
// (RFC #2728 §5). Plugin emits it on writes that originated
// from a different workspace. Best-effort string extraction —
// don't fail rendering if shape drifts.
if raw, ok := m.Propagation["source_workspace_id"]; ok {
if s, ok := raw.(string); ok && s != "" {
v.SourceWorkspaceID = s
}
}
}
return v
}
// namespacesToViews converts resolver namespaces into UI-friendly
// views. Prefers `DisplayName` from the resolver (workspace.name from
// the DB) when present; falls back to a UUID-prefix label.
//
// Issue #2988: pre-fix, every namespace used a shortID-truncated UUID
// label. On a root workspace where workspace==team==org IDs collide
// (resolver derive() degenerate case), all three labels rendered
// identically. DisplayName disambiguates by surfacing real workspace
// names — the canvas dropdown now reads "Workspace (mac laptop)" /
// "Team (mac laptop)" / "Org (mac laptop)" for a root workspace
// rather than three identical UUID prefixes. The `kind` prefix
// "Workspace/Team/Org" still carries the semantic distinction.
func namespacesToViews(in []namespace.Namespace) []NamespaceView {
views := make([]NamespaceView, 0, len(in))
for _, n := range in {
views = append(views, NamespaceView{
Name: n.Name,
Kind: n.Kind,
Label: namespaceLabelWithName(n.Name, n.Kind, n.DisplayName),
})
}
return views
}
// namespaceLabel renders a human-friendly label for a namespace using
// the UUID-prefix fallback only. Kept for back-compat with callers
// that don't yet plumb a display name. New callers should use
// namespaceLabelWithName which prefers the workspace's display name
// when available.
//
// Format (UUID-prefix fallback):
// workspace:abc-123 → "Workspace (abc-123)"
// team:t-1 → "Team (t-1)"
// org:acme → "Org (acme)"
// custom:foo → "foo"
func namespaceLabel(name string, kind contract.NamespaceKind) string {
return namespaceLabelWithName(name, kind, "")
}
// namespaceLabelWithName renders the human-friendly label, preferring
// `displayName` when non-empty.
//
// When displayName is set:
// Workspace, "mac laptop" → "Workspace (mac laptop)"
// Team, "Engineering team" → "Team (Engineering team)"
// Org, "Hongming's Org" → "Org (Hongming's Org)"
//
// When displayName is empty (lookup miss, future-migration drop, etc.),
// falls back to the UUID-prefix shape for back-compat.
//
// Custom namespaces ignore displayName because they're operator-defined
// — the operator chose the raw suffix as the label, surfacing a
// different "name" would be a UX surprise.
func namespaceLabelWithName(name string, kind contract.NamespaceKind, displayName string) string {
suffix := ""
if i := indexOfColon(name); i >= 0 && i+1 < len(name) {
suffix = name[i+1:]
}
switch kind {
case contract.NamespaceKindWorkspace:
if displayName != "" {
return "Workspace (" + displayName + ")"
}
return "Workspace (" + shortID(suffix) + ")"
case contract.NamespaceKindTeam:
if displayName != "" {
return "Team (" + displayName + ")"
}
return "Team (" + shortID(suffix) + ")"
case contract.NamespaceKindOrg:
if displayName != "" {
return "Org (" + displayName + ")"
}
return "Org (" + suffix + ")"
case contract.NamespaceKindCustom:
// Operator-defined; the suffix IS the label they chose.
// displayName is ignored — surfacing a different name would
// be a UX surprise for an operator who deliberately named
// the namespace.
if suffix == "" {
return name
}
return suffix
default:
return name
}
}
// shortID truncates a UUID-like string to the first 8 chars so the
// dropdown stays readable. Keeps the full id available via the
// `name` field for click-to-copy / debugging.
func shortID(s string) string {
if len(s) <= 8 {
return s
}
return s[:8]
}
// indexOfColon is strings.IndexByte without the import, kept inline so
// the helper stays trivially auditable next to namespaceLabel.
func indexOfColon(s string) int {
for i := 0; i < len(s); i++ {
if s[i] == ':' {
return i
}
}
return -1
}
// parseLimit validates the ?limit= query value. Defaults +
// clamps mirror memoriesV2DefaultLimit / memoriesV2MaxLimit.
func parseLimit(raw string) int {
if raw == "" {
return memoriesV2DefaultLimit
}
n, err := strconv.Atoi(raw)
if err != nil || n <= 0 {
return memoriesV2DefaultLimit
}
if n > memoriesV2MaxLimit {
return memoriesV2MaxLimit
}
return n
}
@@ -0,0 +1,755 @@
package handlers
// memories_v2_test.go — comprehensive coverage for the Memory v2
// canvas-facing HTTP surface. Pinned shape:
//
// - 503 path when plugin unwired (every route)
// - GET /v2/namespaces success + readable/writable propagation
// - GET /v2/namespaces error path (resolver failure on either call)
// - GET /v2/memories: empty intersection, namespace passthrough,
// query+kind+limit propagation, plugin error mapping
// - DELETE /v2/memories/:id: success, plugin not_found→404, other
// plugin errors→502, missing memoryId→400
// - View shaping: namespaceLabel for all four kinds + truncation,
// memoryToView with/without propagation source, parseLimit edge
// cases (default, negative, zero, over-cap, non-numeric)
//
// Tests use the same `memoryPluginAPI` / `namespaceResolverAPI` fakes
// the MCP v2 tests use so we don't spin up a real plugin server.
import (
"context"
"encoding/json"
"errors"
"net/http"
"net/http/httptest"
"strings"
"testing"
"time"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/memory/contract"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/memory/namespace"
"github.com/gin-gonic/gin"
)
// ─────────────────────────────────────────────────────────────────────────────
// Fakes
// ─────────────────────────────────────────────────────────────────────────────
type fakePlugin struct {
searchResp *contract.SearchResponse
searchErr error
searchReq contract.SearchRequest // captured for assertion
forgetErr error
forgetID string
forgetReq contract.ForgetRequest
}
func (f *fakePlugin) CommitMemory(ctx context.Context, ns string, body contract.MemoryWrite) (*contract.MemoryWriteResponse, error) {
return nil, errors.New("not implemented in fake")
}
func (f *fakePlugin) Search(ctx context.Context, body contract.SearchRequest) (*contract.SearchResponse, error) {
f.searchReq = body
if f.searchErr != nil {
return nil, f.searchErr
}
return f.searchResp, nil
}
func (f *fakePlugin) ForgetMemory(ctx context.Context, id string, body contract.ForgetRequest) error {
f.forgetID = id
f.forgetReq = body
return f.forgetErr
}
type fakeNSResolver struct {
readable []namespace.Namespace
readableErr error
writable []namespace.Namespace
writableErr error
intersect []string
intersectErr error
intersectIn []string // captured
}
func (f *fakeNSResolver) ReadableNamespaces(ctx context.Context, ws string) ([]namespace.Namespace, error) {
return f.readable, f.readableErr
}
func (f *fakeNSResolver) WritableNamespaces(ctx context.Context, ws string) ([]namespace.Namespace, error) {
return f.writable, f.writableErr
}
func (f *fakeNSResolver) CanWrite(ctx context.Context, ws, ns string) (bool, error) {
return true, nil
}
func (f *fakeNSResolver) IntersectReadable(ctx context.Context, ws string, requested []string) ([]string, error) {
f.intersectIn = requested
return f.intersect, f.intersectErr
}
// ─────────────────────────────────────────────────────────────────────────────
// Test helpers
// ─────────────────────────────────────────────────────────────────────────────
func init() {
gin.SetMode(gin.TestMode)
}
// newWiredHandler returns a handler with both the fake plugin + fake
// resolver attached. Tests that need the unwired (503) path use
// NewMemoriesV2Handler() directly.
func newWiredHandler(p *fakePlugin, r *fakeNSResolver) *MemoriesV2Handler {
return NewMemoriesV2Handler().withMemoryV2APIs(p, r)
}
func doRequest(t *testing.T, h *MemoriesV2Handler, method, path string, params gin.Params) *httptest.ResponseRecorder {
t.Helper()
rec := httptest.NewRecorder()
c, _ := gin.CreateTestContext(rec)
c.Params = params
req := httptest.NewRequest(method, path, nil)
c.Request = req
switch {
case method == http.MethodGet && strings.HasSuffix(path, "/v2/namespaces"):
h.Namespaces(c)
case method == http.MethodGet && strings.Contains(path, "/v2/memories"):
h.Search(c)
case method == http.MethodDelete:
h.Forget(c)
default:
t.Fatalf("doRequest: don't know how to dispatch %s %s", method, path)
}
return rec
}
func mustJSON(t *testing.T, body []byte, out interface{}) {
t.Helper()
if err := json.Unmarshal(body, out); err != nil {
t.Fatalf("json decode: %v\nbody=%s", err, string(body))
}
}
// ─────────────────────────────────────────────────────────────────────────────
// 503 — plugin unwired
// ─────────────────────────────────────────────────────────────────────────────
func TestMemoriesV2_PluginUnwired_All503(t *testing.T) {
h := NewMemoriesV2Handler() // no WithMemoryV2 / withMemoryV2APIs
cases := []struct {
name string
method string
path string
params gin.Params
}{
{"namespaces", http.MethodGet, "/workspaces/ws-a/v2/namespaces", gin.Params{{Key: "id", Value: "ws-a"}}},
{"search", http.MethodGet, "/workspaces/ws-a/v2/memories", gin.Params{{Key: "id", Value: "ws-a"}}},
{"forget", http.MethodDelete, "/workspaces/ws-a/v2/memories/m-1", gin.Params{{Key: "id", Value: "ws-a"}, {Key: "memoryId", Value: "m-1"}}},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
rec := doRequest(t, h, tc.method, tc.path, tc.params)
if rec.Code != http.StatusServiceUnavailable {
t.Errorf("expected 503, got %d", rec.Code)
}
var body map[string]string
mustJSON(t, rec.Body.Bytes(), &body)
if !strings.Contains(body["error"], "MEMORY_PLUGIN_URL") {
t.Errorf("503 body missing operator hint, got: %q", body["error"])
}
})
}
}
// ─────────────────────────────────────────────────────────────────────────────
// GET /v2/namespaces
// ─────────────────────────────────────────────────────────────────────────────
func TestMemoriesV2_Namespaces_Success(t *testing.T) {
resolver := &fakeNSResolver{
readable: []namespace.Namespace{
{Name: "workspace:abc-1234-5678", Kind: contract.NamespaceKindWorkspace},
{Name: "team:t-99", Kind: contract.NamespaceKindTeam},
{Name: "org:acme", Kind: contract.NamespaceKindOrg},
{Name: "custom:special", Kind: contract.NamespaceKindCustom},
},
writable: []namespace.Namespace{
{Name: "workspace:abc-1234-5678", Kind: contract.NamespaceKindWorkspace},
},
}
h := newWiredHandler(&fakePlugin{}, resolver)
rec := doRequest(t, h, http.MethodGet, "/workspaces/ws-a/v2/namespaces",
gin.Params{{Key: "id", Value: "ws-a"}})
if rec.Code != 200 {
t.Fatalf("expected 200, got %d body=%s", rec.Code, rec.Body.String())
}
var body NamespacesResponse
mustJSON(t, rec.Body.Bytes(), &body)
if len(body.Readable) != 4 {
t.Errorf("expected 4 readable, got %d", len(body.Readable))
}
if len(body.Writable) != 1 {
t.Errorf("expected 1 writable, got %d", len(body.Writable))
}
// Label shaping pinned exactly — drift would silently break the
// dropdown rendering.
wantLabels := map[string]string{
"workspace:abc-1234-5678": "Workspace (abc-1234)",
"team:t-99": "Team (t-99)",
"org:acme": "Org (acme)",
"custom:special": "special",
}
for _, v := range body.Readable {
want, ok := wantLabels[v.Name]
if !ok {
t.Errorf("unexpected namespace name %q", v.Name)
continue
}
if v.Label != want {
t.Errorf("namespace %q: want label %q, got %q", v.Name, want, v.Label)
}
}
}
func TestMemoriesV2_Namespaces_ReadableError(t *testing.T) {
resolver := &fakeNSResolver{readableErr: errors.New("boom")}
h := newWiredHandler(&fakePlugin{}, resolver)
rec := doRequest(t, h, http.MethodGet, "/workspaces/ws-a/v2/namespaces",
gin.Params{{Key: "id", Value: "ws-a"}})
if rec.Code != http.StatusInternalServerError {
t.Errorf("expected 500, got %d", rec.Code)
}
}
func TestMemoriesV2_Namespaces_WritableError(t *testing.T) {
resolver := &fakeNSResolver{
readable: []namespace.Namespace{},
writableErr: errors.New("boom"),
}
h := newWiredHandler(&fakePlugin{}, resolver)
rec := doRequest(t, h, http.MethodGet, "/workspaces/ws-a/v2/namespaces",
gin.Params{{Key: "id", Value: "ws-a"}})
if rec.Code != http.StatusInternalServerError {
t.Errorf("expected 500, got %d", rec.Code)
}
}
// ─────────────────────────────────────────────────────────────────────────────
// GET /v2/memories — search path
// ─────────────────────────────────────────────────────────────────────────────
func TestMemoriesV2_Search_NoReadableNamespaces_EmptyResult(t *testing.T) {
// Empty intersection (e.g. workspace just provisioned, plugin
// hasn't created namespaces yet, OR caller asked for ns they
// can't read). Expected: 200 with empty memories array, NOT 404.
resolver := &fakeNSResolver{intersect: []string{}}
plugin := &fakePlugin{searchResp: &contract.SearchResponse{Memories: []contract.Memory{}}}
h := newWiredHandler(plugin, resolver)
rec := doRequest(t, h, http.MethodGet, "/workspaces/ws-a/v2/memories",
gin.Params{{Key: "id", Value: "ws-a"}})
if rec.Code != 200 {
t.Errorf("expected 200, got %d", rec.Code)
}
var body MemoriesResponse
mustJSON(t, rec.Body.Bytes(), &body)
if body.Memories == nil {
t.Error("Memories should be empty array, not nil — JSON would render null")
}
if len(body.Memories) != 0 {
t.Errorf("expected empty memories, got %d", len(body.Memories))
}
// Plugin must NOT be called when intersection is empty.
if plugin.searchReq.Namespaces != nil {
t.Error("plugin Search should not be called when intersection is empty")
}
}
func TestMemoriesV2_Search_FullPath_NamespaceQueryKindLimit(t *testing.T) {
expiresAt := time.Now().Add(24 * time.Hour)
resolver := &fakeNSResolver{intersect: []string{"workspace:ws-a"}}
score := 0.87
plugin := &fakePlugin{
searchResp: &contract.SearchResponse{
Memories: []contract.Memory{
{
ID: "m-1",
Namespace: "workspace:ws-a",
Content: "fact one",
Kind: contract.MemoryKindFact,
Source: contract.MemorySourceAgent,
Pin: true,
ExpiresAt: &expiresAt,
CreatedAt: time.Now(),
Score: &score,
Propagation: map[string]interface{}{
"source_workspace_id": "ws-peer-42",
},
},
},
},
}
h := newWiredHandler(plugin, resolver)
rec := httptest.NewRecorder()
c, _ := gin.CreateTestContext(rec)
c.Params = gin.Params{{Key: "id", Value: "ws-a"}}
c.Request = httptest.NewRequest(http.MethodGet,
"/workspaces/ws-a/v2/memories?namespace=workspace:ws-a&q=hello&kind=fact&limit=10", nil)
h.Search(c)
if rec.Code != 200 {
t.Fatalf("expected 200, got %d body=%s", rec.Code, rec.Body.String())
}
// Resolver received the requested namespace as a single-element list
if len(resolver.intersectIn) != 1 || resolver.intersectIn[0] != "workspace:ws-a" {
t.Errorf("resolver.IntersectReadable received %v, want [workspace:ws-a]", resolver.intersectIn)
}
// Plugin received query + kind + limit propagated through
if plugin.searchReq.Query != "hello" {
t.Errorf("plugin.Query=%q, want hello", plugin.searchReq.Query)
}
if len(plugin.searchReq.Kinds) != 1 || plugin.searchReq.Kinds[0] != contract.MemoryKindFact {
t.Errorf("plugin.Kinds=%v, want [fact]", plugin.searchReq.Kinds)
}
if plugin.searchReq.Limit != 10 {
t.Errorf("plugin.Limit=%d, want 10", plugin.searchReq.Limit)
}
// Response shape — pin/expires_at/score/source_workspace_id all
// surfaced into MemoryView so the canvas doesn't have to dig
// through propagation map.
var body MemoriesResponse
mustJSON(t, rec.Body.Bytes(), &body)
if len(body.Memories) != 1 {
t.Fatalf("expected 1 memory, got %d", len(body.Memories))
}
m := body.Memories[0]
if !m.Pin {
t.Error("Pin not propagated")
}
if m.ExpiresAt == nil {
t.Error("ExpiresAt not propagated")
}
if m.Score == nil || *m.Score != 0.87 {
t.Errorf("Score=%v, want 0.87", m.Score)
}
if m.SourceWorkspaceID != "ws-peer-42" {
t.Errorf("SourceWorkspaceID=%q, want ws-peer-42", m.SourceWorkspaceID)
}
}
func TestMemoriesV2_Search_NoNamespaceQuery_AllReadable(t *testing.T) {
// No ?namespace= → resolver.IntersectReadable receives nil (empty
// requested) and returns ALL readable. Plugin gets full set.
resolver := &fakeNSResolver{intersect: []string{"workspace:ws-a", "team:t-1"}}
plugin := &fakePlugin{searchResp: &contract.SearchResponse{}}
h := newWiredHandler(plugin, resolver)
rec := doRequest(t, h, http.MethodGet, "/workspaces/ws-a/v2/memories",
gin.Params{{Key: "id", Value: "ws-a"}})
if rec.Code != 200 {
t.Errorf("expected 200, got %d", rec.Code)
}
if resolver.intersectIn != nil {
t.Errorf("requested should be nil for unscoped query, got %v", resolver.intersectIn)
}
if len(plugin.searchReq.Namespaces) != 2 {
t.Errorf("plugin.Namespaces=%v, want both readable", plugin.searchReq.Namespaces)
}
}
func TestMemoriesV2_Search_IntersectError(t *testing.T) {
resolver := &fakeNSResolver{intersectErr: errors.New("db down")}
h := newWiredHandler(&fakePlugin{}, resolver)
rec := doRequest(t, h, http.MethodGet, "/workspaces/ws-a/v2/memories",
gin.Params{{Key: "id", Value: "ws-a"}})
if rec.Code != http.StatusInternalServerError {
t.Errorf("expected 500, got %d", rec.Code)
}
}
func TestMemoriesV2_Search_PluginError(t *testing.T) {
resolver := &fakeNSResolver{intersect: []string{"workspace:ws-a"}}
plugin := &fakePlugin{searchErr: errors.New("plugin down")}
h := newWiredHandler(plugin, resolver)
rec := doRequest(t, h, http.MethodGet, "/workspaces/ws-a/v2/memories",
gin.Params{{Key: "id", Value: "ws-a"}})
if rec.Code != http.StatusBadGateway {
t.Errorf("expected 502 (plugin error), got %d", rec.Code)
}
}
func TestMemoriesV2_Search_PropagationMissing_NoSourceWorkspaceID(t *testing.T) {
resolver := &fakeNSResolver{intersect: []string{"workspace:ws-a"}}
plugin := &fakePlugin{
searchResp: &contract.SearchResponse{
Memories: []contract.Memory{
{ID: "m-1", Namespace: "workspace:ws-a", Content: "no propagation"},
},
},
}
h := newWiredHandler(plugin, resolver)
rec := doRequest(t, h, http.MethodGet, "/workspaces/ws-a/v2/memories",
gin.Params{{Key: "id", Value: "ws-a"}})
var body MemoriesResponse
mustJSON(t, rec.Body.Bytes(), &body)
if len(body.Memories) != 1 || body.Memories[0].SourceWorkspaceID != "" {
t.Errorf("SourceWorkspaceID should be empty when propagation is nil, got %q", body.Memories[0].SourceWorkspaceID)
}
}
func TestMemoriesV2_Search_PropagationWrongType_DoesNotPanic(t *testing.T) {
resolver := &fakeNSResolver{intersect: []string{"workspace:ws-a"}}
plugin := &fakePlugin{
searchResp: &contract.SearchResponse{
Memories: []contract.Memory{
{
ID: "m-1",
Content: "wrong-type propagation",
Propagation: map[string]interface{}{
"source_workspace_id": 12345, // int, not string
},
},
},
},
}
h := newWiredHandler(plugin, resolver)
rec := doRequest(t, h, http.MethodGet, "/workspaces/ws-a/v2/memories",
gin.Params{{Key: "id", Value: "ws-a"}})
if rec.Code != 200 {
t.Fatalf("expected 200 (graceful), got %d", rec.Code)
}
var body MemoriesResponse
mustJSON(t, rec.Body.Bytes(), &body)
// Wrong-typed prop entry → empty SourceWorkspaceID, no panic.
if body.Memories[0].SourceWorkspaceID != "" {
t.Errorf("expected empty SourceWorkspaceID for non-string propagation, got %q", body.Memories[0].SourceWorkspaceID)
}
}
// ─────────────────────────────────────────────────────────────────────────────
// DELETE /v2/memories/:memoryId
// ─────────────────────────────────────────────────────────────────────────────
func TestMemoriesV2_Forget_Success(t *testing.T) {
plugin := &fakePlugin{} // forgetErr nil
h := newWiredHandler(plugin, &fakeNSResolver{})
rec := doRequest(t, h, http.MethodDelete, "/workspaces/ws-a/v2/memories/mem-42",
gin.Params{{Key: "id", Value: "ws-a"}, {Key: "memoryId", Value: "mem-42"}})
if rec.Code != 200 {
t.Errorf("expected 200, got %d body=%s", rec.Code, rec.Body.String())
}
if plugin.forgetID != "mem-42" {
t.Errorf("plugin received memoryID=%q, want mem-42", plugin.forgetID)
}
if plugin.forgetReq.RequestedByNamespace != "workspace:ws-a" {
t.Errorf("requested_by_namespace=%q, want workspace:ws-a", plugin.forgetReq.RequestedByNamespace)
}
}
func TestMemoriesV2_Forget_PluginNotFound_Maps404(t *testing.T) {
plugin := &fakePlugin{
forgetErr: &contract.Error{Code: contract.ErrorCodeNotFound, Message: "no such memory"},
}
h := newWiredHandler(plugin, &fakeNSResolver{})
rec := doRequest(t, h, http.MethodDelete, "/workspaces/ws-a/v2/memories/m-1",
gin.Params{{Key: "id", Value: "ws-a"}, {Key: "memoryId", Value: "m-1"}})
if rec.Code != http.StatusNotFound {
t.Errorf("expected 404, got %d", rec.Code)
}
}
func TestMemoriesV2_Forget_PluginOtherError_Maps502(t *testing.T) {
plugin := &fakePlugin{
forgetErr: &contract.Error{Code: contract.ErrorCodeInternal, Message: "db dead"},
}
h := newWiredHandler(plugin, &fakeNSResolver{})
rec := doRequest(t, h, http.MethodDelete, "/workspaces/ws-a/v2/memories/m-1",
gin.Params{{Key: "id", Value: "ws-a"}, {Key: "memoryId", Value: "m-1"}})
if rec.Code != http.StatusBadGateway {
t.Errorf("expected 502, got %d", rec.Code)
}
}
func TestMemoriesV2_Forget_NonContractError_Maps502(t *testing.T) {
// A raw error (e.g. transport failure) — not a contract.Error —
// also bubbles up as 502.
plugin := &fakePlugin{forgetErr: errors.New("connection reset")}
h := newWiredHandler(plugin, &fakeNSResolver{})
rec := doRequest(t, h, http.MethodDelete, "/workspaces/ws-a/v2/memories/m-1",
gin.Params{{Key: "id", Value: "ws-a"}, {Key: "memoryId", Value: "m-1"}})
if rec.Code != http.StatusBadGateway {
t.Errorf("expected 502, got %d", rec.Code)
}
}
func TestMemoriesV2_Forget_MissingMemoryID_400(t *testing.T) {
h := newWiredHandler(&fakePlugin{}, &fakeNSResolver{})
rec := doRequest(t, h, http.MethodDelete, "/workspaces/ws-a/v2/memories/",
gin.Params{{Key: "id", Value: "ws-a"}, {Key: "memoryId", Value: ""}})
if rec.Code != http.StatusBadRequest {
t.Errorf("expected 400, got %d", rec.Code)
}
}
// ─────────────────────────────────────────────────────────────────────────────
// View-shaping unit tests — pin individual helpers
// ─────────────────────────────────────────────────────────────────────────────
// namespaceLabelWithName tests — the new code path that prefers
// DisplayName over UUID-prefix fallback (issue #2988).
func TestNamespaceLabelWithName_PrefersDisplayNameWhenSet(t *testing.T) {
cases := []struct {
name string
raw string
kind contract.NamespaceKind
display string
want string
}{
{"workspace with name", "workspace:abc-1234", contract.NamespaceKindWorkspace, "mac laptop", "Workspace (mac laptop)"},
{"team with name", "team:abc-1234", contract.NamespaceKindTeam, "Engineering", "Team (Engineering)"},
{"org with name", "org:acme", contract.NamespaceKindOrg, "Hongming's Org", "Org (Hongming's Org)"},
// Custom ignores displayName by design — operator chose the suffix.
{"custom ignores displayName", "custom:ops-shared", contract.NamespaceKindCustom, "FancyName", "ops-shared"},
{"unknown kind falls through", "weird:x", contract.NamespaceKind("future"), "WhoCares", "weird:x"},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
got := namespaceLabelWithName(tc.raw, tc.kind, tc.display)
if got != tc.want {
t.Errorf("namespaceLabelWithName(%q, %q, %q) = %q, want %q",
tc.raw, tc.kind, tc.display, got, tc.want)
}
})
}
}
func TestNamespaceLabelWithName_FallsBackToUUIDPrefixWhenEmpty(t *testing.T) {
// When displayName is empty (NULL in DB, lookup miss, etc.), the
// label shape MUST match the legacy UUID-prefix shape exactly so
// existing canvas behaviour is unchanged for callers that don't
// plumb a name.
cases := []struct {
raw string
kind contract.NamespaceKind
want string
}{
{"workspace:abcdefghij", contract.NamespaceKindWorkspace, "Workspace (abcdefgh)"},
{"team:t-99", contract.NamespaceKindTeam, "Team (t-99)"},
{"org:acme", contract.NamespaceKindOrg, "Org (acme)"},
}
for _, tc := range cases {
got := namespaceLabelWithName(tc.raw, tc.kind, "")
if got != tc.want {
t.Errorf("displayName=\"\" path: got %q, want %q", got, tc.want)
}
}
}
func TestNamespacesToViews_PassesDisplayNameThrough(t *testing.T) {
in := []namespace.Namespace{
{Name: "workspace:root-1", Kind: contract.NamespaceKindWorkspace, DisplayName: "mac laptop"},
{Name: "team:root-1", Kind: contract.NamespaceKindTeam, DisplayName: "mac laptop"}, // root → team aliases self
{Name: "org:root-1", Kind: contract.NamespaceKindOrg, DisplayName: "mac laptop"},
}
out := namespacesToViews(in)
if len(out) != 3 {
t.Fatalf("len = %d, want 3", len(out))
}
wantLabels := []string{
"Workspace (mac laptop)",
"Team (mac laptop)",
"Org (mac laptop)",
}
for i, v := range out {
if v.Label != wantLabels[i] {
t.Errorf("[%d] label = %q, want %q", i, v.Label, wantLabels[i])
}
}
}
func TestNamespacesToViews_FallsBackToUUIDLabelWhenDisplayNameEmpty(t *testing.T) {
// Exercises the back-compat path — DisplayName="" plumbs through
// to namespaceLabelWithName which returns the legacy UUID-prefix
// label. This is what callers see when the workspaces table
// has a NULL name (defensive — workspaces.name is NOT NULL today).
in := []namespace.Namespace{
{Name: "workspace:root-1", Kind: contract.NamespaceKindWorkspace}, // no DisplayName
}
out := namespacesToViews(in)
if out[0].Label != "Workspace (root-1)" {
t.Errorf("fallback label = %q, want %q", out[0].Label, "Workspace (root-1)")
}
}
func TestNamespaceLabel_AllKinds(t *testing.T) {
cases := []struct {
name string
kind contract.NamespaceKind
want string
}{
{"workspace:abcdefghij", contract.NamespaceKindWorkspace, "Workspace (abcdefgh)"}, // truncated to 8
{"workspace:abc", contract.NamespaceKindWorkspace, "Workspace (abc)"}, // shorter than 8, kept as-is
{"team:t-99", contract.NamespaceKindTeam, "Team (t-99)"},
{"org:acme", contract.NamespaceKindOrg, "Org (acme)"},
{"custom:my-ns", contract.NamespaceKindCustom, "my-ns"},
{"custom:", contract.NamespaceKindCustom, "custom:"}, // empty suffix → fallback to raw name
{"weird-no-colon", contract.NamespaceKindWorkspace, "Workspace ()"},
{"unknown:x", contract.NamespaceKind("future"), "unknown:x"}, // unknown kind → fallback to raw name
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
got := namespaceLabel(tc.name, tc.kind)
if got != tc.want {
t.Errorf("namespaceLabel(%q, %q) = %q, want %q", tc.name, tc.kind, got, tc.want)
}
})
}
}
func TestParseLimit(t *testing.T) {
cases := []struct {
raw string
want int
}{
{"", memoriesV2DefaultLimit},
{"10", 10},
{"0", memoriesV2DefaultLimit}, // ≤0 → default, not error
{"-5", memoriesV2DefaultLimit}, // negative → default
{"abc", memoriesV2DefaultLimit}, // non-numeric → default
{"99999", memoriesV2MaxLimit}, // over cap → clamped
{"100", memoriesV2MaxLimit}, // exactly cap → kept
{"99", 99}, // just under cap → kept
}
for _, tc := range cases {
t.Run("raw="+tc.raw, func(t *testing.T) {
if got := parseLimit(tc.raw); got != tc.want {
t.Errorf("parseLimit(%q) = %d, want %d", tc.raw, got, tc.want)
}
})
}
}
func TestMemoryToView_AllFieldsPropagated(t *testing.T) {
now := time.Now()
exp := now.Add(time.Hour)
score := 0.95
m := contract.Memory{
ID: "m-1",
Namespace: "team:t-1",
Content: "hello",
Kind: contract.MemoryKindSummary,
Source: contract.MemorySourceUser,
Pin: true,
ExpiresAt: &exp,
CreatedAt: now,
Score: &score,
Propagation: map[string]interface{}{
"source_workspace_id": "ws-other",
},
}
v := memoryToView(m)
if v.ID != m.ID || v.Namespace != m.Namespace || v.Content != m.Content {
t.Errorf("basic fields: %+v", v)
}
if v.Kind != contract.MemoryKindSummary || v.Source != contract.MemorySourceUser {
t.Errorf("kind/source: %+v", v)
}
if !v.Pin || v.ExpiresAt == nil || v.Score == nil || *v.Score != 0.95 {
t.Errorf("pin/expires/score: %+v", v)
}
if v.SourceWorkspaceID != "ws-other" {
t.Errorf("SourceWorkspaceID=%q, want ws-other", v.SourceWorkspaceID)
}
}
func TestNamespacesToViews_PreservesOrder(t *testing.T) {
in := []namespace.Namespace{
{Name: "team:t1", Kind: contract.NamespaceKindTeam},
{Name: "workspace:w1", Kind: contract.NamespaceKindWorkspace},
}
out := namespacesToViews(in)
if len(out) != 2 {
t.Fatalf("len=%d", len(out))
}
// Resolver determines order; we just preserve it. (Sorting can be
// added at the resolver layer if the canvas needs it.)
if out[0].Name != "team:t1" || out[1].Name != "workspace:w1" {
t.Errorf("order not preserved: %+v", out)
}
}
func TestNamespacesToViews_EmptyInput_EmptySlice(t *testing.T) {
out := namespacesToViews(nil)
if out == nil {
t.Error("expected empty slice, not nil — JSON-marshals as null otherwise")
}
if len(out) != 0 {
t.Errorf("expected len 0, got %d", len(out))
}
}
func TestIndexOfColon(t *testing.T) {
cases := []struct {
s string
want int
}{
{"abc:def", 3},
{":foo", 0},
{"nocolon", -1},
{"", -1},
{"a:b:c", 1}, // first colon only
}
for _, tc := range cases {
if got := indexOfColon(tc.s); got != tc.want {
t.Errorf("indexOfColon(%q) = %d, want %d", tc.s, got, tc.want)
}
}
}
func TestWithMemoryV2_FluentReturnsReceiver(t *testing.T) {
// WithMemoryV2 is the production wiring path (takes *client.Client +
// *namespace.Resolver). withMemoryV2APIs is the test path. The
// production call is structural — assigns the two fields and
// returns the receiver — but we still want a 100% coverage gate
// to catch a future refactor that accidentally drops the fluent
// return (breaking the boot-time chain in router.go).
//
// We can't pass nil for the typed pointers and call available()
// here because Go interface-with-nil-pointer is non-nil at the
// interface level — `available()` would not detect that as
// "unwired". The unwired-plugin behaviour is exhaustively
// covered by TestMemoriesV2_PluginUnwired_All503; this test just
// pins the fluent contract.
h := NewMemoriesV2Handler()
got := h.WithMemoryV2(nil, nil)
if got != h {
t.Error("WithMemoryV2 must return receiver for fluent chaining")
}
}
func TestShortID(t *testing.T) {
cases := map[string]string{
"": "",
"short": "short",
"exactly8": "exactly8",
"longer-than-eight": "longer-t",
"abc-1234-5678-90ab": "abc-1234",
}
for in, want := range cases {
if got := shortID(in); got != want {
t.Errorf("shortID(%q) = %q, want %q", in, got, want)
}
}
}
@@ -7,6 +7,7 @@ import (
"context"
"database/sql"
"encoding/json"
"errors"
"fmt"
"log"
"os"
@@ -19,11 +20,14 @@ import (
"github.com/Molecule-AI/molecule-monorepo/platform/internal/channels"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/crypto"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/db"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/events"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/models"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/provisioner"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/provlog"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/scheduler"
"github.com/google/uuid"
)
// createWorkspaceTree recursively materialises an OrgWorkspace (and its
// descendants) into the workspaces + canvas_layouts tables and kicks off
// Docker provisioning. absX/absY are THIS workspace's absolute canvas
@@ -61,34 +65,21 @@ func (h *OrgHandler) createWorkspaceTree(ws OrgWorkspace, parentID *string, absX
tier = defaults.Tier
}
if tier == 0 {
tier = 2
}
ctxLookup := context.Background()
// Idempotency: if a workspace with the same (parent_id, name) already
// exists, skip the INSERT + canvas_layouts + broadcast + provisioning.
// This is what makes /org/import safe to call multiple times — the
// historical leak was every call recreating the entire tree (see
// tenant-hongming, 72 distinct child workspaces in 4 days, all from
// repeated org-template spawns of the same template).
//
// Recursion still runs on the existing id so partial-match templates
// (parent exists, some children missing) backfill the missing children
// instead of either no-op'ing the whole subtree or duplicating the
// existing children.
existingID, existing, lookupErr := h.lookupExistingChild(ctxLookup, ws.Name, parentID)
if lookupErr != nil {
return fmt.Errorf("idempotency check for %s: %w", ws.Name, lookupErr)
}
if existing {
log.Printf("Org import: %q already exists (id=%s) — skipping create+provision, recursing into children for partial-match", ws.Name, existingID)
*results = append(*results, map[string]interface{}{
"id": existingID,
"name": ws.Name,
"tier": tier,
"skipped": true,
})
return h.recurseChildrenForImport(ws, existingID, absX, absY, defaults, orgBaseDir, results, provisionSem)
// Resolved via the same DefaultTier helper Create + Templates
// use (#2910 PR-E). SaaS → T4 (one container per sibling EC2,
// no neighbour to protect from), self-hosted → T3. Pre-#2910
// this path returned T2 on self-hosted, asymmetric with
// workspace.go's T3 — undocumented drift. Lifting to
// DefaultTier collapses both call sites onto one source of
// truth so a future tier-default change sweeps every entry
// point at once. Templates that want a different floor still
// declare `tier:` in config.yaml or `defaults.tier` in
// org.yaml.
if h.workspace != nil {
tier = h.workspace.DefaultTier()
} else {
tier = 3
}
}
id := uuid.New().String()
@@ -142,10 +133,67 @@ func (h *OrgHandler) createWorkspaceTree(ws OrgWorkspace, parentID *string, absX
if maxConcurrent <= 0 {
maxConcurrent = models.DefaultMaxConcurrentTasks
}
_, err := db.DB.ExecContext(ctx, `
// TOCTOU-safe insert (#2872 Critical 1).
//
// `ON CONFLICT DO NOTHING` paired with the partial unique index
// from migration 20260506000000_workspaces_unique_parent_name.up.sql
// atomically resolves a race window that the prior
// lookup-then-insert had: two concurrent /org/import POSTs both
// saw "not found" in lookupExistingChild and both INSERT'd the
// same (parent_id, name). After this swap the SECOND INSERT
// silently no-ops, RETURNING returns 0 rows → sql.ErrNoRows, and
// the skip-path runs.
//
// ON CONFLICT target uses (COALESCE(parent_id,...), name) WHERE
// status != 'removed' — must match the partial-index predicate
// EXACTLY for Postgres to consider the index applicable.
var insertedID string
err := db.DB.QueryRowContext(ctx, `
INSERT INTO workspaces (id, name, role, tier, runtime, awareness_namespace, status, parent_id, workspace_dir, workspace_access, max_concurrent_tasks)
VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11)
`, id, ws.Name, role, tier, runtime, awarenessNS, "provisioning", parentID, workspaceDir, workspaceAccess, maxConcurrent)
ON CONFLICT (COALESCE(parent_id, '00000000-0000-0000-0000-000000000000'::uuid), name)
WHERE status != 'removed'
DO NOTHING
RETURNING id
`, id, ws.Name, role, tier, runtime, awarenessNS, "provisioning", parentID, workspaceDir, workspaceAccess, maxConcurrent).Scan(&insertedID)
if errors.Is(err, sql.ErrNoRows) {
// Skip path — a non-removed row already exists for
// (parent_id, name). Re-select its id; idempotency-friendly
// semantics match the original lookupExistingChild path
// (parent_id IS NOT DISTINCT FROM matches NULL too,
// status='removed' rows are ignored).
ctxLookup, cancelLookup := context.WithTimeout(context.Background(), 5*time.Second)
defer cancelLookup()
existingID, found, selErr := h.lookupExistingChild(ctxLookup, ws.Name, parentID)
if selErr != nil {
return fmt.Errorf("post-conflict re-select for %s: %w", ws.Name, selErr)
}
if !found {
// Index conflicted but row vanished between INSERT and
// re-SELECT (status flipped to 'removed' concurrently).
// Surface as an error rather than silently retrying —
// the user can re-trigger /org/import safely.
return fmt.Errorf("workspace %q conflicted on insert but not visible on re-select (concurrent status flip?)", ws.Name)
}
log.Printf("Org import: %q already exists (id=%s) — skipping create+provision, recursing into children for partial-match", ws.Name, existingID)
parentRef := ""
if parentID != nil {
parentRef = *parentID
}
provlog.Event("provision.skip_existing", map[string]any{
"name": ws.Name,
"existing_id": existingID,
"parent_id": parentRef,
"tier": tier,
})
*results = append(*results, map[string]interface{}{
"id": existingID,
"name": ws.Name,
"tier": tier,
"skipped": true,
})
return h.recurseChildrenForImport(ws, existingID, absX, absY, defaults, orgBaseDir, results, provisionSem)
}
if err != nil {
log.Printf("Org import: failed to create %s: %v", ws.Name, err)
return fmt.Errorf("failed to create %s: %w", ws.Name, err)
@@ -183,7 +231,7 @@ func (h *OrgHandler) createWorkspaceTree(ws OrgWorkspace, parentID *string, absX
if parentID != nil {
payload["parent_id"] = *parentID
}
h.broadcaster.RecordAndBroadcast(ctx, "WORKSPACE_PROVISIONING", id, payload)
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventWorkspaceProvisioning), id, payload)
// Seed initial memories from workspace config or defaults (issue #1050).
// Per-workspace initial_memories override defaults; if workspace has none,
@@ -199,7 +247,7 @@ func (h *OrgHandler) createWorkspaceTree(ws OrgWorkspace, parentID *string, absX
if _, err := db.DB.ExecContext(ctx, `UPDATE workspaces SET status = $1, url = $2 WHERE id = $3`, models.StatusOnline, ws.URL, id); err != nil {
log.Printf("Org import: external workspace status update failed for %s: %v", ws.Name, err)
}
h.broadcaster.RecordAndBroadcast(ctx, "WORKSPACE_ONLINE", id, map[string]interface{}{
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventWorkspaceOnline), id, map[string]interface{}{
"name": ws.Name, "external": true,
})
} else if h.workspace.HasProvisioner() {
@@ -580,6 +628,12 @@ func (h *OrgHandler) createWorkspaceTree(ws OrgWorkspace, parentID *string, absX
//
// On sql.ErrNoRows: returns ("", false, nil) — caller should INSERT.
// On a real DB error: returns ("", false, err) — caller propagates.
//
// errors.Is is wrap-safe — a future caller wrapping the error
// (database/sql can wrap driver errors with %w in some setups) would
// silently break a `err == sql.ErrNoRows` equality check, causing the
// no-rows path to fall through to the "real DB error" branch and
// abort the import. errors.Is unwraps.
func (h *OrgHandler) lookupExistingChild(ctx context.Context, name string, parentID *string) (string, bool, error) {
var existingID string
err := db.DB.QueryRowContext(ctx, `
@@ -589,7 +643,7 @@ func (h *OrgHandler) lookupExistingChild(ctx context.Context, name string, paren
AND status != 'removed'
LIMIT 1
`, name, parentID).Scan(&existingID)
if err == sql.ErrNoRows {
if errors.Is(err, sql.ErrNoRows) {
return "", false, nil
}
if err != nil {
@@ -2,7 +2,9 @@ package handlers
import (
"context"
"database/sql"
"errors"
"fmt"
"go/ast"
"go/parser"
"go/token"
@@ -123,6 +125,36 @@ func TestLookupExistingChild_DBError_Propagates(t *testing.T) {
}
}
// TestLookupExistingChild_WrappedNoRows_TreatedAsNotFound — pins the
// wrap-safety of the errors.Is(err, sql.ErrNoRows) check. The previous
// `err == sql.ErrNoRows` equality would fall through to the
// "real DB error" branch on a wrapped no-rows error, aborting the
// import for what is in fact the no-rows happy path. driver/sql
// wrapping is currently a non-issue but a future driver change or a
// caller that wraps the result via fmt.Errorf("…: %w", err) would
// silently break the equality check. errors.Is unwraps.
func TestLookupExistingChild_WrappedNoRows_TreatedAsNotFound(t *testing.T) {
mock := setupTestDB(t)
parent := "parent-1"
wrapped := fmt.Errorf("driver-wrapped: %w", sql.ErrNoRows)
mock.ExpectQuery(`SELECT id FROM workspaces`).
WithArgs("Alpha", &parent).
WillReturnError(wrapped)
h := &OrgHandler{}
id, found, err := h.lookupExistingChild(context.Background(), "Alpha", &parent)
if err != nil {
t.Fatalf("expected wrapped no-rows to be treated as not-found (err=nil), got: %v", err)
}
if found {
t.Errorf("expected found=false on wrapped no-rows, got found=true")
}
if id != "" {
t.Errorf("expected empty id on wrapped no-rows, got %q", id)
}
}
// workspacesInsertRE matches a SQL literal that begins (after optional
// leading whitespace) with `INSERT INTO workspaces` followed by `(` —
// requiring the open-paren rules out lookalikes like
@@ -177,19 +209,42 @@ func findLookupAndWorkspacesInsertPos(t *testing.T, fname string, src []byte) (l
return
}
// Source-level guard — pins that org_import.go calls
// h.lookupExistingChild BEFORE its INSERT INTO workspaces.
// onConflictDoNothingRE pins the TOCTOU-safe shape introduced by
// migration 20260506000000_workspaces_unique_parent_name.up.sql +
// the org_import.go INSERT swap (#2872 Critical 1). The workspaces
// INSERT MUST funnel concurrent collisions through the partial unique
// index — `ON CONFLICT (...) WHERE status != 'removed' DO NOTHING`
// is the literal pg statement form that achieves it.
//
// The pattern intentionally requires both the COALESCE expression
// (so root-workspace NULL parents collide) AND the partial-index WHERE
// clause (so 'removed' rows don't block re-imports). A regression that
// drops either piece would make the index target a different shape
// than the migration created, and Postgres would emit
// "no unique or exclusion constraint matching the ON CONFLICT
// specification" at runtime — but only on the FIRST collision attempt
// in production, not in CI without a live race. This regex catches
// the shape in source so the bug never ships.
var onConflictDoNothingRE = regexp.MustCompile(
`(?s)ON\s+CONFLICT\s*\(\s*COALESCE\s*\(\s*parent_id\s*,\s*'00000000-0000-0000-0000-000000000000'::uuid\s*\)\s*,\s*name\s*\).*?WHERE\s+status\s*!=\s*'removed'.*?DO\s+NOTHING`,
)
// Source-level guard — pins that org_import.go's INSERT INTO workspaces
// uses the TOCTOU-safe ON CONFLICT DO NOTHING pattern.
//
// Per memory feedback_behavior_based_ast_gates.md: pin the behavior
// (idempotency check before INSERT), not just function names. If a
// future refactor reintroduces the un-checked INSERT (the original
// bug shape that leaked 72 workspaces in 4 days), this test fails.
// (atomic conflict resolution at the DB), not just function names.
// If a future refactor reintroduces the un-checked INSERT (the original
// bug shape that leaked 72 workspaces in 4 days at tenant-hongming),
// this test fails BEFORE the broken code reaches production where the
// race window opens.
//
// AST-walk implementation closes the silent-false-pass mode that the
// previous bytes.Index gate had — see workspacesInsertRE comment for
// the failure mode (workspaces_audit / workspace_secrets / etc.
// shadowing the real target via prefix match).
func TestCreateWorkspaceTree_CallsLookupBeforeInsert(t *testing.T) {
// Replaces an earlier "lookup-before-insert" gate that became obsolete
// when this swap moved idempotency into the database. The earlier
// gate would silent-false-pass against ON CONFLICT — even though that
// shape is correct — because lookupExistingChild now runs AFTER the
// INSERT (only on the skip path, to retrieve the existing id).
func TestCreateWorkspaceTree_InsertUsesOnConflictDoNothing(t *testing.T) {
wd, err := os.Getwd()
if err != nil {
t.Fatalf("getwd: %v", err)
@@ -198,30 +253,24 @@ func TestCreateWorkspaceTree_CallsLookupBeforeInsert(t *testing.T) {
if err != nil {
t.Fatalf("read org_import.go: %v", err)
}
lookupPos, insertPos, fset := findLookupAndWorkspacesInsertPos(t, "org_import.go", src)
if lookupPos == token.NoPos {
t.Fatalf("AST: no call to lookupExistingChild in org_import.go — idempotency check removed?")
}
if insertPos == token.NoPos {
insertSQL := findWorkspacesInsertSQL(t, "org_import.go", src)
if insertSQL == "" {
t.Fatalf("AST: no SQL literal matching `^\\s*INSERT INTO workspaces\\s*\\(` in any CallExpr in org_import.go — schema change or rename?")
}
if lookupPos > insertPos {
t.Errorf("lookupExistingChild call at %s must come BEFORE INSERT INTO workspaces at %s — non-idempotent ordering would re-leak under repeat /org/import calls",
fset.Position(lookupPos), fset.Position(insertPos))
if !onConflictDoNothingRE.MatchString(insertSQL) {
t.Errorf("workspaces INSERT SQL does NOT use the TOCTOU-safe ON CONFLICT shape — concurrent /org/import POSTs will silently double-insert. Required pattern:\n ON CONFLICT (COALESCE(parent_id, '00000000-...'::uuid), name) WHERE status != 'removed' DO NOTHING\n\nActual SQL:\n%s", insertSQL)
}
}
// TestGate_FailsWhenLookupAfterInsert proves the gate actually catches
// the bug it's named after — running it against synthetic Go source
// where the lookup call is positioned AFTER the workspaces INSERT must
// produce lookupPos > insertPos, which the production gate flags as
// an ERROR. Without this test the gate could regress to "always pass"
// and we wouldn't notice until the bug shipped again.
// TestGate_FailsWhenInsertOmitsOnConflict proves the gate actually
// catches the bug it's named after — running it against synthetic Go
// source where the workspaces INSERT lacks the ON CONFLICT clause must
// fail the regex match. Without this test the gate could regress to
// "always pass" and the TOCTOU window would silently reopen.
//
// Per memory feedback_assert_exact_not_substring.md: verify a
// tightened test FAILS on old code before merging.
func TestGate_FailsWhenLookupAfterInsert(t *testing.T) {
// Per memory feedback_assert_exact_not_substring.md: verify the
// tightened test FAILS on the bug shape before merging.
func TestGate_FailsWhenInsertOmitsOnConflict(t *testing.T) {
const buggySrc = `package handlers
import "context"
@@ -232,26 +281,57 @@ func (fakeDB) ExecContext(ctx context.Context, sql string, args ...interface{})
type fakeOrgHandler struct{}
func (h *fakeOrgHandler) lookupExistingChild(ctx context.Context, name string, parentID *string) (string, bool, error) {
return "", false, nil
}
func buggyCreate(h *fakeOrgHandler, db fakeDB, ctx context.Context, name string, parentID *string) {
// Bug shape: INSERT runs FIRST, lookup runs AFTER. This is the
// non-idempotent ordering the gate exists to forbid.
// Bug shape: bare INSERT, no ON CONFLICT. Two concurrent calls
// race past the unique-index check before either completes the
// transaction; constraint failure surfaces as a 500 to the
// caller (not graceful skip). Pre-#2872 this would silently
// duplicate-insert.
db.ExecContext(ctx, ` + "`INSERT INTO workspaces (id, name) VALUES ($1, $2)`" + `, "x", name)
h.lookupExistingChild(ctx, name, parentID)
}
`
lookupPos, insertPos, _ := findLookupAndWorkspacesInsertPos(t, "buggy.go", []byte(buggySrc))
if lookupPos == token.NoPos || insertPos == token.NoPos {
t.Fatalf("synthetic buggy source missing expected nodes (lookupPos=%v insertPos=%v) — helper logic regression", lookupPos, insertPos)
insertSQL := findWorkspacesInsertSQL(t, "buggy.go", []byte(buggySrc))
if insertSQL == "" {
t.Fatalf("synthetic buggy source missing workspaces INSERT — helper logic regression")
}
if lookupPos < insertPos {
t.Fatalf("synthetic bug shape (lookup AFTER insert) returned lookupPos=%d < insertPos=%d — gate would NOT fire on actual bug, regression!", lookupPos, insertPos)
if onConflictDoNothingRE.MatchString(insertSQL) {
t.Fatalf("synthetic bug shape (bare INSERT, no ON CONFLICT) was MATCHED by the gate — regression: gate would not flag the actual bug. SQL:\n%s", insertSQL)
}
// Implicit: lookupPos > insertPos here, which the production gate
// flags via t.Errorf. This proves the gate is live, not vestigial.
}
// findWorkspacesInsertSQL walks `src` and returns the unquoted SQL of
// the first string literal matching workspacesInsertRE inside any
// CallExpr's argument list. Returns "" if none found. Helper for the
// ON CONFLICT gate above.
func findWorkspacesInsertSQL(t *testing.T, fname string, src []byte) string {
t.Helper()
fset := token.NewFileSet()
file, err := parser.ParseFile(fset, fname, src, parser.ParseComments)
if err != nil {
t.Fatalf("parse %s: %v", fname, err)
}
var sql string
ast.Inspect(file, func(n ast.Node) bool {
call, ok := n.(*ast.CallExpr)
if !ok {
return true
}
for _, arg := range call.Args {
lit, ok := arg.(*ast.BasicLit)
if !ok || lit.Kind != token.STRING {
continue
}
raw := lit.Value
if unq, err := strconv.Unquote(raw); err == nil {
raw = unq
}
if workspacesInsertRE.MatchString(raw) && sql == "" {
sql = raw
}
}
return true
})
return sql
}
// TestGate_IgnoresAuditTableShadow proves the regex tightening
@@ -44,6 +44,7 @@ import (
"context"
"database/sql"
"os"
"strings"
"testing"
"time"
@@ -273,6 +274,378 @@ func TestIntegration_PendingUploads_PutEnforcesSizeCap(t *testing.T) {
}
}
// TestIntegration_PendingUploads_PutBatch_HappyPath_AllRowsCommit pins the
// "all rows commit" leg of the PutBatch atomicity contract against a real
// Postgres. sqlmock can't catch a regression where the Go-side Tx machinery
// silently no-ops the inserts (e.g., wrong driver options on BeginTx); only
// COUNT(*) on the real table can.
func TestIntegration_PendingUploads_PutBatch_HappyPath_AllRowsCommit(t *testing.T) {
conn := integrationDB_PendingUploads(t)
store := pendinguploads.NewPostgres(conn)
ctx := context.Background()
wsID := uuid.New()
// Pre-existing row so the COUNT(*) baseline is non-zero — proves
// PutBatch adds rows incrementally rather than overwriting.
if _, err := store.Put(ctx, wsID, []byte("seed"), "seed.txt", "text/plain"); err != nil {
t.Fatalf("seed Put: %v", err)
}
items := []pendinguploads.PutItem{
{Content: []byte("alpha"), Filename: "alpha.txt", Mimetype: "text/plain"},
{Content: []byte("beta"), Filename: "beta.bin", Mimetype: "application/octet-stream"},
{Content: []byte("gamma"), Filename: "gamma.pdf", Mimetype: "application/pdf"},
}
ids, err := store.PutBatch(ctx, wsID, items)
if err != nil {
t.Fatalf("PutBatch: %v", err)
}
if len(ids) != len(items) {
t.Fatalf("ids length %d, want %d", len(ids), len(items))
}
// Each returned id round-trips through Get with the right content.
for i, id := range ids {
rec, err := store.Get(ctx, id)
if err != nil {
t.Fatalf("Get item %d (%s): %v", i, id, err)
}
if string(rec.Content) != string(items[i].Content) {
t.Errorf("item %d content = %q, want %q", i, rec.Content, items[i].Content)
}
if rec.Filename != items[i].Filename {
t.Errorf("item %d filename = %q, want %q", i, rec.Filename, items[i].Filename)
}
}
var n int
if err := conn.QueryRowContext(ctx, `SELECT COUNT(*) FROM pending_uploads WHERE workspace_id = $1`, wsID).Scan(&n); err != nil {
t.Fatalf("count: %v", err)
}
if n != 4 {
t.Errorf("workspace row count = %d, want 4 (1 seed + 3 batch)", n)
}
}
// TestIntegration_PendingUploads_PutBatch_AtomicRollback_NoLeakOnFailure
// proves the all-or-nothing contract end-to-end against real Postgres MVCC.
//
// Strategy: build a 3-item batch where item index 1 carries a filename with
// an embedded NUL byte. lib/pq rejects NULs in TEXT columns at the protocol
// layer (`pq: invalid byte sequence for encoding "UTF8": 0x00`), which
// triggers the per-row INSERT error path in PutBatch. The first item's
// INSERT…RETURNING already wrote a row to the Tx's snapshot, so a buggy
// rollback would leave that row visible after PutBatch returns.
//
// Postgrest semantics: ROLLBACK is the only way a real DB can guarantee the
// "no leak" contract; a unit test with sqlmock can prove the Go function
// CALLED Rollback, but only this integration test proves Postgres actually
// HONORED it.
func TestIntegration_PendingUploads_PutBatch_AtomicRollback_NoLeakOnFailure(t *testing.T) {
conn := integrationDB_PendingUploads(t)
store := pendinguploads.NewPostgres(conn)
ctx := context.Background()
wsID := uuid.New()
// Baseline COUNT(*) for this workspace — must remain 0 after a failed batch.
var before int
if err := conn.QueryRowContext(ctx, `SELECT COUNT(*) FROM pending_uploads WHERE workspace_id = $1`, wsID).Scan(&before); err != nil {
t.Fatalf("baseline count: %v", err)
}
if before != 0 {
t.Fatalf("workspace not isolated: baseline = %d, want 0", before)
}
// Item 1 has a NUL byte in the filename — Go-side pre-validation
// (which only checks empty/length) lets it through, so the INSERT
// reaches lib/pq, which rejects it at the protocol level. That's the
// canonical "DB-side error mid-batch" we want to exercise.
items := []pendinguploads.PutItem{
{Content: []byte("ok"), Filename: "ok.txt", Mimetype: "text/plain"},
{Content: []byte("bad"), Filename: "bad\x00name.txt", Mimetype: "text/plain"},
{Content: []byte("never"), Filename: "never.txt", Mimetype: "text/plain"},
}
_, err := store.PutBatch(ctx, wsID, items)
if err == nil {
t.Fatalf("expected error from NUL-byte filename, got nil")
}
// THE assertion this whole test exists for: even though item 0's
// INSERT…RETURNING succeeded inside the Tx, the rollback unwound
// it — zero rows for this workspace, not one (let alone three).
var after int
if err := conn.QueryRowContext(ctx, `SELECT COUNT(*) FROM pending_uploads WHERE workspace_id = $1`, wsID).Scan(&after); err != nil {
t.Fatalf("post-failure count: %v", err)
}
if after != 0 {
t.Errorf("Tx rollback leaked rows: workspace count = %d, want 0", after)
}
}
// TestIntegration_PendingUploads_PutBatch_Oversize_NoTxOpened verifies the
// pre-validation short-circuit: an oversized item rejects with ErrTooLarge
// BEFORE any Tx opens, so the table is untouched. The unit test (sqlmock
// with zero expectations) catches the Go-side path; this test sanity-checks
// no real DB I/O happens by confirming COUNT(*) doesn't move.
func TestIntegration_PendingUploads_PutBatch_Oversize_NoTxOpened(t *testing.T) {
conn := integrationDB_PendingUploads(t)
store := pendinguploads.NewPostgres(conn)
ctx := context.Background()
wsID := uuid.New()
tooBig := make([]byte, pendinguploads.MaxFileBytes+1)
_, err := store.PutBatch(ctx, wsID, []pendinguploads.PutItem{
{Content: []byte("ok"), Filename: "ok.txt"},
{Content: tooBig, Filename: "too-big.bin"},
})
if err != pendinguploads.ErrTooLarge {
t.Fatalf("expected ErrTooLarge, got %v", err)
}
var n int
if err := conn.QueryRowContext(ctx, `SELECT COUNT(*) FROM pending_uploads WHERE workspace_id = $1`, wsID).Scan(&n); err != nil {
t.Fatalf("count: %v", err)
}
if n != 0 {
t.Errorf("pre-validation did NOT short-circuit: count = %d, want 0", n)
}
}
// TestIntegration_PendingUploads_AckedIndexExists verifies the Phase 5a
// migration (20260505200000_pending_uploads_acked_index.up.sql) actually
// created idx_pending_uploads_acked with the right partial-index predicate.
//
// Why pg_indexes and not EXPLAIN: the planner prefers Seq Scan on tiny
// tables regardless of available indexes — a plan-shape check would be
// flaky under real test loads. The contract we care about is "the index
// exists with the predicate we wrote in the migration"; pg_indexes is
// the canonical source for that, robust to row count and planner version.
func TestIntegration_PendingUploads_AckedIndexExists(t *testing.T) {
conn := integrationDB_PendingUploads(t)
ctx := context.Background()
var indexdef string
err := conn.QueryRowContext(ctx, `
SELECT indexdef FROM pg_indexes
WHERE schemaname = 'public'
AND tablename = 'pending_uploads'
AND indexname = 'idx_pending_uploads_acked'
`).Scan(&indexdef)
if err == sql.ErrNoRows {
t.Fatal("idx_pending_uploads_acked is missing — migration 20260505200000 not applied")
}
if err != nil {
t.Fatalf("pg_indexes query: %v", err)
}
// Pin the partial-index predicate. Without "WHERE acked_at IS NOT NULL"
// we'd be indexing the entire table (defeats the point — most rows are
// unacked), and the existing idx_pending_uploads_unacked already covers
// the inverse predicate.
if !strings.Contains(indexdef, "(acked_at)") {
t.Errorf("index missing acked_at column: %s", indexdef)
}
if !strings.Contains(indexdef, "WHERE (acked_at IS NOT NULL)") {
t.Errorf("index missing partial predicate: %s", indexdef)
}
}
// TestIntegration_PollUpload_AtomicRollback_AcrossBothTables proves the
// #149 cross-table contract at the database layer: when PutBatchTx and
// LogActivityTx run in the same caller-owned Tx and an activity INSERT
// fails after some rows have already been INSERTed, Rollback unwinds
// BOTH tables, leaving zero rows.
//
// Coverage map (#149):
// - chat_files_poll_test.go's TestPollUpload_AtomicRollbackOnActivityInsertFailure
// uses sqlmock to prove the Go handler issues Begin / N inserts /
// Rollback in the right order (no Commit on failure path).
// - This integration test proves the helpers + real Postgres compose
// correctly: rollback after a mid-Tx activity insert failure
// actually reverts BOTH the prior activity row AND the
// pending_uploads rows from PutBatchTx.
// - The pre-existing TestIntegration_PendingUploads_PutBatch_AtomicRollback
// covers the pending_uploads-only case.
//
// Failure injection: a NUL byte in `summary` (TEXT column) — lib/pq
// rejects it at the protocol layer. Same trick the existing PutBatch
// AtomicRollback test uses for the pending_uploads INSERT.
func TestIntegration_PollUpload_AtomicRollback_AcrossBothTables(t *testing.T) {
conn := integrationDB_PendingUploads(t)
ctx := context.Background()
// activity_logs has a FK to workspaces(id) — seed a real row so
// non-failing inserts succeed. Wipe activity_logs + this workspaces
// row at end so the next test sees a clean slate (the integrationDB
// helper only wipes pending_uploads).
wsID := uuid.New()
if _, err := conn.ExecContext(ctx,
`INSERT INTO workspaces (id, name) VALUES ($1, 'test-149-rollback')`, wsID,
); err != nil {
t.Fatalf("seed workspace: %v", err)
}
t.Cleanup(func() {
// CASCADE on workspaces FK deletes the activity_logs rows; explicit
// DELETE on activity_logs catches any rows that somehow leaked.
_, _ = conn.ExecContext(context.Background(), `DELETE FROM activity_logs WHERE workspace_id = $1`, wsID)
_, _ = conn.ExecContext(context.Background(), `DELETE FROM workspaces WHERE id = $1`, wsID)
})
store := pendinguploads.NewPostgres(conn)
// Mirror uploadPollMode's Tx shape: BeginTx → PutBatchTx → N ×
// LogActivityTx → Commit (or Rollback on failure).
tx, err := conn.BeginTx(ctx, nil)
if err != nil {
t.Fatalf("BeginTx: %v", err)
}
items := []pendinguploads.PutItem{
{Content: []byte("first"), Filename: "a.txt", Mimetype: "text/plain"},
{Content: []byte("second"), Filename: "b.txt", Mimetype: "text/plain"},
}
fileIDs, err := store.PutBatchTx(ctx, tx, wsID, items)
if err != nil {
t.Fatalf("PutBatchTx: %v", err)
}
if len(fileIDs) != 2 {
t.Fatalf("len(fileIDs) = %d, want 2", len(fileIDs))
}
// First activity insert succeeds — would commit if not for the
// rollback that the second insert's failure forces.
wsIDStr := wsID.String()
method := "chat_upload_receive"
okSummary := "chat_upload_receive: a.txt"
if _, err := LogActivityTx(ctx, tx, nil, ActivityParams{
WorkspaceID: wsIDStr,
ActivityType: "a2a_receive",
TargetID: &wsIDStr,
Method: &method,
Summary: &okSummary,
Status: "ok",
}); err != nil {
t.Fatalf("first LogActivityTx (should succeed): %v", err)
}
// Second activity insert: NUL byte in summary triggers lib/pq
// "invalid byte sequence for encoding UTF8: 0x00" — the canonical
// "DB-side error after some Tx work has already happened" we want.
badSummary := "chat_upload_receive: b\x00.txt"
_, err = LogActivityTx(ctx, tx, nil, ActivityParams{
WorkspaceID: wsIDStr,
ActivityType: "a2a_receive",
TargetID: &wsIDStr,
Method: &method,
Summary: &badSummary,
Status: "ok",
})
if err == nil {
t.Fatal("expected error from NUL-byte summary, got nil")
}
// Caller (uploadPollMode in production) rolls back on the error.
if rbErr := tx.Rollback(); rbErr != nil {
t.Fatalf("Rollback: %v", rbErr)
}
// THE assertion this test exists for: BOTH tables must have zero
// rows for this workspace. Pre-#149 the activity_logs row from the
// first insert would persist (separate fire-and-forget INSERT) and
// pending_uploads would also persist (committed by PutBatch's own
// Tx). Post-#149 the shared Tx + Rollback unwinds both.
var puCount, alCount int
if err := conn.QueryRowContext(ctx,
`SELECT COUNT(*) FROM pending_uploads WHERE workspace_id = $1`, wsID,
).Scan(&puCount); err != nil {
t.Fatalf("count pending_uploads: %v", err)
}
if err := conn.QueryRowContext(ctx,
`SELECT COUNT(*) FROM activity_logs WHERE workspace_id = $1`, wsID,
).Scan(&alCount); err != nil {
t.Fatalf("count activity_logs: %v", err)
}
if puCount != 0 {
t.Errorf("pending_uploads leaked %d row(s) after Rollback — #149 regression", puCount)
}
if alCount != 0 {
t.Errorf("activity_logs leaked %d row(s) after Rollback — #149 regression "+
"(THIS is the scenario the ticket called out: pre-fix, the first activity row "+
"committed in its own implicit Tx, leaving an orphan)", alCount)
}
}
// TestIntegration_PollUpload_HappyPath_AcrossBothTables is the positive
// counterpart to the rollback test: when nothing fails, both tables
// commit together and the row counts match.
func TestIntegration_PollUpload_HappyPath_AcrossBothTables(t *testing.T) {
conn := integrationDB_PendingUploads(t)
ctx := context.Background()
wsID := uuid.New()
if _, err := conn.ExecContext(ctx,
`INSERT INTO workspaces (id, name) VALUES ($1, 'test-149-happy')`, wsID,
); err != nil {
t.Fatalf("seed workspace: %v", err)
}
t.Cleanup(func() {
_, _ = conn.ExecContext(context.Background(), `DELETE FROM activity_logs WHERE workspace_id = $1`, wsID)
_, _ = conn.ExecContext(context.Background(), `DELETE FROM workspaces WHERE id = $1`, wsID)
})
store := pendinguploads.NewPostgres(conn)
tx, err := conn.BeginTx(ctx, nil)
if err != nil {
t.Fatalf("BeginTx: %v", err)
}
items := []pendinguploads.PutItem{
{Content: []byte("a"), Filename: "a.txt", Mimetype: "text/plain"},
{Content: []byte("b"), Filename: "b.txt", Mimetype: "text/plain"},
{Content: []byte("c"), Filename: "c.txt", Mimetype: "text/plain"},
}
if _, err := store.PutBatchTx(ctx, tx, wsID, items); err != nil {
t.Fatalf("PutBatchTx: %v", err)
}
wsIDStr := wsID.String()
method := "chat_upload_receive"
for _, it := range items {
summary := "chat_upload_receive: " + it.Filename
if _, err := LogActivityTx(ctx, tx, nil, ActivityParams{
WorkspaceID: wsIDStr,
ActivityType: "a2a_receive",
TargetID: &wsIDStr,
Method: &method,
Summary: &summary,
Status: "ok",
}); err != nil {
t.Fatalf("LogActivityTx %q: %v", it.Filename, err)
}
}
if err := tx.Commit(); err != nil {
t.Fatalf("Commit: %v", err)
}
var puCount, alCount int
if err := conn.QueryRowContext(ctx,
`SELECT COUNT(*) FROM pending_uploads WHERE workspace_id = $1`, wsID,
).Scan(&puCount); err != nil {
t.Fatalf("count pending_uploads: %v", err)
}
if err := conn.QueryRowContext(ctx,
`SELECT COUNT(*) FROM activity_logs WHERE workspace_id = $1`, wsID,
).Scan(&alCount); err != nil {
t.Fatalf("count activity_logs: %v", err)
}
if puCount != 3 {
t.Errorf("pending_uploads count = %d, want 3", puCount)
}
if alCount != 3 {
t.Errorf("activity_logs count = %d, want 3", alCount)
}
}
func TestIntegration_PendingUploads_GetIgnoresExpiredAndAcked(t *testing.T) {
conn := integrationDB_PendingUploads(t)
store := pendinguploads.NewPostgres(conn)
@@ -2,6 +2,7 @@ package handlers_test
import (
"context"
"database/sql"
"encoding/json"
"errors"
"net/http"
@@ -77,6 +78,17 @@ func (f *fakeStorage) Sweep(_ context.Context, _ time.Duration) (pendinguploads.
return pendinguploads.SweepResult{}, nil
}
// PutBatch is required by the Storage interface; the upload handler
// tests live in chat_files_poll_test.go and use a separate fake
// (inMemStorage). Stubbed here because the Get/Ack tests don't drive
// PutBatch, but the interface must be satisfied.
func (f *fakeStorage) PutBatch(_ context.Context, _ uuid.UUID, _ []pendinguploads.PutItem) ([]uuid.UUID, error) {
return nil, nil
}
func (f *fakeStorage) PutBatchTx(_ context.Context, _ *sql.Tx, _ uuid.UUID, _ []pendinguploads.PutItem) ([]uuid.UUID, error) {
return nil, nil
}
func newRouter(handler *handlers.PendingUploadsHandler) *gin.Engine {
gin.SetMode(gin.TestMode)
r := gin.New()
@@ -0,0 +1,112 @@
package handlers
// provlog_emit_test.go — pins that the structured-logging emit sites
// added for #2867 PR-D actually fire when their boundary is crossed.
//
// These are call-site contract tests, not provlog package tests (those
// live next to the helper). The assertion is "this dispatcher path
// emits this event name" — if a refactor moves the call out of the
// boundary helper, the gate fails. Fields are NOT pinned here on
// purpose; the field set is convenience for ops, not contract for the
// emit point. Pinning fields would block additive evolution of the
// payload (see also feedback_behavior_based_ast_gates.md).
import (
"bytes"
"context"
"log"
"strings"
"sync"
"testing"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/models"
)
// captureProvLog redirects the global logger to a buffer for the test
// duration. provlog.Event uses log.Printf, so this is the only seam.
// Returned mutex protects against concurrent reads from the goroutine
// fired by provisionWorkspaceAuto (the goroutine never returns in
// these tests because Start() is stubbed, but the buffer can still be
// touched by it racing the assertion).
func captureProvLog(t *testing.T) (read func() string) {
t.Helper()
var buf bytes.Buffer
var mu sync.Mutex
prevWriter := log.Writer()
prevFlags := log.Flags()
log.SetFlags(0)
log.SetOutput(&safeWriter{buf: &buf, mu: &mu})
t.Cleanup(func() {
log.SetOutput(prevWriter)
log.SetFlags(prevFlags)
})
return func() string {
mu.Lock()
defer mu.Unlock()
return buf.String()
}
}
// TestProvisionWorkspaceAutoSync_EmitsProvisionStart — sync variant is
// chosen for the assertion path because it returns once the (stubbed)
// Start() has been called, so we know the emit has flushed. The async
// variant would race a goroutine.
func TestProvisionWorkspaceAutoSync_EmitsProvisionStart(t *testing.T) {
read := captureProvLog(t)
h := &WorkspaceHandler{cpProv: &trackingCPProv{}}
// Best-effort: the body will hit DB code under provisionWorkspaceCP
// — we only need the emit at the entry, which fires unconditionally
// before the dispatch. Recovering from any later panic keeps the
// test focused.
defer func() { _ = recover() }()
h.provisionWorkspaceAutoSync("ws-test-1", "tmpl", nil, models.CreateWorkspacePayload{
Name: "n", Tier: 4, Runtime: "claude-code",
})
got := read()
if !strings.Contains(got, "evt: provision.start ") {
t.Fatalf("expected provision.start emit, got log:\n%s", got)
}
if !strings.Contains(got, `"workspace_id":"ws-test-1"`) {
t.Errorf("workspace_id not in payload: %s", got)
}
if !strings.Contains(got, `"sync":true`) {
t.Errorf("sync flag not pinned for sync dispatcher: %s", got)
}
}
// TestStopForRestart_EmitsRestartPreStop — emit fires before the actual
// Stop call, so the trackingCPProv stub doesn't need to be wired for
// real Stop semantics. Backend label "cp" pinned because that's the
// SaaS path; we don't pin "docker" or "none" branches here (separate
// tests would only re-test the trivial branch label switch).
func TestStopForRestart_EmitsRestartPreStop(t *testing.T) {
read := captureProvLog(t)
h := &WorkspaceHandler{cpProv: &trackingCPProv{}}
defer func() { _ = recover() }()
h.stopForRestart(context.Background(), "ws-restart-1")
got := read()
if !strings.Contains(got, "evt: restart.pre_stop ") {
t.Fatalf("expected restart.pre_stop emit, got log:\n%s", got)
}
if !strings.Contains(got, `"workspace_id":"ws-restart-1"`) {
t.Errorf("workspace_id not in payload: %s", got)
}
if !strings.Contains(got, `"backend":"cp"`) {
t.Errorf("backend label missing or wrong: %s", got)
}
}
// TestStopForRestart_EmitsBackendNoneWhenUnwired — pin the no-backend
// branch so a future refactor that drops the label switch is caught.
// This is the silent-Stop case (workspace_dispatchers.go:StopWorkspaceAuto
// returns nil for unwired backends); the emit ensures the operator can
// still see the boundary in the log.
func TestStopForRestart_EmitsBackendNoneWhenUnwired(t *testing.T) {
read := captureProvLog(t)
h := &WorkspaceHandler{} // both nil
h.stopForRestart(context.Background(), "ws-restart-2")
got := read()
if !strings.Contains(got, `"backend":"none"`) {
t.Fatalf("expected backend=none for unwired handler: %s", got)
}
}
+10 -10
View File
@@ -414,7 +414,7 @@ func (h *RegistryHandler) Register(c *gin.Context) {
}
// Broadcast WORKSPACE_ONLINE
if err := h.broadcaster.RecordAndBroadcast(ctx, "WORKSPACE_ONLINE", payload.ID, map[string]interface{}{
if err := h.broadcaster.RecordAndBroadcast(ctx, string(events.EventWorkspaceOnline), payload.ID, map[string]interface{}{
"url": cachedURL,
"agent_card": payload.AgentCard,
"delivery_mode": effectiveMode,
@@ -572,7 +572,7 @@ func (h *RegistryHandler) Heartbeat(c *gin.Context) {
// Broadcast current task update only when it changed (avoid spamming on every heartbeat)
if payload.CurrentTask != prevTask {
h.broadcaster.BroadcastOnly(payload.WorkspaceID, "TASK_UPDATED", map[string]interface{}{
h.broadcaster.BroadcastOnly(payload.WorkspaceID, string(events.EventTaskUpdated), map[string]interface{}{
"current_task": payload.CurrentTask,
"active_tasks": payload.ActiveTasks,
})
@@ -593,7 +593,7 @@ func (h *RegistryHandler) Heartbeat(c *gin.Context) {
// so per-heartbeat cost is one in-memory channel send per active
// SSE subscriber and one WS hub fan-out. At 30s heartbeat cadence
// this is far below any noise floor on either path.
h.broadcaster.BroadcastOnly(payload.WorkspaceID, "WORKSPACE_HEARTBEAT", map[string]interface{}{
h.broadcaster.BroadcastOnly(payload.WorkspaceID, string(events.EventWorkspaceHeartbeat), map[string]interface{}{
"active_tasks": payload.ActiveTasks,
"uptime_seconds": payload.UptimeSeconds,
})
@@ -678,7 +678,7 @@ func (h *RegistryHandler) evaluateStatus(c *gin.Context, payload models.Heartbea
if err != nil {
log.Printf("Heartbeat: failed to mark %s degraded (wedged): %v", payload.WorkspaceID, err)
}
h.broadcaster.RecordAndBroadcast(ctx, "WORKSPACE_DEGRADED", payload.WorkspaceID, map[string]interface{}{
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventWorkspaceDegraded), payload.WorkspaceID, map[string]interface{}{
"runtime_state": "wedged",
"sample_error": payload.SampleError,
})
@@ -699,7 +699,7 @@ func (h *RegistryHandler) evaluateStatus(c *gin.Context, payload models.Heartbea
if _, err := db.DB.ExecContext(ctx, `UPDATE workspaces SET status = $1, updated_at = now() WHERE id = $2`, models.StatusDegraded, payload.WorkspaceID); err != nil {
log.Printf("Heartbeat: failed to mark %s degraded: %v", payload.WorkspaceID, err)
}
h.broadcaster.RecordAndBroadcast(ctx, "WORKSPACE_DEGRADED", payload.WorkspaceID, map[string]interface{}{
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventWorkspaceDegraded), payload.WorkspaceID, map[string]interface{}{
"error_rate": payload.ErrorRate,
"sample_error": payload.SampleError,
})
@@ -718,7 +718,7 @@ func (h *RegistryHandler) evaluateStatus(c *gin.Context, payload models.Heartbea
if _, err := db.DB.ExecContext(ctx, `UPDATE workspaces SET status = $1, updated_at = now() WHERE id = $2`, models.StatusOnline, payload.WorkspaceID); err != nil {
log.Printf("Heartbeat: failed to recover %s to online: %v", payload.WorkspaceID, err)
}
h.broadcaster.RecordAndBroadcast(ctx, "WORKSPACE_ONLINE", payload.WorkspaceID, map[string]interface{}{})
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventWorkspaceOnline), payload.WorkspaceID, map[string]interface{}{})
}
// Recovery: if workspace was offline but is now sending heartbeats, bring it back online.
@@ -728,7 +728,7 @@ func (h *RegistryHandler) evaluateStatus(c *gin.Context, payload models.Heartbea
if _, err := db.DB.ExecContext(ctx, `UPDATE workspaces SET status = $1, updated_at = now() WHERE id = $2 AND status = 'offline'`, models.StatusOnline, payload.WorkspaceID); err != nil {
log.Printf("Heartbeat: failed to recover %s from offline: %v", payload.WorkspaceID, err)
}
h.broadcaster.RecordAndBroadcast(ctx, "WORKSPACE_ONLINE", payload.WorkspaceID, map[string]interface{}{})
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventWorkspaceOnline), payload.WorkspaceID, map[string]interface{}{})
}
// Auto-recovery: if a workspace is marked "provisioning" but is actively sending
@@ -743,7 +743,7 @@ func (h *RegistryHandler) evaluateStatus(c *gin.Context, payload models.Heartbea
} else {
log.Printf("Heartbeat: transitioned %s from provisioning to online (heartbeat received)", payload.WorkspaceID)
}
h.broadcaster.RecordAndBroadcast(ctx, "WORKSPACE_ONLINE", payload.WorkspaceID, map[string]interface{}{
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventWorkspaceOnline), payload.WorkspaceID, map[string]interface{}{
"recovered_from": currentStatus,
})
}
@@ -771,7 +771,7 @@ func (h *RegistryHandler) evaluateStatus(c *gin.Context, payload models.Heartbea
} else {
log.Printf("Heartbeat: transitioned %s from awaiting_agent to online (heartbeat received)", payload.WorkspaceID)
}
h.broadcaster.RecordAndBroadcast(ctx, "WORKSPACE_ONLINE", payload.WorkspaceID, map[string]interface{}{
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventWorkspaceOnline), payload.WorkspaceID, map[string]interface{}{
"recovered_from": currentStatus,
})
}
@@ -820,7 +820,7 @@ func (h *RegistryHandler) UpdateCard(c *gin.Context) {
return
}
h.broadcaster.RecordAndBroadcast(c.Request.Context(), "AGENT_CARD_UPDATED", payload.WorkspaceID, map[string]interface{}{
h.broadcaster.RecordAndBroadcast(c.Request.Context(), string(events.EventAgentCardUpdated), payload.WorkspaceID, map[string]interface{}{
"agent_card": payload.AgentCard,
})

Some files were not shown because too many files have changed in this diff Show More