User asked for VSCode-style drag-drop upload (#2999): "drag local to
upload to target folder just like vscode does". Today the only upload
path is the toolbar's Upload button (folder picker). Drag-drop lets
users grab files from Finder/Explorer and drop them directly on a
specific subdirectory in the tree.
1. New `uploadDataTransferItems(items, targetDir)` in `useFilesApi`
— walks the HTML5 DataTransferItemList via `webkitGetAsEntry()`,
recursing folders to a flat (relativePath, file) list, then PUTs
each via the existing /files/<path> endpoint. The walker (also
exported via `__testables`) calls `readEntries()` in a loop until
empty so multi-batch folders (browsers cap each call at ~100
entries) aren't silently truncated.
2. `uploadFiles` (folder-picker path) gained an optional `targetDir`
parameter. Same prefixing semantics so future surfaces (e.g. an
"upload here" toolbar button on a row) can reuse it.
3. `FileTree` directory rows gained `onDragOver` / `onDragEnter` /
`onDragLeave` / `onDrop` handlers + a hover-target highlight
(accent-tinted background + outline). dragLeave uses
`currentTarget.contains(relatedTarget)` to suppress the flicker
that fires when the cursor crosses any child of the row (icon,
label, ✕ button) — without this the highlight strobes on every
sub-element transition.
4. `FilesTab` wraps the tree column in an outer drop zone for
"drop on root" — drops outside any specific subdir row land at
root. The empty-state placeholder copy now includes a
"drag files here to upload" hint when the active root is
/configs (the only writable root today).
5. Both the row drop and the root drop are gated on
`root === "/configs"` (the same gate that already blocks the
toolbar's New / Upload / Clear). Other roots ignore the drag
entirely (no highlight, no drop), so the user doesn't get a
misleading drag affordance followed by a "switch root" toast.
`dragDropUpload.test.tsx` (9 tests, two layers):
Walker tests (pure function, no DOM):
- `walkEntry` collects a single dropped file with correct relpath.
- `walkEntry` walks a folder + preserves folder name in the path.
- **Multi-batch loop**: a fake reader that emits two batches of 2
+ an empty terminator must yield 4 files. A walker that called
readEntries once would see only 2 — this is the load-bearing
assertion against silent folder truncation.
- Nested directories: outer/inner/file.md → "outer/inner/file.md".
FileTree drag-drop wiring (DOM):
- `dragover` on a directory row preventDefault's (load-bearing —
without it the drop event never fires).
- `drop` on a directory row fires `onDropToTarget(path, items)`.
- `drop` on a FILE row does NOT fire (only directories are valid
drop targets).
- `drop` with no DataTransferItems does NOT fire (defensive guard
against text-only drags).
- `dragenter` adds the highlight class to the directory row.
1. The 1MB per-file size cap is inherited from the existing
`uploadFiles`. A user dropping a 5MB skill bundle silently
skips the file (the loop's `continue` on `file.size >
1_000_000`). Same behavior as the toolbar Upload, so consistent
if not great. Surfacing skipped-files would be a UX improvement
tracked separately — not load-bearing for this PR.
2. Drop-zone highlight on the column wrapper uses an outline that
sits inside the column's overflow-y-auto scroll container. If
the user drags onto a row that's mid-scroll, the highlight may
clip slightly at the scroll boundary. Cosmetic only; the drop
still works.
3. The `?root=` query is NOT passed on the underlying writeFile
call (matches the existing uploadFiles behavior). On a backend
without #2999 PR-A, this means uploads always land in /configs
regardless of selected root — but we already gated drop on
`root === "/configs"` so the practical effect is nil today.
Once PR-A merges and the canvas threads ?root= through writes
(separate follow-up), drops on /home etc. would be enableable
by lifting the canDelete-style gate.
- `npx tsc --noEmit` clean
- 177/177 canvas tab tests pass
- Manual on local dev: drag a file from Finder onto /configs/skills
row → file appears under /configs/skills/<name>. Drag a folder of
3 files onto root area → 3 files uploaded with folder structure
preserved. Drag onto /home tree → no highlight, no drop.
Refs #2999. Pairs with PR-A (backend EIC) — without PR-A the tree
is empty on SaaS and there's nothing to drop ONTO; PR-D still works
on self-hosted today.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
## Why
User asked for a VSCode-style right-click menu on file rows (#2999):
"right click to have a menu to download". Today the only download
affordance is the toolbar's Export-all (bulk JSON dump), and the
inline ✕ button is the only delete UX (small click target, easy to
miss).
## Fix
1. New `FileTreeContextMenu` component — fixed-position popover with
Open / Download / Delete items composed per-row (files get all
three; directories get Delete only since "open a directory in the
editor" doesn't apply). Esc + outside-click + Tab + scroll
dismiss. ↓/↑ arrow keys rove focus between menu items. role=menu
+ role=menuitem + autofocus on first item for a11y.
2. Menu state lifted to the top-level `FileTree` (not per-row) so
opening a second row's menu auto-closes the first — only one
menu open at a time, matching VSCode/Theia. Pinned by the
`replaces the first` test.
3. New `downloadFileByPath(path)` in `useFilesApi` — fetches via the
existing GET /workspaces/<id>/files/<path>?root= endpoint and
triggers a browser download. Distinct from the existing
`handleDownloadFile` which downloads the in-editor buffer
(round-trips unsaved edits to disk); the context-menu download
targets arbitrary tree rows the user hasn't opened.
4. `canDelete` prop threaded from FilesTab → FileTree → menu →
item. Same gate as the toolbar (Clear/New/Upload all gated to
/configs); context menu's Delete renders as disabled with a
muted background on other roots, matching the "feature exists
but isn't applicable here" pattern.
## Test coverage
`FileTreeContextMenu.test.tsx` (8 tests):
- File row → menu opens with Open + Download + Delete.
- Directory row → menu opens with Delete only.
- Click Download → onDownload(path) fires + menu closes.
- Click Delete (canDelete=true) → onDelete(path) fires.
- Click Delete (canDelete=false) → onDelete NOT called + menu stays
open (disabled-state UX).
- Esc dismisses.
- Outside-click (mousedown on document.body) dismisses.
- Opening second context menu replaces the first (only-one-open
invariant).
Each test uses fireEvent + screen.getByRole, so they fail on a
deleted-code regression — none would pass on the pre-PR shape.
## Three weakest spots (hostile self-review)
1. The menu is positioned at `clientX/clientY` without viewport
clamping. If the user right-clicks at the very bottom-right of
the panel, part of the menu may overflow off-screen. VSCode
handles this by flipping the anchor; we don't yet. Acceptable
v1 because the FilesTab is fixed-width (≤ side-panel width)
and the menu is small (140×~80px); the overflow would be a few
pixels of one item. Filed as a follow-up.
2. Auto-focus on the first item shifts keyboard focus away from
the row that opened the menu. Closing with Esc returns focus
to the body, not the row. Same behavior as TerminalTab's
placeholder + the canvas's other context menus; consistent
isn't ideal but at least uniform. Documented inline.
3. The download request reuses the API client's 15s default
timeout — large config files (multi-MB skill bundles) on a
slow connection could time out. Same risk applies to the
existing toolbar Export. If we see real download failures we
can add a `timeoutMs` override at the call site without
touching the menu.
## Verification
- `npx tsc --noEmit` clean
- 176/176 canvas tab tests pass
- Manual on local dev: right-click a config.yaml row → menu opens
→ click Download → file lands in Downloads. Right-click on
/home root → Delete renders disabled.
Refs #2999. Pairs with PR-A (backend EIC) — without PR-A the tree
is empty and there's nothing to right-click on a SaaS workspace.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
## Why
Reported by user (issue #2999): external workspaces (mac laptop, mac
mini, hermes-on-home-server — runtime="external") render the FilesTab
identically to the SaaS empty-listing bug, showing "0 files / No
config files yet" even though the platform doesn't actually own the
filesystem of these workspaces. Visually indistinguishable from the
broken state, reads as a bug.
## Fix
Mirror the affordance TerminalTab adopted in PR #2830 for runtimes
without a TTY:
1. New `NotAvailablePanel` in `canvas/src/components/tabs/FilesTab/`
— folder-with-slash icon + "Files not available" headline + body
text that names the runtime and points the user at Chat.
2. `FilesTab` now takes optional `data?: WorkspaceNodeData`. When
`data.runtime` is in `RUNTIMES_WITHOUT_FILES` (currently just
"external"), early-return the placeholder before mounting the
useFilesApi hook. Mirrors TerminalTab's prop shape exactly so the
review pattern is uniform across tabs.
3. SidePanel passes `node.data` to FilesTab (matches existing pattern
for ChatTab / TerminalTab).
## Test coverage
`FilesTab.notAvailable.test.tsx` (4 tests):
- external runtime → banner renders with runtime name + Chat-tab
guidance copy.
- external runtime → NO `/files` API request fires (asserted by
inspecting the mocked api.get call log).
- claude-code runtime → no banner, normal mount proceeds (toolbar's
root selector is the discriminator).
- data prop omitted → falls through to normal mount (back-compat
with any caller that doesn't thread data through, e.g. legacy
tests).
Each branch is independent and discriminating — none would pass on
a code-deleted version of the early-return.
## Three weakest spots (hostile self-review)
1. `RUNTIMES_WITHOUT_FILES` is a hardcoded set in this file. If a
future runtime joins (e.g. a "byok-claude" that runs on user
hardware), someone has to remember to add it here. Reviewed
alternatives: pull from a runtime-capabilities registry — same
shape as `RUNTIMES_WITHOUT_TERMINAL` already in TerminalTab. We
chose the parallel pattern over a new abstraction; consolidating
into a shared registry can land if/when a third tab grows the
same gate (rule of three). Documented inline.
2. The placeholder is a static panel — no retry, no "report bug"
link. Same as TerminalTab's. Acceptable because the absence is
intentional, not transient.
3. Chat-tab guidance is hardcoded English. No i18n in canvas yet;
matches the rest of the codebase. Will move with the i18n
migration when that lands.
## Verification
- `npx tsc --noEmit` clean
- 54/54 canvas tab + SidePanel tests pass
- Will be live-verified on staging post-merge: open Files tab on an
external workspace (mac laptop) → expect placeholder; open on a
platform-owned workspace (Hongming Personal Brand Agent) → expect
normal tree (assuming PR-A also lands).
Refs #2999. Pairs with PR-A (backend EIC fix) — without PR-A the
platform-owned path still shows "0 files" because the backend never
returns rows.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
## User-visible bug
Canvas Files tab returns "0 files / No config files yet" for every
SaaS workspace, every root (/configs, /home, /workspace, /plugins).
Reported by user (canvas screenshot, hongming.moleculesai.app,
Hongming Personal Brand Agent — claude-code, T4, online).
## Root cause
`ListFiles` (templates.go) was missing the SSH-via-EIC branch that
ReadFile (PR #2785) and WriteFile (PR #1702) already have. On SaaS,
dockerCli is nil → findContainer returns "" → falls through to
host-side resolveTemplateDir which only matches baked-in template
names. For a user-named workspace it matches nothing, so the handler
silently returns []fileEntry{}.
DeleteFile had the same gap — right-click delete (introduced in PR-C
of this issue) would silently no-op once #1 was fixed.
## Fix
1. Extracted shared EIC plumbing into `withEICTunnel` (closure-based,
single SSOT for keypair → key push → tunnel → port-wait → cleanup).
Refactored writeFileViaEIC + readFileViaEIC to use it. Added
listFilesViaEIC + deleteFileViaEIC on the same scaffold. The
`LogLevel=ERROR` shim from PR #2822 now lives in one
`eicSSHSession.sshArgs()` helper instead of being duplicated per
helper — the next time we need to tweak ssh options, one place.
2. Factored remote shell strings into pure functions
(buildInstallShell / buildCatShell / buildRmShell / buildFindShell
+ parseFindOutput) so the wire shape can be pinned without booting
a real EIC tunnel.
3. Refactored `resolveWorkspaceFilePath(runtime, root, relPath)` to
honor `?root=`. New rule: `/configs` (or empty / unrecognized) →
runtime managed-config dir via workspaceFilePathPrefix (preserves
the v1 ReadFile/WriteFile behaviour where canvas's Config tab
GETs/PUTs config.yaml without specifying a root and lands in the
right per-runtime dir); `/home`, `/workspace`, `/plugins` →
literal absolute path on the EC2 host. List/Read/Write/Delete now
agree on what file a tree row points to — pre-fix List would say
"/home contents" but Read/Write would route to /configs.
4. ListFiles + DeleteFile dispatch on instance_id != "" → EIC helper.
Errors from the EIC path produce 500 (not silent fall-through to
local-Docker, which would mask the failure as "0 files" — the
exact user-visible symptom).
5. Added ?root= validation gate to WriteFile + DeleteFile so an
out-of-allowlist root is rejected before the resolver runs.
## Test coverage
- TestResolveWorkspaceFilePath_RuntimeIndirection — pins the
/configs → runtime prefix translation per-runtime (hermes,
claude-code, langgraph, external, unknown). Catches the regression
where a future edit accidentally drops the runtime indirection.
- TestResolveWorkspaceFilePath_LiteralRoots — pins /home,
/workspace, /plugins as literal pass-through regardless of
runtime. Catches the symmetric regression where the literal roots
start getting rewritten to the runtime prefix (which would mean
the FilesTab "/home" selector silently routes to /configs on
hermes).
- TestResolveWorkspaceRootPath — directory-only translation used
by listFilesViaEIC, same indirection rules.
- TestSSHArgs_HardenedFlags — pins the centralised ssh option set
(LogLevel=ERROR + hardening). Catches drift in the
one-place-where-ssh-flags-live.
- TestEicSSHSessionSingleSourceForSSHFlags — behaviour-based AST
gate (per memory). Counts s.sshArgs() callers (must be ≥4 —
list/read/write/delete) and asserts LogLevel=ERROR appears
exactly once in the source. Fires if anyone copy-pastes a raw
ssh args slice instead of going through the helper.
- TestBuildInstallShell / TestBuildCatShell / TestBuildRmShell /
TestBuildFindShell — pure-function tests pinning the remote
command shape. Catches regression like "rm -f silently becomes
rm -rf" or "find loses node_modules pruning" without needing a
real EC2.
- TestBuildFindShell_DepthForwarding — catches a regression where
the helper hard-codes a depth instead of using the caller's value.
- TestParseFindOutput / TestParseFindOutput_EmptyInput — pin the
TYPE|SIZE|REL parser. Empty-input case explicitly returns []
not nil so the JSON wire shape stays a list.
- TestListFiles_EICDispatch_Success / Error — sqlmock-driven
handler test. Verifies instance_id != "" routes to listFilesViaEIC
and surfaces errors as 500 (does NOT silently fall through to
local-Docker, which is the exact regression-mode of the original
bug).
- TestListFiles_EICBranch_NotTakenForSelfHosted — back-compat
guard: instance_id == "" must NOT enter the EIC branch (would
break self-hosted operators).
- TestDeleteFile_EICDispatch_Success / Error — same shape for
DeleteFile.
- TestListFiles_RootValidation / TestDeleteFile_RootValidation —
?root=/etc must 400 before any DB query or EIC call.
## Verification
- `go build ./...` clean
- `go test ./...` clean (full workspace-server suite)
- Will be live-verified against staging on hongming.moleculesai.app
after merge: open Files tab → expect populated /home + /configs +
/workspace listings (not "0 files"); right-click delete on
/configs/old.yaml → expect file removed on the EC2 host.
## Three weakest spots (hostile self-review)
1. The LogLevel=ERROR drift gate counts source occurrences. A
future refactor that intentionally moves the literal somewhere
else (e.g. into a constant) would trigger a false positive. The
gate's failure message points to the load-bearing constraint
(must appear in sshArgs); operator can adjust.
2. `eicFileWriteTimeout` constant kept as an alias for back-compat
with prior tests. Documented as intentional + safe to remove on
the next pass.
3. The resolver tests pin the runtime → prefix map values
(`/home/ubuntu/.hermes`, `/configs`, etc.). A future runtime
addition that ships a new prefix needs the test updated. This
is intentional — silent prefix changes orphan saved files, so a
test failure on map edit IS the right signal.
## Follow-up (RFC #2312 subtask 2)
Long-term the right fix is to drop EIC entirely and HTTP-forward to
the workspace's own URL (RFC #2312). That's a substantially larger
refactor across 5 surfaces (chat upload, files, templates, plugins,
terminal) and out of scope for this bug-fix PR. Tracked separately
under that RFC.
Refs #2999.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The 15-min sweeper has been deleting stale e2e orgs but not the
orphan tunnels left behind when the org-delete cascade half-fails
(CP transient 5xx after the org row is gone but before the CF
tunnel delete completes). Result: tunnels accumulate in CF until
manual operator cleanup.
Add a final step that POSTs `/cp/admin/orphan-tunnels/cleanup`
every tick. Best-effort — failure doesn't fail the workflow; next
tick re-attempts. Output reports deleted_count + failed count for
ops visibility.
This is the catch-all for the orphan-tunnel class. The proper
upstream fix (transactional org delete) lives in CP and tracks as
issue #2989. Until that lands, the sweeper bounded-time-to-cleanup
keeps the leak from escalating.
Note: PR #492 (cf-tunnel silent-success fix) makes this step
actually effective — pre-fix DeleteTunnel silent-succeeded on
1022, so the cleanup endpoint reported success without deleting.
Post-fix the cleanup chains CleanupTunnelConnections + retry on
1022, which actually clears stuck-connector orphans.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Mirrors molecule-controlplane#494: the canonical EPHEMERAL_PREFIXES
list now lives in molecule-controlplane/internal/slugs/ephemeral.go,
where redeploy-fleet reads it to skip in-flight test tenants. The
sweep workflow keeps a Python copy because GHA Python can't import
Go, but a comment now points engineers updating the list to update
both files.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PR #2990 root cause: the resolver SQL added `name` to the SELECT for
DisplayName plumbing, but the e2e test's sqlmock fixture
(expectChainQueryRoot at swap_test.go:216) still scripts the
3-column shape. Three e2e tests fail with:
sql: expected 3 destination arguments in Scan, not 4
Fix: bump the fixture to 4 columns (id, name, parent_id, depth) and
pass an empty name. The e2e tests don't assert on label rendering —
they pin the namespace string flow ("workspace:root-1" etc), which
is unchanged. Empty name is fine: ReadableNamespaces still emits the
correct namespace strings; only DisplayName is empty.
Caught by CI's Platform (Go) check on PR #2990 — would have been a
silent missed-coverage case in the resolver_test.go run because that
package doesn't import the e2e package.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
User feedback on the v2 Memory tab redesign: on a root workspace, the
namespace dropdown showed three indistinguishable entries:
Workspace (30ba7f0b)
Team (30ba7f0b) (team)
Org (30ba7f0b-b303-4a20-aefe-3a4a675b8aa4) (org)
For a root workspace, the resolver collapses workspace==team==org IDs
(resolver.go:113-122 derive() degenerate case). The previous
shortID(8)-truncated UUID label scheme made all three look identical
even though the three concepts (private / team-shared / org-wide)
remain semantically distinct.
## Backend — Resolver returns DisplayName
- SQL chain query now SELECTs workspaces.name (COALESCE → "" on NULL)
- chainNode carries .name through walk
- deriveNames() computes the display name for each namespace,
mirroring derive():
workspace: self.name
team: parent.name (or self.name if root — degenerate)
org: chain[end].name (root of tree)
- Namespace struct gets a new DisplayName field, omitempty wire-shape
## Backend — Handler renders label from DisplayName when present
- memories_v2.go:namespaceLabelWithName(name, kind, displayName) is
the new SSOT label generator. Falls back to the UUID-prefix shape
when displayName is empty so callers without name plumbing keep
working unchanged.
- namespacesToViews now plumbs Namespace.DisplayName into the label.
- Old namespaceLabel(name, kind) is preserved as a thin wrapper
around namespaceLabelWithName(_, _, "") for back-compat.
- Custom namespaces ignore displayName by design — operator-defined
suffixes ARE the chosen label; a name override would surprise.
## Frontend — drop redundant `(kind)` suffix
Pre-fix: "Team (mac laptop) (team)" — kind shown twice.
Post-fix: "Team (mac laptop)" — the prefix already conveys the kind.
## Test coverage
Resolver (3 new tests):
- DisplayName_Root: workspace name propagates to all 3 namespaces
- DisplayName_Child: workspace=self.name, team=parent.name, org=root.name
- DisplayName_EmptyOnNULL: COALESCE → "" → empty fallback
Handler (3 new tests):
- NamespaceLabelWithName_PrefersDisplayName: workspace/team/org/custom paths
- NamespaceLabelWithName_FallsBackToUUIDPrefix: empty displayName → legacy shape
- NamespacesToViews_PassesDisplayNameThrough: full integration on root case
Canvas: existing 30 tests still pass; suffix drop is rendering-only.
memories_v2.go function coverage: **14/14 = 100%**
- namespaceLabelWithName: 100%
- namespacesToViews: 100%
- (all 11 pre-existing functions stay at 100%)
## SSOT
The "what is this namespace called" question now has one source of
truth: namespace.Resolver.ReadableNamespaces sets DisplayName from the
canonical workspace.name column. The handler is a renderer; the
canvas is a consumer. No name-lookup logic duplicated across the
three layers.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Closes the silent-block failure mode that left 25 commits — including
the Memory v2 redesign and the reno-stars data-loss fix — wedged on
staging for 12+ hours behind a single missing review. The auto-promote
workflow opened the PR + armed auto-merge, but main's branch protection
required a human review and nobody noticed until a user reported
"still seeing old memory tab".
## Detection logic — `scripts/check-stale-promote-pr.sh`
Reads open PRs `base=main head=staging` and alarms on:
- `mergeStateStatus == BLOCKED`
- `reviewDecision == REVIEW_REQUIRED`
- createdAt older than `STALE_HOURS` (default 4h)
Other BLOCKED reasons (DIRTY, BEHIND, failed checks) are NOT alarmed —
those are the author's signal-to-fix. This script targets the specific
"no human reviewed yet" wedge.
Output:
- `::warning` per stale PR (visible in workflow summary + Actions UI)
- PR comment (idempotent via marker-string detection; one alarm
per PR, never re-spammed)
- Exit code = count of stale PRs (capped at 125)
Logic in a script (not inline workflow YAML) so it's:
- **Unit-testable** — tests/test-check-stale-promote-pr.sh exercises
every branch with stubbed fixture JSON + frozen clock. 23 tests
covering: empty list, single stale, just-under-threshold, wrong
reviewDecision, wrong mergeStateStatus, mixed list (only matching
PRs alarm), custom threshold via --stale-hours, exit-code-counts-
matching-PRs, --help, unknown arg → 64, missing repo → 2.
- **Operator-runnable ad-hoc** — `scripts/check-stale-promote-pr.sh`
works from any shell with `gh` + `jq`.
- **SSOT** — one detector, the workflow YAML is just schedule +
invocation surface. Future sibling workflows that need the same
check call the same script.
## Workflow — `.github/workflows/auto-promote-stale-alarm.yml`
Triggers:
- cron `27 * * * *` (hourly, off-the-hour to dodge cron herd)
- workflow_dispatch with `stale_hours` + `post_comment` overrides
Concurrency: `auto-promote-stale-alarm` group, cancel-in-progress=false
(idempotent script; no benefit to cancelling a running scan).
Permissions: `contents: read` + `pull-requests: write` (post comments).
Sparse checkout — only fetches `scripts/check-stale-promote-pr.sh`.
No node_modules, no go modules, no slow setup steps. Workflow runs
in <30s on a clean repo.
## Why "alarm + comment" not "auto-approve"
Considered options in issue #2975:
1. Slack/email alert — picked.
2. Bot-account auto-approve via molecule-ops — circumvents the
human-review gate that branch protection encodes.
3. Trusted-promote bypass via CODEOWNERS — needs Org Admin config
change; out of scope for a workflow PR.
The comment-on-PR pattern picks (1) without external dependencies
(no Slack token, no email config). Subscribers get notified via
GitHub's existing PR notification delivery; the warning shows up in
the Actions feed.
## Why this won't false-positive on legitimate slow reviews
Threshold is 4h. Most legitimate gates clear in <1h, so 4× headroom
is plenty for slow CI. The comment is idempotent (one alarm per PR,
never re-posted) — adding noise stops at 1 comment regardless of
how long the PR sits.
## Test plan
- [x] `bash scripts/test-check-stale-promote-pr.sh` — 23/23 pass
- [x] `python3 -c 'yaml.safe_load(...)'` clean
- [x] `bash -n` clean on both scripts
- [ ] Live verification: dispatch the workflow once main has caught up,
confirm it correctly reports zero stale PRs
Reported on production reno-stars 2026-05-05 (browser console):
/workspaces/d76977b1-…/files/config.yaml:1
Failed to load resource: the server responded with a status of 404
The workspace was an external-runtime mac-mini-style agent that
doesn't use the platform's config.yaml template — every Config tab
open issued a GET that 404d cleanly, and the existing catch block
fell into the runtime-manages-own-config branch + populated the
form from workspace metadata. Functionally correct, but the request
fired anyway, surfaced as a 404 in DevTools, and burned an RTT.
Fix: branch on RUNTIMES_WITH_OWN_CONFIG BEFORE the fetch — when the
workspace's runtime is one of those (external, hermes), skip the
GET, populate the form from workspace metadata directly, set
loading=false, return. Same code path as the existing 404-catch
fallback, just skipping the wasted request.
Behavior preserved for runtimes that DO use the template
(claude-code, etc.): unchanged GET → parse → setConfig flow.
Tests: 24/24 existing ConfigTab tests pass; no behavioral change for
the documented runtimes. tsc clean.
Refs reno-stars production 2026-05-05.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes#2973 — the followup test gap I flagged on PR #2968's review.
Pre-merge #2968 added the platform-pending: URI scheme branch to
resolveAttachmentHref + introduced the isPlatformAttachment SSOT
helper, but the existing uploads.test.ts only covered the older
workspace: / file:/// / absolute-path branches. The new branch shipped
on prod-impact (live console error on reno-stars) with manual post-
deploy verification; the regression gate was filed as a followup
(#2973) so a future canvas refactor can't silently re-break the
poll-mode chat-attachment download path.
Adds 15 new test cases across two existing describe blocks:
resolveAttachmentHref — platform-pending: scheme (poll-mode uploads):
- well-formed platform-pending:<wsid>/<fileid> resolves to the
/pending-uploads/<file>/content endpoint
- uses the URI's wsid, NOT the chat workspace_id (cross-workspace
forwarding case — pinning the explicit decision from #2968's
commit message so a regression that flipped this would mis-route
the download to the wrong workspace's pending-uploads store)
- defensive fallback to raw URI on missing slash, empty fileID,
empty wsid (so a future "helpful" change can't synthesize a
broken /pending-uploads// path)
- regression test against the EXACT production repro from #2968's
body (reno-stars, 2026-05-05 console error)
isPlatformAttachment:
- positive cases for platform-pending: (well-formed and malformed),
workspace:<allowed-root>, file:///<allowed-root>, absolute paths
under allowed roots
- NEGATIVE cases for HTTPS/HTTP URLs to other origins (auth-leak
class regression — a helper that always returned true would
attach workspace tokens to third-party requests), non-allowlisted
roots like /etc/passwd or /var/log/x, empty string, and
unrecognised schemes (s3://, ftp://)
All 21 tests pass. The 6 pre-existing tests are unchanged. The 15
new tests are the regression gate that #2973 asked for.
Verification:
- pnpm exec vitest run src/components/tabs/chat/__tests__/uploads.test.ts
→ 21 passed
The drift gate caught the new SSOT parser module — without registration
the wheel ships it un-rewritten and runtime imports fail. Same pattern
as inbox_uploads, a2a_tools_delegation, a2a_tools_rbac registrations.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previously Phase 3 only checked the workspace-server's poll-mode short-circuit
emit shape ({"status":"queued","delivery_mode":"poll","method":"..."}); the
matching client-side classification was tested in isolation against fixture
dicts in test_a2a_response.py.
This phase closes the loop by piping the actual on-the-wire response from a
real workspace-server back through the wheel's a2a_response.parse() and
asserting it classifies as the Queued variant with the right method +
delivery_mode. A regression in EITHER the server emit shape OR the client
parser will now fail this E2E, eliminating the gap that allowed the original
"unexpected response shape" production bug to ship despite green unit tests.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reported on production 2026-05-05:
agent plugin tab Plugins
0 installed
+ Install Plugin
this part should be default compact
Pre-fix: SkillsTab always rendered the Plugins section as a full
rounded-xl panel with vertical chrome — even when zero plugins were
installed and the registry browser was closed. The empty state
gave a lot of vertical real estate for content that's just "0
installed + Install button".
Fix: when installed.length === 0 AND registry closed AND initial
load completed, collapse the section into a single inline pill
("Plugins · 0 installed · + Install Plugin"). The full panel
re-mounts when:
- installed.length > 0 (a plugin landed → expand to surface the list)
- showRegistry === true (user clicked + Install Plugin → registry opens)
- !installedLoaded (avoid flash; the loading shell shows instead
until the first /plugins fetch resolves)
Accessibility:
- Compact pill: aria-label="Plugins (none installed)" + button
aria-expanded="false" + aria-controls="plugins-section"
- Full panel: button aria-expanded={showRegistry} + same aria-controls
- Section gets id="plugins-section" so the aria-controls reference
resolves once the section mounts
External workspaces: this is a pure canvas-frontend layout change —
applies to ALL workspace runtimes (external, claude-code, hermes,
langchain, codex, third-party MCP). No server-side change needed.
Tests
-----
SkillsTab.compactEmpty.test.tsx (4 tests):
- Compact pill renders when installed=0, registry closed, loaded
- Full panel renders when installed > 0
- Click + Install Plugin from compact → expands to full panel
(verified via aria-controls target id appearing in the DOM)
- During initial load (installedLoaded=false), compact pill does
NOT render — avoids a compact→full flash as the load completes
Per memory feedback_oss_design_philosophy.md: the SkillsTab is the
only tab that needs compact-empty today, but the pattern is
extractable into a shared EmptyStateCompactWrapper if Schedules /
Memories / Approvals adopt the same affordance later. Don't generalise
until the third use case (per the same memory, "every refactor toward
OSS plugin shape" without premature abstraction).
Verified
- tsc --noEmit clean
- All 4 tests pass
Refs #2971.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Introduce ``workspace/a2a_response.py`` as the single source of truth for
the wire shapes the workspace-server proxy can return at
``/workspaces/<id>/a2a``:
* ``Result`` — JSON-RPC success
* ``Error`` — JSON-RPC error or platform-level error (with
restart-in-progress metadata when present)
* ``Queued`` — poll-mode short-circuit envelope: the platform
queued the message into the target's inbox, the
target will fetch via /activity poll
* ``Malformed`` — anything the parser can't classify (logged at
WARNING so a future server change is loud)
``send_a2a_message`` (in ``a2a_client.py``) now dispatches via
``a2a_response.parse(data)`` instead of inline ``"result" in data`` /
``"error" in data`` sniffing. The Queued variant returns a new
``_A2A_QUEUED_PREFIX`` sentinel so callers can distinguish "delivered
async, no synchronous reply" from both success-with-text and failure.
reno-stars production data caught two intermittent failures that
both reduced to the same root cause:
1. **File transfer announce silently failed** — when CEO Ryan PC
(poll-mode external molecule-mcp) sent the harmi.zip
announcement to Reno Stars Business Intelligent (also poll-mode
external), ``send_a2a_message`` saw the platform's poll-queued
envelope ``{"status":"queued","delivery_mode":"poll","method":"..."}``,
didn't recognize it as the synthetic delivery-acknowledgement
it is, and returned ``[A2A_ERROR] unexpected response shape``.
The agent fell back to a chunk-shipping path; receiver did get
the file but operator-facing logs showed a failure that didn't
actually fail.
2. **Duplicated agent comm** — same bug, inverted direction. d76
delegated to 67d, send_a2a_message returned the unexpected-shape
error, delegate_task wrapped it as DELEGATION FAILED, the calling
agent retried with sharper wording, the recipient saw the same
request twice and self-reported "二次请求 — 我先不执行".
External molecule-mcp standalone runtimes are inherently poll-mode
(they have no public URL), so every external↔external A2A pair was
hitting this on every send. The pre-fix client only handled JSON-RPC
``result``/``error`` keys and treated the queued envelope (which has
neither) as malformed. RFC #2339 PR 2 added the queued envelope on
the server side; the client never caught up.
When ``send_a2a_message`` returns the ``_A2A_QUEUED_PREFIX`` sentinel,
``tool_delegate_task`` now transparently falls back to
``_delegate_sync_via_polling`` (RFC #2829 PR-5's durable
``/delegate`` + ``/delegations`` polling path, which DOES work for
poll-mode peers because the platform's executeDelegation goroutine
writes to the inbox queue and the result row arrives when the target
picks it up + replies). The agent gets a real synchronous reply
instead of the empty queued sentinel.
* ``test_a2a_response.py`` — 62 tests, **100% line coverage** on
the parser (verified via ``coverage run --source=a2a_response``).
Includes adversarial-input fuzzing across ~25 pathological
payloads — parser must never raise.
* ``test_a2a_client.py::TestSendA2AMessagePollMode`` — 4 tests for
the new Queued/Error wiring in ``send_a2a_message``.
* ``test_delegation_sync_via_polling.py::TestPollModeAutoFallback``
— 3 tests for the auto-fallback in ``tool_delegate_task``,
including negative cases (push-mode reply must NOT trigger
fallback; genuine error must NOT silently retry).
* **Verified all new tests FAIL on pre-fix source** by stashing
a2a_client.py + a2a_tools_delegation.py and re-running — 5
failures including ImportError for the missing
``_A2A_QUEUED_PREFIX``.
Per the operator-debuggability directive:
* INFO at every Queued classification (expected variant; operator
sees normal poll-mode-peer queueing in log stream).
* INFO at the auto-fallback decision in ``tool_delegate_task``
so a future operator can correlate "send returned queued →
falling back to polling path" without reading the source.
* WARNING at every Malformed classification (server contract
drift; operator MUST see this immediately).
* Existing transient-retry WARNING preserved.
* Mirror Go-side typed model in workspace-server. The wire shape
is documented in ``a2a_response.py``'s module docstring with
file:line pointers to the canonical emitters; a future PR can
introduce ``models/a2a_response.go`` without changing wire
behavior. The fixture corpus in ``test_a2a_response.py`` is
designed so a one-sided edit breaks CI.
* ``send_message_to_user`` and ``chat_upload_receive`` use a
different endpoint (``/notify``) and aren't affected by this
bug; their parsing stays unchanged.
* 135 tests pass across ``test_a2a_response.py`` +
``test_a2a_client.py`` + ``test_delegation_sync_via_polling.py``
+ ``test_a2a_tools_impl.py``.
* ``coverage run --source=a2a_response -m pytest`` reports 100%
line coverage with 0 missing.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
workspace-server's a2a_proxy poll-mode short-circuit returns
{status: "queued", delivery_mode: "poll", method: <a2a_method>}
when the peer has no URL to dispatch to (poll-mode peers, including
every external molecule-mcp standalone runtime). The bare
send_a2a_message parser only knew about JSON-RPC {result, error}
keys, so this envelope fell through to the "unexpected response shape"
error path. Two production symptoms on the reno-stars tenant traced
to it:
1. File transfer logged as failed when it actually succeeded —
operator-facing logs showed an A2A_ERROR but the receiving
workspace did get the chunked file via the agent's fallback path.
2. delegate_task retried after the false failure → peer received
duplicate delegations → conversation got confused, the second
peer self-diagnosed in a notify ("⚠️ Peer 二次请求 — 我先不执行").
Add a third branch to the parser, BETWEEN the existing JSON-RPC
{result, error} cases and the catch-all "unexpected" fallback. The
queued envelope is delivery-acknowledged-but-pending-consumption —
not an error — so it returns a clean success string the agent can
render as a normal outcome. The success string includes "queued"
and "poll" so an operator scanning logs sees the routing path
without parsing JSON.
Defensive: the new branch only fires when BOTH status="queued" AND
delivery_mode="poll" are present. A partial envelope (one key
missing) still falls through to the catch-all, so a future server
bug that emits a malformed shape gets surfaced instead of silently
swallowed.
Tests:
- test_poll_queued_envelope_returns_success_string — pins the canonical
envelope returns a non-error string. Discriminating: verified to FAIL
on old code (returned [A2A_ERROR] string), PASS on new.
- test_poll_queued_envelope_with_other_method — pins the parser doesn't
hardcode message/send. Discriminating: also FAILS on old code.
- test_status_queued_without_poll_mode_still_falls_through — pins both
keys are required (defensive against future server bugs).
12 existing tests in TestSendA2AMessage still pass — no regression.
Scope: hotfix for the bare send_a2a_message path. The full SSOT
typed-A2AResponse refactor (#158-#163, parents under #2967) covers the
broader vocabulary alignment between Go server and Python client. This
PR ends the production symptoms now without preempting that work.
Followup to PR #2966. The user reported the about:blank symptom on
reno-stars and the browser console showed:
Failed to launch 'platform-pending:d76977b1-…/bb0dcaf3-…' because
the scheme does not have a registered handler.
So the agent's "download link" was a `platform-pending:<wsid>/<file_id>`
URI — the canonical reference for poll-mode chat uploads (see
workspace-server/internal/handlers/chat_files.go:690 +
workspace/inbox_uploads.py). PR #2966 only handled `workspace:`,
`file:///`, and absolute container paths; the platform-pending
scheme fell through to the raw URI which the browser couldn't
navigate to.
Fix
---
- `resolveAttachmentHref`: added a `platform-pending:` branch that
resolves to `${PLATFORM_URL}/workspaces/<wsid>/pending-uploads/
<file_id>/content`. Uses the wsid from the URI, NOT the chat's
workspace_id — these can differ when a file is forwarded across
workspaces (cross-workspace delegation, agent forwarding).
- New `isPlatformAttachment(uri)` helper — single source of truth
for "this URI requires our auth headers, route through
downloadChatFile". Used by both `downloadChatFile` (chip click)
and ChatTab's markdown-link override.
- ChatTab.tsx markdown-link override now imports
`isPlatformAttachment` instead of duplicating the scheme list.
Pre-fix this list was duplicated and missed `platform-pending:`.
Tests
-----
The 4 IME tests still pass; tsc clean. The platform-pending resolution
is exercised via the `isPlatformAttachment` SSOT helper (any URI
reaching `downloadChatFile` or the markdown override goes through
it). A dedicated test for the URL shape would need a more elaborate
fixture; manual verification on staging post-deploy is the practical
gate.
Reported on production reno-stars 2026-05-05.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two production-reported regressions in the same chat surface, fixed
in one focused PR.
Issue 1 — IME composition + Enter sends half-typed message
----------------------------------------------------------
ChatTab's textarea onKeyDown was:
if (e.key === "Enter" && !e.shiftKey) {
e.preventDefault();
sendMessage();
}
For agents typing CJK / Japanese / Korean via the system IME, Enter
commits the candidate selection — not a newline, not a send. With
the old check, every IME-commit Enter accidentally sent the
half-typed message ("你好" + half-typed-pinyin + Enter to commit
the next candidate → message goes out before the user finishes).
Fix: guard on `event.nativeEvent.isComposing` AND `e.keyCode !== 229`.
The latter covers older Safari / WebKit-based mobile browsers that
delay setting isComposing on the composition-end Enter.
Issue 2 — markdown links land at about:blank
---------------------------------------------
ReactMarkdown's default `<a>` rendering passes the agent-supplied
href directly to the DOM with no target / scheme handling:
- http(s) → navigates the canvas tab away (canvas state lost)
- workspace://path / file:///workspace/... / /workspace/... →
browser hits unhandled-protocol click → about:blank, no
download (the reported bug)
Fix: ReactMarkdown `components.a` override:
- In-container paths (workspace:, file:///{workspace,configs,home,
plugins}, bare /{workspace,configs,...}) → preventDefault, route
through downloadChatFile (same auth path the AttachmentChip
uses). Filename is derived from the path's last segment.
- External (http/https/mailto/unknown scheme) → target="_blank"
rel="noopener noreferrer" so canvas state survives.
Tests
-----
ChatTab.imeAndLinks.test.tsx (4 tests):
- Enter with isComposing=true → does NOT send, input preserved
- Enter with keyCode=229 (older-Safari IME) → does NOT send
- Enter with no IME signal → DOES send (happy path intact)
- Shift+Enter → does NOT send (newline path intact)
The link-component override is exercised through the full ChatTab
render — the IME tests are jsdom-only and don't load chat history
with markdown messages, so the link test would need a more elaborate
fixture. Manual verification on staging post-deploy is the practical
gate; if the link test grows critical the AttachmentViews-style chip
test can extend.
Verified:
- tsc --noEmit clean
- 4/4 IME tests pass
Reported on production 2026-05-05.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pre-RFC-#2945, every BroadcastOnly / RecordAndBroadcast call site
passed a bare string literal:
h.broadcaster.BroadcastOnly(workspaceID, "AGENT_MESSAGE", payload)
29 producers (Go, ~30 call sites in handlers/, scheduler/, registry/,
bundle/) and ~30 canvas consumers (TS store + listeners) duplicated
the same string with no shared definition. A producer renaming an
event silently broke every consumer — same drift class that produced
the reno-stars data-loss regression on the persistence side. PR-A
fixed the persistence-side SSOT (AgentMessageWriter); PR-B fixes the
event-name SSOT.
What this PR ships
internal/events/types.go
- EventType typed string + 29 named constants covering the full
taxonomy (chat / lifecycle / agent assignment / delegation /
task / approval / auth).
- Grouped semantically; new constants must be added here AND
mirrored in canvas/src/lib/ws-events.ts (parity gate landing
in PR-B-2 follow-up).
- AllEventTypes slice — authoritative list for the snapshot
test + the cross-language parity gate.
internal/events/types_test.go (3 tests)
- TestAllEventTypes_IsSnapshot: pins the canonical list. Adding
a new constant without updating AllEventTypes (or vice versa)
fails with a one-line diff.
- TestEventType_NoEmptyConstants: catches accidentally-empty
values (typo in types.go: const X EventType = ...).
- TestEventType_AllUppercaseSnakeCase: pins the wire format that
canvas TS switch statements assume (no kebab-case, no mixed
case, no leading/trailing/double underscores).
agent_message_writer.go (single migration)
- Demonstrates the constant-usage shape:
events.EventAgentMessage → "AGENT_MESSAGE"
- Other ~30 call sites stay on bare strings for now (this PR
narrow); the migration happens in PR-B-1 follow-up. Both
shapes (constant + bare string) co-exist on the wire — the
typed version is just the recommended path for new code.
Why ship this in stages
1. PR-B (this): types + tests + first migration → MERGEABLE NOW,
low risk.
2. PR-B-1 (follow-up): migrate the remaining ~30 call sites to
constants. Mechanical, low-risk.
3. PR-B-2 (follow-up): canvas/src/lib/ws-events.ts mirror + cross-
language parity gate. Touches both repos.
Per memory feedback_oss_design_philosophy.md (every refactor toward
OSS plugin shape) — this surface is now plugin-safe: external
implementations can import the events package and get the same
named taxonomy without copying strings.
Verified
- go vet ./internal/events/ clean
- go build ./... clean
- TestAllEventTypes_IsSnapshot + TestEventType_* all pass
- TestAgentMessageWriter_* (the only call site touched) still green
Refs RFC #2945, PR #2949 (PR-A SSOT), PR #2944 (reno-stars).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous byte-slice form `s[:previewCap]` could split a multi-byte
codepoint at byte 4096, producing invalid UTF-8. Postgres JSONB rejects
the row → ledger insert silently fails → audit gap on dashboards while
activity_logs continues to record the event.
Walk the string by rune index and stop at the last boundary that fits
inside the cap. ASCII-only strings still hit the cap exactly; CJK/emoji
strings stop slightly under, never over.
Mirrors the truncatePreviewRunes fix shipped for agent_message_writer
in #2959. Followup: deduplicate into a shared helper once both have
landed.
Tests: 2 regression tests using utf8.ValidString — one with an all-3-byte
rune string just over the cap, one with a single multi-byte rune sitting
exactly on the boundary. Verified on the previous byte-slice impl: both
new tests would fail (invalid UTF-8 + truncation past cap by 1 byte).
Two issues caught in five-axis self-review of #2956:
## 1. Drop speculative source_workspace_id rendering
The panel rendered a "from peer" badge based on
`propagation.source_workspace_id`, claiming it surfaced cross-
workspace propagation. But the OpenAPI spec at
docs/api-protocol/memory-plugin-v1.yaml documents `propagation` as
"Opaque metadata the plugin stores and returns. Reserved for future
cross-namespace propagation semantics" — and a grep across
workspace-server/internal/memory/ confirms NO writer in the codebase
populates that key. The badge would never render against real data.
Violates "don't design for hypothetical future requirements" from
the project conventions. Drop the field from MemoryV2, the row badge,
the test fixtures, and the JSDoc. When propagation gains a concrete
shape, re-add backed by an actual writer.
## 2. Tighten 503 detection — match the literal contract string
Pre-fix detection: `msg.includes('503') || msg.toLowerCase().includes('plugin is not configured')`
False-positives on any unrelated 503 + on any error mentioning
"plugin" + "configured" in any order.
Post-fix: `msg.includes('MEMORY_PLUGIN_URL')` — the env var name is a
hard-coded literal in workspace-server/internal/handlers/memories_v2.go's
available() error, so this is a pinned cross-layer contract. Drift
between the Go error message and the canvas detection now fails
loud (TestMemoriesV2_PluginUnwired_All503 asserts the env var name
in the response body; the canvas test asserts the same).
Extracted as a named export `isPluginUnavailableError` so the
detection is unit-testable and reusable. Added 4 direct tests:
contract-string match, generic-503 false-negative, 401 false-
negative, non-Error inputs.
## Test results
- 30 component tests pass (was 26; +4 for isPluginUnavailableError)
- Coverage on MemoryInspectorPanel.tsx: 100% lines, 100% functions
(branch coverage up to 85.9% from 84.7% — speculative-field
branches no longer count)
- Full canvas suite: 1277/1277 pass across 91 files
Self-review caught after #2954 landed: check_register() POSTed to
/registry/register with agent_card.name="doctor-probe". The endpoint
is an UPSERT, so the doctor probe overwrites the workspace's actual
agent_card metadata until the real agent's next register call. An
operator running `molecule-mcp doctor` against a live workspace
would see their canvas briefly display "doctor-probe" as the agent
name — invisible production-disruption.
Switches to POST /registry/heartbeat. heartbeat only updates
last_heartbeat_at (and clears awaiting_agent if needed) — the same
work a normal molecule-mcp boot does every 20s in steady state, so
the doctor's extra heartbeat is indistinguishable from background
traffic.
Function renamed check_register → check_token_auth to match what
it actually does. check_register kept as back-compat alias so any
external test/import still resolves.
Also unified the duplicated token-resolution paths into a single
_resolve_token() returning (value, source_label). Pre-fix:
check_register and _resolve_token_summary read env in parallel
ladders — a future env-var addition would have to touch both.
New tests:
- test_check_token_auth_uses_heartbeat_endpoint: mocks urlopen,
asserts the URL ends in /registry/heartbeat AND does NOT
contain /registry/register. Pins the load-bearing invariant
so a future refactor can't silently re-route through register.
- test_resolve_token_returns_value_and_label_for_env: pins the
consolidated resolver returns both pieces of info from the
same source-decision.
- test_resolve_token_returns_none_when_missing: missing-env
happy path.
Verification:
- 13/13 tests pass (10 existing + 3 new)
- Manual stripped-env run still renders 4 FAIL + 2 WARN with
actionable hints, exit 1.
Refs molecule-core#2934 item 6 (doctor side-effect fix-up).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Self-review of PR #2949 surfaced two pre-existing defects that the
SSOT consolidation inherited from the original /notify handler. Both
are addressable in a small follow-up; shipping them as a separate PR
keeps the consolidation and the bug-fix individually reviewable.
Critical: byte-slice preview truncation produces invalid UTF-8
-------------------------------------------------------------
Pre-fix:
if len(preview) > 80 {
preview = preview[:80] + "…"
}
`len()` returns BYTES; `preview[:80]` slices on a byte boundary. For
agent-authored chat in CJK / emoji / accented characters, byte 80
lands mid-codepoint → invalid UTF-8 → Postgres JSONB rejects → INSERT
fails → activity_log row never written → message vanishes from chat
history on the next reload. The persistence-failure log fires but
operators have to grep to find it, and the user-visible regression
mode is identical to reno-stars.
Fix: extract `truncatePreviewRunes(s, maxRunes)` that walks the rune
boundary using `for i := range s` (Go's range over string yields rune
start indices). Cap at 80 RUNES not bytes — UI-friendly count, not
storage count.
Important: workspace-lookup error path swallows real DB errors
--------------------------------------------------------------
Pre-fix:
if err := w.db.QueryRowContext(...).Scan(&wsName); err != nil {
return ErrWorkspaceNotFound
}
Conflates `sql.ErrNoRows` (legit not-found → caller 404) with real
DB errors (connection drop, query timeout, pool exhaustion → caller
should 503). During a Postgres outage every notify call surfaced as
"workspace not found" — masking the actual incident in alerting and
making the symptom indistinguishable from "you typed a bad workspace
ID".
Fix: distinguish via `errors.Is(err, sql.ErrNoRows)` and wrap
non-not-found errors with `fmt.Errorf("agent_message: workspace
lookup: %w", err)`. Callers' existing fallback path (return 500 /
return error wrapped) handles the new shape correctly without any
changes — verified by running existing TestNotify_* and
TestMCPHandler_SendMessage_* tests.
Tests added (3 new, 11 total writer tests)
------------------------------------------
- TestTruncatePreviewRunes_RuneBoundary: 8-case table — ASCII, CJK,
exactly-at-max, emoji prefix. Asserts both correct visible output
AND `utf8.ValidString` on every result so the bug shape (invalid
UTF-8) can't recur.
- TestAgentMessageWriter_Send_NonASCIIMessagePersists: end-to-end
with a 200-rune CJK message (exceeds the 80-rune cap, would have
hit the byte-slice bug). Pins the INSERT summary contains valid
UTF-8 with exactly 80-rune body + ellipsis.
- TestAgentMessageWriter_Send_DBErrorOnLookupReturnsWrapped: pins the
DB-outage path returns a wrapped non-ErrWorkspaceNotFound error so
alerting can distinguish 404 from 503. Verified via mock
ExpectQuery returning a transient error.
Verified
--------
- `go vet ./internal/handlers/` clean
- `go build ./...` clean
- All 14 writer + caller tests pass (8 original + 3 new + AST gate +
TestNotify_* + TestMCPHandler_SendMessage_* sibling tests)
Per memory feedback_assert_exact_not_substring.md: every new test
asserts boundary behavior directly (UTF-8 validity, exact rune count,
errors.Is comparison) rather than substring-match in stringified
output.
Refs RFC #2945, PR #2949, PR #2944.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The External Connect modal's Codex and OpenClaw tabs were rendering
this MCP server config:
command = "python3"
args = ["-m", "molecule_runtime.a2a_mcp_server"]
That spawns the bare MCP dispatcher with no presence wiring. The
``molecule-mcp`` console-script wrapper (mcp_cli.main) is what calls
``POST /registry/register`` at startup and runs the 20s heartbeat
thread alongside the MCP stdio loop. Without the wrapper, the canvas
flips the workspace back to ``awaiting_agent`` (OFFLINE) within
60-90s — even while tools work — because nothing is heartbeating.
Operator-side this looks like: the workspace is registered and tools
work fine when invoked, but the canvas shows "offline" / "Restart"
CTA, peer agents see the workspace as awaiting_agent in list_peers
output, and inbound A2A delivery silently fails the readiness check.
A new external-Codex operator (#2957) hit this and spent debugging
time on what should have been a copy-paste install.
Fix: switch both Codex and OpenClaw templates to
``command = "molecule-mcp"`` / ``args = []``, matching the universal
MCP template that already handles this correctly. Inline comment in
each template explains the wrapper-vs-bare-module tradeoff so a
future template author doesn't regress to the shorter form.
Hermes-channel intentionally still spawns the bare module — the
hermes plugin owns the platform plugin path and runs its own
register_platform/heartbeat code in-process; double-heartbeating
would race. Universal/Codex/OpenClaw all need the wrapper.
Regression gate: TestExternalMcpTemplates_UseMoleculeMcpWrapper
asserts the three templates that must use the wrapper actually do,
and explicitly fails on the old ``-m molecule_runtime.a2a_mcp_server``
shape. Verified the test FAILS on pre-fix source by stashing only
external_connection.go and re-running.
Source: molecule-core#2957 issue 1 (item 4 of the report — the
``(codex returned empty output)`` / opaque-canvas-error / stale-
session items live in codex-channel-molecule and are tracked
separately).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>