molecule-core

Author	SHA1	Message	Date
Hongming Wang	656a02fae4	fix(textutil): SSOT for rune-safe string truncation, fix 3 audit-gap bugs Closes #2962. ## Why Six per-package `truncate` helpers had drifted into independent re-implementations of the same idea. Three of them (delegation.go, memory/client/client.go, memory-backfill/verify.go) used `s[:max] + "…"` byte-slice form, which on a multi-byte codepoint at byte `max` produces invalid UTF-8 → Postgres `text`/`jsonb` rejects the INSERT silently → `delegation` / `activity_logs` row never lands → audit gap. Three other helpers (delegation_ledger.go #2962, agent_message_writer.go #2959, scheduler.go #2026) had each been fixed in isolation with three slightly different rune-safe shapes — confirming this is a class of bug, not a single instance. ## What New package `internal/textutil` with three rune-safe functions: - `TruncateBytes(s, maxBytes)` — byte-cap, "…" marker. Used by 5 callers writing into byte-bounded columns / log lines. - `TruncateBytesNoMarker(s, maxBytes)` — byte-cap, no marker. Used by delegation_ledger.go where the storage already conveys "preview" and an extra ellipsis would push the result over the column cap. - `TruncateRunes(s, maxRunes)` — rune-cap, "…" marker. Used by agent_message_writer.go where the cap is in display chars (UI summary), not bytes. All three guarantee `utf8.ValidString(out)` for any `utf8.ValidString(in)`. Inputs already invalid go through `sanitizeUTF8` at the call site boundary (scheduler.go preserved this defense-in-depth). ## Migration map \| Old \| New \| Behavior change \| \|---\|---\|---\| \| `delegation_ledger.truncatePreview` \| `textutil.TruncateBytesNoMarker(s, 4096)` \| none \| \| `agent_message_writer.truncatePreviewRunes` \| `textutil.TruncateRunes(s, n)` \| none \| \| `scheduler.truncate` \| `textutil.TruncateBytes(s, n)` \| "..." → "…" (3 bytes either way; single-glyph display) \| \| `delegation.truncate` \| `textutil.TruncateBytes(s, n)` \| bug fix + ellipsis swap \| \| `memory/client.truncate` \| `textutil.TruncateBytes(s, n)` \| bug fix \| \| `memory-backfill.truncate` \| `textutil.TruncateBytes(s, n)` \| bug fix \| Five separate `truncate*` helpers + their per-package tests removed. Net: 12 files / +427 / -255. ## Tests - `internal/textutil/truncate_test.go` — 27 table-test cases + 145 fuzz-invariant cases asserting `utf8.ValidString` and byte-cap invariants on every output. - `delegation_ledger_test.go TestLedgerInsert_TruncatesOversizedPreview` strengthened with `capValidUTF8Matcher` so the SQL-write argument is asserted to be valid UTF-8 + within cap (not just `AnyArg()`). Mutation-tested: replacing the SSOT call with byte-slice form makes this test fail loud. ## Compatibility - All callers internal; no external API surface change. - Ellipsis swap "..." → "…": same byte budget (3 bytes), single-glyph display. No alerting/grep on either marker in this codebase (verified). Canvas renders both correctly. - DB column widths unchanged (4096 / 80 / 200 / 256 / 300 — all preserved in the migrations). ## Security Fixes a silent INSERT-failure mode that hid `activity_logs` / `delegations` rows containing peer-controlled text. The class of input that triggered it (CJK, emoji, accented Latin) is normal user content, not malicious — but the symptom (audit gap) makes incident reconstruction harder. Helper is pure-function over `string`; no secrets / PII / auth handling involved. Untrusted input is handled identically to before, just rune-aligned now. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 23:01:21 -07:00
Hongming Wang	c53155ec5f	Merge pull request #3014 from Molecule-AI/test/cross-table-atomicity-integ-149-followup test(chat-uploads): integration test for cross-table atomicity (#149 follow-up)	2026-05-06 05:05:49 +00:00
Hongming Wang	debe29c889	ci(handlers-postgres-integration): apply legacy .sql migrations too The migration-replay step globbed only .up.sql, silently skipping the older flat-naming migrations (001_workspaces.sql, 009_activity_logs.sql, etc.). Fine while no integration test depended on those tables; broke when the #149 cross-table atomicity test came in needing both workspaces (FK target for activity_logs) and activity_logs themselves. Switch to globbing .sql + sorted lex-order, excluding .down.sql so up/down pairs don't undo themselves mid-run. Add a sanity check for workspaces + activity_logs + pending_uploads alongside the existing delegations gate so a future migration drift fails loud instead of silently skipping the regressed test. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 22:02:24 -07:00
Hongming Wang	7a39a08837	test(chat-uploads): integration test for cross-table atomicity (#149 follow-up) Adds two real-Postgres tests under //go:build integration: - TestIntegration_PollUpload_AtomicRollback_AcrossBothTables exercises the helpers in the same Tx shape uploadPollMode does (PutBatchTx + LogActivityTx + Rollback) and asserts COUNT(*)=0 on BOTH pending_uploads AND activity_logs after the rollback. Failure injection: NUL byte in `summary` triggers lib/pq protocol rejection on the second activity insert — same trick the existing PutBatch AtomicRollback test uses. - TestIntegration_PollUpload_HappyPath_AcrossBothTables is the positive counterpart — Commit lands N rows in both tables. Coverage rationale (post-PR-3010 review): - sqlmock unit test (TestPollUpload_AtomicRollbackOnActivityInsertFailure) proved the handler calls Begin/Exec/Exec-fail/Rollback in order. - Existing PutBatch integration test proved Postgres honors rollback for pending_uploads alone. - New tests close the cross-table gap: prove LogActivityTx + PutBatchTx + real Postgres MVCC compose correctly under rollback. A regression that made LogActivityTx silently route through db.DB instead of the passed tx would still pass the sqlmock test (the Begin/Commit/Rollback shape would look right) but would fail this integration test (the activity_logs row would survive the rollback). Verified locally: postgres:15-alpine + all migrations applied, both tests pass in 0.1s. Skips cleanly without INTEGRATION_DB_URL — CI already runs this file via the Handlers Postgres Integration job. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 21:57:56 -07:00
Hongming Wang	bb9bf85dbd	Merge pull request #3011 from Molecule-AI/rfc-2872-workspaces-uniq-toctou fix(workspace-server): close TOCTOU race on workspaces(parent_id, name) (#2872 Critical 1)	2026-05-06 04:51:01 +00:00
Hongming Wang	ff21bbb876	Merge staging into rfc-2872-workspaces-uniq-toctou to clear BEHIND	2026-05-05 21:46:33 -07:00
Hongming Wang	da3cb4c098	fix(workspace-server): close TOCTOU race on workspaces(parent_id, name) (#2872 Critical 1) ## Bug `/org/import` had no per-tenant mutex, advisory lock, or DB-level uniqueness on (parent_id, name). The pattern was lookup-then-insert: existingID, existing, err := h.lookupExistingChild(...) // SELECT if existing { return /* skip / } db.DB.ExecContext(ctx, `INSERT INTO workspaces ...`) // INSERT Two concurrent admin POSTs (rapid double-click in canvas, retry-after- timeout, two operators on the same template) both saw "not found" in the SELECT and both INSERT'd the same (parent_id, name). Captured impact: tenant-hongming accumulated 72 stale child workspaces in 4 days from repeated org-template spawns of the same template (see #2857 phase 4 sweeper for the cleanup; #2872 for the prevention RFC). ## Fix Two-layer fix — DB-level backstop AND application-level happy path: 1. Migration* `20260506000000_workspaces_unique_parent_name.up.sql` ```sql CREATE UNIQUE INDEX CONCURRENTLY IF NOT EXISTS workspaces_parent_name_uniq ON workspaces ( COALESCE(parent_id, '00000000-0000-0000-0000-000000000000'::uuid), name ) WHERE status != 'removed'; ``` * COALESCE(parent_id, sentinel) collapses NULLs so root workspaces also collide pairwise. * `WHERE status != 'removed'` lets a tombstoned row be replaced by a same-named re-import (preserves existing org-import semantics). * CONCURRENTLY avoids ACCESS EXCLUSIVE on production tenants under live traffic; IF NOT EXISTS makes the migration resumable. * Down migration drops CONCURRENTLY symmetrically. 2. `org_import.go` swap Replace lookup-then-insert with `INSERT ... ON CONFLICT DO NOTHING RETURNING id`. On the skip path (RETURNING returns 0 rows → sql.ErrNoRows), re-select the existing id to recurse children: INSERT INTO workspaces (...) VALUES (...) ON CONFLICT (COALESCE(parent_id, ...), name) WHERE status != 'removed' DO NOTHING RETURNING id; The ON CONFLICT target predicate matches the partial-index predicate exactly — required for Postgres to consider the index applicable. Existing `lookupExistingChild` helper kept (still used on the skip path); semantics unchanged. ## Test coverage * AST gate refreshed to assert the workspaces INSERT contains the ON CONFLICT pattern (`onConflictDoNothingRE`) instead of the now-obsolete "lookup-before-insert" ordering. Per behavior-based gating (memory: feedback_behavior_based_ast_gates.md), the new gate pins the actual TOCTOU-resolution behavior. * Companion `TestGate_FailsWhenInsertOmitsOnConflict` proves the gate catches the bug shape on synthetic source. * All existing `lookupExistingChild` unit tests (no-rows, found, nil-parent, DB error, wrapped no-rows) still pass — helper is unchanged and still load-bearing on the skip path. * Live Postgres E2E coverage runs via the existing "Handlers Postgres Integration" CI job, which applies migrations to a real PG and exercises the INSERT path. ## Why ship the migration + swap together (not stacked) The migration alone provides a DB-level backstop, but without the handler swap a UNIQUE-violation surfaces as a 500 to the user. The handler swap alone has no enforceable target until the migration applies. Shipped together they give graceful skip + atomic backstop. Migration is CONCURRENTLY + IF NOT EXISTS, safe to apply even on tenants where the sweeper (#2860) hasn't run yet — the index just declines to build until conflicting rows are reconciled. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 21:43:49 -07:00
Hongming Wang	ef9bd1e0e2	Merge pull request #3010 from Molecule-AI/fix/activity-row-tx-atomicity-149 fix(chat-uploads): activity rows commit atomically with PutBatch (#149)	2026-05-06 04:37:55 +00:00
Hongming Wang	b759548822	fix(chat-uploads): activity rows commit atomically with PutBatch Closes #149. uploadPollMode for poll-mode chat uploads previously committed N pending_uploads rows in one Tx (PutBatch), then wrote N activity_logs rows individually outside any Tx. A per-row failure on activity row K left rows 1..K-1 committed and pending_uploads orphaned until the 24h TTL — not data-loss because the platform's fetcher handled the half-state cleanly, but the user never saw file K in the canvas and the inconsistency surfaced as an "uploaded but invisible" complaint class. Thread one Tx through PutBatchTx + N × LogActivityTx + Commit so all or none commit. Broadcasts are deferred until after Commit — emitting an ACTIVITY_LOGGED event for a row that ends up rolled back would paint a ghost message into the canvas's optimistic UI. A new LogActivityTx returns a commitHook the caller invokes post-Commit; the existing fire-and-forget LogActivity is unchanged for the 4 other production callers (a2a_proxy_helpers + activity.go report path). Storage interface gains PutBatchTx; PostgresStorage.PutBatch is refactored to share the validation + insert path. inMemStorage and fakeSweepStorage delegate or no-op for PutBatchTx (the in-mem fake can't model Tx state — DB-level atomicity is verified by the existing real-Postgres integration test for PutBatch + the new unit test asserting the Go handler calls Rollback on activity-insert failure). Tests: - TestPollUpload_AtomicRollbackOnActivityInsertFailure pins the new contract via sqlmock — second activity insert errors → Rollback expected, Commit must NOT be called. - TestLogActivityTx_DefersBroadcastUntilCommitHook + _InsertError_NoHook_NoBroadcast + _NilTx_Errors cover the new API. - TestPutBatchTx_HappyPath / _EmptyItems / _ValidationFails / _PerRowErrorPropagates cover Tx-aware storage layer. - 7 existing TestPollUpload_* tests updated to mock Begin + Commit (or Begin + Rollback for failure paths) since the handler now opens a Tx around PutBatch + activity inserts. All workspace-server tests pass; integration tag also clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 21:34:28 -07:00
Hongming Wang	cce2050b6a	Merge pull request #2997 from Molecule-AI/rfc-2991-pr-1-image-preview-lightbox feat(canvas/chat): inline image preview + fullscreen lightbox (RFC #2991 PR-1)	2026-05-06 04:28:03 +00:00
Hongming Wang	e87df906bd	Merge staging into rfc-2991-pr-1 to clear BEHIND (post PR-2993 + PR-3005)	2026-05-05 21:24:20 -07:00
Hongming Wang	c60e2b5fa2	chore(canvas/chat): drop unused downloadChatFile import in AttachmentImage github-code-quality bot flagged this as the last unresolved review thread blocking the merge queue. The function is referenced in comments but never called from this file (download is dispatched via the lightbox / AttachmentChip path). Removing the import resolves the bot thread and clears the staging branch-protection 'all conversations resolved' gate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 21:23:46 -07:00
Hongming Wang	143fbb91ff	Merge pull request #3005 from Molecule-AI/ux/files-tab-drag-drop-upload ux(canvas/files): drag-drop upload to target folder (#2999 PR-D)	2026-05-06 03:52:10 +00:00
Hongming Wang	1b29b24e83	Merge staging into rfc-2991-pr-1 to clear BEHIND state	2026-05-05 20:50:55 -07:00
Hongming Wang	6033179f48	Merge pull request #3006 from Molecule-AI/rfc-2991-pr-3-pdf-text-preview feat(canvas/chat): inline PDF + text/code preview (RFC #2991 PR-3)	2026-05-05 20:49:53 -07:00
Hongming Wang	ab1acff2d2	ux(canvas/files): drag-drop upload to target folder (#2999 PR-D) User asked for VSCode-style drag-drop upload (#2999): "drag local to upload to target folder just like vscode does". Today the only upload path is the toolbar's Upload button (folder picker). Drag-drop lets users grab files from Finder/Explorer and drop them directly on a specific subdirectory in the tree. 1. New `uploadDataTransferItems(items, targetDir)` in `useFilesApi` — walks the HTML5 DataTransferItemList via `webkitGetAsEntry()`, recursing folders to a flat (relativePath, file) list, then PUTs each via the existing /files/<path> endpoint. The walker (also exported via `__testables`) calls `readEntries()` in a loop until empty so multi-batch folders (browsers cap each call at ~100 entries) aren't silently truncated. 2. `uploadFiles` (folder-picker path) gained an optional `targetDir` parameter. Same prefixing semantics so future surfaces (e.g. an "upload here" toolbar button on a row) can reuse it. 3. `FileTree` directory rows gained `onDragOver` / `onDragEnter` / `onDragLeave` / `onDrop` handlers + a hover-target highlight (accent-tinted background + outline). dragLeave uses `currentTarget.contains(relatedTarget)` to suppress the flicker that fires when the cursor crosses any child of the row (icon, label, ✕ button) — without this the highlight strobes on every sub-element transition. 4. `FilesTab` wraps the tree column in an outer drop zone for "drop on root" — drops outside any specific subdir row land at root. The empty-state placeholder copy now includes a "drag files here to upload" hint when the active root is /configs (the only writable root today). 5. Both the row drop and the root drop are gated on `root === "/configs"` (the same gate that already blocks the toolbar's New / Upload / Clear). Other roots ignore the drag entirely (no highlight, no drop), so the user doesn't get a misleading drag affordance followed by a "switch root" toast. `dragDropUpload.test.tsx` (9 tests, two layers): Walker tests (pure function, no DOM): - `walkEntry` collects a single dropped file with correct relpath. - `walkEntry` walks a folder + preserves folder name in the path. - Multi-batch loop: a fake reader that emits two batches of 2 + an empty terminator must yield 4 files. A walker that called readEntries once would see only 2 — this is the load-bearing assertion against silent folder truncation. - Nested directories: outer/inner/file.md → "outer/inner/file.md". FileTree drag-drop wiring (DOM): - `dragover` on a directory row preventDefault's (load-bearing — without it the drop event never fires). - `drop` on a directory row fires `onDropToTarget(path, items)`. - `drop` on a FILE row does NOT fire (only directories are valid drop targets). - `drop` with no DataTransferItems does NOT fire (defensive guard against text-only drags). - `dragenter` adds the highlight class to the directory row. 1. The 1MB per-file size cap is inherited from the existing `uploadFiles`. A user dropping a 5MB skill bundle silently skips the file (the loop's `continue` on `file.size > 1_000_000`). Same behavior as the toolbar Upload, so consistent if not great. Surfacing skipped-files would be a UX improvement tracked separately — not load-bearing for this PR. 2. Drop-zone highlight on the column wrapper uses an outline that sits inside the column's overflow-y-auto scroll container. If the user drags onto a row that's mid-scroll, the highlight may clip slightly at the scroll boundary. Cosmetic only; the drop still works. 3. The `?root=` query is NOT passed on the underlying writeFile call (matches the existing uploadFiles behavior). On a backend without #2999 PR-A, this means uploads always land in /configs regardless of selected root — but we already gated drop on `root === "/configs"` so the practical effect is nil today. Once PR-A merges and the canvas threads ?root= through writes (separate follow-up), drops on /home etc. would be enableable by lifting the canDelete-style gate. - `npx tsc --noEmit` clean - 177/177 canvas tab tests pass - Manual on local dev: drag a file from Finder onto /configs/skills row → file appears under /configs/skills/<name>. Drag a folder of 3 files onto root area → 3 files uploaded with folder structure preserved. Drag onto /home tree → no highlight, no drop. Refs #2999. Pairs with PR-A (backend EIC) — without PR-A the tree is empty on SaaS and there's nothing to drop ONTO; PR-D still works on self-hosted today. 🤖 Generated with [Claude Code](https://claude.com/claude-code)	2026-05-05 20:47:47 -07:00
Hongming Wang	19df43e3da	Merge pull request #2993 from Molecule-AI/rfc-2945-pr-b-1-migrate-bare-event-strings refactor(events): migrate 18 producers to typed EventType constants (RFC #2945 PR-B-1)	2026-05-06 03:45:47 +00:00
Hongming Wang	dcece2762b	feat(canvas/chat): inline PDF + text/code preview (RFC #2991 PR-3) Adds two new arms to the AttachmentPreview kind dispatcher: * PDF — chip in the bubble, click opens the shared AttachmentLightbox with a browser-native <embed type="application/pdf"> at 95vw/90vh. Fetch+Blob+ObjectURL auth path matches AttachmentImage / Video. PDF.js not pulled in; browser viewer is good enough for the desktop chat MVP (Slack/Linear/Notion all gate full-page PDF behind a click for the same reason). Falls back to AttachmentChip on fetch error. * Text/code/JSON/YAML — first 10 lines in monospace <pre><code> right in the bubble, "Show all N lines" expands to full content, with a filename + ⬇ download header. Streams up to 256 KB then marks truncated and offers a download chip; large logs don't crash the bubble. No syntax highlighting in v1 — shiki adds 200-500 KB and is pure polish. Coverage: 5 new dispatch tests (PDF success → embed in lightbox, PDF fetch fail → chip fallback, text inline render, text long content → Show-all-N-lines expand button, text fetch fail → chip fallback). All 19 AttachmentPreview tests pass; tsc --noEmit clean. Stacked on rfc-2991-pr-1-image-preview-lightbox (PR-2 already merged into PR-1's branch). PR-1 ships first; this rebases onto staging once it lands. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 20:43:46 -07:00
Hongming Wang	57bfa40990	Merge pull request #3004 from Molecule-AI/ux/files-tab-context-menu ux(canvas/files): right-click context menu — Open / Download / Delete (#2999 PR-C)	2026-05-06 03:37:16 +00:00
Hongming Wang	d88fbb90fb	ux(canvas/files): right-click context menu — Open / Download / Delete (#2999 PR-C) ## Why User asked for a VSCode-style right-click menu on file rows (#2999): "right click to have a menu to download". Today the only download affordance is the toolbar's Export-all (bulk JSON dump), and the inline ✕ button is the only delete UX (small click target, easy to miss). ## Fix 1. New `FileTreeContextMenu` component — fixed-position popover with Open / Download / Delete items composed per-row (files get all three; directories get Delete only since "open a directory in the editor" doesn't apply). Esc + outside-click + Tab + scroll dismiss. ↓/↑ arrow keys rove focus between menu items. role=menu + role=menuitem + autofocus on first item for a11y. 2. Menu state lifted to the top-level `FileTree` (not per-row) so opening a second row's menu auto-closes the first — only one menu open at a time, matching VSCode/Theia. Pinned by the `replaces the first` test. 3. New `downloadFileByPath(path)` in `useFilesApi` — fetches via the existing GET /workspaces/<id>/files/<path>?root= endpoint and triggers a browser download. Distinct from the existing `handleDownloadFile` which downloads the in-editor buffer (round-trips unsaved edits to disk); the context-menu download targets arbitrary tree rows the user hasn't opened. 4. `canDelete` prop threaded from FilesTab → FileTree → menu → item. Same gate as the toolbar (Clear/New/Upload all gated to /configs); context menu's Delete renders as disabled with a muted background on other roots, matching the "feature exists but isn't applicable here" pattern. ## Test coverage `FileTreeContextMenu.test.tsx` (8 tests): - File row → menu opens with Open + Download + Delete. - Directory row → menu opens with Delete only. - Click Download → onDownload(path) fires + menu closes. - Click Delete (canDelete=true) → onDelete(path) fires. - Click Delete (canDelete=false) → onDelete NOT called + menu stays open (disabled-state UX). - Esc dismisses. - Outside-click (mousedown on document.body) dismisses. - Opening second context menu replaces the first (only-one-open invariant). Each test uses fireEvent + screen.getByRole, so they fail on a deleted-code regression — none would pass on the pre-PR shape. ## Three weakest spots (hostile self-review) 1. The menu is positioned at `clientX/clientY` without viewport clamping. If the user right-clicks at the very bottom-right of the panel, part of the menu may overflow off-screen. VSCode handles this by flipping the anchor; we don't yet. Acceptable v1 because the FilesTab is fixed-width (≤ side-panel width) and the menu is small (140×~80px); the overflow would be a few pixels of one item. Filed as a follow-up. 2. Auto-focus on the first item shifts keyboard focus away from the row that opened the menu. Closing with Esc returns focus to the body, not the row. Same behavior as TerminalTab's placeholder + the canvas's other context menus; consistent isn't ideal but at least uniform. Documented inline. 3. The download request reuses the API client's 15s default timeout — large config files (multi-MB skill bundles) on a slow connection could time out. Same risk applies to the existing toolbar Export. If we see real download failures we can add a `timeoutMs` override at the call site without touching the menu. ## Verification - `npx tsc --noEmit` clean - 176/176 canvas tab tests pass - Manual on local dev: right-click a config.yaml row → menu opens → click Download → file lands in Downloads. Right-click on /home root → Delete renders disabled. Refs #2999. Pairs with PR-A (backend EIC) — without PR-A the tree is empty and there's nothing to right-click on a SaaS workspace. 🤖 Generated with [Claude Code](https://claude.com/claude-code)	2026-05-05 20:26:04 -07:00
Hongming Wang	2e6bed71b9	Merge pull request #3003 from Molecule-AI/ux/files-tab-external-not-available ux(canvas/files): "Files not available" banner for external runtimes (#2999 PR-B)	2026-05-06 03:24:45 +00:00
Hongming Wang	030377bb84	Merge pull request #3002 from Molecule-AI/fix/files-eic-list-delete-symmetry fix(workspace files API): EIC parity for ListFiles + DeleteFile (#2999 PR-A)	2026-05-06 03:22:45 +00:00
Hongming Wang	f93957e982	ux(canvas/files): "Files not available" banner for external runtimes (#2999 PR-B) ## Why Reported by user (issue #2999): external workspaces (mac laptop, mac mini, hermes-on-home-server — runtime="external") render the FilesTab identically to the SaaS empty-listing bug, showing "0 files / No config files yet" even though the platform doesn't actually own the filesystem of these workspaces. Visually indistinguishable from the broken state, reads as a bug. ## Fix Mirror the affordance TerminalTab adopted in PR #2830 for runtimes without a TTY: 1. New `NotAvailablePanel` in `canvas/src/components/tabs/FilesTab/` — folder-with-slash icon + "Files not available" headline + body text that names the runtime and points the user at Chat. 2. `FilesTab` now takes optional `data?: WorkspaceNodeData`. When `data.runtime` is in `RUNTIMES_WITHOUT_FILES` (currently just "external"), early-return the placeholder before mounting the useFilesApi hook. Mirrors TerminalTab's prop shape exactly so the review pattern is uniform across tabs. 3. SidePanel passes `node.data` to FilesTab (matches existing pattern for ChatTab / TerminalTab). ## Test coverage `FilesTab.notAvailable.test.tsx` (4 tests): - external runtime → banner renders with runtime name + Chat-tab guidance copy. - external runtime → NO `/files` API request fires (asserted by inspecting the mocked api.get call log). - claude-code runtime → no banner, normal mount proceeds (toolbar's root selector is the discriminator). - data prop omitted → falls through to normal mount (back-compat with any caller that doesn't thread data through, e.g. legacy tests). Each branch is independent and discriminating — none would pass on a code-deleted version of the early-return. ## Three weakest spots (hostile self-review) 1. `RUNTIMES_WITHOUT_FILES` is a hardcoded set in this file. If a future runtime joins (e.g. a "byok-claude" that runs on user hardware), someone has to remember to add it here. Reviewed alternatives: pull from a runtime-capabilities registry — same shape as `RUNTIMES_WITHOUT_TERMINAL` already in TerminalTab. We chose the parallel pattern over a new abstraction; consolidating into a shared registry can land if/when a third tab grows the same gate (rule of three). Documented inline. 2. The placeholder is a static panel — no retry, no "report bug" link. Same as TerminalTab's. Acceptable because the absence is intentional, not transient. 3. Chat-tab guidance is hardcoded English. No i18n in canvas yet; matches the rest of the codebase. Will move with the i18n migration when that lands. ## Verification - `npx tsc --noEmit` clean - 54/54 canvas tab + SidePanel tests pass - Will be live-verified on staging post-merge: open Files tab on an external workspace (mac laptop) → expect placeholder; open on a platform-owned workspace (Hongming Personal Brand Agent) → expect normal tree (assuming PR-A also lands). Refs #2999. Pairs with PR-A (backend EIC fix) — without PR-A the platform-owned path still shows "0 files" because the backend never returns rows. 🤖 Generated with [Claude Code](https://claude.com/claude-code)	2026-05-05 20:21:45 -07:00
Hongming Wang	b530c147de	Merge pull request #3000 from Molecule-AI/rfc-2991-pr-2-video-audio-preview feat(canvas/chat): inline video + audio HTML5 players (RFC #2991 PR-2)	2026-05-05 20:18:36 -07:00
Hongming Wang	f39b595a9c	fix(workspace files API): EIC parity for ListFiles + DeleteFile (closes #2999 PR-A) ## User-visible bug Canvas Files tab returns "0 files / No config files yet" for every SaaS workspace, every root (/configs, /home, /workspace, /plugins). Reported by user (canvas screenshot, hongming.moleculesai.app, Hongming Personal Brand Agent — claude-code, T4, online). ## Root cause `ListFiles` (templates.go) was missing the SSH-via-EIC branch that ReadFile (PR #2785) and WriteFile (PR #1702) already have. On SaaS, dockerCli is nil → findContainer returns "" → falls through to host-side resolveTemplateDir which only matches baked-in template names. For a user-named workspace it matches nothing, so the handler silently returns []fileEntry{}. DeleteFile had the same gap — right-click delete (introduced in PR-C of this issue) would silently no-op once #1 was fixed. ## Fix 1. Extracted shared EIC plumbing into `withEICTunnel` (closure-based, single SSOT for keypair → key push → tunnel → port-wait → cleanup). Refactored writeFileViaEIC + readFileViaEIC to use it. Added listFilesViaEIC + deleteFileViaEIC on the same scaffold. The `LogLevel=ERROR` shim from PR #2822 now lives in one `eicSSHSession.sshArgs()` helper instead of being duplicated per helper — the next time we need to tweak ssh options, one place. 2. Factored remote shell strings into pure functions (buildInstallShell / buildCatShell / buildRmShell / buildFindShell + parseFindOutput) so the wire shape can be pinned without booting a real EIC tunnel. 3. Refactored `resolveWorkspaceFilePath(runtime, root, relPath)` to honor `?root=`. New rule: `/configs` (or empty / unrecognized) → runtime managed-config dir via workspaceFilePathPrefix (preserves the v1 ReadFile/WriteFile behaviour where canvas's Config tab GETs/PUTs config.yaml without specifying a root and lands in the right per-runtime dir); `/home`, `/workspace`, `/plugins` → literal absolute path on the EC2 host. List/Read/Write/Delete now agree on what file a tree row points to — pre-fix List would say "/home contents" but Read/Write would route to /configs. 4. ListFiles + DeleteFile dispatch on instance_id != "" → EIC helper. Errors from the EIC path produce 500 (not silent fall-through to local-Docker, which would mask the failure as "0 files" — the exact user-visible symptom). 5. Added ?root= validation gate to WriteFile + DeleteFile so an out-of-allowlist root is rejected before the resolver runs. ## Test coverage - TestResolveWorkspaceFilePath_RuntimeIndirection — pins the /configs → runtime prefix translation per-runtime (hermes, claude-code, langgraph, external, unknown). Catches the regression where a future edit accidentally drops the runtime indirection. - TestResolveWorkspaceFilePath_LiteralRoots — pins /home, /workspace, /plugins as literal pass-through regardless of runtime. Catches the symmetric regression where the literal roots start getting rewritten to the runtime prefix (which would mean the FilesTab "/home" selector silently routes to /configs on hermes). - TestResolveWorkspaceRootPath — directory-only translation used by listFilesViaEIC, same indirection rules. - TestSSHArgs_HardenedFlags — pins the centralised ssh option set (LogLevel=ERROR + hardening). Catches drift in the one-place-where-ssh-flags-live. - TestEicSSHSessionSingleSourceForSSHFlags — behaviour-based AST gate (per memory). Counts s.sshArgs() callers (must be ≥4 — list/read/write/delete) and asserts LogLevel=ERROR appears exactly once in the source. Fires if anyone copy-pastes a raw ssh args slice instead of going through the helper. - TestBuildInstallShell / TestBuildCatShell / TestBuildRmShell / TestBuildFindShell — pure-function tests pinning the remote command shape. Catches regression like "rm -f silently becomes rm -rf" or "find loses node_modules pruning" without needing a real EC2. - TestBuildFindShell_DepthForwarding — catches a regression where the helper hard-codes a depth instead of using the caller's value. - TestParseFindOutput / TestParseFindOutput_EmptyInput — pin the TYPE\|SIZE\|REL parser. Empty-input case explicitly returns [] not nil so the JSON wire shape stays a list. - TestListFiles_EICDispatch_Success / Error — sqlmock-driven handler test. Verifies instance_id != "" routes to listFilesViaEIC and surfaces errors as 500 (does NOT silently fall through to local-Docker, which is the exact regression-mode of the original bug). - TestListFiles_EICBranch_NotTakenForSelfHosted — back-compat guard: instance_id == "" must NOT enter the EIC branch (would break self-hosted operators). - TestDeleteFile_EICDispatch_Success / Error — same shape for DeleteFile. - TestListFiles_RootValidation / TestDeleteFile_RootValidation — ?root=/etc must 400 before any DB query or EIC call. ## Verification - `go build ./...` clean - `go test ./...` clean (full workspace-server suite) - Will be live-verified against staging on hongming.moleculesai.app after merge: open Files tab → expect populated /home + /configs + /workspace listings (not "0 files"); right-click delete on /configs/old.yaml → expect file removed on the EC2 host. ## Three weakest spots (hostile self-review) 1. The LogLevel=ERROR drift gate counts source occurrences. A future refactor that intentionally moves the literal somewhere else (e.g. into a constant) would trigger a false positive. The gate's failure message points to the load-bearing constraint (must appear in sshArgs); operator can adjust. 2. `eicFileWriteTimeout` constant kept as an alias for back-compat with prior tests. Documented as intentional + safe to remove on the next pass. 3. The resolver tests pin the runtime → prefix map values (`/home/ubuntu/.hermes`, `/configs`, etc.). A future runtime addition that ships a new prefix needs the test updated. This is intentional — silent prefix changes orphan saved files, so a test failure on map edit IS the right signal. ## Follow-up (RFC #2312 subtask 2) Long-term the right fix is to drop EIC entirely and HTTP-forward to the workspace's own URL (RFC #2312). That's a substantially larger refactor across 5 surfaces (chat upload, files, templates, plugins, terminal) and out of scope for this bug-fix PR. Tracked separately under that RFC. Refs #2999. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 20:18:05 -07:00
Hongming Wang	95fdf86187	feat(canvas/chat): inline video + audio HTML5 native players (RFC #2991 PR-2) Second specialized renderer pair landing under RFC #2991. Stacks on PR-1 (#2997) — extends the AttachmentPreview dispatcher with video/ audio cases. Why HTML5-native (not custom JS player) --------------------------------------- - Browser vendors ship hardware-accelerated decoders, captions, pinch + scrub UX, and fullscreen UI. We get all of it for free. - Native fullscreen via the <video> control bar — no AttachmentLightbox needed for video (the browser's built-in fullscreen handles it). - Mobile-friendly without us writing the touch handlers. Auth model ---------- Identical to AttachmentImage (PR-1): platform-auth URIs need our cookie/token, so we fetch the bytes, wrap in a Blob, hand the browser an ObjectURL via <video src=> / <audio src=>. External http(s) URIs skip the fetch. Memory caveat: a Blob holds the entire media in JS memory until the bubble unmounts. The server's 25MB single-file cap (chat_files.go) bounds this; v2 can switch to MediaSource + streaming if larger files become a real shape. Failure modes ------------- - Fetch failure (404, 403, network) → AttachmentChip fallback. - Bytes that aren't valid media (corrupt, wrong Content-Type) → <video onError> / <audio onError> swap to chip. Tests ----- 5 new component tests in AttachmentPreview.test.tsx (now 14 total): - kind=video → <video controls> with blob URL src - kind=video fetch fails → falls back to chip - kind=video extension fallback (no mime) → routes to video path - kind=audio → <audio controls> + filename label visible - kind=audio fetch fails → falls back to chip The preview-kind unit tests from PR-1 (49 cases) already cover the MIME → video / audio dispatch logic; this PR's component tests pin the rendered DOM shape (controls attribute, blob URL src, fallback behavior). Hostile self-review ------------------- 1. Memory bound: 25MB cap protects us today; documented future migration path (MediaSource). 2. iOS Safari autoplay: playsInline pinned on <video> so mobile doesn't auto-fullscreen on play. 3. Captions accessibility: <track kind="captions" /> placeholder so the element is tagged correctly even though we don't have caption files yet (forward-compatible). Verified - tsc --noEmit clean - 173 chat tests green (49 unit + 14 component + 110 pre-existing) Stacks on PR-1 (#2997). PR-3 (PDF + text/code) is the final piece. Refs RFC #2991, PR #2997 (PR-1). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 20:10:19 -07:00
Hongming Wang	04f7a07add	feat(canvas/chat): inline image preview + fullscreen lightbox (RFC #2991 PR-1) First specialized renderer landing under RFC #2991 — chat attachment preview. Adds the dispatch infrastructure that PR-2 (video/audio) and PR-3 (PDF/text) will extend. Architecture (RFC #2991 Phase 2 design) --------------------------------------- - preview-kind.ts: pure helper that maps mimeType (+ extension fallback for missing/generic MIME) to one of: image \| video \| audio \| pdf \| text \| file. Single source of truth; the dispatch axis for every attachment renderer. - AttachmentPreview.tsx: SSOT dispatch component. ChatTab no longer imports kind-specific components — it imports AttachmentPreview, which switches on the kind and renders the right child. - AttachmentImage.tsx: inline thumbnail (max 240×180) + click → lightbox. Auth-aware: for platform URIs (workspace: / platform-pending: / etc) the bytes are fetched via JS-injected headers, wrapped in a Blob, served as ObjectURL — bare <img src> would not include the cookie/token. - AttachmentLightbox.tsx: shared fullscreen modal (image now; PDF will use it in PR-3). Esc / backdrop click / X button to close, focus trap on close button, focus restoration on close. - AttachmentChip retained as the kind=file fallback. No breaking change for existing renderable shapes. External-workspace coverage --------------------------- The wire shape (ChatAttachment.mimeType + uri) is identical for internal + external workspaces — both go through AgentMessageWriter (PR #2949). External claude-code agents that attach images via send_message_to_user automatically get the new preview surface; no runtime-side change needed. Failure modes ------------- - Fetch failure (404, 403, network) → AttachmentChip fallback so the user still gets a working download. Pinned by tests. - Decoded as non-image (corrupt bytes, wrong Content-Type) → onError on the <img> swaps to AttachmentChip. Pinned by tests. - Non-platform URIs (http/https external image hosts) → skip the auth-fetch flow, use the raw URL via resolveAttachmentHref. Pinned by extension-fallback tests. Tests ----- preview-kind.test.ts (49 cases): - Strict MIME match across image/video/audio/pdf/text/unknown - Extension fallback when MIME is missing or application/octet-stream - URL with query string + fragment → strip before parsing - MIME wins over extension (regression: don't render image-named zip) - SVG is image (not text) despite being XML - Non-canonical MIME like application/javascript → text AttachmentPreview.test.tsx (9 component tests): - Dispatch: kind=file → chip, kind=image → image path - Loading state shows placeholder, NOT chip (proves dispatch routed) - Extension fallback (no mimeType) routes to image path - Fetch fail (404) and network error → fall back to chip - Image success: <img> renders ObjectURL, click opens lightbox - Lightbox: Esc closes, backdrop click closes, content click doesn't - Universal fallback: unknown MIME → chip even when extension hints at a renderable kind Hostile self-review (3 weakest spots, addressed) ------------------------------------------------ 1. <img> auth: bare <img src="/chat/download?..."> would NOT include our auth headers. Resolved via fetch+Blob+ObjectURL pattern. Pinned by the image-success test (asserts src === "blob:test-url"). 2. Server-side allowed-roots mismatch: pre-fix tests used /tmp/ paths which the server doesn't allow. Caught when the dispatch test fell into the non-platform path. Updated tests to use /workspace/ subpaths matching templates.go's allowedRoots. 3. Bundle size creep: each kind component adds bytes. Lightbox is currently always-bundled. Lazy-loading is plausible but defer until measured-needed. Verified - tsc --noEmit clean - 168 chat tests green (49 unit + 9 component + 110 pre-existing) PR-2 (video + audio) and PR-3 (PDF + text) extend the dispatch in AttachmentPreview.tsx with their own kind-specific components. Refs RFC #2991. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 19:39:37 -07:00
hongmingwang-moleculeai	3dfeb180ab	Merge pull request #2995 from Molecule-AI/fix/sweep-add-orphan-tunnel-cleanup-2987 chore(sweep): add orphan-tunnel cleanup step (#2987 / #340)	2026-05-06 02:38:39 +00:00
Hongming Wang	88ff0d770b	chore(sweep): add orphan-tunnel cleanup step (#2987 / #340 ) The 15-min sweeper has been deleting stale e2e orgs but not the orphan tunnels left behind when the org-delete cascade half-fails (CP transient 5xx after the org row is gone but before the CF tunnel delete completes). Result: tunnels accumulate in CF until manual operator cleanup. Add a final step that POSTs `/cp/admin/orphan-tunnels/cleanup` every tick. Best-effort — failure doesn't fail the workflow; next tick re-attempts. Output reports deleted_count + failed count for ops visibility. This is the catch-all for the orphan-tunnel class. The proper upstream fix (transactional org delete) lives in CP and tracks as issue #2989. Until that lands, the sweeper bounded-time-to-cleanup keeps the leak from escalating. Note: PR #492 (cf-tunnel silent-success fix) makes this step actually effective — pre-fix DeleteTunnel silent-succeeded on 1022, so the cleanup endpoint reported success without deleting. Post-fix the cleanup chains CleanupTunnelConnections + retry on 1022, which actually clears stuck-connector orphans. 🤖 Generated with [Claude Code](https://claude.com/claude-code)	2026-05-05 19:36:20 -07:00
Hongming Wang	86b8d8d744	Merge pull request #2982 from Molecule-AI/fix-config-skip-yaml-for-external-runtime fix(canvas/config): skip config.yaml fetch for external/hermes runtimes	2026-05-06 02:22:14 +00:00
Hongming Wang	9b9419ad5e	Merge pull request #2992 from Molecule-AI/chore/ssot-pointer-sweep-workflow chore(sweep): note SSOT for ephemeral prefixes lives in CP	2026-05-06 02:20:35 +00:00
Hongming Wang	a19ee90556	chore(sweep): note SSOT for ephemeral prefixes lives in CP Mirrors molecule-controlplane#494: the canonical EPHEMERAL_PREFIXES list now lives in molecule-controlplane/internal/slugs/ephemeral.go, where redeploy-fleet reads it to skip in-flight test tenants. The sweep workflow keeps a Python copy because GHA Python can't import Go, but a comment now points engineers updating the list to update both files. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 19:18:13 -07:00
hongmingwang-moleculeai	bd0580f4af	Merge pull request #2990 from Molecule-AI/fix/memory-v2-namespace-labels-2988 fix(memory-v2): namespace labels use display names not UUID prefixes (#2988)	2026-05-06 02:13:30 +00:00
Hongming Wang	64e58fb390	test(memory-v2-e2e): update expectChainQueryRoot for new name column PR #2990 root cause: the resolver SQL added `name` to the SELECT for DisplayName plumbing, but the e2e test's sqlmock fixture (expectChainQueryRoot at swap_test.go:216) still scripts the 3-column shape. Three e2e tests fail with: sql: expected 3 destination arguments in Scan, not 4 Fix: bump the fixture to 4 columns (id, name, parent_id, depth) and pass an empty name. The e2e tests don't assert on label rendering — they pin the namespace string flow ("workspace:root-1" etc), which is unchanged. Empty name is fine: ReadableNamespaces still emits the correct namespace strings; only DisplayName is empty. Caught by CI's Platform (Go) check on PR #2990 — would have been a silent missed-coverage case in the resolver_test.go run because that package doesn't import the e2e package. 🤖 Generated with [Claude Code](https://claude.com/claude-code)	2026-05-05 19:10:18 -07:00
Hongming Wang	9ceda9d81f	refactor(events): migrate 18 files to typed EventType constants (RFC #2945 PR-B-1) Mechanical migration of bare event-name strings in BroadcastOnly / RecordAndBroadcast call sites to the typed constants from internal/events/types.go (RFC #2945 PR-B). Wire format unchanged (both shapes serialize to identical WSMessage.Event literals); pinned by TestAllEventTypes_IsSnapshot in #2965. Migrated (18 files, scope: handlers/, scheduler/, registry/, bundle/, channels/): - handlers/{approvals,a2a_proxy_helpers,a2a_queue,activity,agent, delegation,external_rotate,org_import,registry,workspace, workspace_bootstrap,workspace_crud,workspace_provision_shared, workspace_restart}.go - channels/manager.go (caught by hostile-reviewer pass — initial scope missed channels/, found via grep on the post-migration tree) - scheduler/scheduler.go - registry/provisiontimeout.go - bundle/importer.go Hostile self-review (3 weakest spots, addressed) ------------------------------------------------ 1. Missed call sites — initial scope omitted channels/. Post-migration `grep -rEn 'BroadcastOnly\([^,]+,[^,]"[A-Z_]+"\|RecordAndBroadcast\([^,]+,[^,]"[A-Z_]+"' internal/` found 2 stragglers in channels/manager.go. Migrated. Final grep on the same pattern returns only the docstring example in types.go (intentional). 2. gofmt drift — auto-import injection produced non-canonical import ordering. `gofmt -w` applied ONLY to the 18 modified files (NOT the whole tree, to avoid sweeping unrelated pre-existing drift into this PR's diff). Three pre-existing un-gofmt'd files in handlers/ (a2a_proxy.go, a2a_proxy_test.go, a2a_queue_test.go) left as-is — they're unchanged by this PR and their drift predates it. 3. Wire format — paranoia check: do the constants serialize to the exact strings consumers (canvas TS, hermes plugin, anything parsing WSMessage.Event) expect? Yes. Pinned by the snapshot test. The migration is name-only; not a single character of wire output changes. Verified - go build ./... clean - go vet ./internal/... clean - gofmt -l on the 5 migrated package dirs: only pre-existing files - Full tests: handlers/, channels/, scheduler/, registry/, events/, bundle/ all green (5 ok, 0 fail) PR-B-2 (canvas TS mirror + cross-language parity gate) remains as the final piece of RFC #2945 PR-B. Tracked separately so this PR stays mechanical + reviewable. Refs RFC #2945, PR #2965 (PR-B types). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 19:05:03 -07:00
Hongming Wang	b6310d7ebf	fix(memory-v2): namespace dropdown labels use display names not UUID prefixes (#2988 ) User feedback on the v2 Memory tab redesign: on a root workspace, the namespace dropdown showed three indistinguishable entries: Workspace (30ba7f0b) Team (30ba7f0b) (team) Org (30ba7f0b-b303-4a20-aefe-3a4a675b8aa4) (org) For a root workspace, the resolver collapses workspace==team==org IDs (resolver.go:113-122 derive() degenerate case). The previous shortID(8)-truncated UUID label scheme made all three look identical even though the three concepts (private / team-shared / org-wide) remain semantically distinct. ## Backend — Resolver returns DisplayName - SQL chain query now SELECTs workspaces.name (COALESCE → "" on NULL) - chainNode carries .name through walk - deriveNames() computes the display name for each namespace, mirroring derive(): workspace: self.name team: parent.name (or self.name if root — degenerate) org: chain[end].name (root of tree) - Namespace struct gets a new DisplayName field, omitempty wire-shape ## Backend — Handler renders label from DisplayName when present - memories_v2.go:namespaceLabelWithName(name, kind, displayName) is the new SSOT label generator. Falls back to the UUID-prefix shape when displayName is empty so callers without name plumbing keep working unchanged. - namespacesToViews now plumbs Namespace.DisplayName into the label. - Old namespaceLabel(name, kind) is preserved as a thin wrapper around namespaceLabelWithName(_, _, "") for back-compat. - Custom namespaces ignore displayName by design — operator-defined suffixes ARE the chosen label; a name override would surprise. ## Frontend — drop redundant `(kind)` suffix Pre-fix: "Team (mac laptop) (team)" — kind shown twice. Post-fix: "Team (mac laptop)" — the prefix already conveys the kind. ## Test coverage Resolver (3 new tests): - DisplayName_Root: workspace name propagates to all 3 namespaces - DisplayName_Child: workspace=self.name, team=parent.name, org=root.name - DisplayName_EmptyOnNULL: COALESCE → "" → empty fallback Handler (3 new tests): - NamespaceLabelWithName_PrefersDisplayName: workspace/team/org/custom paths - NamespaceLabelWithName_FallsBackToUUIDPrefix: empty displayName → legacy shape - NamespacesToViews_PassesDisplayNameThrough: full integration on root case Canvas: existing 30 tests still pass; suffix drop is rendering-only. memories_v2.go function coverage: 14/14 = 100% - namespaceLabelWithName: 100% - namespacesToViews: 100% - (all 11 pre-existing functions stay at 100%) ## SSOT The "what is this namespace called" question now has one source of truth: namespace.Resolver.ReadableNamespaces sets DisplayName from the canonical workspace.name column. The handler is a renderer; the canvas is a consumer. No name-lookup logic duplicated across the three layers. 🤖 Generated with [Claude Code](https://claude.com/claude-code)	2026-05-05 18:46:50 -07:00
molecule-ai[bot]	d75b73e713	Merge pull request #2981 from Molecule-AI/staging staging → main: auto-promote `9dd2988`	2026-05-05 18:13:50 -07:00
Hongming Wang	0886dbc923	Merge pull request #2978 from Molecule-AI/fix-plugins-compact-empty-state feat(canvas/skills): compact-empty layout for Plugins section (#2971)	2026-05-06 01:12:09 +00:00
Hongming Wang	7420631c32	Merge pull request #2983 from Molecule-AI/feat/auto-promote-stale-alarm-2975 feat(ops): hourly alarm for auto-promote PR stuck on REVIEW_REQUIRED (#2975)	2026-05-06 00:58:49 +00:00
Hongming Wang	caf19e8980	feat(ops): hourly alarm for auto-promote PR stuck on REVIEW_REQUIRED (#2975 ) Closes the silent-block failure mode that left 25 commits — including the Memory v2 redesign and the reno-stars data-loss fix — wedged on staging for 12+ hours behind a single missing review. The auto-promote workflow opened the PR + armed auto-merge, but main's branch protection required a human review and nobody noticed until a user reported "still seeing old memory tab". ## Detection logic — `scripts/check-stale-promote-pr.sh` Reads open PRs `base=main head=staging` and alarms on: - `mergeStateStatus == BLOCKED` - `reviewDecision == REVIEW_REQUIRED` - createdAt older than `STALE_HOURS` (default 4h) Other BLOCKED reasons (DIRTY, BEHIND, failed checks) are NOT alarmed — those are the author's signal-to-fix. This script targets the specific "no human reviewed yet" wedge. Output: - `::warning` per stale PR (visible in workflow summary + Actions UI) - PR comment (idempotent via marker-string detection; one alarm per PR, never re-spammed) - Exit code = count of stale PRs (capped at 125) Logic in a script (not inline workflow YAML) so it's: - Unit-testable — tests/test-check-stale-promote-pr.sh exercises every branch with stubbed fixture JSON + frozen clock. 23 tests covering: empty list, single stale, just-under-threshold, wrong reviewDecision, wrong mergeStateStatus, mixed list (only matching PRs alarm), custom threshold via --stale-hours, exit-code-counts- matching-PRs, --help, unknown arg → 64, missing repo → 2. - Operator-runnable ad-hoc — `scripts/check-stale-promote-pr.sh` works from any shell with `gh` + `jq`. - SSOT — one detector, the workflow YAML is just schedule + invocation surface. Future sibling workflows that need the same check call the same script. ## Workflow — `.github/workflows/auto-promote-stale-alarm.yml` Triggers: - cron `27 * * * *` (hourly, off-the-hour to dodge cron herd) - workflow_dispatch with `stale_hours` + `post_comment` overrides Concurrency: `auto-promote-stale-alarm` group, cancel-in-progress=false (idempotent script; no benefit to cancelling a running scan). Permissions: `contents: read` + `pull-requests: write` (post comments). Sparse checkout — only fetches `scripts/check-stale-promote-pr.sh`. No node_modules, no go modules, no slow setup steps. Workflow runs in <30s on a clean repo. ## Why "alarm + comment" not "auto-approve" Considered options in issue #2975: 1. Slack/email alert — picked. 2. Bot-account auto-approve via molecule-ops — circumvents the human-review gate that branch protection encodes. 3. Trusted-promote bypass via CODEOWNERS — needs Org Admin config change; out of scope for a workflow PR. The comment-on-PR pattern picks (1) without external dependencies (no Slack token, no email config). Subscribers get notified via GitHub's existing PR notification delivery; the warning shows up in the Actions feed. ## Why this won't false-positive on legitimate slow reviews Threshold is 4h. Most legitimate gates clear in <1h, so 4× headroom is plenty for slow CI. The comment is idempotent (one alarm per PR, never re-posted) — adding noise stops at 1 comment regardless of how long the PR sits. ## Test plan - [x] `bash scripts/test-check-stale-promote-pr.sh` — 23/23 pass - [x] `python3 -c 'yaml.safe_load(...)'` clean - [x] `bash -n` clean on both scripts - [ ] Live verification: dispatch the workflow once main has caught up, confirm it correctly reports zero stale PRs	2026-05-05 17:55:27 -07:00
Hongming Wang	38bc27df0d	fix(canvas/config): skip config.yaml fetch for external / hermes runtimes — eliminate 404 console noise Reported on production reno-stars 2026-05-05 (browser console): /workspaces/d76977b1-…/files/config.yaml:1 Failed to load resource: the server responded with a status of 404 The workspace was an external-runtime mac-mini-style agent that doesn't use the platform's config.yaml template — every Config tab open issued a GET that 404d cleanly, and the existing catch block fell into the runtime-manages-own-config branch + populated the form from workspace metadata. Functionally correct, but the request fired anyway, surfaced as a 404 in DevTools, and burned an RTT. Fix: branch on RUNTIMES_WITH_OWN_CONFIG BEFORE the fetch — when the workspace's runtime is one of those (external, hermes), skip the GET, populate the form from workspace metadata directly, set loading=false, return. Same code path as the existing 404-catch fallback, just skipping the wasted request. Behavior preserved for runtimes that DO use the template (claude-code, etc.): unchanged GET → parse → setConfig flow. Tests: 24/24 existing ConfigTab tests pass; no behavioral change for the documented runtimes. tsc clean. Refs reno-stars production 2026-05-05. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 17:55:24 -07:00
Hongming Wang	6748035720	Merge pull request #2980 from Molecule-AI/test/canvas-resolve-attachment-href-2973 test(canvas/chat): cover platform-pending: branch + isPlatformAttachment (#2973)	2026-05-06 00:54:17 +00:00
Hongming Wang	c74d0ecc94	test(canvas/chat): cover platform-pending: branch + isPlatformAttachment (#2973 ) Closes #2973 — the followup test gap I flagged on PR #2968's review. Pre-merge #2968 added the platform-pending: URI scheme branch to resolveAttachmentHref + introduced the isPlatformAttachment SSOT helper, but the existing uploads.test.ts only covered the older workspace: / file:/// / absolute-path branches. The new branch shipped on prod-impact (live console error on reno-stars) with manual post- deploy verification; the regression gate was filed as a followup (#2973) so a future canvas refactor can't silently re-break the poll-mode chat-attachment download path. Adds 15 new test cases across two existing describe blocks: resolveAttachmentHref — platform-pending: scheme (poll-mode uploads): - well-formed platform-pending:<wsid>/<fileid> resolves to the /pending-uploads/<file>/content endpoint - uses the URI's wsid, NOT the chat workspace_id (cross-workspace forwarding case — pinning the explicit decision from #2968's commit message so a regression that flipped this would mis-route the download to the wrong workspace's pending-uploads store) - defensive fallback to raw URI on missing slash, empty fileID, empty wsid (so a future "helpful" change can't synthesize a broken /pending-uploads// path) - regression test against the EXACT production repro from #2968's body (reno-stars, 2026-05-05 console error) isPlatformAttachment: - positive cases for platform-pending: (well-formed and malformed), workspace:<allowed-root>, file:///<allowed-root>, absolute paths under allowed roots - NEGATIVE cases for HTTPS/HTTP URLs to other origins (auth-leak class regression — a helper that always returned true would attach workspace tokens to third-party requests), non-allowlisted roots like /etc/passwd or /var/log/x, empty string, and unrecognised schemes (s3://, ftp://) All 21 tests pass. The 6 pre-existing tests are unchanged. The 15 new tests are the regression gate that #2973 asked for. Verification: - pnpm exec vitest run src/components/tabs/chat/__tests__/uploads.test.ts → 21 passed	2026-05-05 17:51:28 -07:00
Hongming Wang	9dd29882e2	Merge pull request #2979 from Molecule-AI/fix/a2a-poll-mode-response-shape-2967 feat(a2a): SSOT typed-variant response parser + auto-fallback for poll-mode peers (#2967)	2026-05-06 00:41:43 +00:00
Hongming Wang	e342d0c5a7	fix(build): register a2a_response in TOP_LEVEL_MODULES The drift gate caught the new SSOT parser module — without registration the wheel ships it un-rewritten and runtime imports fail. Same pattern as inbox_uploads, a2a_tools_delegation, a2a_tools_rbac registrations. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 17:34:05 -07:00
Hongming Wang	166ad20cd7	test(e2e): Phase 3.5 — wheel parser classifies real server response (#2967 ) Previously Phase 3 only checked the workspace-server's poll-mode short-circuit emit shape ({"status":"queued","delivery_mode":"poll","method":"..."}); the matching client-side classification was tested in isolation against fixture dicts in test_a2a_response.py. This phase closes the loop by piping the actual on-the-wire response from a real workspace-server back through the wheel's a2a_response.parse() and asserting it classifies as the Queued variant with the right method + delivery_mode. A regression in EITHER the server emit shape OR the client parser will now fail this E2E, eliminating the gap that allowed the original "unexpected response shape" production bug to ship despite green unit tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 17:31:45 -07:00
Hongming Wang	4a2dda7cac	feat(canvas/skills): compact-empty layout for Plugins section (#2971 ) Reported on production 2026-05-05: agent plugin tab Plugins 0 installed + Install Plugin this part should be default compact Pre-fix: SkillsTab always rendered the Plugins section as a full rounded-xl panel with vertical chrome — even when zero plugins were installed and the registry browser was closed. The empty state gave a lot of vertical real estate for content that's just "0 installed + Install button". Fix: when installed.length === 0 AND registry closed AND initial load completed, collapse the section into a single inline pill ("Plugins · 0 installed · + Install Plugin"). The full panel re-mounts when: - installed.length > 0 (a plugin landed → expand to surface the list) - showRegistry === true (user clicked + Install Plugin → registry opens) - !installedLoaded (avoid flash; the loading shell shows instead until the first /plugins fetch resolves) Accessibility: - Compact pill: aria-label="Plugins (none installed)" + button aria-expanded="false" + aria-controls="plugins-section" - Full panel: button aria-expanded={showRegistry} + same aria-controls - Section gets id="plugins-section" so the aria-controls reference resolves once the section mounts External workspaces: this is a pure canvas-frontend layout change — applies to ALL workspace runtimes (external, claude-code, hermes, langchain, codex, third-party MCP). No server-side change needed. Tests ----- SkillsTab.compactEmpty.test.tsx (4 tests): - Compact pill renders when installed=0, registry closed, loaded - Full panel renders when installed > 0 - Click + Install Plugin from compact → expands to full panel (verified via aria-controls target id appearing in the DOM) - During initial load (installedLoaded=false), compact pill does NOT render — avoids a compact→full flash as the load completes Per memory feedback_oss_design_philosophy.md: the SkillsTab is the only tab that needs compact-empty today, but the pattern is extractable into a shared EmptyStateCompactWrapper if Schedules / Memories / Approvals adopt the same affordance later. Don't generalise until the third use case (per the same memory, "every refactor toward OSS plugin shape" without premature abstraction). Verified - tsc --noEmit clean - All 4 tests pass Refs #2971. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 17:26:32 -07:00
Hongming Wang	8b9f809966	fix(a2a): SSOT response parser — handle poll-mode queued envelope (#2967 ) Introduce ``workspace/a2a_response.py`` as the single source of truth for the wire shapes the workspace-server proxy can return at ``/workspaces/<id>/a2a``: * ``Result`` — JSON-RPC success * ``Error`` — JSON-RPC error or platform-level error (with restart-in-progress metadata when present) * ``Queued`` — poll-mode short-circuit envelope: the platform queued the message into the target's inbox, the target will fetch via /activity poll * ``Malformed`` — anything the parser can't classify (logged at WARNING so a future server change is loud) ``send_a2a_message`` (in ``a2a_client.py``) now dispatches via ``a2a_response.parse(data)`` instead of inline ``"result" in data`` / ``"error" in data`` sniffing. The Queued variant returns a new ``_A2A_QUEUED_PREFIX`` sentinel so callers can distinguish "delivered async, no synchronous reply" from both success-with-text and failure. reno-stars production data caught two intermittent failures that both reduced to the same root cause: 1. File transfer announce silently failed — when CEO Ryan PC (poll-mode external molecule-mcp) sent the harmi.zip announcement to Reno Stars Business Intelligent (also poll-mode external), ``send_a2a_message`` saw the platform's poll-queued envelope ``{"status":"queued","delivery_mode":"poll","method":"..."}``, didn't recognize it as the synthetic delivery-acknowledgement it is, and returned ``[A2A_ERROR] unexpected response shape``. The agent fell back to a chunk-shipping path; receiver did get the file but operator-facing logs showed a failure that didn't actually fail. 2. Duplicated agent comm — same bug, inverted direction. d76 delegated to 67d, send_a2a_message returned the unexpected-shape error, delegate_task wrapped it as DELEGATION FAILED, the calling agent retried with sharper wording, the recipient saw the same request twice and self-reported "二次请求 — 我先不执行". External molecule-mcp standalone runtimes are inherently poll-mode (they have no public URL), so every external↔external A2A pair was hitting this on every send. The pre-fix client only handled JSON-RPC ``result``/``error`` keys and treated the queued envelope (which has neither) as malformed. RFC #2339 PR 2 added the queued envelope on the server side; the client never caught up. When ``send_a2a_message`` returns the ``_A2A_QUEUED_PREFIX`` sentinel, ``tool_delegate_task`` now transparently falls back to ``_delegate_sync_via_polling`` (RFC #2829 PR-5's durable ``/delegate`` + ``/delegations`` polling path, which DOES work for poll-mode peers because the platform's executeDelegation goroutine writes to the inbox queue and the result row arrives when the target picks it up + replies). The agent gets a real synchronous reply instead of the empty queued sentinel. * ``test_a2a_response.py`` — 62 tests, 100% line coverage on the parser (verified via ``coverage run --source=a2a_response``). Includes adversarial-input fuzzing across ~25 pathological payloads — parser must never raise. * ``test_a2a_client.py::TestSendA2AMessagePollMode`` — 4 tests for the new Queued/Error wiring in ``send_a2a_message``. * ``test_delegation_sync_via_polling.py::TestPollModeAutoFallback`` — 3 tests for the auto-fallback in ``tool_delegate_task``, including negative cases (push-mode reply must NOT trigger fallback; genuine error must NOT silently retry). * Verified all new tests FAIL on pre-fix source by stashing a2a_client.py + a2a_tools_delegation.py and re-running — 5 failures including ImportError for the missing ``_A2A_QUEUED_PREFIX``. Per the operator-debuggability directive: * INFO at every Queued classification (expected variant; operator sees normal poll-mode-peer queueing in log stream). * INFO at the auto-fallback decision in ``tool_delegate_task`` so a future operator can correlate "send returned queued → falling back to polling path" without reading the source. * WARNING at every Malformed classification (server contract drift; operator MUST see this immediately). * Existing transient-retry WARNING preserved. * Mirror Go-side typed model in workspace-server. The wire shape is documented in ``a2a_response.py``'s module docstring with file:line pointers to the canonical emitters; a future PR can introduce ``models/a2a_response.go`` without changing wire behavior. The fixture corpus in ``test_a2a_response.py`` is designed so a one-sided edit breaks CI. * ``send_message_to_user`` and ``chat_upload_receive`` use a different endpoint (``/notify``) and aren't affected by this bug; their parsing stays unchanged. * 135 tests pass across ``test_a2a_response.py`` + ``test_a2a_client.py`` + ``test_delegation_sync_via_polling.py`` + ``test_a2a_tools_impl.py``. * ``coverage run --source=a2a_response -m pytest`` reports 100% line coverage with 0 missing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 17:21:28 -07:00
molecule-ai[bot]	a869bc1536	Merge pull request #2963 from Molecule-AI/staging staging → main: auto-promote `7ee696e`	2026-05-05 17:21:02 -07:00
Hongming Wang	d3e115cb06	Merge pull request #2972 from Molecule-AI/fix/a2a-poll-queued-envelope-2967 fix(a2a-client): recognize poll-mode 'queued' envelope (#2967)	2026-05-06 00:05:27 +00:00

1 2 3 4 5 ...

4486 Commits