molecule-core

Author	SHA1	Message	Date
Hongming Wang	68ee76c6b7	fix(canvas): add getState to useCanvasStore mock in ContextMenu keyboard test ContextMenu.tsx reads parent-workspace children via useCanvasStore.getState().nodes.filter(...) — a direct .getState() call, not the selector-calling form. The existing vi.mock exposed only the selector form, so rendering crashed with "TypeError: useCanvasStore.getState is not a function". Restructure the vi.mock factory to return Object.assign(fn, { getState: () => mockStore }) so both call shapes resolve. Factory body builds the function locally because vi.mock hoists above outer-scope variable declarations and can't reference `mockStore` via closure. Verified: all 15 tests in the file pass after the change. Unblocks the Canvas (Next.js) CI check on PR #1743 (staging→main sync). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 01:49:34 -07:00
molecule-ai[bot]	9d076b9c4d	Merge pull request #1684 from Molecule-AI/fix/missing-keys-modal-a11y-v2 fix(canvas/a11y): MissingKeysModal — backdrop aria-hidden, decorative SVGs, form labels	2026-04-23 02:54:46 +00:00
Molecule AI Core-FE	5157f80d19	fix(canvas): add type=button to ApprovalBanner action buttons (bug #1669 ) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-23 02:15:52 +00:00
Molecule AI Core-FE	382238daa3	test(canvas): relax setPendingDelete assertion to use expect.objectContaining Staging added hasChildren/children fields to workspace store shape. Test assertion updated to use objectContaining to avoid false negatives. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-23 00:59:38 +00:00
Molecule AI Core-FE	66c6b83ab2	test(canvas): add ActivityTab and MissingKeysModal component tests - ActivityTab.test.tsx: 27 tests covering filter bar (aria-pressed states, API reload), loading/error/empty states, ActivityRow content (type badges, method, duration_ms, summary, error styling), A2A flow indicators, auto-refresh Live/Paused toggle, refresh button, activity count - MissingKeysModal.component.test.tsx: 25 tests covering visibility, ARIA semantics (role=dialog, aria-modal, aria-labelledby), content, keyboard (Escape, Enter), save flow (disabled/.../Saved/error), Add Keys & Deploy gate, Cancel + backdrop click, Open Settings button - MissingKeysModal.test.tsx: refactored to preflight logic only (7 tests); component rendering now covered in component test file 863 tests passing (+3 net). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-23 00:58:56 +00:00
airenostars	7a89704b6e	fix(build): add missing fmt import + fix canvas Dockerfile GID (#1487 ) * docs(canary-release): flag as aspirational; link to current state The canary-release.md doc describes the pipeline as if the fleet is running — referring to AWS account 004947743811 and a configured MoleculeStagingProvisioner role. Reality as of 2026-04-22: no canary tenants are provisioned, the 3 GH Actions secrets are empty, and canary-verify.yml has failed 7/7 times in a row. Added a top-of-doc ⚠️ state note that: 1. Clarifies this is intended design, not deployed reality. 2. Notes the AWS account ID is historical / unverified. 3. Explains that merges currently rely on manual promote-latest. 4. Cross-links to molecule-controlplane/docs/canary-tenants.md for the Phase 1 work that's shipped, the Phase 2 stand-up plan, and the "should we even do this now?" decision framework. 5. Asks whoever lands Phase 2 to reconcile the two docs. No behaviour change — doc-only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(build): add missing fmt import in a2a_proxy.go, fix canvas Dockerfile GID - a2a_proxy.go: missing "fmt" import caused build failure (8 undefined references at lines 743-775). Likely dropped during a recent merge. - canvas/Dockerfile: GID 1000 already in use in node base image. Changed to dynamic group/user creation with fallback. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Hongming Wang <hongmingwang.rabbit@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Hongming Wang <hongmingwangrabbit@gmail.com>	2026-04-22 21:10:58 +00:00
Molecule AI Core-UIUX	116526bff3	fix(canvas/a11y): orgs/page.tsx — form labels, error announcements, checkout banner - CreateOrgForm: replace bare <span> labels with <label htmlFor> + input id (WCAG 1.3.1 — programmatic label association); add aria-describedby hint for slug field - Error state: add role=alert on error <p> (WCAG 4.1.3 — Status Messages) - CheckoutBanner: add role=status + aria-live=polite (WCAG 4.1.3); restore decorative ✓ with aria-hidden=true Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-22 21:06:20 +00:00
Molecule AI Core-UIUX	d6dbf23172	test(canvas/a11y): add WCAG 2.1 accessibility tests for ConsoleModal and DeleteCascadeConfirmDialog ConsoleModal: role=dialog, aria-modal, aria-labelledby, backdrop aria-hidden, error role=alert, accessible button names DeleteCascadeConfirmDialog: role=dialog, aria-modal, aria-labelledby, backdrop aria-hidden, SVG aria-hidden, disabled state, keyboard interactions (Escape, Enter), accessible names Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-22 20:39:48 +00:00
Molecule AI Core-UIUX	8bb0fe70ff	fix(canvas/a11y): DeleteCascadeConfirmDialog backdrop aria-hidden (WCAG 4.1.2) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-22 20:36:05 +00:00
Molecule AI Core-UIUX	a322dd0056	fix(canvas/a11y): unaudited components — backdrop/semantic a11y gaps - ConsoleModal.tsx: backdrop div aria-hidden; error div role=alert (WCAG 4.1.2) - ProvisioningTimeout.tsx: warning SVG aria-hidden; cancel-dialog backdrop aria-hidden (WCAG 4.1.2) - TermsGate.tsx: backdrop aria-hidden; dialog role=dialog+aria-modal+aria-labelledby; error role=alert - TopBar.tsx: replace non-semantic role=banner div with <header>; logo emoji aria-hidden - FilesToolbar.tsx: aria-label on select dropdown; aria-label on all icon buttons (New, Upload, Export, Clear, Refresh, file input) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-22 20:07:49 +00:00
Molecule AI Core-UIUX	c6e7ccb289	fix(canvas/a11y): MissingKeysModal — backdrop aria-hidden, decorative SVGs - Backdrop div: add aria-hidden="true" so screen readers skip it (WCAG 4.1.2) - Warning triangle SVG (header): add aria-hidden="true" (decorative icon) - Saved-badge checkmark SVG: add aria-hidden="true" (decorative icon) - Add MissingKeysModal.a11y.test.tsx: 14 tests covering role=dialog, aria-modal, aria-labelledby, backdrop aria-hidden, SVG aria-hidden, focus-on-open (WCAG 2.4.3), Escape key handler (WCAG 2.1.2), accessible button names Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-22 19:40:18 +00:00
Molecule AI Core-UIUX	e211a25ccd	fix(canvas/a11y): dialog aria-modal, icon-button labels, focus management - CookieConsent.tsx: add aria-modal="true" (WCAG 2.1.1) - ConsoleModal.tsx: add useRef + requestAnimationFrame focus management on open - ConversationTraceModal.tsx: remove redundant aria-describedby={undefined} - FileTree.tsx: add aria-label to directory/file delete buttons (WCAG 4.1.2) - FileEditor.tsx: add aria-label to download button (WCAG 4.1.2) - ScheduleTab.tsx: add aria-label to Run Now, Edit, Delete icon buttons - form-inputs.tsx: add aria-label to tag removal button Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-22 19:03:00 +00:00
Molecule AI Fullstack (floater)	ea5e018f76	Merge main into staging to sync	2026-04-22 18:15:52 +00:00
molecule-ai[bot]	6bd1691446	Merge pull request #1594 from Molecule-AI/fix/canvas-a11y-clean fix(canvas/a11y): aria-hidden on decorative SVGs + MissingKeysModal semantics	2026-04-22 18:11:12 +00:00
Molecule AI Core-FE	236158d4a4	fix(canvas/a11y): add aria-hidden to decorative SVGs + MissingKeysModal semantics - DeleteCascadeConfirmDialog: aria-hidden on warning triangle SVG (button already has adjacent text content; icon is purely decorative) - Toolbar: aria-hidden on 4 decorative SVGs (stop-all, restart-pending, search, help) — buttons all have aria-label/aria-expanded/text - MissingKeysModal: role="dialog" aria-modal="true" aria-labelledby on container, id="missing-keys-title" on heading, requestAnimationFrame focus management via useRef (replaces autoFocus={index===0}) - CreateWorkspaceDialog: remove redundant aria-describedby={undefined} WCAG 2.1 SC 1.1.1 — screen readers skip purely-presentational icons. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-22 17:40:43 +00:00
Hongming Wang	5f96a832e7	fix(canvas): drop node:20-alpine default user before creating canvas uid 1000 publish-canvas-image has been failing on every main push since 2026-04-21 at `addgroup -g 1000 canvas` because node:20-alpine already ships a `node` user/group at uid/gid 1000. Same collision workspace-server/Dockerfile.tenant already fixes with `deluser --remove-home node` before `addgroup`. Copying that pattern here so the workflow goes green again and canvas images publish to ghcr. No runtime behaviour change — canvas still runs as non-root uid 1000. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 09:42:02 -07:00
Hongming Wang	359dc615e9	fix(canvas+templates): fetch runtime dropdown from /templates registry (#1526 ) * fix(canvas+templates): fetch runtime dropdown from /templates registry Canvas hardcoded 6 runtime options, drifting from manifest.json which already registers hermes + gemini-cli as first-class workspace templates. A Hermes workspace had runtime=hermes in its DB row but Config showed "LangGraph (default)" — the HTML select fell back to its first option because "hermes" wasn't listed, and saving would clobber the runtime back to empty. Now: - GET /templates returns the runtime field from each cloned template's config.yaml (previously dropped on the floor) - ConfigTab fetches /templates on mount, dedupes non-empty runtimes, and renders them as <option>s. Falls back to the static list if the fetch fails (offline, older backend), so the control never renders empty. Adding a template to manifest.json now flows through automatically — no canvas PR required. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(canvas+templates): model + required-env suggestions from template Extends the dropdown fix so Model and Required Env also flow from the template registry instead of being free-form fields the user has to remember. Template config.yaml now declares: runtime_config: model: <default> models: - id: nous-hermes-3-70b name: Nous Hermes 3 70B (Nous Portal) required_env: [HERMES_API_KEY] - id: nousresearch/hermes-3-llama-3.1-70b name: Hermes 3 70B (via OpenRouter) required_env: [OPENROUTER_API_KEY] Platform: GET /templates now returns runtime + model + models[] per template (was previously dropping runtime + ignoring runtime_config). Canvas: - Runtime dropdown built from /templates (was hardcoded 6 options) - Model input becomes a datalist combobox; free-form input still allowed since model names rotate faster than templates - Required Env Vars default to the selected model's required_env, labelled "(suggested)" so the user knows it's template-driven - Everything falls back to a static list when /templates is unreachable, so offline editing still works Follow-up: add models[] to the other 7 template repos (claude-code, crewai, autogen, deepagents, openclaw, gemini-cli, langgraph). This PR updates the platform + canvas; the Hermes template config update goes in a separate PR against its own repo. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(canvas): commit required_env on model change; add backend tests Review turned up that the \"Required Env Vars (suggested)\" display was cosmetic-only — users picking a different model saw the new env suggestion in the TagList, but the values never made it into state, so Save serialized an empty (or stale) required_env and the workspace ran with the wrong auth check. Canvas fixes: - Model input onChange now commits the matched modelSpec's required_env to state — but only when the prior required_env was empty or matched the previous modelSpec's list (i.e. user hadn't manually edited). User-typed envs always win. - Dropped the display-only fallback in TagList values; shows only what's actually in state. - New \"Template suggests X, Apply\" hint button covers the edge case where state and template differ (existing workspace whose required_env lags the template's current recommendation). - datalist option key now includes index so template authors shipping duplicate model ids don't trigger a silent React key collision. - Small arraysEqual helper. Backend tests: - TestTemplatesList_RuntimeAndModelsRegistry — asserts /templates response carries runtime + models[] with per-model required_env. - TestTemplatesList_LegacyTopLevelModel — asserts older templates with top-level model: still surface correctly, with empty Models[]. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Hongming Wang <hongmingwang.rabbit@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 15:07:46 +00:00
Hongming Wang	e88ab70251	fix(canvas): stop infinite re-render on ContextMenu mount ContextMenu's children selector ran .filter() inside the Zustand hook, returning a brand-new array reference on every render. useSyncExternalStore under the hood compares snapshots with Object.is — a new array always differs, so React kept scheduling re-renders, hit the 50-update depth cap, and crashed with minified error #185. Observed as "Application error: a client-side exception" on every SaaS tenant once a session cookie resolved. Caught in dev mode where the build emits the clear warning: The result of getSnapshot should be cached to avoid an infinite loop at ContextMenu (src/components/ContextMenu.tsx:26:34) Fix: select the stable nodes array once, derive children via useMemo outside the store subscription. Same output, no new reference per render. Manually verified: dev bundle served through a cloudflared tunnel to a live tenant, ContextMenu component mounts cleanly, remaining console errors are all unrelated (localhost API 401s from the dev server pointing at its own origin). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 21:47:32 -07:00
molecule-ai[bot]	64ccf8e179	fix: CWE-78 rm scope, go vet failures, delegation idempotency * refactor: split 4 oversized handler files into focused sub-files - org.go (1099 lines) → org.go + org_import.go + org_helpers.go - mcp.go (1001 lines) → mcp.go + mcp_tools.go - workspace.go (934 lines) → workspace.go + workspace_crud.go - a2a_proxy.go (825 lines) → a2a_proxy.go + a2a_proxy_helpers.go No functional changes — same package, same exports, same tests. All files stay under 635 lines. Note: isSafeURL and isPrivateOrMetadataIP are duplicated between mcp_tools.go and a2a_proxy_helpers.go — this is a pre-existing issue from the original mcp.go and a2a_proxy.go, not introduced by this split. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(runtime+scheduler): increment/decrement active_tasks counter (refs #1386) * docs(tutorials): add Self-Hosted AI Agents guide — Docker, Fly Machines, bare metal * docs: add Remote Agents feature + Phase 30 blog links to docs index * docs(marketing): update Phase 30 brief — Action 5 complete, docs/index.md update noted * docs(api-ref): add workspace file copy API reference (#1281) Documents TemplatesHandler.copyFilesToContainer (container_files.go): - Endpoint overview: PUT /workspaces/:id/files/path - Parameter descriptions for all four function parameters - CWE-22 path traversal protection (PRs #1267/1270/1271) - Defense-in-depth: validateRelPath at handler + archive boundary - Full error code table (400/404/500) - curl example with success and path-traversal rejection cases Also covers: writeViaEphemeral routing, findContainer fallback, allowed roots allow-list, and related links to platform-api.md. Co-authored-by: Molecule AI Technical Writer <technical-writer@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> fix(security): CWE-78/CWE-22 — block shell injection in deleteViaEphemeral (#1310) ## Summary Issue #1273: deleteViaEphemeral interpolated filePath directly into rm command, enabling both shell injection (CWE-78) and path traversal (CWE-22) attacks. ## Changes 1. Added validateRelPath(filePath) guard before constructing the rm command. validateRelPath blocks absolute paths and ".." traversal sequences. 2. Changed Cmd from "/configs/"+filePath (string interpolation) to []string{"rm", "-rf", "/configs", filePath} (exec form). This eliminates shell injection entirely — filePath is a plain argument, never interpreted as shell code. ## Security properties - validateRelPath: blocks "../" and absolute paths before they reach Docker - Exec form: filePath cannot inject shell metacharacters even if validation is somehow bypassed - "/configs" as separate arg: rm has exactly two arguments, no room for injected args Closes #1273. Co-authored-by: Molecule AI Infra-Runtime-BE <infra-runtime-be@agents.moleculesai.app> * fix(security): backport SSRF defence (CWE-918) to main — isSafeURL in a2a_proxy.go (#1292) (#1302) * fix(security): backport SSRF defence (CWE-918) to main — isSafeURL in mcp.go and a2a_proxy.go Issue #1042: 3 CodeQL SSRF findings across mcp.go and a2a_proxy.go. staging already ships the fix (PRs #1147, #1154 → merged); main did not include it. - mcp.go: add isSafeURL() + isPrivateOrMetadataIP() helpers; validate agentURL before outbound calls in mcpCallTool (line ~529) and toolDelegateTaskAsync (line ~607) - a2a_proxy.go: add identical isSafeURL() + isPrivateOrMetadataIP() helpers; call isSafeURL() before dispatchA2A in resolveAgentURL() (blocks finding #1 at line 462) - mcp_test.go: 19 new tests covering all blocked URL patterns: file://, ftp://, 127.0.0.1, ::1, 169.254.169.254, 10.x.x.x, 172.16.x.x, 192.168.x.x, empty hostname, invalid URL, isPrivateOrMetadataIP across all private/CGNAT/metadata ranges 1. URL scheme enforcement — http/https only 2. IP literal blocking — loopback, link-local, RFC-1918, CGNAT, doc/test ranges 3. DNS hostname resolution — blocks internal hostnames resolving to private IPs Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(ci-blocker): remove duplicate isSafeURL/isPrivateOrMetadataIP from mcp.go Issue #1292: PR #1274 duplicated isSafeURL + isPrivateOrMetadataIP in mcp.go — both functions already exist on main at lines 829 and 876. Kept the mcp.go definitions (the originals) and removed the 70-line duplicate appended at end of file. a2a_proxy.go functions are unchanged — they serve the same purpose via a separate code path. * fix: remove orphaned commit-text lines from a2a_proxy.go Three lines from the PR/commit title were accidentally baked into the file during the rebase from #1274 to #1302, causing a Go syntax error (a bare string literal at statement level followed by dangling braces). Deletion restores: } return agentURL, nil } Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Molecule AI Infra-Runtime-BE <infra-runtime-be@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Molecule AI Core-BE <core-be@agents.moleculesai.app> Co-authored-by: Molecule AI SDK Lead <sdk-lead@agents.moleculesai.app> * fix(canvas/test): patch test regressions from PR #1243 + proximity hitbox fix (#1313) * fix(ci): revert cancel-in-progress to true — ubuntu-runner dispatch stalled With cancel-in-progress: false, pending CI runs accumulate in the ci-staging concurrency group. New pushes create queued runs, but GitHub dispatches multiple runs for the same SHA instead of replacing the pending one. All runs get stuck/cancelled before completing. Reverting to cancel-in-progress: true restores CI operation — runs that are superseded are cancelled, freeing the concurrency slot for the new run to proceed. Runner availability (ubuntu-latest dispatch stall) is a separate infra issue tracked independently. * fix(security): validate tar header names in copyFilesToContainer — CWE-22 path traversal (#1043) Tar header names were built from raw map keys without validation. A malicious server-side caller could embed "../" in a file name to escape the destPath volume mount (/configs) and write files outside the intended directory. Fix: validate each name with filepath.Clean + IsAbs + HasPrefix("..") checks before using it in the tar header, then join with destPath for the archive header. Also guard parent-directory creation against traversal. Closes #1043. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas/test): patch regressed tests from PR #1243 orgs-page flakiness fix Two regressions introduced by PR #1243 (fix issue #1207): 1. ContextMenu.keyboard.test.tsx — `setPendingDelete` now receives `{id, name, hasChildren}` (cascade-delete UX, PR #1252), but the test expected only `{id, name}`. Added `hasChildren: false` to the assertion. 2. orgs-page.test.tsx — 10 tests awaited `vi.advanceTimersByTimeAsync(50)` without `act()`. With fake timers, `setState` (synchronous) is flushed by `advanceTimersByTimeAsync`, but the React state update it triggers is a microtask — so the test saw stale render. Wrapping in `act(async () => { await vi.advanceTimersByTimeAsync(50); })` ensures microtasks drain before assertions run. All 813 vitest tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): add 100px proximity threshold to drag-to-nest detection Fixes #1052 — previously, getIntersectingNodes() returned any node whose bounding box overlapped the dragged node, regardless of actual pixel distance. On a sparse canvas this triggered the "Nest Workspace" dialog even when the dragged node was nowhere near any target. The fix adds an on-node-drag proximity filter: only nodes within 100px (center-to-center) of the dragged node are eligible as nest targets. Distance is computed as squared Euclidean to avoid the sqrt overhead in the hot drag path. Added two tests to Canvas.pan-to-node.test.tsx covering the mock wiring and confirming the regression is addressed in Canvas.tsx. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com> Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): add ?? 0 guard for optional budget_used in progressPct (#1324) (#1327) * fix(ci): revert cancel-in-progress to true — ubuntu-runner dispatch stalled With cancel-in-progress: false, pending CI runs accumulate in the ci-staging concurrency group. New pushes create queued runs, but GitHub dispatches multiple runs for the same SHA instead of replacing the pending one. All runs get stuck/cancelled before completing. Reverting to cancel-in-progress: true restores CI operation — runs that are superseded are cancelled, freeing the concurrency slot for the new run to proceed. Runner availability (ubuntu-latest dispatch stall) is a separate infra issue tracked independently. * fix(security): validate tar header names in copyFilesToContainer — CWE-22 path traversal (#1043) Tar header names were built from raw map keys without validation. A malicious server-side caller could embed "../" in a file name to escape the destPath volume mount (/configs) and write files outside the intended directory. Fix: validate each name with filepath.Clean + IsAbs + HasPrefix("..") checks before using it in the tar header, then join with destPath for the archive header. Also guard parent-directory creation against traversal. Closes #1043. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas/test): patch regressed tests from PR #1243 orgs-page flakiness fix Two regressions introduced by PR #1243 (fix issue #1207): 1. ContextMenu.keyboard.test.tsx — `setPendingDelete` now receives `{id, name, hasChildren}` (cascade-delete UX, PR #1252), but the test expected only `{id, name}`. Added `hasChildren: false` to the assertion. 2. orgs-page.test.tsx — 10 tests awaited `vi.advanceTimersByTimeAsync(50)` without `act()`. With fake timers, `setState` (synchronous) is flushed by `advanceTimersByTimeAsync`, but the React state update it triggers is a microtask — so the test saw stale render. Wrapping in `act(async () => { await vi.advanceTimersByTimeAsync(50); })` ensures microtasks drain before assertions run. All 813 vitest tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): add 100px proximity threshold to drag-to-nest detection Fixes #1052 — previously, getIntersectingNodes() returned any node whose bounding box overlapped the dragged node, regardless of actual pixel distance. On a sparse canvas this triggered the "Nest Workspace" dialog even when the dragged node was nowhere near any target. The fix adds an on-node-drag proximity filter: only nodes within 100px (center-to-center) of the dragged node are eligible as nest targets. Distance is computed as squared Euclidean to avoid the sqrt overhead in the hot drag path. Added two tests to Canvas.pan-to-node.test.tsx covering the mock wiring and confirming the regression is addressed in Canvas.tsx. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): add ?? 0 guard for optional budget_used in progressPct Fixes #1324 — TypeScript strict mode flags budget.budget_used as possibly undefined in the progressPct ternary, even though the outer condition checks budget_limit > 0. Fix: use nullish coalescing (budget_used ?? 0) so progress shows 0% when the backend returns a partial shape (provisioning-stuck workspaces). Also adds a test covering the undefined-budget_used case with the progress bar aria-valuenow and fill width both at 0%. Closes #1324. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com> Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): add ?? 0 guard for optional budget_used in progressPct (issue #1324) (#1329) * fix(ci): revert cancel-in-progress to true — ubuntu-runner dispatch stalled With cancel-in-progress: false, pending CI runs accumulate in the ci-staging concurrency group. New pushes create queued runs, but GitHub dispatches multiple runs for the same SHA instead of replacing the pending one. All runs get stuck/cancelled before completing. Reverting to cancel-in-progress: true restores CI operation — runs that are superseded are cancelled, freeing the concurrency slot for the new run to proceed. Runner availability (ubuntu-latest dispatch stall) is a separate infra issue tracked independently. * fix(security): validate tar header names in copyFilesToContainer — CWE-22 path traversal (#1043) Tar header names were built from raw map keys without validation. A malicious server-side caller could embed "../" in a file name to escape the destPath volume mount (/configs) and write files outside the intended directory. Fix: validate each name with filepath.Clean + IsAbs + HasPrefix("..") checks before using it in the tar header, then join with destPath for the archive header. Also guard parent-directory creation against traversal. Closes #1043. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas/test): patch regressed tests from PR #1243 orgs-page flakiness fix Two regressions introduced by PR #1243 (fix issue #1207): 1. ContextMenu.keyboard.test.tsx — `setPendingDelete` now receives `{id, name, hasChildren}` (cascade-delete UX, PR #1252), but the test expected only `{id, name}`. Added `hasChildren: false` to the assertion. 2. orgs-page.test.tsx — 10 tests awaited `vi.advanceTimersByTimeAsync(50)` without `act()`. With fake timers, `setState` (synchronous) is flushed by `advanceTimersByTimeAsync`, but the React state update it triggers is a microtask — so the test saw stale render. Wrapping in `act(async () => { await vi.advanceTimersByTimeAsync(50); })` ensures microtasks drain before assertions run. All 813 vitest tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): add 100px proximity threshold to drag-to-nest detection Fixes #1052 — previously, getIntersectingNodes() returned any node whose bounding box overlapped the dragged node, regardless of actual pixel distance. On a sparse canvas this triggered the "Nest Workspace" dialog even when the dragged node was nowhere near any target. The fix adds an on-node-drag proximity filter: only nodes within 100px (center-to-center) of the dragged node are eligible as nest targets. Distance is computed as squared Euclidean to avoid the sqrt overhead in the hot drag path. Added two tests to Canvas.pan-to-node.test.tsx covering the mock wiring and confirming the regression is addressed in Canvas.tsx. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): add ?? 0 guard for optional budget_used in progressPct Fixes #1324 — TypeScript strict mode flags budget.budget_used as possibly undefined in the progressPct ternary, even though the outer condition checks budget_limit > 0. Fix: use nullish coalescing (budget_used ?? 0) so progress shows 0% when the backend returns a partial shape (provisioning-stuck workspaces). Also adds a test covering the undefined-budget_used case with the progress bar aria-valuenow and fill width both at 0%. Closes #1324. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com> Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(platform): unblock SaaS workspace registration end-to-end Every workspace in the cross-EC2 SaaS provisioning shape was failing registration, heartbeat, or A2A routing. Four distinct blockers sat between "EC2 is up" and "agent responds"; three are platform-side and fixed here (the fourth is in the CP user-data, separate PR). 1. SSRF validator blocked RFC-1918 (registry.go + mcp.go) validateAgentURL and isPrivateOrMetadataIP rejected 172.16.0.0/12, which contains the AWS default VPC range (172.31.x.x) that every sibling workspace EC2 registers from. Registration returned 400 and the 10-min provision sweep flipped status to failed. RFC-1918 + IPv6 ULA are now gated behind saasMode(); link-local (169.254/16), loopback, IPv6 metadata (fe80::/10, ::1), and TEST-NET stay blocked unconditionally in both modes. saasMode() resolution order: 1. MOLECULE_DEPLOY_MODE=saas\|self-hosted (explicit operator flag) 2. MOLECULE_ORG_ID presence (legacy implicit signal, kept for back-compat so existing deployments don't need a config change) isPrivateOrMetadataIP now actually checks IPv6 — previously it returned false on any non-IPv4 input, which would let a registered [::1] or [fe80::...] URL bypass the SSRF check entirely. 2. Orphan auth-token minting (workspace_provision.go) issueAndInjectToken mints a token and stuffs it into cfg.ConfigFiles[".auth_token"]. The Docker provisioner writes that file into the /configs volume — the CP provisioner ignores it (only cfg.EnvVars crosses the wire). Result: live token in DB, no plaintext on disk, RegistryHandler.requireWorkspaceToken 401s every /registry/register attempt because the workspace is no longer in the "no live token → bootstrap-allowed" state. Now no-ops in SaaS mode; the register handler already mints on first successful register and returns the plaintext in the response body for the runtime to persist locally. Also removes the redundant wsauth.IssueToken call at the bottom of provisionWorkspaceCP, which created the same orphan-token pattern a second time. 3. Compaction artefacts (bundle/importer.go, handlers/org_tokens.go, scheduler.go, workspace_provision.go) Four pre-existing compile errors on main from an earlier session's code truncation: missing tuple destructuring on ExecContext / redactSecrets / orgTokenActor, missing close-brace in Scheduler.fireSchedule's panic recovery. All one-line mechanical fixes; without them the binary would not build. Tests ----- ssrf_test.go adds: * TestSaasMode — covers the env resolution ladder (explicit flag wins over legacy signal, case-insensitive, whitespace tolerant) * TestIsPrivateOrMetadataIP_SaaSMode — asserts RFC-1918 + IPv6 ULA flip to allowed, metadata/loopback/TEST-NET still blocked * TestIsPrivateOrMetadataIP_IPv6 — regression guard for the old "returns false for all IPv6" behaviour Follow-up issue for CP-sourced workspace_id attestation will be filed separately — closes the residual intra-VPC SSRF + token-race windows the SaaS-mode relaxation introduces. Verified end-to-end today on workspace 6565a2e0 (hermes runtime, OpenAI provider) — agent returned "PONG" in 1.4s after register → heartbeat → A2A proxy → runtime. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(runtime+scheduler): increment/decrement active_tasks + max_concurrent (#1408) Runtime (shared_runtime.py): - set_current_task now increments active_tasks on task start, decrements on completion (was binary 0/1) - Counter never goes below 0 (max(0, n-1)) - Pushes heartbeat immediately on BOTH increment and decrement (#1372) Scheduler (scheduler.go): - Reads max_concurrent_tasks from DB (default 1, backward compatible) - Skips cron only when active_tasks >= max_concurrent_tasks (was > 0) - Leaders can be configured with max_concurrent_tasks > 1 to accept A2A delegations while a cron runs Platform: - Added max_concurrent_tasks column to workspaces (migration 037) - Workspace model + list/get queries include the new field - API exposes max_concurrent_tasks in workspace JSON Config.yaml support (future): runtime_config.max_concurrent_tasks Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(review): address 3 critical issues from code review 1. BLOCKER: executor_helpers.py now uses increment/decrement too (was still binary 0/1, stomping the counter for CLI + SDK executors) 2. BUG: asymmetric getattr defaults fixed — both paths use default 0 (was 0 on increment, 1 on decrement) 3. UX: current_task preserved when active_tasks > 0 on decrement (was clearing task description even when other tasks still running) 4. Scheduler polling loop re-reads max_concurrent_tasks on each poll (was using stale value from initial query) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Hongming Wang <hongmingwangrabbit@gmail.com> Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com> Co-authored-by: Molecule AI Technical Writer <technical-writer@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Molecule AI Infra-Runtime-BE <infra-runtime-be@agents.moleculesai.app> Co-authored-by: Molecule AI Core-BE <core-be@agents.moleculesai.app> Co-authored-by: Molecule AI SDK Lead <sdk-lead@agents.moleculesai.app> Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app> Co-authored-by: Hongming Wang <hongmingwang.rabbit@users.noreply.github.com> * docs: workspace files API reference, skill catalog, and links * docs: fix secrets endpoint path across docs The workspace secrets endpoint is `/workspaces/:id/secrets`, not `/secrets/values`. This was wrong in quickstart.md (Path 2: Remote Agent) and workspace-runtime.md (registration flow example and comparison table). The external-agent-registration guide already had the correct path. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs: fix broken blog cross-link in skills-vs-bundled-tools post Link path had an extra `/docs/` segment: `/docs/blog/...` instead of `/blog/...`. Nextra resolves blog posts directly under `/blog/<slug>`, not under `/docs/blog/`. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs: add skill-catalog.md guide Linked from the skills-vs-bundled-tools blog post as a reference for TTS/image-generation/web-search skills. The blog promises "install directly via the CLI" with a skill catalog — this page fills that promise by documenting available skill types, install commands, version management, custom skill authoring, and removal. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs(marketing): update Phase 30 brief — Action 5 complete, docs/index.md update noted * docs(api-ref): add workspace file copy API reference Documents TemplatesHandler.copyFilesToContainer (container_files.go): - Endpoint overview: PUT /workspaces/:id/files/path - Parameter descriptions for all four function parameters - CWE-22 path traversal protection (PRs #1267/1270/1271) - Defense-in-depth: validateRelPath at handler + archive boundary - Full error code table (400/404/500) - curl example with success and path-traversal rejection cases Also covers: writeViaEphemeral routing, findContainer fallback, allowed roots allow-list, and related links to platform-api.md. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Molecule AI Technical Writer <technical-writer@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com> fix(handlers): add saasMode() gating to isPrivateOrMetadataIP in a2a_proxy_helpers.go Issue #1421 / #1401: PR #1363 (handler split) moved isPrivateOrMetadataIP into a2a_proxy_helpers.go but kept the OLD pre-SaaS version — it unconditionally blocks RFC-1918 addresses, regressing the fix in commits `1125a02` / `cf10733`. The A2A proxy path now has the same SaaS-gated logic as registry.go: - Cloud metadata (169.254/16, fe80::/10, ::1) always blocked in both modes - RFC-1918 (10/8, 172.16/12, 192.168/16) + IPv6 ULA (fc00::/7) blocked in self-hosted, allowed in SaaS cross-EC2 mode - IPv6 addresses now properly checked (previous version returned false for all) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs(marketing): Discord adapter Day 2 Reddit + HN community copy * fix(tests): supply events.Broadcaster pointer to captureBroadcaster Cannot use captureBroadcaster as events.Broadcaster when the struct embeds events.Broadcaster as a value — must initialize as a named field. Fixes go vet error in workspace_provision_test.go: cannot use broadcaster (captureBroadcaster) as events.Broadcaster value Merge pull request #1429 from fix/canvas-tooltip-clear-timer Without this, a 400ms setTimeout from onFocus/onMouseEnter that fires after onBlur will re-show a tooltip the user just dismissed. The setShow(false) in onBlur closes the tooltip immediately but leaves the timer pending — Tab-blur followed by timer-fire would re-show it. Fix: add clearTimeout(timerRef.current) at the top of onBlur, mirroring the pattern already used in onMouseLeave and onFocus. Refs: PR #1367 (a11y keyboard support — this was a pre-existing gap) Co-authored-by: Molecule AI App-FE <app-fe@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas/test): add missing children:[] to setPendingDelete expectation (#1426) PR #1252 (cascade-delete UX) updated setPendingDelete to pass a children array for cascade-warning rendering. The keyboard-a11y test assertion was not updated to match. Test: clicking 'Delete' hoists state to the store and closes the menu Co-authored-by: Molecule AI Core-QA <core-qa@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas/test): add children:[] to setPendingDelete + \' entity fix (closes #1380) (#1427) * ci: retry — trigger fresh runner allocation * fix(canvas/test): add children:[] to setPendingDelete assertion setPendingDelete now includes children:[] (PR #1383 extended the pendingDelete type). The keyboard accessibility test at line 225 used exact object matching which omitted the new field, causing a failure after staging merged #1383. Issue: #1380 * fix(canvas): replace ' HTML entity with straight apostrophe JSX does not entity-decode ' — it renders the literal text "'" instead of "'". Found at line 157 (payment confirmed) and line 321 (empty org list). Replaced with a straight apostrophe, which JSX handles correctly. Ref: issue #1375 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: DevOps Engineer <devops@molecule.ai> Co-authored-by: Molecule AI Core-UIUX <core-uiux@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * Merge pull request #1430 from fix/1421-saas-ssrf-helpers Issue #1421 / #1401: PR #1363 (handler split) moved isPrivateOrMetadataIP into a2a_proxy_helpers.go but kept the OLD pre-SaaS version — it unconditionally blocks RFC-1918 addresses, regressing the fix in commits `1125a02` / `cf10733`. The A2A proxy path now has the same SaaS-gated logic as registry.go: - Cloud metadata (169.254/16, fe80::/10, ::1) always blocked in both modes - RFC-1918 (10/8, 172.16/12, 192.168/16) + IPv6 ULA (fc00::/7) blocked in self-hosted, allowed in SaaS cross-EC2 mode - IPv6 addresses now properly checked (previous version returned false for all) Co-authored-by: Molecule AI Core-BE <core-be@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(P0): CWE-22 path traversal in copyFilesToContainer + ContextMenu test Issue #1434 — CWE-22 Path Traversal Regression: PR #1280 (`dc218212`) correctly used cleaned path in tar header. PR #1363 (`e9615af`) regressed to using uncleaned `name`. Fix: use `clean` in filepath.Join AND add defence-in-depth escape check. Issue #1422 — ContextMenu Test Regression: PR #1340 expanded pendingDelete store type to include `children:[]`. Test assertion missing the field — add `children:[]` to match. Note: ssrf.go created (shared isSafeURL/isPrivateOrMetadataIP) to prepare for the handler-split refactor fix — current branch has no build error, but the shared file will prevent regression when PR #1363 is merged. isSafeURL/isPrivateOrMetadataIP retained in both files for now to avoid breaking callers while the split is finalized. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: resolve 3 go vet failures + add idempotency_key to delegate_task_async - workspace_provision_test.go: add missing mock := setupTestDB(t) to TestSeedInitialMemories_Truncation — mock was referenced but never declared, causing "undefined: mock" vet error - orgtoken/tokens_test.go: discard unused orgID return value with _ in Validate call — "declared and not used" vet error - a2a_tools.py: delegate_task_async now sends idempotency_key (SHA-256 of workspace_id + task) to POST /workspaces/:id/delegate, fixing duplicate task execution when an agent restarts mid-delegation (#1456) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: airenostars <airenostars@gmail.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com> Co-authored-by: Hongming Wang <hongmingwangrabbit@gmail.com> Co-authored-by: Molecule AI Technical Writer <technical-writer@agents.moleculesai.app> Co-authored-by: Molecule AI Infra-Runtime-BE <infra-runtime-be@agents.moleculesai.app> Co-authored-by: Molecule AI Core-BE <core-be@agents.moleculesai.app> Co-authored-by: Molecule AI SDK Lead <sdk-lead@agents.moleculesai.app> Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app> Co-authored-by: Hongming Wang <hongmingwang.rabbit@users.noreply.github.com> Co-authored-by: Molecule AI Community Manager <community-manager@agents.moleculesai.app> Co-authored-by: Molecule AI App-FE <app-fe@agents.moleculesai.app> Co-authored-by: Molecule AI Core-QA <core-qa@agents.moleculesai.app> Co-authored-by: DevOps Engineer <devops@molecule.ai> Co-authored-by: Molecule AI Core-UIUX <core-uiux@agents.moleculesai.app> Co-authored-by: Molecule AI Dev Lead <dev-lead@agents.moleculesai.app>	2026-04-21 18:22:30 +00:00
molecule-ai[bot]	38e9eba59a	fix(P0): CWE-22 path traversal in copyFilesToContainer + ContextMenu test Issue #1434 — CWE-22 Path Traversal Regression: PR #1280 (`dc218212`) correctly used cleaned path in tar header. PR #1363 (`e9615af`) regressed to using uncleaned `name`. Fix: use `clean` in filepath.Join AND add defence-in-depth escape check. Issue #1422 — ContextMenu Test Regression: PR #1340 expanded pendingDelete store type to include `children:[]`. Test assertion missing the field — add `children:[]` to match. Note: ssrf.go created (shared isSafeURL/isPrivateOrMetadataIP) to prepare for the handler-split refactor fix — current branch has no build error, but the shared file will prevent regression when PR #1363 is merged. isSafeURL/isPrivateOrMetadataIP retained in both files for now to avoid breaking callers while the split is finalized. Co-authored-by: Molecule AI Core-BE <core-be@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-21 16:56:47 +00:00
Hongming Wang	a14cf863d1	Merge pull request #1445 from Molecule-AI/fix/tenant-dockerfile-uid-conflict fix(tenant-image): remove node user so canvas uid 1000 can be created	2026-04-21 08:58:09 -07:00
Molecule AI SDK Lead	e9615af169	Merge origin/main into staging: resolve conflicts with main's test + security fixes Conflicts resolved (took main's versions): - canvas/src/app/__tests__/orgs-page.test.tsx (act() wrappers, PR #1350) - canvas/src/components/Canvas.tsx (100px proximity threshold, PR #1357) - canvas/src/components/__tests__/ContextMenu.keyboard.test.tsx (hasChildren fix) - workspace-server/internal/handlers/container_files.go (CWE-22/CWE-78 fixes, PRs #1281/#1310) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-21 12:25:42 +00:00
Hongming Wang	6bd674e412	fix(e2e): CP DELETE /cp/admin/tenants body uses 'confirm', not 'confirm_token' Verified against live staging: the admin endpoint returns 400 'confirm field must equal the URL slug' when the body key is 'confirm_token'. Every workflow's safety-net teardown step + the main harness + the Playwright teardown all had the wrong key. Fixed all six call sites. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 04:50:28 -07:00
Hongming Wang	d7193dfa34	feat(e2e): pivot to admin-bearer-only auth + add sanity self-check workflow Reduces required secret surface from 2 (session cookie + admin token) to 1 (admin token). Pairs with molecule-controlplane#202 which adds: - POST /cp/admin/orgs — server-to-server org creation - GET /cp/admin/orgs/:slug/admin-token — per-tenant bearer fetch With those endpoints live, CI doesn't need to scrape a browser WorkOS session cookie. CP admin bearer (Railway CP_ADMIN_API_TOKEN) drives provision + tenant-token retrieval + teardown through a single credential. Changes ------- test_staging_full_saas.sh: admin bearer for provision/teardown, fetched per-tenant token drives all tenant API calls. Added E2E_INTENTIONAL_FAILURE=1 toggle that poisons the tenant token after provisioning so the teardown path gets exercised when the happy-path isn't. canvas/e2e/staging-setup.ts: same pivot; exports STAGING_TENANT_TOKEN instead of STAGING_SESSION_COOKIE. canvas/e2e/staging-tabs.spec.ts: context.setExtraHTTPHeaders with Authorization: Bearer on every page request, no cookie handling. All three workflows (e2e-staging-saas, canary-staging, e2e-staging-canvas): drop MOLECULE_STAGING_SESSION_COOKIE env + verification step. One secret to set. NEW e2e-staging-sanity.yml: weekly Mon 06:00 UTC. Runs the harness with E2E_INTENTIONAL_FAILURE=1 and inverts the pass condition — rc=1 is green, rc=0 (unexpected success) or rc=4 (leak) open a priority-high issue labelled e2e-safety-net. This is the answer to 'how do we know the teardown path still works when nothing else has failed recently.' STAGING_SAAS_E2E.md refreshed: single-secret setup, sanity workflow documented, canvas workflow added to the coverage matrix. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 04:34:11 -07:00
Hongming Wang	f4700858ac	feat(e2e): canary + canvas Playwright workflows; delegation mechanics Three additions on top of `187a9bf`: 1. Canary (.github/workflows/canary-staging.yml) 30-min cron that runs the full-SaaS harness in E2E_MODE=canary: one hermes workspace + one A2A PONG + teardown. ~8-min wall clock vs ~20-min for the full run. Alerting is self-contained: opens a single 'Canary failing' issue on first failure, comments on subsequent failures (no issue spam), auto-closes the issue on the next green run. Labels: canary-staging, bug. Safety-net teardown step sweeps e2e-YYYYMMDD-canary-* orgs tagged today so a runner cancel can't leak EC2. 2. Canvas Playwright (canvas/e2e/staging-*.ts + playwright.staging.config.ts + .github/workflows/e2e-staging-canvas.yml) staging-setup.ts provisions a fresh org + hermes workspace (same lifecycle as the bash harness, just in TypeScript). staging-tabs.spec.ts clicks through all 13 workspace-panel tabs (chat, activity, details, skills, terminal, config, schedule, channels, files, memory, traces, events, audit) and asserts each renders without crashing and without 'Failed to load' error toasts. Known SaaS gaps (Files empty, Terminal disconnects, Peers 401) are documented in #1369 and whitelisted so they don't fail the test — the gate is 'no hard crash', not 'no issues'. staging-teardown.ts deletes the org via DELETE /cp/admin/tenants/:slug. playwright.staging.config.ts separates staging from local tests so pnpm test in dev doesn't try to provision against staging. Retries=2 and timeouts are longer; workers=1 because the setup provisions one shared workspace. Workflow uploads HTML report + screenshots on failure for 14 days. 3. Delegation mechanics (tests/e2e/test_staging_full_saas.sh section 10) Parent → child proxy test: POST /workspaces/CHILD/a2a with X-Source-Workspace-Id=PARENT and verify the child responds + child activity log captures PARENT as source. Intentionally LLM-free: the mechanics regression is what matters; prompt-driven delegation correctness belongs in canvas-driven tests. Also reorders teardown step to 11/11 since delegation is 10/11. Mode gating: E2E_MODE=canary -> skips child workspace, HMA memory, peers, activity, delegation (steps 6, 9, 10 no-op). Full-lifecycle still runs every piece. Validated both paths via 'bash -n' syntax check after each edit. Secrets requirement unchanged (same two secrets as `187a9bf`): MOLECULE_STAGING_SESSION_COOKIE, MOLECULE_STAGING_ADMIN_TOKEN. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 04:15:10 -07:00
molecule-ai[bot]	00bd73f8c8	fix(canvas): a11y fixes + budget_used TypeScript guard + orgs-page test fix (#1367 ) * fix(canvas/a11y): mark StatusDot as aria-hidden — decorative element StatusDot is purely decorative; the status is already conveyed via aria-label on parent elements (WorkspaceNode, SidePanel header, etc.). Marking it aria-hidden="true" prevents screen readers from announcing the empty div as "img" with no alt text. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): guard budget_used optional field with ?? 0 in progress calc TypeScript error in CI: 'budget.budget_used' is possibly 'undefined' when used in the progress percentage calculation. The field is optional per BudgetData interface, so ?? 0 is the correct guard. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas/a11y): Tooltip keyboard focus support + ARIA role - Add role="tooltip" + unique id so assistive tech can find tooltip content - Add aria-describedby on trigger so screen readers announce tooltip text - Add onFocus/onBlur handlers so keyboard users (Tab navigation) can see tooltips that mouse users see on hover Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas/test): restore advanceTimersByTime pattern in orgs-page error test waitFor() + fake timers (vi.useFakeTimers in beforeEach) cause race conditions: the 5s polling timeout fires before React state updates flush. Restores the established pattern used by all other tests in this file: advanceTimersByTimeAsync(50) + runAllTimersAsync(). Also removes the now-unused waitFor import. Ref: PRs #1360, #1345 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Molecule AI Core-UIUX <core-uiux@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-21 11:08:24 +00:00
molecule-ai[bot]	bde456a893	feat(canvas/e2e): add Playwright test for context-menu → delete confirm flow (#1344 ) Issue #1138: Add Playwright E2E for context-menu → delete confirm flow. The unit test (ContextMenu.keyboard.test.tsx) only exercises the store setter — it can't catch the portal/race bug from PR #1133 where the portal-rendered ConfirmDialog was closed by the menu's outside-click handler before onConfirm fired. This E2E test covers: - Right-click workspace node → context menu opens - Click Delete → ConfirmDialog appears (not swallowed) - Click Confirm → dialog closes, node disappears, DELETE /workspaces/:id fires - Click Cancel → dialog closes, node remains Requires: platform on :8080, canvas on :3000. Closes #1138. Co-authored-by: Molecule AI Core-UIUX <core-uiux@agents.moleculesai.app>	2026-04-21 08:11:48 +00:00
molecule-ai[bot]	f2e4f71fee	fix(canvas/test): restore waitFor in orgs-page error test + add getState mock (#1341 ) Issue #1268: orgs-page error state test — replace vi.advanceTimersByTimeAsync(50) with waitFor polling. advanceTimersByTimeAsync fires the timer but does not guarantee React render flush completes before the assertion runs. Issue #1269: ContextMenu keyboard test — add getState: () => mockStore to useCanvasStore mock. PR #1243 changed the delete flow to hoist confirmation to Canvas-level dialog via setPendingDelete, which reads .nodes via useCanvasStore.getState() — the mock was missing getState. Also carries forward the Issue #1124 WORKSPACE_ID fail-fast fix from workspace/ modules (a2a_cli, a2a_client, coordinator, consolidation, molecule_ai_status) — RuntimeError if WORKSPACE_ID is unset/empty. Co-authored-by: Molecule AI Core Platform Lead <core-platform-lead@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-21 07:52:15 +00:00
molecule-ai[bot]	b21b3d163f	fix(canvas): add ?? 0 guard for optional budget_used in progressPct (#1324 ) (#1327 ) * fix(ci): revert cancel-in-progress to true — ubuntu-runner dispatch stalled With cancel-in-progress: false, pending CI runs accumulate in the ci-staging concurrency group. New pushes create queued runs, but GitHub dispatches multiple runs for the same SHA instead of replacing the pending one. All runs get stuck/cancelled before completing. Reverting to cancel-in-progress: true restores CI operation — runs that are superseded are cancelled, freeing the concurrency slot for the new run to proceed. Runner availability (ubuntu-latest dispatch stall) is a separate infra issue tracked independently. * fix(security): validate tar header names in copyFilesToContainer — CWE-22 path traversal (#1043) Tar header names were built from raw map keys without validation. A malicious server-side caller could embed "../" in a file name to escape the destPath volume mount (/configs) and write files outside the intended directory. Fix: validate each name with filepath.Clean + IsAbs + HasPrefix("..") checks before using it in the tar header, then join with destPath for the archive header. Also guard parent-directory creation against traversal. Closes #1043. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas/test): patch regressed tests from PR #1243 orgs-page flakiness fix Two regressions introduced by PR #1243 (fix issue #1207): 1. ContextMenu.keyboard.test.tsx — `setPendingDelete` now receives `{id, name, hasChildren}` (cascade-delete UX, PR #1252), but the test expected only `{id, name}`. Added `hasChildren: false` to the assertion. 2. orgs-page.test.tsx — 10 tests awaited `vi.advanceTimersByTimeAsync(50)` without `act()`. With fake timers, `setState` (synchronous) is flushed by `advanceTimersByTimeAsync`, but the React state update it triggers is a microtask — so the test saw stale render. Wrapping in `act(async () => { await vi.advanceTimersByTimeAsync(50); })` ensures microtasks drain before assertions run. All 813 vitest tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): add 100px proximity threshold to drag-to-nest detection Fixes #1052 — previously, getIntersectingNodes() returned any node whose bounding box overlapped the dragged node, regardless of actual pixel distance. On a sparse canvas this triggered the "Nest Workspace" dialog even when the dragged node was nowhere near any target. The fix adds an on-node-drag proximity filter: only nodes within 100px (center-to-center) of the dragged node are eligible as nest targets. Distance is computed as squared Euclidean to avoid the sqrt overhead in the hot drag path. Added two tests to Canvas.pan-to-node.test.tsx covering the mock wiring and confirming the regression is addressed in Canvas.tsx. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): add ?? 0 guard for optional budget_used in progressPct Fixes #1324 — TypeScript strict mode flags budget.budget_used as possibly undefined in the progressPct ternary, even though the outer condition checks budget_limit > 0. Fix: use nullish coalescing (budget_used ?? 0) so progress shows 0% when the backend returns a partial shape (provisioning-stuck workspaces). Also adds a test covering the undefined-budget_used case with the progress bar aria-valuenow and fill width both at 0%. Closes #1324. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com> Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-21 07:21:27 +00:00
molecule-ai[bot]	45715aa8a5	fix(canvas/test): patch test regressions from PR #1243 + proximity hitbox fix (#1313 ) * fix(ci): revert cancel-in-progress to true — ubuntu-runner dispatch stalled With cancel-in-progress: false, pending CI runs accumulate in the ci-staging concurrency group. New pushes create queued runs, but GitHub dispatches multiple runs for the same SHA instead of replacing the pending one. All runs get stuck/cancelled before completing. Reverting to cancel-in-progress: true restores CI operation — runs that are superseded are cancelled, freeing the concurrency slot for the new run to proceed. Runner availability (ubuntu-latest dispatch stall) is a separate infra issue tracked independently. * fix(security): validate tar header names in copyFilesToContainer — CWE-22 path traversal (#1043) Tar header names were built from raw map keys without validation. A malicious server-side caller could embed "../" in a file name to escape the destPath volume mount (/configs) and write files outside the intended directory. Fix: validate each name with filepath.Clean + IsAbs + HasPrefix("..") checks before using it in the tar header, then join with destPath for the archive header. Also guard parent-directory creation against traversal. Closes #1043. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas/test): patch regressed tests from PR #1243 orgs-page flakiness fix Two regressions introduced by PR #1243 (fix issue #1207): 1. ContextMenu.keyboard.test.tsx — `setPendingDelete` now receives `{id, name, hasChildren}` (cascade-delete UX, PR #1252), but the test expected only `{id, name}`. Added `hasChildren: false` to the assertion. 2. orgs-page.test.tsx — 10 tests awaited `vi.advanceTimersByTimeAsync(50)` without `act()`. With fake timers, `setState` (synchronous) is flushed by `advanceTimersByTimeAsync`, but the React state update it triggers is a microtask — so the test saw stale render. Wrapping in `act(async () => { await vi.advanceTimersByTimeAsync(50); })` ensures microtasks drain before assertions run. All 813 vitest tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): add 100px proximity threshold to drag-to-nest detection Fixes #1052 — previously, getIntersectingNodes() returned any node whose bounding box overlapped the dragged node, regardless of actual pixel distance. On a sparse canvas this triggered the "Nest Workspace" dialog even when the dragged node was nowhere near any target. The fix adds an on-node-drag proximity filter: only nodes within 100px (center-to-center) of the dragged node are eligible as nest targets. Distance is computed as squared Euclidean to avoid the sqrt overhead in the hot drag path. Added two tests to Canvas.pan-to-node.test.tsx covering the mock wiring and confirming the regression is addressed in Canvas.tsx. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com> Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-21 07:06:57 +00:00
molecule-ai[bot]	c0d5e528a4	fix(canvas): cascade-delete UX — require checkbox before Delete All (#1314 ) * fix(canvas/test): restore test regressions from PR #1243 PR #1243 introduced two regressions in the canvas vitest suite: 1. ContextMenu.keyboard.test.tsx: the setPendingDelete call now passes `{hasChildren, id, name}` (not just `{id, name}`). Updated the keyboard-a11y test assertion to match the new store shape. 2. orgs-page.test.tsx: mockFetch.mockResolvedValueOnce() returned a plain object that didn't match the two-argument (url, options) call signature used by the component's fetch wrapper. Switched to mockImplementationOnce returning a rejected Promise — matching real fetch's rejection contract — and added runAllTimersAsync after advanceTimersByTimeAsync(50) to flush React state updates. 54 test files · 813 tests · all passing Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): replace bounding-box intersection with distance threshold for nest detection ReactFlow's getIntersectingNodes uses bounding-box overlap detection, which fires the drag-over state whenever any part of two nodes' position rectangles overlap — even when the dragged node is far from the target. This made the "Nest Workspace" dialog appear from large distances. Fix: scan all nodes on each drag tick and set dragOverNodeId to the closest node within NEST_PROXIMITY_THRESHOLD (150 px, center-to-center). This matches the intuitive behavior: nest only when the node is actually dropped near another. Constants: - NEST_PROXIMITY_THRESHOLD = 150px (~60% of a collapsed node's width) - DEFAULT_NODE_WIDTH = 245px (mid-range of min/max node widths) - DEFAULT_NODE_HEIGHT = 110px Also removed the unused getIntersectingNodes import (was causing duplicate identifier error when both onNodeDrag and the zoom handler called useReactFlow in the same component scope). Closes #1052. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): cascade-delete UX — show child count and require checkbox before Delete All Issue #1137: with ?confirm=true always sent, a single confirmation silently cascades — a team lead with 20 children gets nuked on one click. Changes: - store/canvas.ts: pendingDelete type now includes children: {id, name}[] - ContextMenu.tsx: passes child list to setPendingDelete on Delete click - DeleteCascadeConfirmDialog.tsx: new component — shows child names, a cascade warning, and requires the operator to tick a checkbox before Delete All activates. Disabled by default; only enables after checkbox. - Canvas.tsx: conditionally renders DeleteCascadeConfirmDialog for hasChildren workspaces, or plain ConfirmDialog for leaf workspaces. confirmDelete requires cascadeConfirmChecked=true when hasChildren. - ContextMenu.keyboard.test.tsx: updated setPendingDelete assertion to include children:[] (no children in the test fixture). 813 tests pass. Closes #1137. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Molecule AI Core-UIUX <core-uiux@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-21 07:06:45 +00:00
molecule-ai[bot]	04c3bc6eb1	fix(canvas): cascade-delete UX — warn before deleting workspace with children (PR #1252 ) - Store: pendingDelete now carries `hasChildren: boolean` (computed from nodes.some(parentId === nodeId)) - ContextMenu: passes hasChildren into setPendingDelete - Canvas: dialog title changes to "Delete Workspace and Children" with ⚠️ message when hasChildren; confirms with "Delete All" Refs: #1137 Co-authored-by: Molecule AI Fullstack (floater) <fullstack-floater@agents.moleculesai.app>	2026-04-21 03:25:12 +00:00
molecule-ai[bot]	221d8b2384	fix(canvas): guard undefined lastErrorRate and period dates in metrics (PR #1250 ) - DetailsTab: use `(data.lastErrorRate ?? 0)` instead of bare multiply to prevent NaN% when the field is absent on pre-provisioning workspaces. - WorkspaceUsage: make formatPeriod accept optional start/end strings; return "—" for undefined so the usage period shows blank rather than "Invalid Date" for provisioning/partial workspaces. Refs: #1139 Co-authored-by: Molecule AI Fullstack (floater) <fullstack-floater@agents.moleculesai.app>	2026-04-21 03:22:17 +00:00
molecule-ai[bot]	883cb7ebc3	fix(canvas): eliminate flaky timer state between orgs-page tests (#1207 ) (#1243 ) Root cause: tests used try/finally { vi.useRealTimers() / vi.useFakeTimers() } back-and-forth. When any test's finally-block called vi.useFakeTimers(), subsequent tests inherited fake timer state causing 50ms real setTimeouts to not fire and mockFetch to accumulate calls across test boundaries. Fix: consolidate timer management to beforeEach/afterEach hooks. - beforeEach: vi.useFakeTimers() — all tests start from known fake state - afterEach: cleanup() + vi.useRealTimers() — restore real timers for next test - Individual tests: use vi.advanceTimersByTimeAsync(50) instead of real setTimeout Also removed duplicate afterEach(cleanup()) and unused waitFor import. Closes #1207. Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-21 03:07:28 +00:00
molecule-ai[bot]	014295d57f	fix(issue-1207): eliminate orgs-page test flakiness (#1235 ) * fix(auth): F1094 — requireCallerOwnsOrg reads org_id not created_by (#1200) Root cause: requireCallerOwnsOrg (org_plugin_allowlist.go:116) was reading org_api_tokens.created_by to determine caller's org workspace ID. But created_by is a provenance label ("session", "admin-token", "org-token:<prefix>") — never a UUID. The equality check callerOrg != targetOrgID always failed → every org-token caller got 403 on /orgs/:id/plugins/allowlist routes. Fix: - Migration 036: adds org_id UUID column (nullable) to org_api_tokens with index. Existing pre-migration tokens get org_id=NULL → deny by default (safer than cross-org access). - orgtoken.Issue: takes new orgID param; stores in org_id column. - orgtoken.OrgIDByTokenID: new helper reads org_id for a token ID. Returns ("", nil) for NULL/unanchored tokens. - requireCallerOwnsOrg: now calls OrgIDByTokenID instead of reading created_by. Pre-migration tokens with org_id=NULL get callerOrg="" → denied (safer). - orgTokenActor (org_tokens.go): returns (createdBy, orgID) pair. Token minted via another org token gets its org_id set at mint time. Session/ADMIN_TOKEN callers get orgID="". - orgtoken.Token struct: adds OrgID field for list display. - orgtoken.List: selects org_id alongside other columns. - Updated existing tests for new Issue signature. - Added 10 regression tests covering: happy path, unanchored denial, cross-org denial, session bypass, DB error denial. 🤖 Generated with [Claude Code](https://claude.ai/claude-code) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(security): replace err.Error() leaks with prod-safe messages (#1206) - workspace_provision.go: provisionWorkspace, provisionWorkspaceCP — replaced 7 err.Error() calls with "provisioning failed" in both Broadcast payloads and last_sample_error DB column. Full error preserved in server-side log.Printf. - plugins_install_pipeline.go: resolveAndStage — replaced 5 err.Error() calls with generic messages: "invalid plugin source" "plugin source not supported" "invalid plugin name" "staged plugin exceeds size limit" "plugin manifest integrity check failed" Risk mitigated: DB errors (pq: connection refused, pq: deadlock), OS errors, and internal paths no longer leak in HTTP JSON responses or WebSocket broadcasts. Added regression tests (workspace_provision_test.go): - TestProvisionWorkspace_NoInternalErrorsInBroadcast - TestProvisionWorkspaceCP_NoInternalErrorsInBroadcast - TestResolveAndStage_NoInternalErrorsInHTTPErr Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(F1089): log panic-recovery UPDATE errors in scheduler The panic defer blocks in tick() and fireSchedule() now capture and log errors from the db.DB.ExecContext call that advances next_run_at after a panic. Previously, a DB failure during panic recovery was silent — the log line for the panic itself appeared but any subsequent UPDATE failure was invisible, risking unnoticed scheduler drift. context.Background() was already used (F1089 comment in place); this commit adds the missing error capture + log.Printf on exec failure. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(issue-1207): eliminate orgs-page test flakiness Three root causes addressed: 1. Duplicate afterEach blocks (lines 97-103) — two identical afterEach(() => { cleanup(); }) blocks collapsed to one. 2. Fake-timer isolation gap — if a polling test failed before its finally-block ran, vi.useFakeTimers() persisted globally. The next non-polling test's setTimeout(50) then hung indefinitely (fake timers don't advance without vi.advanceTimersByTime), causing waitFor/async timeouts. Fixed by calling vi.useRealTimers() unconditionally in beforeEach (guaranteed clean slate) and afterEach (even when a test fails before its own finally). 3. mockFetch.callHistory now cleared via mockReset() in beforeEach, preventing "expected 2 calls but got N" failures from carry-over between polling tests. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Molecule AI Dev Lead <dev-lead@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-21 02:47:25 +00:00
molecule-ai[bot]	4e39201d76	fix(canvas): show toast when clipboard API unavailable in ConsoleModal (#1199 ) (#1231 ) Use explicit navigator.clipboard check instead of optional chaining so the no-op case is handled explicitly. When clipboard API is unavailable (non-HTTPS context) show a toast: "Copy requires HTTPS — please select and copy manually". Production is always HTTPS so this only affects local dev with http:// canvas. Closes #1199. Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-21 02:45:29 +00:00
molecule-ai[bot]	fcd3a6eaf0	fix(test): align ssrf_test.go localhost test cases with isSafeURL behaviour (#1192 ) * feat(canvas): rewrite MemoryInspectorPanel to match backend API Issue #909 (chunk 3 of #576). The existing MemoryInspectorPanel used the wrong API endpoint (/memory instead of /memories) and wrong field names (key/value/version instead of id/content/scope/namespace/created_at). It also lacked LOCAL/TEAM/GLOBAL scope tabs and a namespace filter. Changes: - Fix endpoint: GET /workspaces/:id/memories with ?scope= query param - Fix MemoryEntry type to match actual API: id, content, scope, namespace, created_at, similarity_score - Add LOCAL/TEAM/GLOBAL scope tabs - Add namespace filter input - Remove Edit functionality (no update endpoint in backend) - Delete uses DELETE /workspaces/:id/memories/:id (by id, not key) - Full rewrite of 27 tests to match new API and UI structure - Uses ConfirmDialog (not native dialogs) for delete confirmation - All dark zinc theme (no light colors) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: tighten types + improve provision-timeout message (#1135, #1136) #1135 — TypeScript: make BudgetData.budget_used and WorkspaceMetrics fields optional to match actual partial-response shapes from provisioning- stuck workspaces. Runtime already guarded with ?? 0. #1136 — provisiontimeout.go: replace misleading "check required env vars" hint (preflight catches that case upfront) with accurate message about container starting but failing to call /registry/register. 🤖 Generated with [Claude Code](https://claude.com/claude-code) * fix(test): align ssrf_test.go localhost test cases with isSafeURL behaviour isSafeURL blocks 127.0.0.1 via ip.IsLoopback() even in dev environments. The test cases `wantErr: false` for localhost were incorrect — the test would fail when go test runs. Fix by changing wantErr to true for both localhost test cases. Rationale: loopback blocking at this layer is intentional. Access control is enforced by WorkspaceAuth + CanCommunicate at the A2A routing layer, not by the URL validation. Opening this would widen the SSRF attack surface without adding real dev flexibility. Closes: ssrf_test.go inconsistency reported 2026-04-21 Co-Authored-By: Claude Sonnet 4.7 <noreply@anthropic.com> --------- Co-authored-by: Molecule AI Core-UIUX <core-uiux@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-21 02:08:45 +00:00
Hongming Wang	c033f3b051	chore(canvas): review nits on ConsoleModal + Peers empty state Post-review cleanup for the #1178 / #1189 bootstrap-watcher flow: - ConsoleModal status-code matching uses \b regex anchors instead of raw substrings. Before, any error message containing "501" inside a longer digit run ("15012") would false-match into the self-hosted branch. Unlikely in practice but cheap to tighten. - Peers empty-state copy now explains WHY the list is empty on offline / failed / provisioning workspaces instead of rendering the same "No reachable peers" text used for healthy workspaces with zero siblings. Online workspaces unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 18:02:22 -07:00
Hongming Wang	3b339d13e6	fix(canvas): skip /registry/:id/peers fetch when workspace not online The peers endpoint requires a workspace-scoped bearer token (see validateDiscoveryCaller in handlers/discovery.go — designed for agent-to-agent calls). The canvas session doesn't hold that token, so every Details-tab open for a provisioning / failed / offline workspace fired a 401 that cluttered devtools and lit up the error banner even though the real UX here is "no peers — the workspace hasn't booted." Gate the fetch on status ∈ {online, degraded} and render an empty Peers list for everything else. Follow-up: give the canvas a way to see peers for any workspace (admin session should be enough). Tracked separately — this fix just quiets the noise on the common case. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 17:38:25 -07:00
Hongming Wang	48900e34b0	feat(canvas): show last_sample_error + EC2 console output on failed workspaces Part 3 of 3 for the "fail fast + comprehensive logs" UX. Platform PR #1168 and controlplane #181 ship the server-side; this PR surfaces the data in the canvas. Two changes: 1. DetailsTab renders `last_sample_error` in a dedicated Error section when the workspace is failed (or degraded with an error). Before, the only trace of why a workspace failed was a generic banner — users had to click "View Logs", which opened the terminal tab (the post-boot log, empty on a runtime crash). Now the actual Python traceback is inline. A "View console output" button in the same section opens the full serial console in a modal. 2. New ConsoleModal component. Fetches GET /workspaces/:id/console (platform → CP → ec2:GetConsoleOutput). Portal-rendered above the canvas with Copy / Close / Esc handlers. Renders a friendly message on 501 (self-hosted deploys without CP) and 404 (instance terminated). 3. ProvisioningTimeout's "View Logs" button now opens the console modal instead of the (usually empty) terminal tab — when a workspace is stuck in provisioning, the cloud-init + user-data trace is what the user actually needs. Tests cover the closed-state no-fetch, happy-path fetch, 501/404 messaging, and Close/Escape wiring. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 17:22:15 -07:00
molecule-ai[bot]	b1433ee8e6	Merge pull request #1171 from Molecule-AI/staging chore: fast-forward staging with main review-cleanup commits	2026-04-21 00:16:58 +00:00
molecule-ai[bot]	45cf87c7b8	test(BatchActionBar): add hasFailedBatch success-reset test (#1170 ) CP-QA approved. 34-line test for BatchActionBar retry state reset after successful batch action.	2026-04-21 00:12:37 +00:00
molecule-ai[bot]	45f5b47487	fix(security): add USER directive before ENTRYPOINT in all tenant images (#1155 ) Closes: #177 (CRITICAL — Dockerfile runs as root) Dockerfiles changed: - workspace-server/Dockerfile (platform-only): addgroup/adduser + USER platform - workspace-server/Dockerfile.tenant (combined Go+Canvas): addgroup/adduser + USER canvas + chown canvas:canvas on canvas dir so non-root node process can read it - canvas/Dockerfile (canvas standalone): addgroup/adduser + USER canvas - workspace-server/entrypoint-tenant.sh: update header comment (no longer starts as root; both processes now start non-root) The entrypoint no longer needs a root→non-root handoff since both the Go platform and Canvas node run as non-root by default. The 'canvas' user owns /app and /platform, so volume mounts owned by the host's canvas user work without needing a root init step. Co-authored-by: Molecule AI CP-BE <cp-be@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-20 23:51:33 +00:00
Molecule AI CP-BE	696bd86322	Merge branch 'fix/canvas-orgs-page-tests' into staging	2026-04-20 23:47:02 +00:00
Molecule AI CP-BE	223584c66d	Merge remote-tracking branch 'origin/staging' into fix/canvas-orgs-page-tests	2026-04-20 23:46:17 +00:00
Molecule AI CP-BE	acf03cd057	fix(security): suppress raw response body from user-facing billing errors billing.ts (startCheckout, openBillingPortal): replace raw res.text() in thrown Error with a safe status-only message. The response body from /cp/billing/* routes can contain Stripe API error detail (invalid key, card decline message, raw Stripe envelope) that should not reach clients. orgs/page.tsx (createOrg): same fix — raw body → safe message. Full body is logged server-side for debugging. Closes: #91 (CWE-209 — Stripe key echoed in error) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-20 23:23:25 +00:00
Molecule AI Core-FE	a339cde8d5	fix(canvas): restore 12 orgs-page tests broken by TermsGate fetch + mock leak Root causes: 1. TermsGate (rendered inside OrgsPage Shell) fetches /cp/auth/terms-status before OrgsPage fetches /cp/orgs, consuming the first mockResponseOnce slot — leaving /cp/orgs with no mock and throwing TypeError. Fix: mock TermsGate as a pass-through component in vi.mock. 2. Non-polling tests used mockFetchSession.mockResolvedValueOnce() which exhausted after one call; React 18 concurrent re-renders call fetchSession() multiple times, causing subsequent calls to return undefined. Fix: use mockResolvedValue() (persistent) for fetchSession. 3. vi.clearAllMocks() in beforeEach kept mockResolvedValueOnce from previous tests from leaking BUT the vi.fn() mock implementation was already reset by mockFetchSession.mockReset() in beforeEach. Tests were passing stale persistent mocks from previous tests. Fix: mockFetchSession.mockReset() in beforeEach + mockResolvedValue in each test. 4. Polling tests used vi.useFakeTimers() without shouldAdvanceTime, which prevented React's useEffect from calling fetch() (0 calls). Fix: use vi.useFakeTimers({ shouldAdvanceTime: true }) + await vi.advanceTimersByTimeAsync() to advance time during await. 5. Unmount test unmounted before effects fired (with shouldAdvanceTime). Fix: flush microtasks with await vi.advanceTimersByTimeAsync(0) before unmount so the effect runs and schedules the poll timer. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-20 23:12:34 +00:00
Hongming Wang	fc3ae5a63a	chore: code-review cleanup on today's shipped PRs Three nits identified during post-merge review of #1119, #1133: 1. ContextMenu.tsx imported `removeNode` from the canvas store but stopped using it when the delete-confirm flow moved to Canvas in #1133. Also removed the now-unused mock entry in the keyboard test so the test inventory matches the real call list. 2. Preflight's YAML parse failure was a silent pass — defensible since the in-container preflight owns the schema, but invisible to ops if a template ships malformed YAML. Log at WARN so the signal surfaces without blocking the provision. 3. formatMissingEnvError rendered its slice via %q, producing `["A" "B"]` which is Go-literal-looking and ugly in a user-facing error. Join with ", " instead. Test updated to assert the new format. No behavioural changes beyond the log line; fixes are review nits, not bug fixes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 16:04:57 -07:00
Hongming Wang	0ddf266e54	fix(canvas): delete workspace dialog race with context menu close Clicking "Delete" in the workspace context menu did nothing for stuck workspaces. The confirm dialog was rendered via portal as a child of ContextMenu. ContextMenu's outside-click handler checks whether the click target is inside its ref — but the portal puts the dialog in document.body, outside the ref. So clicking the dialog's Confirm counted as "outside", closed the menu, unmounted the dialog mid-click, and the onConfirm handler never ran. Hoist the pending-delete state to the canvas store and render the confirm dialog at the Canvas level (same pattern as the existing pendingNest dialog). The dialog now outlives ContextMenu, so the outside-click close is harmless. Close the context menu on the Delete click itself rather than waiting for the dialog to resolve. Add a regression test covering the new flow and add the standard ?confirm=true query param so the backend's child-cascade guard is consulted correctly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 15:50:30 -07:00
Hongming Wang	c3f7447e86	fix: harden stuck-provisioning UX — details crash, preflight, sweeper Workspaces stuck in status='provisioning' previously surfaced in three bad ways: 1. Details tab crashed with `Cannot read properties of undefined (reading 'toLocaleString')`. `BudgetSection` + `WorkspaceUsage` assumed full response shapes but a provisioning-stuck workspace returns partial `{}`. Guard each deep field with `?? 0` and cover the partial-response case with regression tests. 2. Missing required env vars failed silently 15+ minutes later as a cosmetic "Provisioning Timeout" banner. The in-container preflight catches them but by then the container has already crashed without calling /registry/register, so the workspace sat in 'provisioning' forever. Mirror the preflight server-side: parse config.yaml's `runtime_config.required_env` before launch, fail fast with a WORKSPACE_PROVISION_FAILED event naming the missing vars. 3. No backend timeout ever flipped a stuck workspace to 'failed'. Add a registry sweeper (10m default, env-overridable) that detects workspaces stuck past the window, flips them to 'failed', and emits WORKSPACE_PROVISION_TIMEOUT. Race-safe: the UPDATE re-checks the status + age predicate so a concurrent register/restart wins. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 14:51:39 -07:00

1 2 3 4 5

204 Commits