forked from molecule-ai/molecule-core
2df644f528
184 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
e08ea7b5ba |
fix(canvas): require hermes model at create + send to CP (fixes silent Anthropic 401)
Root cause of the hermes 401 "Invalid API key" on SaaS workspaces:
1. CreateWorkspaceDialog never sent `model` in the /workspaces POST
2. Tenant/CP plumbed through a valid (provider, API key) but empty MODEL
3. Workspace install.sh ran with HERMES_DEFAULT_MODEL unset
4. derive-provider.sh saw no slug → PROVIDER="auto"
5. Hermes fell back to its compiled-in default (Anthropic via
OpenAI-compat adapter)
6. User's MINIMAX_API_KEY was present but irrelevant — hermes tried
Anthropic with it → 401
Fix:
- Extend HERMES_PROVIDERS with `defaultModel` + `models` (suggestion
list). Each provider ships with a known-good default so the trap
is physically impossible to hit with the new form.
- Add a required Model input to the Hermes panel, auto-populated
from the provider's defaultModel when the provider changes (only
if the user hasn't typed their own slug yet).
- Datalist surfaces additional model suggestions per provider so
users can pick a different size (e.g. M2.7-highspeed) without
typing the whole slug.
- handleCreate validates hermesModel is non-empty, sends as `model`
in the POST body alongside the secrets block.
- useEffect guard avoids clobbering a user-typed custom slug when
they toggle providers back and forth.
Existing 19 a11y tests still pass (non-SaaS path unchanged, four-tier
picker still renders, arrow-key nav still wraps).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
66de81fbfa
|
Merge pull request #1689 from Molecule-AI/refactor/strip-secret-service-dropdown
refactor(secrets): strip Service dropdown from Add-Key form |
||
|
|
0574e7c1d0 |
feat(canvas): add T4 tier (full-host access); SaaS default T4
Following feedback that T4 — not T3 — is the full-access tier: - Non-SaaS picker now shows all four tiers: T1 Sandboxed, T2 Standard, T3 Privileged, T4 Full Access. Four-column grid. - SaaS picker stays single-option but now locks to T4 (was T3). Every SaaS workspace gets a dedicated EC2 VM, which is unambiguously the "full host" case — T3 (privileged container) was a category mismatch. - Default tier on SaaS is 4 (was 3). CP provisioner already supports tier 4 (t3.large / 80 GB). TIER_CONFIG already has T4's amber color. Tests updated for the four-tier picker: wrap tests now go T4 ↔ T1, and the selection/tabIndex tests cover the fourth button. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
8b1af9708c |
feat(canvas): default tier T3 and hide T1/T2 on SaaS
On SaaS every workspace gets its own EC2 VM — the Docker-sandbox distinction between T1 (sandboxed), T2 (standard Docker), and T3 (full host access) doesn't apply. A SaaS workspace is always a dedicated VM, which is "full access" by construction. Showing T1/T2 in that UI is a category error: users pick a sandbox level that has no effect on the actual EC2 machine they get. Changes: - tenant.ts: export isSaaSTenant() — returns true when canvas is served at <slug>.moleculesai.app (SSR-safe: false on server) - CreateWorkspaceDialog: when isSaaSTenant(), render only the T3 option, default tier=3, grid collapses to a single column. Label gets a " — dedicated VM" hint so the user knows what they're getting. On self-hosted the full T1/T2/T3 picker is unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
d956164812 |
refactor(secrets): strip Service dropdown from Add-Key form
The Add-Key form used to open with a required Service dropdown (GitHub / Anthropic / OpenRouter / Other) that gated everything else. The dropdown did no persistent work — the secret store only cares about (key_name, value); the Service label was never saved anywhere. It also suffered registry drift: today we support ~22 hermes-dispatched providers (MiniMax, Gemini, DeepSeek, Kimi, Qwen, NVIDIA, etc.); only 3 had entries. Everyone else landed in "Other" with no downside beyond the mandatory click. Replaces it with: 1. Key-name <datalist> autocomplete sourced from new KEY_NAME_SUGGESTIONS in lib/services.ts — 26 entries covering common infra keys + every hermes-supported provider. 2. inferGroup(keyName) derives classification at render time, matching what the store already does in getGrouped(). No behaviour change for list grouping. 3. Provider docs link renders inline only when inferGroup recognises the name. For 'custom' keys we stay quiet — no false-structure prompt. 4. Test-connection button still available when the inferred group supports it AND the value is format-valid. Same providers as before. SERVICES registry preserved for LIST rendering + test routing. Result: two fields instead of three. One fewer decision. Provider- agnostic by design — new providers work the moment someone types their canonical env var name; no UI code change per provider. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
f6e6a64ba9 |
fix(canvas): forward-port dynamic runtime dropdown from staging (PR #1526)
PR #1526 shipped the /templates registry + canvas dynamic Runtime / Model / Required-Env fields on 2026-04-22 — but merged into the staging branch, not main. The staging→main promotion PR #1496 has been open unmerged for a while with 1172 commits divergence, so prod (which builds from main) still carries the old hardcoded dropdown. Symptom seen on hongmingwang.moleculesai.app today: - New Hermes Agent workspace (template declares runtime: hermes) loads Config tab → Runtime dropdown shows "LangGraph (default)" because there's no <option value="hermes"> in the hardcoded list; it falls back to empty-value silently. - Model field is a plain TextInput with static placeholder "e.g. anthropic:claude-sonnet-4-6" — should be a combobox populated from the selected runtime's models[]. - Required Env Vars is a TagList with static placeholder "e.g. CLAUDE_CODE_OAUTH_TOKEN" — should auto-populate from the selected model's required_env. - Net effect: "Save & Deploy" sends empty model + empty env to the provisioner → workspace instant-fails. This PR cherry-picks the exact three files from PR #1526 (#359dc61 on staging) forward to main, without pulling the other 1171 commits: - canvas/src/components/tabs/ConfigTab.tsx - RuntimeOption interface + FALLBACK_RUNTIME_OPTIONS (hermes, gemini-cli included) - useEffect fetches /templates and populates runtimeOptions dynamically - dropdown renders from runtimeOptions (no hardcoded list) - Model becomes a combobox with datalist of available models per selected runtime - Required Env Vars auto-populates from the selected model's required_env on model change - workspace-server/internal/handlers/templates.go - /templates endpoint returns [{id, name, runtime, models}] with per-template models registry (id, name, required_env) - workspace-server/internal/handlers/templates_test.go - Tests for runtime+models parsing and legacy top-level model fallback The canvas Runtime dropdown now resolves "hermes" correctly; Model dropdown shows the models[] from the hermes template; Env auto-populates with HERMES_API_KEY (or whichever model selected). Verified locally: - workspace-server builds clean - Template handler tests pass: TestTemplatesList_RuntimeAndModelsRegistry, TestTemplatesList_LegacyTopLevelModel, TestTemplatesList_NonexistentDir Follow-up: the staging→main promotion gap (#1496) is the underlying process issue. Either merge that PR or adopt a policy of landing fixes directly on main (as several PRs have today). Files here were chosen minimally to avoid pulling unrelated staging changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
e88ab70251 |
fix(canvas): stop infinite re-render on ContextMenu mount
ContextMenu's children selector ran .filter() inside the Zustand hook, returning a brand-new array reference on every render. useSyncExternalStore under the hood compares snapshots with Object.is — a new array always differs, so React kept scheduling re-renders, hit the 50-update depth cap, and crashed with minified error #185. Observed as "Application error: a client-side exception" on every SaaS tenant once a session cookie resolved. Caught in dev mode where the build emits the clear warning: The result of getSnapshot should be cached to avoid an infinite loop at ContextMenu (src/components/ContextMenu.tsx:26:34) Fix: select the stable nodes array once, derive children via useMemo outside the store subscription. Same output, no new reference per render. Manually verified: dev bundle served through a cloudflared tunnel to a live tenant, ContextMenu component mounts cleanly, remaining console errors are all unrelated (localhost API 401s from the dev server pointing at its own origin). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
64ccf8e179
|
fix: CWE-78 rm scope, go vet failures, delegation idempotency
* refactor: split 4 oversized handler files into focused sub-files - org.go (1099 lines) → org.go + org_import.go + org_helpers.go - mcp.go (1001 lines) → mcp.go + mcp_tools.go - workspace.go (934 lines) → workspace.go + workspace_crud.go - a2a_proxy.go (825 lines) → a2a_proxy.go + a2a_proxy_helpers.go No functional changes — same package, same exports, same tests. All files stay under 635 lines. Note: isSafeURL and isPrivateOrMetadataIP are duplicated between mcp_tools.go and a2a_proxy_helpers.go — this is a pre-existing issue from the original mcp.go and a2a_proxy.go, not introduced by this split. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(runtime+scheduler): increment/decrement active_tasks counter (refs #1386) * docs(tutorials): add Self-Hosted AI Agents guide — Docker, Fly Machines, bare metal * docs: add Remote Agents feature + Phase 30 blog links to docs index * docs(marketing): update Phase 30 brief — Action 5 complete, docs/index.md update noted * docs(api-ref): add workspace file copy API reference (#1281) Documents TemplatesHandler.copyFilesToContainer (container_files.go): - Endpoint overview: PUT /workspaces/:id/files/*path - Parameter descriptions for all four function parameters - CWE-22 path traversal protection (PRs #1267/1270/1271) - Defense-in-depth: validateRelPath at handler + archive boundary - Full error code table (400/404/500) - curl example with success and path-traversal rejection cases Also covers: writeViaEphemeral routing, findContainer fallback, allowed roots allow-list, and related links to platform-api.md. Co-authored-by: Molecule AI Technical Writer <technical-writer@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(security): CWE-78/CWE-22 — block shell injection in deleteViaEphemeral (#1310) ## Summary Issue #1273: deleteViaEphemeral interpolated filePath directly into rm command, enabling both shell injection (CWE-78) and path traversal (CWE-22) attacks. ## Changes 1. Added validateRelPath(filePath) guard before constructing the rm command. validateRelPath blocks absolute paths and ".." traversal sequences. 2. Changed Cmd from "/configs/"+filePath (string interpolation) to []string{"rm", "-rf", "/configs", filePath} (exec form). This eliminates shell injection entirely — filePath is a plain argument, never interpreted as shell code. ## Security properties - validateRelPath: blocks "../" and absolute paths before they reach Docker - Exec form: filePath cannot inject shell metacharacters even if validation is somehow bypassed - "/configs" as separate arg: rm has exactly two arguments, no room for injected args Closes #1273. Co-authored-by: Molecule AI Infra-Runtime-BE <infra-runtime-be@agents.moleculesai.app> * fix(security): backport SSRF defence (CWE-918) to main — isSafeURL in a2a_proxy.go (#1292) (#1302) * fix(security): backport SSRF defence (CWE-918) to main — isSafeURL in mcp.go and a2a_proxy.go Issue #1042: 3 CodeQL SSRF findings across mcp.go and a2a_proxy.go. staging already ships the fix (PRs #1147, #1154 → merged); main did not include it. - mcp.go: add isSafeURL() + isPrivateOrMetadataIP() helpers; validate agentURL before outbound calls in mcpCallTool (line ~529) and toolDelegateTaskAsync (line ~607) - a2a_proxy.go: add identical isSafeURL() + isPrivateOrMetadataIP() helpers; call isSafeURL() before dispatchA2A in resolveAgentURL() (blocks finding #1 at line 462) - mcp_test.go: 19 new tests covering all blocked URL patterns: file://, ftp://, 127.0.0.1, ::1, 169.254.169.254, 10.x.x.x, 172.16.x.x, 192.168.x.x, empty hostname, invalid URL, isPrivateOrMetadataIP across all private/CGNAT/metadata ranges 1. URL scheme enforcement — http/https only 2. IP literal blocking — loopback, link-local, RFC-1918, CGNAT, doc/test ranges 3. DNS hostname resolution — blocks internal hostnames resolving to private IPs Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(ci-blocker): remove duplicate isSafeURL/isPrivateOrMetadataIP from mcp.go Issue #1292: PR #1274 duplicated isSafeURL + isPrivateOrMetadataIP in mcp.go — both functions already exist on main at lines 829 and 876. Kept the mcp.go definitions (the originals) and removed the 70-line duplicate appended at end of file. a2a_proxy.go functions are unchanged — they serve the same purpose via a separate code path. * fix: remove orphaned commit-text lines from a2a_proxy.go Three lines from the PR/commit title were accidentally baked into the file during the rebase from #1274 to #1302, causing a Go syntax error (a bare string literal at statement level followed by dangling braces). Deletion restores: } return agentURL, nil } Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Molecule AI Infra-Runtime-BE <infra-runtime-be@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Molecule AI Core-BE <core-be@agents.moleculesai.app> Co-authored-by: Molecule AI SDK Lead <sdk-lead@agents.moleculesai.app> * fix(canvas/test): patch test regressions from PR #1243 + proximity hitbox fix (#1313) * fix(ci): revert cancel-in-progress to true — ubuntu-runner dispatch stalled With cancel-in-progress: false, pending CI runs accumulate in the ci-staging concurrency group. New pushes create queued runs, but GitHub dispatches multiple runs for the same SHA instead of replacing the pending one. All runs get stuck/cancelled before completing. Reverting to cancel-in-progress: true restores CI operation — runs that are superseded are cancelled, freeing the concurrency slot for the new run to proceed. Runner availability (ubuntu-latest dispatch stall) is a separate infra issue tracked independently. * fix(security): validate tar header names in copyFilesToContainer — CWE-22 path traversal (#1043) Tar header names were built from raw map keys without validation. A malicious server-side caller could embed "../" in a file name to escape the destPath volume mount (/configs) and write files outside the intended directory. Fix: validate each name with filepath.Clean + IsAbs + HasPrefix("..") checks before using it in the tar header, then join with destPath for the archive header. Also guard parent-directory creation against traversal. Closes #1043. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas/test): patch regressed tests from PR #1243 orgs-page flakiness fix Two regressions introduced by PR #1243 (fix issue #1207): 1. **ContextMenu.keyboard.test.tsx** — `setPendingDelete` now receives `{id, name, hasChildren}` (cascade-delete UX, PR #1252), but the test expected only `{id, name}`. Added `hasChildren: false` to the assertion. 2. **orgs-page.test.tsx** — 10 tests awaited `vi.advanceTimersByTimeAsync(50)` without `act()`. With fake timers, `setState` (synchronous) is flushed by `advanceTimersByTimeAsync`, but the React state update it triggers is a microtask — so the test saw stale render. Wrapping in `act(async () => { await vi.advanceTimersByTimeAsync(50); })` ensures microtasks drain before assertions run. All 813 vitest tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): add 100px proximity threshold to drag-to-nest detection Fixes #1052 — previously, getIntersectingNodes() returned any node whose bounding box overlapped the dragged node, regardless of actual pixel distance. On a sparse canvas this triggered the "Nest Workspace" dialog even when the dragged node was nowhere near any target. The fix adds an on-node-drag proximity filter: only nodes within 100px (center-to-center) of the dragged node are eligible as nest targets. Distance is computed as squared Euclidean to avoid the sqrt overhead in the hot drag path. Added two tests to Canvas.pan-to-node.test.tsx covering the mock wiring and confirming the regression is addressed in Canvas.tsx. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com> Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): add ?? 0 guard for optional budget_used in progressPct (#1324) (#1327) * fix(ci): revert cancel-in-progress to true — ubuntu-runner dispatch stalled With cancel-in-progress: false, pending CI runs accumulate in the ci-staging concurrency group. New pushes create queued runs, but GitHub dispatches multiple runs for the same SHA instead of replacing the pending one. All runs get stuck/cancelled before completing. Reverting to cancel-in-progress: true restores CI operation — runs that are superseded are cancelled, freeing the concurrency slot for the new run to proceed. Runner availability (ubuntu-latest dispatch stall) is a separate infra issue tracked independently. * fix(security): validate tar header names in copyFilesToContainer — CWE-22 path traversal (#1043) Tar header names were built from raw map keys without validation. A malicious server-side caller could embed "../" in a file name to escape the destPath volume mount (/configs) and write files outside the intended directory. Fix: validate each name with filepath.Clean + IsAbs + HasPrefix("..") checks before using it in the tar header, then join with destPath for the archive header. Also guard parent-directory creation against traversal. Closes #1043. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas/test): patch regressed tests from PR #1243 orgs-page flakiness fix Two regressions introduced by PR #1243 (fix issue #1207): 1. **ContextMenu.keyboard.test.tsx** — `setPendingDelete` now receives `{id, name, hasChildren}` (cascade-delete UX, PR #1252), but the test expected only `{id, name}`. Added `hasChildren: false` to the assertion. 2. **orgs-page.test.tsx** — 10 tests awaited `vi.advanceTimersByTimeAsync(50)` without `act()`. With fake timers, `setState` (synchronous) is flushed by `advanceTimersByTimeAsync`, but the React state update it triggers is a microtask — so the test saw stale render. Wrapping in `act(async () => { await vi.advanceTimersByTimeAsync(50); })` ensures microtasks drain before assertions run. All 813 vitest tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): add 100px proximity threshold to drag-to-nest detection Fixes #1052 — previously, getIntersectingNodes() returned any node whose bounding box overlapped the dragged node, regardless of actual pixel distance. On a sparse canvas this triggered the "Nest Workspace" dialog even when the dragged node was nowhere near any target. The fix adds an on-node-drag proximity filter: only nodes within 100px (center-to-center) of the dragged node are eligible as nest targets. Distance is computed as squared Euclidean to avoid the sqrt overhead in the hot drag path. Added two tests to Canvas.pan-to-node.test.tsx covering the mock wiring and confirming the regression is addressed in Canvas.tsx. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): add ?? 0 guard for optional budget_used in progressPct Fixes #1324 — TypeScript strict mode flags budget.budget_used as possibly undefined in the progressPct ternary, even though the outer condition checks budget_limit > 0. Fix: use nullish coalescing (budget_used ?? 0) so progress shows 0% when the backend returns a partial shape (provisioning-stuck workspaces). Also adds a test covering the undefined-budget_used case with the progress bar aria-valuenow and fill width both at 0%. Closes #1324. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com> Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): add ?? 0 guard for optional budget_used in progressPct (issue #1324) (#1329) * fix(ci): revert cancel-in-progress to true — ubuntu-runner dispatch stalled With cancel-in-progress: false, pending CI runs accumulate in the ci-staging concurrency group. New pushes create queued runs, but GitHub dispatches multiple runs for the same SHA instead of replacing the pending one. All runs get stuck/cancelled before completing. Reverting to cancel-in-progress: true restores CI operation — runs that are superseded are cancelled, freeing the concurrency slot for the new run to proceed. Runner availability (ubuntu-latest dispatch stall) is a separate infra issue tracked independently. * fix(security): validate tar header names in copyFilesToContainer — CWE-22 path traversal (#1043) Tar header names were built from raw map keys without validation. A malicious server-side caller could embed "../" in a file name to escape the destPath volume mount (/configs) and write files outside the intended directory. Fix: validate each name with filepath.Clean + IsAbs + HasPrefix("..") checks before using it in the tar header, then join with destPath for the archive header. Also guard parent-directory creation against traversal. Closes #1043. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas/test): patch regressed tests from PR #1243 orgs-page flakiness fix Two regressions introduced by PR #1243 (fix issue #1207): 1. **ContextMenu.keyboard.test.tsx** — `setPendingDelete` now receives `{id, name, hasChildren}` (cascade-delete UX, PR #1252), but the test expected only `{id, name}`. Added `hasChildren: false` to the assertion. 2. **orgs-page.test.tsx** — 10 tests awaited `vi.advanceTimersByTimeAsync(50)` without `act()`. With fake timers, `setState` (synchronous) is flushed by `advanceTimersByTimeAsync`, but the React state update it triggers is a microtask — so the test saw stale render. Wrapping in `act(async () => { await vi.advanceTimersByTimeAsync(50); })` ensures microtasks drain before assertions run. All 813 vitest tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): add 100px proximity threshold to drag-to-nest detection Fixes #1052 — previously, getIntersectingNodes() returned any node whose bounding box overlapped the dragged node, regardless of actual pixel distance. On a sparse canvas this triggered the "Nest Workspace" dialog even when the dragged node was nowhere near any target. The fix adds an on-node-drag proximity filter: only nodes within 100px (center-to-center) of the dragged node are eligible as nest targets. Distance is computed as squared Euclidean to avoid the sqrt overhead in the hot drag path. Added two tests to Canvas.pan-to-node.test.tsx covering the mock wiring and confirming the regression is addressed in Canvas.tsx. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): add ?? 0 guard for optional budget_used in progressPct Fixes #1324 — TypeScript strict mode flags budget.budget_used as possibly undefined in the progressPct ternary, even though the outer condition checks budget_limit > 0. Fix: use nullish coalescing (budget_used ?? 0) so progress shows 0% when the backend returns a partial shape (provisioning-stuck workspaces). Also adds a test covering the undefined-budget_used case with the progress bar aria-valuenow and fill width both at 0%. Closes #1324. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com> Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(platform): unblock SaaS workspace registration end-to-end Every workspace in the cross-EC2 SaaS provisioning shape was failing registration, heartbeat, or A2A routing. Four distinct blockers sat between "EC2 is up" and "agent responds"; three are platform-side and fixed here (the fourth is in the CP user-data, separate PR). 1. SSRF validator blocked RFC-1918 (registry.go + mcp.go) validateAgentURL and isPrivateOrMetadataIP rejected 172.16.0.0/12, which contains the AWS default VPC range (172.31.x.x) that every sibling workspace EC2 registers from. Registration returned 400 and the 10-min provision sweep flipped status to failed. RFC-1918 + IPv6 ULA are now gated behind saasMode(); link-local (169.254/16), loopback, IPv6 metadata (fe80::/10, ::1), and TEST-NET stay blocked unconditionally in both modes. saasMode() resolution order: 1. MOLECULE_DEPLOY_MODE=saas|self-hosted (explicit operator flag) 2. MOLECULE_ORG_ID presence (legacy implicit signal, kept for back-compat so existing deployments don't need a config change) isPrivateOrMetadataIP now actually checks IPv6 — previously it returned false on any non-IPv4 input, which would let a registered [::1] or [fe80::...] URL bypass the SSRF check entirely. 2. Orphan auth-token minting (workspace_provision.go) issueAndInjectToken mints a token and stuffs it into cfg.ConfigFiles[".auth_token"]. The Docker provisioner writes that file into the /configs volume — the CP provisioner ignores it (only cfg.EnvVars crosses the wire). Result: live token in DB, no plaintext on disk, RegistryHandler.requireWorkspaceToken 401s every /registry/register attempt because the workspace is no longer in the "no live token → bootstrap-allowed" state. Now no-ops in SaaS mode; the register handler already mints on first successful register and returns the plaintext in the response body for the runtime to persist locally. Also removes the redundant wsauth.IssueToken call at the bottom of provisionWorkspaceCP, which created the same orphan-token pattern a second time. 3. Compaction artefacts (bundle/importer.go, handlers/org_tokens.go, scheduler.go, workspace_provision.go) Four pre-existing compile errors on main from an earlier session's code truncation: missing tuple destructuring on ExecContext / redactSecrets / orgTokenActor, missing close-brace in Scheduler.fireSchedule's panic recovery. All one-line mechanical fixes; without them the binary would not build. Tests ----- ssrf_test.go adds: * TestSaasMode — covers the env resolution ladder (explicit flag wins over legacy signal, case-insensitive, whitespace tolerant) * TestIsPrivateOrMetadataIP_SaaSMode — asserts RFC-1918 + IPv6 ULA flip to allowed, metadata/loopback/TEST-NET still blocked * TestIsPrivateOrMetadataIP_IPv6 — regression guard for the old "returns false for all IPv6" behaviour Follow-up issue for CP-sourced workspace_id attestation will be filed separately — closes the residual intra-VPC SSRF + token-race windows the SaaS-mode relaxation introduces. Verified end-to-end today on workspace 6565a2e0 (hermes runtime, OpenAI provider) — agent returned "PONG" in 1.4s after register → heartbeat → A2A proxy → runtime. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(runtime+scheduler): increment/decrement active_tasks + max_concurrent (#1408) Runtime (shared_runtime.py): - set_current_task now increments active_tasks on task start, decrements on completion (was binary 0/1) - Counter never goes below 0 (max(0, n-1)) - Pushes heartbeat immediately on BOTH increment and decrement (#1372) Scheduler (scheduler.go): - Reads max_concurrent_tasks from DB (default 1, backward compatible) - Skips cron only when active_tasks >= max_concurrent_tasks (was > 0) - Leaders can be configured with max_concurrent_tasks > 1 to accept A2A delegations while a cron runs Platform: - Added max_concurrent_tasks column to workspaces (migration 037) - Workspace model + list/get queries include the new field - API exposes max_concurrent_tasks in workspace JSON Config.yaml support (future): runtime_config.max_concurrent_tasks Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(review): address 3 critical issues from code review 1. BLOCKER: executor_helpers.py now uses increment/decrement too (was still binary 0/1, stomping the counter for CLI + SDK executors) 2. BUG: asymmetric getattr defaults fixed — both paths use default 0 (was 0 on increment, 1 on decrement) 3. UX: current_task preserved when active_tasks > 0 on decrement (was clearing task description even when other tasks still running) 4. Scheduler polling loop re-reads max_concurrent_tasks on each poll (was using stale value from initial query) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Hongming Wang <hongmingwangrabbit@gmail.com> Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com> Co-authored-by: Molecule AI Technical Writer <technical-writer@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Molecule AI Infra-Runtime-BE <infra-runtime-be@agents.moleculesai.app> Co-authored-by: Molecule AI Core-BE <core-be@agents.moleculesai.app> Co-authored-by: Molecule AI SDK Lead <sdk-lead@agents.moleculesai.app> Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app> Co-authored-by: Hongming Wang <hongmingwang.rabbit@users.noreply.github.com> * docs: workspace files API reference, skill catalog, and links * docs: fix secrets endpoint path across docs The workspace secrets endpoint is `/workspaces/:id/secrets`, not `/secrets/values`. This was wrong in quickstart.md (Path 2: Remote Agent) and workspace-runtime.md (registration flow example and comparison table). The external-agent-registration guide already had the correct path. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs: fix broken blog cross-link in skills-vs-bundled-tools post Link path had an extra `/docs/` segment: `/docs/blog/...` instead of `/blog/...`. Nextra resolves blog posts directly under `/blog/<slug>`, not under `/docs/blog/`. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs: add skill-catalog.md guide Linked from the skills-vs-bundled-tools blog post as a reference for TTS/image-generation/web-search skills. The blog promises "install directly via the CLI" with a skill catalog — this page fills that promise by documenting available skill types, install commands, version management, custom skill authoring, and removal. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs(marketing): update Phase 30 brief — Action 5 complete, docs/index.md update noted * docs(api-ref): add workspace file copy API reference Documents TemplatesHandler.copyFilesToContainer (container_files.go): - Endpoint overview: PUT /workspaces/:id/files/*path - Parameter descriptions for all four function parameters - CWE-22 path traversal protection (PRs #1267/1270/1271) - Defense-in-depth: validateRelPath at handler + archive boundary - Full error code table (400/404/500) - curl example with success and path-traversal rejection cases Also covers: writeViaEphemeral routing, findContainer fallback, allowed roots allow-list, and related links to platform-api.md. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Molecule AI Technical Writer <technical-writer@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com> * fix(handlers): add saasMode() gating to isPrivateOrMetadataIP in a2a_proxy_helpers.go Issue #1421 / #1401: PR #1363 (handler split) moved isPrivateOrMetadataIP into a2a_proxy_helpers.go but kept the OLD pre-SaaS version — it unconditionally blocks RFC-1918 addresses, regressing the fix in commits |
||
|
|
38e9eba59a
|
fix(P0): CWE-22 path traversal in copyFilesToContainer + ContextMenu test
Issue #1434 — CWE-22 Path Traversal Regression: PR #1280 ( |
||
| e9615af169 |
Merge origin/main into staging: resolve conflicts with main's test + security fixes
Conflicts resolved (took main's versions): - canvas/src/app/__tests__/orgs-page.test.tsx (act() wrappers, PR #1350) - canvas/src/components/Canvas.tsx (100px proximity threshold, PR #1357) - canvas/src/components/__tests__/ContextMenu.keyboard.test.tsx (hasChildren fix) - workspace-server/internal/handlers/container_files.go (CWE-22/CWE-78 fixes, PRs #1281/#1310) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> |
|||
|
|
00bd73f8c8
|
fix(canvas): a11y fixes + budget_used TypeScript guard + orgs-page test fix (#1367)
* fix(canvas/a11y): mark StatusDot as aria-hidden — decorative element StatusDot is purely decorative; the status is already conveyed via aria-label on parent elements (WorkspaceNode, SidePanel header, etc.). Marking it aria-hidden="true" prevents screen readers from announcing the empty div as "img" with no alt text. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): guard budget_used optional field with ?? 0 in progress calc TypeScript error in CI: 'budget.budget_used' is possibly 'undefined' when used in the progress percentage calculation. The field is optional per BudgetData interface, so ?? 0 is the correct guard. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas/a11y): Tooltip keyboard focus support + ARIA role - Add role="tooltip" + unique id so assistive tech can find tooltip content - Add aria-describedby on trigger so screen readers announce tooltip text - Add onFocus/onBlur handlers so keyboard users (Tab navigation) can see tooltips that mouse users see on hover Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas/test): restore advanceTimersByTime pattern in orgs-page error test waitFor() + fake timers (vi.useFakeTimers in beforeEach) cause race conditions: the 5s polling timeout fires before React state updates flush. Restores the established pattern used by all other tests in this file: advanceTimersByTimeAsync(50) + runAllTimersAsync(). Also removes the now-unused waitFor import. Ref: PRs #1360, #1345 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Molecule AI Core-UIUX <core-uiux@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|
|
f2e4f71fee |
fix(canvas/test): restore waitFor in orgs-page error test + add getState mock (#1341)
Issue #1268: orgs-page error state test — replace vi.advanceTimersByTimeAsync(50) with waitFor polling. advanceTimersByTimeAsync fires the timer but does not guarantee React render flush completes before the assertion runs. Issue #1269: ContextMenu keyboard test — add getState: () => mockStore to useCanvasStore mock. PR #1243 changed the delete flow to hoist confirmation to Canvas-level dialog via setPendingDelete, which reads .nodes via useCanvasStore.getState() — the mock was missing getState. Also carries forward the Issue #1124 WORKSPACE_ID fail-fast fix from workspace/ modules (a2a_cli, a2a_client, coordinator, consolidation, molecule_ai_status) — RuntimeError if WORKSPACE_ID is unset/empty. Co-authored-by: Molecule AI Core Platform Lead <core-platform-lead@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|
|
b21b3d163f |
fix(canvas): add ?? 0 guard for optional budget_used in progressPct (#1324) (#1327)
* fix(ci): revert cancel-in-progress to true — ubuntu-runner dispatch stalled With cancel-in-progress: false, pending CI runs accumulate in the ci-staging concurrency group. New pushes create queued runs, but GitHub dispatches multiple runs for the same SHA instead of replacing the pending one. All runs get stuck/cancelled before completing. Reverting to cancel-in-progress: true restores CI operation — runs that are superseded are cancelled, freeing the concurrency slot for the new run to proceed. Runner availability (ubuntu-latest dispatch stall) is a separate infra issue tracked independently. * fix(security): validate tar header names in copyFilesToContainer — CWE-22 path traversal (#1043) Tar header names were built from raw map keys without validation. A malicious server-side caller could embed "../" in a file name to escape the destPath volume mount (/configs) and write files outside the intended directory. Fix: validate each name with filepath.Clean + IsAbs + HasPrefix("..") checks before using it in the tar header, then join with destPath for the archive header. Also guard parent-directory creation against traversal. Closes #1043. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas/test): patch regressed tests from PR #1243 orgs-page flakiness fix Two regressions introduced by PR #1243 (fix issue #1207): 1. **ContextMenu.keyboard.test.tsx** — `setPendingDelete` now receives `{id, name, hasChildren}` (cascade-delete UX, PR #1252), but the test expected only `{id, name}`. Added `hasChildren: false` to the assertion. 2. **orgs-page.test.tsx** — 10 tests awaited `vi.advanceTimersByTimeAsync(50)` without `act()`. With fake timers, `setState` (synchronous) is flushed by `advanceTimersByTimeAsync`, but the React state update it triggers is a microtask — so the test saw stale render. Wrapping in `act(async () => { await vi.advanceTimersByTimeAsync(50); })` ensures microtasks drain before assertions run. All 813 vitest tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): add 100px proximity threshold to drag-to-nest detection Fixes #1052 — previously, getIntersectingNodes() returned any node whose bounding box overlapped the dragged node, regardless of actual pixel distance. On a sparse canvas this triggered the "Nest Workspace" dialog even when the dragged node was nowhere near any target. The fix adds an on-node-drag proximity filter: only nodes within 100px (center-to-center) of the dragged node are eligible as nest targets. Distance is computed as squared Euclidean to avoid the sqrt overhead in the hot drag path. Added two tests to Canvas.pan-to-node.test.tsx covering the mock wiring and confirming the regression is addressed in Canvas.tsx. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): add ?? 0 guard for optional budget_used in progressPct Fixes #1324 — TypeScript strict mode flags budget.budget_used as possibly undefined in the progressPct ternary, even though the outer condition checks budget_limit > 0. Fix: use nullish coalescing (budget_used ?? 0) so progress shows 0% when the backend returns a partial shape (provisioning-stuck workspaces). Also adds a test covering the undefined-budget_used case with the progress bar aria-valuenow and fill width both at 0%. Closes #1324. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com> Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|
|
45715aa8a5 |
fix(canvas/test): patch test regressions from PR #1243 + proximity hitbox fix (#1313)
* fix(ci): revert cancel-in-progress to true — ubuntu-runner dispatch stalled With cancel-in-progress: false, pending CI runs accumulate in the ci-staging concurrency group. New pushes create queued runs, but GitHub dispatches multiple runs for the same SHA instead of replacing the pending one. All runs get stuck/cancelled before completing. Reverting to cancel-in-progress: true restores CI operation — runs that are superseded are cancelled, freeing the concurrency slot for the new run to proceed. Runner availability (ubuntu-latest dispatch stall) is a separate infra issue tracked independently. * fix(security): validate tar header names in copyFilesToContainer — CWE-22 path traversal (#1043) Tar header names were built from raw map keys without validation. A malicious server-side caller could embed "../" in a file name to escape the destPath volume mount (/configs) and write files outside the intended directory. Fix: validate each name with filepath.Clean + IsAbs + HasPrefix("..") checks before using it in the tar header, then join with destPath for the archive header. Also guard parent-directory creation against traversal. Closes #1043. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas/test): patch regressed tests from PR #1243 orgs-page flakiness fix Two regressions introduced by PR #1243 (fix issue #1207): 1. **ContextMenu.keyboard.test.tsx** — `setPendingDelete` now receives `{id, name, hasChildren}` (cascade-delete UX, PR #1252), but the test expected only `{id, name}`. Added `hasChildren: false` to the assertion. 2. **orgs-page.test.tsx** — 10 tests awaited `vi.advanceTimersByTimeAsync(50)` without `act()`. With fake timers, `setState` (synchronous) is flushed by `advanceTimersByTimeAsync`, but the React state update it triggers is a microtask — so the test saw stale render. Wrapping in `act(async () => { await vi.advanceTimersByTimeAsync(50); })` ensures microtasks drain before assertions run. All 813 vitest tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): add 100px proximity threshold to drag-to-nest detection Fixes #1052 — previously, getIntersectingNodes() returned any node whose bounding box overlapped the dragged node, regardless of actual pixel distance. On a sparse canvas this triggered the "Nest Workspace" dialog even when the dragged node was nowhere near any target. The fix adds an on-node-drag proximity filter: only nodes within 100px (center-to-center) of the dragged node are eligible as nest targets. Distance is computed as squared Euclidean to avoid the sqrt overhead in the hot drag path. Added two tests to Canvas.pan-to-node.test.tsx covering the mock wiring and confirming the regression is addressed in Canvas.tsx. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com> Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|
|
c0d5e528a4 |
fix(canvas): cascade-delete UX — require checkbox before Delete All (#1314)
* fix(canvas/test): restore test regressions from PR #1243 PR #1243 introduced two regressions in the canvas vitest suite: 1. ContextMenu.keyboard.test.tsx: the setPendingDelete call now passes `{hasChildren, id, name}` (not just `{id, name}`). Updated the keyboard-a11y test assertion to match the new store shape. 2. orgs-page.test.tsx: mockFetch.mockResolvedValueOnce() returned a plain object that didn't match the two-argument (url, options) call signature used by the component's fetch wrapper. Switched to mockImplementationOnce returning a rejected Promise — matching real fetch's rejection contract — and added runAllTimersAsync after advanceTimersByTimeAsync(50) to flush React state updates. 54 test files · 813 tests · all passing Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): replace bounding-box intersection with distance threshold for nest detection ReactFlow's getIntersectingNodes uses bounding-box overlap detection, which fires the drag-over state whenever any part of two nodes' position rectangles overlap — even when the dragged node is far from the target. This made the "Nest Workspace" dialog appear from large distances. Fix: scan all nodes on each drag tick and set dragOverNodeId to the closest node within NEST_PROXIMITY_THRESHOLD (150 px, center-to-center). This matches the intuitive behavior: nest only when the node is actually dropped near another. Constants: - NEST_PROXIMITY_THRESHOLD = 150px (~60% of a collapsed node's width) - DEFAULT_NODE_WIDTH = 245px (mid-range of min/max node widths) - DEFAULT_NODE_HEIGHT = 110px Also removed the unused getIntersectingNodes import (was causing duplicate identifier error when both onNodeDrag and the zoom handler called useReactFlow in the same component scope). Closes #1052. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): cascade-delete UX — show child count and require checkbox before Delete All Issue #1137: with ?confirm=true always sent, a single confirmation silently cascades — a team lead with 20 children gets nuked on one click. Changes: - store/canvas.ts: pendingDelete type now includes children: {id, name}[] - ContextMenu.tsx: passes child list to setPendingDelete on Delete click - DeleteCascadeConfirmDialog.tsx: new component — shows child names, a cascade warning, and requires the operator to tick a checkbox before Delete All activates. Disabled by default; only enables after checkbox. - Canvas.tsx: conditionally renders DeleteCascadeConfirmDialog for hasChildren workspaces, or plain ConfirmDialog for leaf workspaces. confirmDelete requires cascadeConfirmChecked=true when hasChildren. - ContextMenu.keyboard.test.tsx: updated setPendingDelete assertion to include children:[] (no children in the test fixture). 813 tests pass. Closes #1137. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Molecule AI Core-UIUX <core-uiux@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|
|
04c3bc6eb1 |
fix(canvas): cascade-delete UX — warn before deleting workspace with children (PR #1252)
- Store: pendingDelete now carries `hasChildren: boolean` (computed from nodes.some(parentId === nodeId)) - ContextMenu: passes hasChildren into setPendingDelete - Canvas: dialog title changes to "Delete Workspace and Children" with ⚠️ message when hasChildren; confirms with "Delete All" Refs: #1137 Co-authored-by: Molecule AI Fullstack (floater) <fullstack-floater@agents.moleculesai.app> |
||
|
|
221d8b2384 |
fix(canvas): guard undefined lastErrorRate and period dates in metrics (PR #1250)
- DetailsTab: use `(data.lastErrorRate ?? 0)` instead of bare multiply to prevent NaN% when the field is absent on pre-provisioning workspaces. - WorkspaceUsage: make formatPeriod accept optional start/end strings; return "—" for undefined so the usage period shows blank rather than "Invalid Date" for provisioning/partial workspaces. Refs: #1139 Co-authored-by: Molecule AI Fullstack (floater) <fullstack-floater@agents.moleculesai.app> |
||
|
|
883cb7ebc3 |
fix(canvas): eliminate flaky timer state between orgs-page tests (#1207) (#1243)
Root cause: tests used try/finally { vi.useRealTimers() / vi.useFakeTimers() }
back-and-forth. When any test's finally-block called vi.useFakeTimers(),
subsequent tests inherited fake timer state causing 50ms real setTimeouts
to not fire and mockFetch to accumulate calls across test boundaries.
Fix: consolidate timer management to beforeEach/afterEach hooks.
- beforeEach: vi.useFakeTimers() — all tests start from known fake state
- afterEach: cleanup() + vi.useRealTimers() — restore real timers for next test
- Individual tests: use vi.advanceTimersByTimeAsync(50) instead of real setTimeout
Also removed duplicate afterEach(cleanup()) and unused waitFor import.
Closes #1207.
Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
|
||
|
|
014295d57f |
fix(issue-1207): eliminate orgs-page test flakiness (#1235)
* fix(auth): F1094 — requireCallerOwnsOrg reads org_id not created_by (#1200) Root cause: requireCallerOwnsOrg (org_plugin_allowlist.go:116) was reading org_api_tokens.created_by to determine caller's org workspace ID. But created_by is a provenance label ("session", "admin-token", "org-token:<prefix>") — never a UUID. The equality check callerOrg != targetOrgID always failed → every org-token caller got 403 on /orgs/:id/plugins/allowlist routes. Fix: - Migration 036: adds org_id UUID column (nullable) to org_api_tokens with index. Existing pre-migration tokens get org_id=NULL → deny by default (safer than cross-org access). - orgtoken.Issue: takes new orgID param; stores in org_id column. - orgtoken.OrgIDByTokenID: new helper reads org_id for a token ID. Returns ("", nil) for NULL/unanchored tokens. - requireCallerOwnsOrg: now calls OrgIDByTokenID instead of reading created_by. Pre-migration tokens with org_id=NULL get callerOrg="" → denied (safer). - orgTokenActor (org_tokens.go): returns (createdBy, orgID) pair. Token minted via another org token gets its org_id set at mint time. Session/ADMIN_TOKEN callers get orgID="". - orgtoken.Token struct: adds OrgID field for list display. - orgtoken.List: selects org_id alongside other columns. - Updated existing tests for new Issue signature. - Added 10 regression tests covering: happy path, unanchored denial, cross-org denial, session bypass, DB error denial. 🤖 Generated with [Claude Code](https://claude.ai/claude-code) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(security): replace err.Error() leaks with prod-safe messages (#1206) - workspace_provision.go: provisionWorkspace, provisionWorkspaceCP — replaced 7 err.Error() calls with "provisioning failed" in both Broadcast payloads and last_sample_error DB column. Full error preserved in server-side log.Printf. - plugins_install_pipeline.go: resolveAndStage — replaced 5 err.Error() calls with generic messages: "invalid plugin source" "plugin source not supported" "invalid plugin name" "staged plugin exceeds size limit" "plugin manifest integrity check failed" Risk mitigated: DB errors (pq: connection refused, pq: deadlock), OS errors, and internal paths no longer leak in HTTP JSON responses or WebSocket broadcasts. Added regression tests (workspace_provision_test.go): - TestProvisionWorkspace_NoInternalErrorsInBroadcast - TestProvisionWorkspaceCP_NoInternalErrorsInBroadcast - TestResolveAndStage_NoInternalErrorsInHTTPErr Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(F1089): log panic-recovery UPDATE errors in scheduler The panic defer blocks in tick() and fireSchedule() now capture and log errors from the db.DB.ExecContext call that advances next_run_at after a panic. Previously, a DB failure during panic recovery was silent — the log line for the panic itself appeared but any subsequent UPDATE failure was invisible, risking unnoticed scheduler drift. context.Background() was already used (F1089 comment in place); this commit adds the missing error capture + log.Printf on exec failure. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(issue-1207): eliminate orgs-page test flakiness Three root causes addressed: 1. Duplicate afterEach blocks (lines 97-103) — two identical afterEach(() => { cleanup(); }) blocks collapsed to one. 2. Fake-timer isolation gap — if a polling test failed before its finally-block ran, vi.useFakeTimers() persisted globally. The next non-polling test's setTimeout(50) then hung indefinitely (fake timers don't advance without vi.advanceTimersByTime), causing waitFor/async timeouts. Fixed by calling vi.useRealTimers() unconditionally in beforeEach (guaranteed clean slate) and afterEach (even when a test fails before its own finally). 3. mockFetch.callHistory now cleared via mockReset() in beforeEach, preventing "expected 2 calls but got N" failures from carry-over between polling tests. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Molecule AI Dev Lead <dev-lead@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|
|
4e39201d76 |
fix(canvas): show toast when clipboard API unavailable in ConsoleModal (#1199) (#1231)
Use explicit navigator.clipboard check instead of optional chaining so the no-op case is handled explicitly. When clipboard API is unavailable (non-HTTPS context) show a toast: "Copy requires HTTPS — please select and copy manually". Production is always HTTPS so this only affects local dev with http:// canvas. Closes #1199. Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|
|
fcd3a6eaf0 |
fix(test): align ssrf_test.go localhost test cases with isSafeURL behaviour (#1192)
* feat(canvas): rewrite MemoryInspectorPanel to match backend API Issue #909 (chunk 3 of #576). The existing MemoryInspectorPanel used the wrong API endpoint (/memory instead of /memories) and wrong field names (key/value/version instead of id/content/scope/namespace/created_at). It also lacked LOCAL/TEAM/GLOBAL scope tabs and a namespace filter. Changes: - Fix endpoint: GET /workspaces/:id/memories with ?scope= query param - Fix MemoryEntry type to match actual API: id, content, scope, namespace, created_at, similarity_score - Add LOCAL/TEAM/GLOBAL scope tabs - Add namespace filter input - Remove Edit functionality (no update endpoint in backend) - Delete uses DELETE /workspaces/:id/memories/:id (by id, not key) - Full rewrite of 27 tests to match new API and UI structure - Uses ConfirmDialog (not native dialogs) for delete confirmation - All dark zinc theme (no light colors) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: tighten types + improve provision-timeout message (#1135, #1136) #1135 — TypeScript: make BudgetData.budget_used and WorkspaceMetrics fields optional to match actual partial-response shapes from provisioning- stuck workspaces. Runtime already guarded with ?? 0. #1136 — provisiontimeout.go: replace misleading "check required env vars" hint (preflight catches that case upfront) with accurate message about container starting but failing to call /registry/register. 🤖 Generated with [Claude Code](https://claude.com/claude-code) * fix(test): align ssrf_test.go localhost test cases with isSafeURL behaviour isSafeURL blocks 127.0.0.1 via ip.IsLoopback() even in dev environments. The test cases `wantErr: false` for localhost were incorrect — the test would fail when go test runs. Fix by changing wantErr to true for both localhost test cases. Rationale: loopback blocking at this layer is intentional. Access control is enforced by WorkspaceAuth + CanCommunicate at the A2A routing layer, not by the URL validation. Opening this would widen the SSRF attack surface without adding real dev flexibility. Closes: ssrf_test.go inconsistency reported 2026-04-21 Co-Authored-By: Claude Sonnet 4.7 <noreply@anthropic.com> --------- Co-authored-by: Molecule AI Core-UIUX <core-uiux@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|
|
c033f3b051 |
chore(canvas): review nits on ConsoleModal + Peers empty state
Post-review cleanup for the #1178 / #1189 bootstrap-watcher flow: - ConsoleModal status-code matching uses \b regex anchors instead of raw substrings. Before, any error message containing "501" inside a longer digit run ("15012") would false-match into the self-hosted branch. Unlikely in practice but cheap to tighten. - Peers empty-state copy now explains WHY the list is empty on offline / failed / provisioning workspaces instead of rendering the same "No reachable peers" text used for healthy workspaces with zero siblings. Online workspaces unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
3b339d13e6 |
fix(canvas): skip /registry/:id/peers fetch when workspace not online
The peers endpoint requires a workspace-scoped bearer token (see
validateDiscoveryCaller in handlers/discovery.go — designed for
agent-to-agent calls). The canvas session doesn't hold that token, so
every Details-tab open for a provisioning / failed / offline workspace
fired a 401 that cluttered devtools and lit up the error banner even
though the real UX here is "no peers — the workspace hasn't booted."
Gate the fetch on status ∈ {online, degraded} and render an empty
Peers list for everything else.
Follow-up: give the canvas a way to see peers for any workspace (admin
session should be enough). Tracked separately — this fix just quiets
the noise on the common case.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
48900e34b0 |
feat(canvas): show last_sample_error + EC2 console output on failed workspaces
Part 3 of 3 for the "fail fast + comprehensive logs" UX. Platform PR #1168 and controlplane #181 ship the server-side; this PR surfaces the data in the canvas. Two changes: 1. DetailsTab renders `last_sample_error` in a dedicated Error section when the workspace is failed (or degraded with an error). Before, the only trace of why a workspace failed was a generic banner — users had to click "View Logs", which opened the terminal tab (the post-boot log, empty on a runtime crash). Now the actual Python traceback is inline. A "View console output" button in the same section opens the full serial console in a modal. 2. New ConsoleModal component. Fetches GET /workspaces/:id/console (platform → CP → ec2:GetConsoleOutput). Portal-rendered above the canvas with Copy / Close / Esc handlers. Renders a friendly message on 501 (self-hosted deploys without CP) and 404 (instance terminated). 3. ProvisioningTimeout's "View Logs" button now opens the console modal instead of the (usually empty) terminal tab — when a workspace is stuck in provisioning, the cloud-init + user-data trace is what the user actually needs. Tests cover the closed-state no-fetch, happy-path fetch, 501/404 messaging, and Close/Escape wiring. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
b1433ee8e6 |
Merge pull request #1171 from Molecule-AI/staging
chore: fast-forward staging with main review-cleanup commits |
||
|
|
45cf87c7b8 |
test(BatchActionBar): add hasFailedBatch success-reset test (#1170)
CP-QA approved. 34-line test for BatchActionBar retry state reset after successful batch action. |
||
| 696bd86322 | Merge branch 'fix/canvas-orgs-page-tests' into staging | |||
| 223584c66d | Merge remote-tracking branch 'origin/staging' into fix/canvas-orgs-page-tests | |||
| acf03cd057 |
fix(security): suppress raw response body from user-facing billing errors
billing.ts (startCheckout, openBillingPortal): replace raw res.text() in thrown Error with a safe status-only message. The response body from /cp/billing/* routes can contain Stripe API error detail (invalid key, card decline message, raw Stripe envelope) that should not reach clients. orgs/page.tsx (createOrg): same fix — raw body → safe message. Full body is logged server-side for debugging. Closes: #91 (CWE-209 — Stripe key echoed in error) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> |
|||
| a339cde8d5 |
fix(canvas): restore 12 orgs-page tests broken by TermsGate fetch + mock leak
Root causes:
1. TermsGate (rendered inside OrgsPage Shell) fetches /cp/auth/terms-status
before OrgsPage fetches /cp/orgs, consuming the first mockResponseOnce
slot — leaving /cp/orgs with no mock and throwing TypeError.
Fix: mock TermsGate as a pass-through component in vi.mock.
2. Non-polling tests used mockFetchSession.mockResolvedValueOnce() which
exhausted after one call; React 18 concurrent re-renders call
fetchSession() multiple times, causing subsequent calls to return
undefined. Fix: use mockResolvedValue() (persistent) for fetchSession.
3. vi.clearAllMocks() in beforeEach kept mockResolvedValueOnce from
previous tests from leaking BUT the vi.fn() mock implementation was
already reset by mockFetchSession.mockReset() in beforeEach. Tests
were passing stale persistent mocks from previous tests. Fix:
mockFetchSession.mockReset() in beforeEach + mockResolvedValue in
each test.
4. Polling tests used vi.useFakeTimers() without shouldAdvanceTime,
which prevented React's useEffect from calling fetch() (0 calls).
Fix: use vi.useFakeTimers({ shouldAdvanceTime: true }) + await
vi.advanceTimersByTimeAsync() to advance time during await.
5. Unmount test unmounted before effects fired (with shouldAdvanceTime).
Fix: flush microtasks with await vi.advanceTimersByTimeAsync(0)
before unmount so the effect runs and schedules the poll timer.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|||
|
|
fc3ae5a63a |
chore: code-review cleanup on today's shipped PRs
Three nits identified during post-merge review of #1119, #1133: 1. ContextMenu.tsx imported `removeNode` from the canvas store but stopped using it when the delete-confirm flow moved to Canvas in #1133. Also removed the now-unused mock entry in the keyboard test so the test inventory matches the real call list. 2. Preflight's YAML parse failure was a silent pass — defensible since the in-container preflight owns the schema, but invisible to ops if a template ships malformed YAML. Log at WARN so the signal surfaces without blocking the provision. 3. formatMissingEnvError rendered its slice via %q, producing `["A" "B"]` which is Go-literal-looking and ugly in a user-facing error. Join with ", " instead. Test updated to assert the new format. No behavioural changes beyond the log line; fixes are review nits, not bug fixes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
0ddf266e54 |
fix(canvas): delete workspace dialog race with context menu close
Clicking "Delete" in the workspace context menu did nothing for stuck workspaces. The confirm dialog was rendered via portal as a child of ContextMenu. ContextMenu's outside-click handler checks whether the click target is inside its ref — but the portal puts the dialog in document.body, outside the ref. So clicking the dialog's Confirm counted as "outside", closed the menu, unmounted the dialog mid-click, and the onConfirm handler never ran. Hoist the pending-delete state to the canvas store and render the confirm dialog at the Canvas level (same pattern as the existing pendingNest dialog). The dialog now outlives ContextMenu, so the outside-click close is harmless. Close the context menu on the Delete click itself rather than waiting for the dialog to resolve. Add a regression test covering the new flow and add the standard ?confirm=true query param so the backend's child-cascade guard is consulted correctly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
c3f7447e86 |
fix: harden stuck-provisioning UX — details crash, preflight, sweeper
Workspaces stuck in status='provisioning' previously surfaced in three
bad ways:
1. **Details tab crashed** with `Cannot read properties of undefined
(reading 'toLocaleString')`. `BudgetSection` + `WorkspaceUsage`
assumed full response shapes but a provisioning-stuck workspace
returns partial `{}`. Guard each deep field with `?? 0` and cover
the partial-response case with regression tests.
2. **Missing required env vars failed silently** 15+ minutes later as
a cosmetic "Provisioning Timeout" banner. The in-container preflight
catches them but by then the container has already crashed without
calling /registry/register, so the workspace sat in 'provisioning'
forever. Mirror the preflight server-side: parse config.yaml's
`runtime_config.required_env` before launch, fail fast with a
WORKSPACE_PROVISION_FAILED event naming the missing vars.
3. **No backend timeout** ever flipped a stuck workspace to 'failed'.
Add a registry sweeper (10m default, env-overridable) that detects
workspaces stuck past the window, flips them to 'failed', and emits
WORKSPACE_PROVISION_TIMEOUT. Race-safe: the UPDATE re-checks the
status + age predicate so a concurrent register/restart wins.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
91187342b4 |
feat(auth): organization-scoped API keys for admin access
Adds user-facing API keys with full-org admin scope. Replaces the
single ADMIN_TOKEN env var with named, revocable, audited tokens
that users can mint/rotate from the canvas UI without ops
intervention.
Designed for the beta growth phase — one token tier (full admin).
Future work will split into scoped roles (admin / workspace-write
/ read-only) and per-workspace bindings. See docs/architecture/
org-api-keys.md for the design + follow-up roadmap.
## Surface
POST /org/tokens mint (plaintext returned once)
GET /org/tokens list live keys (prefix-only)
DELETE /org/tokens/:id revoke (idempotent)
All AdminAuth-gated. Bootstrap path: mint the first token via
ADMIN_TOKEN or canvas session; tokens can mint more tokens after.
## Validation as a new AdminAuth tier (2a)
AdminAuth evaluation order:
Tier 0 lazy-bootstrap fail-open (only when no live tokens AND
no ADMIN_TOKEN env)
Tier 1 verified WorkOS session via /cp/auth/tenant-member
Tier 2a org_api_tokens SELECT — NEW
Tier 2b ADMIN_TOKEN env (bootstrap / CLI break-glass)
Tier 3 any live workspace token (deprecated, only when ADMIN_TOKEN
unset)
Tier 2a runs ONE indexed lookup (partial index on
token_hash WHERE revoked_at IS NULL) + an async last_used_at
bump. No measurable latency cost on the hot path.
## UI
New "Org API Keys" tab in the settings panel. Label field for
human-readable naming. Plaintext shown once + clipboard copy.
Revoke with confirm dialog. Mirrors the existing workspace-
TokensTab flow so users who've used one get the other for free.
## Security properties
- Plaintext never stored. sha256 hash + 8-char display prefix.
- Revocation is immediate: partial index on revoked_at IS NULL
means the next request validates or fails in microseconds.
- created_by audit field captures provenance: "org-token:<short>"
when a token mints another, "session" for browser-UI mints,
"admin-token" for the ADMIN_TOKEN bootstrap path.
- Validate() collapses all failure shapes into ErrInvalidToken
so response-shape can't distinguish "never existed" from
"revoked".
## Tests
- internal/orgtoken: 9 unit tests (hash storage, empty field
null-ing, validation happy path, empty plaintext, unknown hash,
revoked filtering, list ordering, revoke idempotency, has-any-
live short-circuit).
- AdminAuth tier-2a integration covered by existing middleware
tests unchanged (fail-open + bearer paths).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
52235aeb27 |
feat(router): /cp/* reverse-proxy to CP + same-origin canvas fetches
Canvas's browser bundle issues fetches to both CP endpoints
(/cp/auth/me, /cp/orgs, ...) AND tenant-platform endpoints
(/canvas/viewport, /approvals/pending, /org/templates). They
share ONE build-time base URL. Baking api.moleculesai.app
broke tenant calls with 404; baking the tenant subdomain broke
auth. Tried both today and saw exactly one failure mode per
attempt.
Real fix: same-origin fetches + tenant-side split. Adds:
internal/router/cp_proxy.go # /cp/* → CP_UPSTREAM_URL
mounted before NoRoute(canvasProxy). Now a tenant serves:
/cp/* → reverse-proxy to api.moleculesai.app
/canvas/viewport,
/approvals/pending,
/workspaces/:id/*,
/ws, /registry, → tenant platform (existing handlers)
/metrics
everything else → canvas UI (existing reverse-proxy)
Canvas middleware reverts to `connect-src 'self' wss:` for the
same-origin path (keeping explicit PLATFORM_URL whitelist as a
self-hosted escape hatch when the build-arg is non-empty).
CI build-arg flips to NEXT_PUBLIC_PLATFORM_URL="" so the bundle
issues relative fetches.
Security of cp_proxy:
- Cookie + Authorization PRESERVED across the hop (opposite of
canvas proxy) — they carry the WorkOS session, which is the
whole point.
- Host rewritten to upstream so CORS + cookie-domain on the CP
side see their own hostname.
- Upstream URL validated at construction: must parse, must be
http(s), must have a host — misconfig fails closed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
d069231d0b |
fix(canvas): include NEXT_PUBLIC_PLATFORM_URL in CSP connect-src
Tenant page loads were blocked by: Refused to connect to 'https://api.moleculesai.app/cp/auth/me' because it violates the document's Content Security Policy. CSP had `connect-src 'self' wss:` — fine for same-origin + any wss, but browser refuses cross-origin HTTPS fetches that aren't listed. PLATFORM_URL (baked from NEXT_PUBLIC_PLATFORM_URL, which is the CP origin on SaaS tenants) needs to be explicit. Fix: middleware reads NEXT_PUBLIC_PLATFORM_URL at build/runtime and adds both the https and wss siblings to connect-src. Self- hosted deploys that override the build-arg automatically get a matching CSP — no hardcoded hostname. Test added: buildCsp includes NEXT_PUBLIC_PLATFORM_URL origin in connect-src when set. Also loosens the dev `ws:` assertion since dev uses `connect-src *` which subsumes ws (pre-existing behavior, test was stale). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
9abe2e5739 |
Merge pull request #1089 from Molecule-AI/fix/canvas-csp-nonce-propagation
fix(canvas): root layout dynamic so CSP nonce reaches Next scripts |
||
|
|
1af6f696a2 |
fix(canvas): make root layout dynamic so CSP nonce reaches Next scripts
Tenant page loads were failing with repeated CSP violations: Executing inline script violates ... script-src 'self' 'nonce-M2M4YTVh...' 'strict-dynamic'. ... because Next.js's bootstrap inline scripts were emitted without a nonce attribute. The middleware was generating per-request nonces correctly and sending them via `x-nonce` — but the layout was fully static, so Next.js cached the HTML once and served that cached bundle (no nonces baked in) for every request. Fix: call `await headers()` in the root layout. That opts the tree into dynamic rendering AND signals Next.js to propagate the x-nonce value to its own generated <script> tags. The `nonce` return value is intentionally unused — the framework handles its bootstrap scripts automatically once the read happens. Future code that adds third-party <Script> components (analytics, etc.) should pass the returned nonce explicitly. Verified against live tenant: before this change every /_next/ chunk script tag in the HTML had no nonce attribute; expected after deploy is `<script nonce="..." src="/_next/...">` on each. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
35df23850e |
fix(canvas): CSP_DEV_MODE + admin token for local Docker (#1052 follow-up)
Three changes that keep getting lost on nuke+rebuild: 1. middleware.ts: read CSP_DEV_MODE env to relax CSP in local Docker 2. api.ts: send NEXT_PUBLIC_ADMIN_TOKEN header (AdminAuth on /workspaces) 3. Dockerfile: accept NEXT_PUBLIC_ADMIN_TOKEN as build arg All three are required for the canvas to work in local Docker where canvas (port 3000) fetches from platform (port 8080) cross-origin. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
7a2a17591c |
chore(canvas): remove dead /waitlist page (lives in molecule-app)
#1080 added /waitlist to canvas, but canvas isn't served at app.moleculesai.app — it backs the tenant subdomains (acme.moleculesai.app etc.). The real /waitlist lives in the separate molecule-app repo, which is what the CP auth callback redirects to. molecule-app#12 has the real page + contact form wiring to /cp/waitlist/request. This canvas copy was never reachable and would only diverge. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
b7149c5dda |
feat(canvas): /waitlist page with contact form
Adds the user-facing half of the beta-gate: a page at /waitlist that the CP auth callback redirects users to when their email isn't on the allowlist. Collects email + optional name + use-case and POSTs to /cp/waitlist/request (backend landed in controlplane #150). ## Behavior - No auto-pre-fill of email from URL query (CP's #145 dropped the ?email= param for the privacy reason; this test guards against a future regression on the client side). - Client-side validates email shape for instant feedback; backend re-validates. - Three UI states after submit: success → "your request is in" banner, form hidden dedup → softer "already on file" banner when backend returns dedup=true (same 200, no 409 to avoid enumeration) error → inline banner with backend message or network fallback ## Tests 9 tests in __tests__/waitlist-page.test.tsx covering: - default render + a11y (role=button, role=status, role=alert) - URL-pre-fill privacy regression guard - HTML5 + JS validation (empty, malformed) - successful POST with trimmed body - dedup branch - non-2xx with + without error field - network rejection Follow-up to the beta-gate rollout on controlplane #145 / #150. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
a3e70b739c |
Merge pull request #949 from Molecule-AI/feat/canvas-batch-operations
feat(canvas): batch operations — multi-select + restart/pause/delete |
||
|
|
fc0e02d692 |
Merge pull request #1011 from Molecule-AI/test/qa-coverage-orgs-page-and-api-timeout
test(canvas): QA coverage — orgs page polling + API timeout |
||
|
|
67c12b119e |
Merge pull request #1016 from Molecule-AI/fix/a11y-workspace-node
fix(a11y): WorkspaceNode font floor, contrast, focus rings |
||
| 875d4ab0a5 |
fix(gate-1): remove unused fireEvent import (#1011)
Mechanical lint fix. github-code-quality[bot] flagged unused import on line 18 — fireEvent is imported but never referenced in the test file. Removing it clears the code quality gate without changing any test behaviour. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
|||
|
|
da2a2b795f |
fix(a11y): WorkspaceNode font floor, contrast, focus rings (Cycle 10)
C1: skills badge spans text-[7px]→text-[10px]; "+N more" overflow
text-[7px] text-zinc-500→text-[10px] text-zinc-400
C2: Team section label text-[7px] text-zinc-600→text-[10px] text-zinc-400
H4: status label text-[9px]→text-[10px]; active-tasks count
text-[9px] text-amber-300/80→text-[10px] text-amber-300 (remove opacity
modifier per design-system contrast rule); current-task text
text-[9px] text-amber-300/70→text-[10px] text-amber-300
L1: add focus-visible:ring-2 focus-visible:ring-blue-500/70 to the Restart
button (independently Tab-focusable inside role="button" wrapper) and to
the Extract-from-team button in TeamMemberChip; TeamMemberChip
role="button" div already has the focus ring (COVERED, no change)
762/762 tests pass · build clean
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
||
|
|
4db8d30ece |
test(canvas): cover /orgs 5s polling on in-flight orgs
The test docstring promised polling coverage but I'd only wired the
describe-block header, not the actual tests. Closing that gap — vitest
fake timers drive three cases:
- `provisioning` org → 2nd fetch fires after 5.1s advance
- all `running` → no 2nd fetch even after 10s advance
- `awaiting_payment` org, unmount before timer fires → no post-unmount
fetch (cleanup correctly clears the pollTimer)
The unmount case is the meaningful one: without it a fast nav-away
leaves the 5s interval chasing the CP forever. page.tsx L97-99 does
clear the timer; the test pins the contract.
Local baseline on origin/staging tip
|
||
|
|
e70cd1c2aa |
test(canvas): pin AbortSignal timeout regression + cover /orgs landing page
Two independent test additions that harden the surface freshly landed on
staging via PRs #982 (canvas fetch timeout), #992 (/orgs landing), #994
(post-checkout redirect to /orgs).
canvas/src/lib/__tests__/api.test.ts (+74 lines, 7 new tests)
- GET/POST/PATCH/PUT/DELETE each pass an AbortSignal to fetch
- TimeoutError (DOMException name=TimeoutError) propagates to the caller
- Each request installs its own signal — no shared module-level controller
that would allow one slow request to cancel an unrelated fast one
This is the hardening nit I flagged in my APPROVE-w/-nit review of
fix/canvas-api-fetch-timeout. Landing as a follow-up now that #982 is in
staging.
canvas/src/app/__tests__/orgs-page.test.tsx (+251 lines, new file, 10 tests)
- Auth guard: signed-out → redirectToLogin and no /cp/orgs fetch
- Error state: failed /cp/orgs → Error message + Retry button
- Empty list: CreateOrgForm renders
- CTA by status:
running → "Open" link targets {slug}.moleculesai.app
awaiting_payment → "Complete payment" → /pricing?org=<slug>
failed → "Contact support" mailto
- Post-checkout: ?checkout=success renders CheckoutBanner AND
history.replaceState scrubs the query param
- Fetch contract: /cp/orgs called with credentials:include + AbortSignal
Local baseline on origin/staging tip
|
||
|
|
dd5654b803 |
feat(canvas): ToS gate modal + us-east-2 data residency notice
Wraps /orgs in a TermsGate that polls /cp/auth/terms-status on mount and overlays a blocking modal when the current terms version hasn't been accepted yet. "I agree" POSTs /cp/auth/accept-terms and dismisses the modal; the backend records IP + UA as GDPR Art. 7 proof-of-consent. Also adds a short data residency notice under the page header: workspaces run in AWS us-east-2 (Ohio, US). An EU region selector is a future lift once the infra is provisioned there. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
18894bebe8 |
feat(canvas): Phase 5 — credit balance pill + low-balance banner
Adds the UI surface for the credit system to /orgs: - CreditsPill next to each org row. Tone shifts from zinc → amber at 10% of plan to red at zero. - LowCreditsBanner appears under the pill for running orgs when the balance crosses thresholds: overage_used > 0 → "overage active", balance <= 0 → "out of credits, upgrade", trial tail → "trial almost out". - Pure helpers extracted to lib/credits.ts so formatCredits, pillTone, and bannerKind are unit-tested without jsdom. Backend List query now returns credits_balance / plan_monthly_credits / overage_used_credits / overage_cap_credits so no second round-trip is needed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |