molecule-core

Author	SHA1	Message	Date
Hongming Wang	46a8d24b2d	feat(workspace): persist CP-returned EC2 instance_id on provision Foundation for the EIC-based terminal handler (#1528). The tenant's workspace-server needs to map workspace_id → EC2 instance_id to open an SSH session, but CPProvisioner.Start returned the instance id only for logging — it was never written anywhere. This PR adds the column and writes it at provision time. Scope kept intentionally small: no terminal code yet. The follow-up PR will consume this column from the terminal handler. What's here: - migrations/038_workspace_instance_id — nullable TEXT column on workspaces, partial index on non-null for fast lookup - workspace_provision.go — UPDATE after CPProvisioner.Start; failure logs but doesn't fail provisioning (row just lacks instance_id and terminal falls back to the existing not-reachable error) - docs/infra/workspace-terminal.md — full design for the terminal flow: EIC vs SSM comparison, IAM policy JSON, SG rules, key lifetime, failure modes, rollout checklist Refs: #1528 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 17:56:15 -07:00
Hongming Wang	73464a21dd	fix(restart): support SaaS control-plane provisioner (unblocks Platform Go build too) (#1512 ) Squash-merge fix/restart (PR #1512): remove SSRF helpers from a2a_proxy_helpers.go since ssrf.go on main now owns these functions, resolving duplicate symbol build failures. Author: HongmingWang-Rabbit. Approved by molecule-ai. Mergeable, UNSTABLE (likely due to pending head branch changes).	2026-04-21 22:56:01 +00:00
Hongming Wang	2133e5601f	Merge pull request #1491 from Molecule-AI/feat/e2e-staging-saas-cicd fix(e2e): 9 follow-ups to make staging E2E actually green end-to-end	2026-04-21 11:39:07 -07:00
Hongming Wang	bd020d84be	ci(e2e): wire MOLECULE_STAGING_OPENAI_KEY into workflow env The harness needs E2E_OPENAI_API_KEY set for Hermes workspaces to boot — without it the runtime crashes with "No provider API key found" and workspaces never hit online. Preflight step fails fast with a clear error if the repo secret is missing, so CI doesn't burn 10 minutes on a foregone conclusion. Repo secret to add: Settings → Secrets → Actions → MOLECULE_STAGING_OPENAI_KEY. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 11:24:59 -07:00
molecule-ai[bot]	64ccf8e179	fix: CWE-78 rm scope, go vet failures, delegation idempotency * refactor: split 4 oversized handler files into focused sub-files - org.go (1099 lines) → org.go + org_import.go + org_helpers.go - mcp.go (1001 lines) → mcp.go + mcp_tools.go - workspace.go (934 lines) → workspace.go + workspace_crud.go - a2a_proxy.go (825 lines) → a2a_proxy.go + a2a_proxy_helpers.go No functional changes — same package, same exports, same tests. All files stay under 635 lines. Note: isSafeURL and isPrivateOrMetadataIP are duplicated between mcp_tools.go and a2a_proxy_helpers.go — this is a pre-existing issue from the original mcp.go and a2a_proxy.go, not introduced by this split. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(runtime+scheduler): increment/decrement active_tasks counter (refs #1386) * docs(tutorials): add Self-Hosted AI Agents guide — Docker, Fly Machines, bare metal * docs: add Remote Agents feature + Phase 30 blog links to docs index * docs(marketing): update Phase 30 brief — Action 5 complete, docs/index.md update noted * docs(api-ref): add workspace file copy API reference (#1281) Documents TemplatesHandler.copyFilesToContainer (container_files.go): - Endpoint overview: PUT /workspaces/:id/files/path - Parameter descriptions for all four function parameters - CWE-22 path traversal protection (PRs #1267/1270/1271) - Defense-in-depth: validateRelPath at handler + archive boundary - Full error code table (400/404/500) - curl example with success and path-traversal rejection cases Also covers: writeViaEphemeral routing, findContainer fallback, allowed roots allow-list, and related links to platform-api.md. Co-authored-by: Molecule AI Technical Writer <technical-writer@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> fix(security): CWE-78/CWE-22 — block shell injection in deleteViaEphemeral (#1310) ## Summary Issue #1273: deleteViaEphemeral interpolated filePath directly into rm command, enabling both shell injection (CWE-78) and path traversal (CWE-22) attacks. ## Changes 1. Added validateRelPath(filePath) guard before constructing the rm command. validateRelPath blocks absolute paths and ".." traversal sequences. 2. Changed Cmd from "/configs/"+filePath (string interpolation) to []string{"rm", "-rf", "/configs", filePath} (exec form). This eliminates shell injection entirely — filePath is a plain argument, never interpreted as shell code. ## Security properties - validateRelPath: blocks "../" and absolute paths before they reach Docker - Exec form: filePath cannot inject shell metacharacters even if validation is somehow bypassed - "/configs" as separate arg: rm has exactly two arguments, no room for injected args Closes #1273. Co-authored-by: Molecule AI Infra-Runtime-BE <infra-runtime-be@agents.moleculesai.app> * fix(security): backport SSRF defence (CWE-918) to main — isSafeURL in a2a_proxy.go (#1292) (#1302) * fix(security): backport SSRF defence (CWE-918) to main — isSafeURL in mcp.go and a2a_proxy.go Issue #1042: 3 CodeQL SSRF findings across mcp.go and a2a_proxy.go. staging already ships the fix (PRs #1147, #1154 → merged); main did not include it. - mcp.go: add isSafeURL() + isPrivateOrMetadataIP() helpers; validate agentURL before outbound calls in mcpCallTool (line ~529) and toolDelegateTaskAsync (line ~607) - a2a_proxy.go: add identical isSafeURL() + isPrivateOrMetadataIP() helpers; call isSafeURL() before dispatchA2A in resolveAgentURL() (blocks finding #1 at line 462) - mcp_test.go: 19 new tests covering all blocked URL patterns: file://, ftp://, 127.0.0.1, ::1, 169.254.169.254, 10.x.x.x, 172.16.x.x, 192.168.x.x, empty hostname, invalid URL, isPrivateOrMetadataIP across all private/CGNAT/metadata ranges 1. URL scheme enforcement — http/https only 2. IP literal blocking — loopback, link-local, RFC-1918, CGNAT, doc/test ranges 3. DNS hostname resolution — blocks internal hostnames resolving to private IPs Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(ci-blocker): remove duplicate isSafeURL/isPrivateOrMetadataIP from mcp.go Issue #1292: PR #1274 duplicated isSafeURL + isPrivateOrMetadataIP in mcp.go — both functions already exist on main at lines 829 and 876. Kept the mcp.go definitions (the originals) and removed the 70-line duplicate appended at end of file. a2a_proxy.go functions are unchanged — they serve the same purpose via a separate code path. * fix: remove orphaned commit-text lines from a2a_proxy.go Three lines from the PR/commit title were accidentally baked into the file during the rebase from #1274 to #1302, causing a Go syntax error (a bare string literal at statement level followed by dangling braces). Deletion restores: } return agentURL, nil } Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Molecule AI Infra-Runtime-BE <infra-runtime-be@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Molecule AI Core-BE <core-be@agents.moleculesai.app> Co-authored-by: Molecule AI SDK Lead <sdk-lead@agents.moleculesai.app> * fix(canvas/test): patch test regressions from PR #1243 + proximity hitbox fix (#1313) * fix(ci): revert cancel-in-progress to true — ubuntu-runner dispatch stalled With cancel-in-progress: false, pending CI runs accumulate in the ci-staging concurrency group. New pushes create queued runs, but GitHub dispatches multiple runs for the same SHA instead of replacing the pending one. All runs get stuck/cancelled before completing. Reverting to cancel-in-progress: true restores CI operation — runs that are superseded are cancelled, freeing the concurrency slot for the new run to proceed. Runner availability (ubuntu-latest dispatch stall) is a separate infra issue tracked independently. * fix(security): validate tar header names in copyFilesToContainer — CWE-22 path traversal (#1043) Tar header names were built from raw map keys without validation. A malicious server-side caller could embed "../" in a file name to escape the destPath volume mount (/configs) and write files outside the intended directory. Fix: validate each name with filepath.Clean + IsAbs + HasPrefix("..") checks before using it in the tar header, then join with destPath for the archive header. Also guard parent-directory creation against traversal. Closes #1043. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas/test): patch regressed tests from PR #1243 orgs-page flakiness fix Two regressions introduced by PR #1243 (fix issue #1207): 1. ContextMenu.keyboard.test.tsx — `setPendingDelete` now receives `{id, name, hasChildren}` (cascade-delete UX, PR #1252), but the test expected only `{id, name}`. Added `hasChildren: false` to the assertion. 2. orgs-page.test.tsx — 10 tests awaited `vi.advanceTimersByTimeAsync(50)` without `act()`. With fake timers, `setState` (synchronous) is flushed by `advanceTimersByTimeAsync`, but the React state update it triggers is a microtask — so the test saw stale render. Wrapping in `act(async () => { await vi.advanceTimersByTimeAsync(50); })` ensures microtasks drain before assertions run. All 813 vitest tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): add 100px proximity threshold to drag-to-nest detection Fixes #1052 — previously, getIntersectingNodes() returned any node whose bounding box overlapped the dragged node, regardless of actual pixel distance. On a sparse canvas this triggered the "Nest Workspace" dialog even when the dragged node was nowhere near any target. The fix adds an on-node-drag proximity filter: only nodes within 100px (center-to-center) of the dragged node are eligible as nest targets. Distance is computed as squared Euclidean to avoid the sqrt overhead in the hot drag path. Added two tests to Canvas.pan-to-node.test.tsx covering the mock wiring and confirming the regression is addressed in Canvas.tsx. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com> Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): add ?? 0 guard for optional budget_used in progressPct (#1324) (#1327) * fix(ci): revert cancel-in-progress to true — ubuntu-runner dispatch stalled With cancel-in-progress: false, pending CI runs accumulate in the ci-staging concurrency group. New pushes create queued runs, but GitHub dispatches multiple runs for the same SHA instead of replacing the pending one. All runs get stuck/cancelled before completing. Reverting to cancel-in-progress: true restores CI operation — runs that are superseded are cancelled, freeing the concurrency slot for the new run to proceed. Runner availability (ubuntu-latest dispatch stall) is a separate infra issue tracked independently. * fix(security): validate tar header names in copyFilesToContainer — CWE-22 path traversal (#1043) Tar header names were built from raw map keys without validation. A malicious server-side caller could embed "../" in a file name to escape the destPath volume mount (/configs) and write files outside the intended directory. Fix: validate each name with filepath.Clean + IsAbs + HasPrefix("..") checks before using it in the tar header, then join with destPath for the archive header. Also guard parent-directory creation against traversal. Closes #1043. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas/test): patch regressed tests from PR #1243 orgs-page flakiness fix Two regressions introduced by PR #1243 (fix issue #1207): 1. ContextMenu.keyboard.test.tsx — `setPendingDelete` now receives `{id, name, hasChildren}` (cascade-delete UX, PR #1252), but the test expected only `{id, name}`. Added `hasChildren: false` to the assertion. 2. orgs-page.test.tsx — 10 tests awaited `vi.advanceTimersByTimeAsync(50)` without `act()`. With fake timers, `setState` (synchronous) is flushed by `advanceTimersByTimeAsync`, but the React state update it triggers is a microtask — so the test saw stale render. Wrapping in `act(async () => { await vi.advanceTimersByTimeAsync(50); })` ensures microtasks drain before assertions run. All 813 vitest tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): add 100px proximity threshold to drag-to-nest detection Fixes #1052 — previously, getIntersectingNodes() returned any node whose bounding box overlapped the dragged node, regardless of actual pixel distance. On a sparse canvas this triggered the "Nest Workspace" dialog even when the dragged node was nowhere near any target. The fix adds an on-node-drag proximity filter: only nodes within 100px (center-to-center) of the dragged node are eligible as nest targets. Distance is computed as squared Euclidean to avoid the sqrt overhead in the hot drag path. Added two tests to Canvas.pan-to-node.test.tsx covering the mock wiring and confirming the regression is addressed in Canvas.tsx. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): add ?? 0 guard for optional budget_used in progressPct Fixes #1324 — TypeScript strict mode flags budget.budget_used as possibly undefined in the progressPct ternary, even though the outer condition checks budget_limit > 0. Fix: use nullish coalescing (budget_used ?? 0) so progress shows 0% when the backend returns a partial shape (provisioning-stuck workspaces). Also adds a test covering the undefined-budget_used case with the progress bar aria-valuenow and fill width both at 0%. Closes #1324. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com> Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): add ?? 0 guard for optional budget_used in progressPct (issue #1324) (#1329) * fix(ci): revert cancel-in-progress to true — ubuntu-runner dispatch stalled With cancel-in-progress: false, pending CI runs accumulate in the ci-staging concurrency group. New pushes create queued runs, but GitHub dispatches multiple runs for the same SHA instead of replacing the pending one. All runs get stuck/cancelled before completing. Reverting to cancel-in-progress: true restores CI operation — runs that are superseded are cancelled, freeing the concurrency slot for the new run to proceed. Runner availability (ubuntu-latest dispatch stall) is a separate infra issue tracked independently. * fix(security): validate tar header names in copyFilesToContainer — CWE-22 path traversal (#1043) Tar header names were built from raw map keys without validation. A malicious server-side caller could embed "../" in a file name to escape the destPath volume mount (/configs) and write files outside the intended directory. Fix: validate each name with filepath.Clean + IsAbs + HasPrefix("..") checks before using it in the tar header, then join with destPath for the archive header. Also guard parent-directory creation against traversal. Closes #1043. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas/test): patch regressed tests from PR #1243 orgs-page flakiness fix Two regressions introduced by PR #1243 (fix issue #1207): 1. ContextMenu.keyboard.test.tsx — `setPendingDelete` now receives `{id, name, hasChildren}` (cascade-delete UX, PR #1252), but the test expected only `{id, name}`. Added `hasChildren: false` to the assertion. 2. orgs-page.test.tsx — 10 tests awaited `vi.advanceTimersByTimeAsync(50)` without `act()`. With fake timers, `setState` (synchronous) is flushed by `advanceTimersByTimeAsync`, but the React state update it triggers is a microtask — so the test saw stale render. Wrapping in `act(async () => { await vi.advanceTimersByTimeAsync(50); })` ensures microtasks drain before assertions run. All 813 vitest tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): add 100px proximity threshold to drag-to-nest detection Fixes #1052 — previously, getIntersectingNodes() returned any node whose bounding box overlapped the dragged node, regardless of actual pixel distance. On a sparse canvas this triggered the "Nest Workspace" dialog even when the dragged node was nowhere near any target. The fix adds an on-node-drag proximity filter: only nodes within 100px (center-to-center) of the dragged node are eligible as nest targets. Distance is computed as squared Euclidean to avoid the sqrt overhead in the hot drag path. Added two tests to Canvas.pan-to-node.test.tsx covering the mock wiring and confirming the regression is addressed in Canvas.tsx. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): add ?? 0 guard for optional budget_used in progressPct Fixes #1324 — TypeScript strict mode flags budget.budget_used as possibly undefined in the progressPct ternary, even though the outer condition checks budget_limit > 0. Fix: use nullish coalescing (budget_used ?? 0) so progress shows 0% when the backend returns a partial shape (provisioning-stuck workspaces). Also adds a test covering the undefined-budget_used case with the progress bar aria-valuenow and fill width both at 0%. Closes #1324. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com> Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(platform): unblock SaaS workspace registration end-to-end Every workspace in the cross-EC2 SaaS provisioning shape was failing registration, heartbeat, or A2A routing. Four distinct blockers sat between "EC2 is up" and "agent responds"; three are platform-side and fixed here (the fourth is in the CP user-data, separate PR). 1. SSRF validator blocked RFC-1918 (registry.go + mcp.go) validateAgentURL and isPrivateOrMetadataIP rejected 172.16.0.0/12, which contains the AWS default VPC range (172.31.x.x) that every sibling workspace EC2 registers from. Registration returned 400 and the 10-min provision sweep flipped status to failed. RFC-1918 + IPv6 ULA are now gated behind saasMode(); link-local (169.254/16), loopback, IPv6 metadata (fe80::/10, ::1), and TEST-NET stay blocked unconditionally in both modes. saasMode() resolution order: 1. MOLECULE_DEPLOY_MODE=saas\|self-hosted (explicit operator flag) 2. MOLECULE_ORG_ID presence (legacy implicit signal, kept for back-compat so existing deployments don't need a config change) isPrivateOrMetadataIP now actually checks IPv6 — previously it returned false on any non-IPv4 input, which would let a registered [::1] or [fe80::...] URL bypass the SSRF check entirely. 2. Orphan auth-token minting (workspace_provision.go) issueAndInjectToken mints a token and stuffs it into cfg.ConfigFiles[".auth_token"]. The Docker provisioner writes that file into the /configs volume — the CP provisioner ignores it (only cfg.EnvVars crosses the wire). Result: live token in DB, no plaintext on disk, RegistryHandler.requireWorkspaceToken 401s every /registry/register attempt because the workspace is no longer in the "no live token → bootstrap-allowed" state. Now no-ops in SaaS mode; the register handler already mints on first successful register and returns the plaintext in the response body for the runtime to persist locally. Also removes the redundant wsauth.IssueToken call at the bottom of provisionWorkspaceCP, which created the same orphan-token pattern a second time. 3. Compaction artefacts (bundle/importer.go, handlers/org_tokens.go, scheduler.go, workspace_provision.go) Four pre-existing compile errors on main from an earlier session's code truncation: missing tuple destructuring on ExecContext / redactSecrets / orgTokenActor, missing close-brace in Scheduler.fireSchedule's panic recovery. All one-line mechanical fixes; without them the binary would not build. Tests ----- ssrf_test.go adds: * TestSaasMode — covers the env resolution ladder (explicit flag wins over legacy signal, case-insensitive, whitespace tolerant) * TestIsPrivateOrMetadataIP_SaaSMode — asserts RFC-1918 + IPv6 ULA flip to allowed, metadata/loopback/TEST-NET still blocked * TestIsPrivateOrMetadataIP_IPv6 — regression guard for the old "returns false for all IPv6" behaviour Follow-up issue for CP-sourced workspace_id attestation will be filed separately — closes the residual intra-VPC SSRF + token-race windows the SaaS-mode relaxation introduces. Verified end-to-end today on workspace 6565a2e0 (hermes runtime, OpenAI provider) — agent returned "PONG" in 1.4s after register → heartbeat → A2A proxy → runtime. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(runtime+scheduler): increment/decrement active_tasks + max_concurrent (#1408) Runtime (shared_runtime.py): - set_current_task now increments active_tasks on task start, decrements on completion (was binary 0/1) - Counter never goes below 0 (max(0, n-1)) - Pushes heartbeat immediately on BOTH increment and decrement (#1372) Scheduler (scheduler.go): - Reads max_concurrent_tasks from DB (default 1, backward compatible) - Skips cron only when active_tasks >= max_concurrent_tasks (was > 0) - Leaders can be configured with max_concurrent_tasks > 1 to accept A2A delegations while a cron runs Platform: - Added max_concurrent_tasks column to workspaces (migration 037) - Workspace model + list/get queries include the new field - API exposes max_concurrent_tasks in workspace JSON Config.yaml support (future): runtime_config.max_concurrent_tasks Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(review): address 3 critical issues from code review 1. BLOCKER: executor_helpers.py now uses increment/decrement too (was still binary 0/1, stomping the counter for CLI + SDK executors) 2. BUG: asymmetric getattr defaults fixed — both paths use default 0 (was 0 on increment, 1 on decrement) 3. UX: current_task preserved when active_tasks > 0 on decrement (was clearing task description even when other tasks still running) 4. Scheduler polling loop re-reads max_concurrent_tasks on each poll (was using stale value from initial query) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Hongming Wang <hongmingwangrabbit@gmail.com> Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com> Co-authored-by: Molecule AI Technical Writer <technical-writer@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Molecule AI Infra-Runtime-BE <infra-runtime-be@agents.moleculesai.app> Co-authored-by: Molecule AI Core-BE <core-be@agents.moleculesai.app> Co-authored-by: Molecule AI SDK Lead <sdk-lead@agents.moleculesai.app> Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app> Co-authored-by: Hongming Wang <hongmingwang.rabbit@users.noreply.github.com> * docs: workspace files API reference, skill catalog, and links * docs: fix secrets endpoint path across docs The workspace secrets endpoint is `/workspaces/:id/secrets`, not `/secrets/values`. This was wrong in quickstart.md (Path 2: Remote Agent) and workspace-runtime.md (registration flow example and comparison table). The external-agent-registration guide already had the correct path. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs: fix broken blog cross-link in skills-vs-bundled-tools post Link path had an extra `/docs/` segment: `/docs/blog/...` instead of `/blog/...`. Nextra resolves blog posts directly under `/blog/<slug>`, not under `/docs/blog/`. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs: add skill-catalog.md guide Linked from the skills-vs-bundled-tools blog post as a reference for TTS/image-generation/web-search skills. The blog promises "install directly via the CLI" with a skill catalog — this page fills that promise by documenting available skill types, install commands, version management, custom skill authoring, and removal. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs(marketing): update Phase 30 brief — Action 5 complete, docs/index.md update noted * docs(api-ref): add workspace file copy API reference Documents TemplatesHandler.copyFilesToContainer (container_files.go): - Endpoint overview: PUT /workspaces/:id/files/path - Parameter descriptions for all four function parameters - CWE-22 path traversal protection (PRs #1267/1270/1271) - Defense-in-depth: validateRelPath at handler + archive boundary - Full error code table (400/404/500) - curl example with success and path-traversal rejection cases Also covers: writeViaEphemeral routing, findContainer fallback, allowed roots allow-list, and related links to platform-api.md. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Molecule AI Technical Writer <technical-writer@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com> fix(handlers): add saasMode() gating to isPrivateOrMetadataIP in a2a_proxy_helpers.go Issue #1421 / #1401: PR #1363 (handler split) moved isPrivateOrMetadataIP into a2a_proxy_helpers.go but kept the OLD pre-SaaS version — it unconditionally blocks RFC-1918 addresses, regressing the fix in commits `1125a02` / `cf10733`. The A2A proxy path now has the same SaaS-gated logic as registry.go: - Cloud metadata (169.254/16, fe80::/10, ::1) always blocked in both modes - RFC-1918 (10/8, 172.16/12, 192.168/16) + IPv6 ULA (fc00::/7) blocked in self-hosted, allowed in SaaS cross-EC2 mode - IPv6 addresses now properly checked (previous version returned false for all) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs(marketing): Discord adapter Day 2 Reddit + HN community copy * fix(tests): supply events.Broadcaster pointer to captureBroadcaster Cannot use captureBroadcaster as events.Broadcaster when the struct embeds events.Broadcaster as a value — must initialize as a named field. Fixes go vet error in workspace_provision_test.go: cannot use broadcaster (captureBroadcaster) as events.Broadcaster value Merge pull request #1429 from fix/canvas-tooltip-clear-timer Without this, a 400ms setTimeout from onFocus/onMouseEnter that fires after onBlur will re-show a tooltip the user just dismissed. The setShow(false) in onBlur closes the tooltip immediately but leaves the timer pending — Tab-blur followed by timer-fire would re-show it. Fix: add clearTimeout(timerRef.current) at the top of onBlur, mirroring the pattern already used in onMouseLeave and onFocus. Refs: PR #1367 (a11y keyboard support — this was a pre-existing gap) Co-authored-by: Molecule AI App-FE <app-fe@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas/test): add missing children:[] to setPendingDelete expectation (#1426) PR #1252 (cascade-delete UX) updated setPendingDelete to pass a children array for cascade-warning rendering. The keyboard-a11y test assertion was not updated to match. Test: clicking 'Delete' hoists state to the store and closes the menu Co-authored-by: Molecule AI Core-QA <core-qa@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas/test): add children:[] to setPendingDelete + \' entity fix (closes #1380) (#1427) * ci: retry — trigger fresh runner allocation * fix(canvas/test): add children:[] to setPendingDelete assertion setPendingDelete now includes children:[] (PR #1383 extended the pendingDelete type). The keyboard accessibility test at line 225 used exact object matching which omitted the new field, causing a failure after staging merged #1383. Issue: #1380 * fix(canvas): replace ' HTML entity with straight apostrophe JSX does not entity-decode ' — it renders the literal text "'" instead of "'". Found at line 157 (payment confirmed) and line 321 (empty org list). Replaced with a straight apostrophe, which JSX handles correctly. Ref: issue #1375 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: DevOps Engineer <devops@molecule.ai> Co-authored-by: Molecule AI Core-UIUX <core-uiux@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * Merge pull request #1430 from fix/1421-saas-ssrf-helpers Issue #1421 / #1401: PR #1363 (handler split) moved isPrivateOrMetadataIP into a2a_proxy_helpers.go but kept the OLD pre-SaaS version — it unconditionally blocks RFC-1918 addresses, regressing the fix in commits `1125a02` / `cf10733`. The A2A proxy path now has the same SaaS-gated logic as registry.go: - Cloud metadata (169.254/16, fe80::/10, ::1) always blocked in both modes - RFC-1918 (10/8, 172.16/12, 192.168/16) + IPv6 ULA (fc00::/7) blocked in self-hosted, allowed in SaaS cross-EC2 mode - IPv6 addresses now properly checked (previous version returned false for all) Co-authored-by: Molecule AI Core-BE <core-be@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(P0): CWE-22 path traversal in copyFilesToContainer + ContextMenu test Issue #1434 — CWE-22 Path Traversal Regression: PR #1280 (`dc218212`) correctly used cleaned path in tar header. PR #1363 (`e9615af`) regressed to using uncleaned `name`. Fix: use `clean` in filepath.Join AND add defence-in-depth escape check. Issue #1422 — ContextMenu Test Regression: PR #1340 expanded pendingDelete store type to include `children:[]`. Test assertion missing the field — add `children:[]` to match. Note: ssrf.go created (shared isSafeURL/isPrivateOrMetadataIP) to prepare for the handler-split refactor fix — current branch has no build error, but the shared file will prevent regression when PR #1363 is merged. isSafeURL/isPrivateOrMetadataIP retained in both files for now to avoid breaking callers while the split is finalized. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: resolve 3 go vet failures + add idempotency_key to delegate_task_async - workspace_provision_test.go: add missing mock := setupTestDB(t) to TestSeedInitialMemories_Truncation — mock was referenced but never declared, causing "undefined: mock" vet error - orgtoken/tokens_test.go: discard unused orgID return value with _ in Validate call — "declared and not used" vet error - a2a_tools.py: delegate_task_async now sends idempotency_key (SHA-256 of workspace_id + task) to POST /workspaces/:id/delegate, fixing duplicate task execution when an agent restarts mid-delegation (#1456) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: airenostars <airenostars@gmail.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com> Co-authored-by: Hongming Wang <hongmingwangrabbit@gmail.com> Co-authored-by: Molecule AI Technical Writer <technical-writer@agents.moleculesai.app> Co-authored-by: Molecule AI Infra-Runtime-BE <infra-runtime-be@agents.moleculesai.app> Co-authored-by: Molecule AI Core-BE <core-be@agents.moleculesai.app> Co-authored-by: Molecule AI SDK Lead <sdk-lead@agents.moleculesai.app> Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app> Co-authored-by: Hongming Wang <hongmingwang.rabbit@users.noreply.github.com> Co-authored-by: Molecule AI Community Manager <community-manager@agents.moleculesai.app> Co-authored-by: Molecule AI App-FE <app-fe@agents.moleculesai.app> Co-authored-by: Molecule AI Core-QA <core-qa@agents.moleculesai.app> Co-authored-by: DevOps Engineer <devops@molecule.ai> Co-authored-by: Molecule AI Core-UIUX <core-uiux@agents.moleculesai.app> Co-authored-by: Molecule AI Dev Lead <dev-lead@agents.moleculesai.app>	2026-04-21 18:22:30 +00:00
rabbitblood	ce52b67d62	fix(build): add missing fmt import to a2a_proxy.go Build broken on main since `d86b8fe` — a2a_proxy.go uses fmt.Errorf() (8 call sites) but the import was dropped during an isSafeURL refactor merge. CI fails with "undefined: fmt" at lines 743-775. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-21 11:17:54 -07:00
molecule-ai[bot]	859d676f70	fix(CI): correct BASE in detect-changes (PR/push race); catch RuntimeError in conftest (#1473 ) - ci.yml: replace if/else BASE assignment with GITHUB_BASE_REF default + pull_request base.sha override pattern. Prevents push events from overwriting the correct PR base SHA when both events fire together. - conftest.py: catch RuntimeError in addition to ImportError when importing coordinator.py, which raises RuntimeError at import time when WORKSPACE_ID is not set (before the ImportError guard). Co-authored-by: Molecule AI Release Manager <release-manager@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-21 18:15:45 +00:00
Hongming Wang	5e130b7e6f	fix(e2e): delegation raw curl missing X-Molecule-Org-Id Section 10's delegation call is a raw curl (not tenant_call, because it carries an additional X-Source-Workspace-Id). It was missing X-Molecule-Org-Id, which TenantGuard requires — so the tenant 404'd every delegation probe despite section 8's A2A call (via tenant_call) working correctly. Repro: staging run 2026-04-21T17:40Z had section 8 green (PONG) and section 10 red (rc=22) on the same workspace. Only difference was the missing header. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 10:41:17 -07:00
Hongming Wang	b8b3d5ce1f	fix(e2e): MODEL_PROVIDER is provider:model slug, not just provider workspace/config.py:258 reads MODEL_PROVIDER as the full model string (format 'provider:model', e.g. 'anthropic:claude-opus-4-7'). My prior 'openai' alone got parsed as the model name → 404 model_not_found. Use 'openai:gpt-4o' and also set OPENAI_BASE_URL to api.openai.com (default was openrouter.ai which takes different key format). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 10:33:27 -07:00
Hongming Wang	392282c518	fix(e2e): set MODEL_PROVIDER=openai for Hermes runtime Hermes's provider resolver checks ANTHROPIC_API_KEY first (resolution order puts anthropic before openai). Without MODEL_PROVIDER=openai explicitly set, Hermes defaults to claude-sonnet-4-6 against the OpenAI endpoint and 404s with model_not_found. Staging E2E run 2026-04-21T17:24Z hit this after every earlier fix landed (workspace online, A2A ready) — last remaining blocker for the happy path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 10:24:58 -07:00
Hongming Wang	5be20ac1cf	fix(e2e): inject OPENAI_API_KEY into workspace secrets Workspace runtimes (hermes, langgraph, etc.) crash at boot with 'No provider API key found' when no ANTHROPIC_API_KEY / OPENAI_API_KEY / etc. is set. Harness previously sent no secrets → workspace sat in provisioning for 10 min → harness timed out. Console log from staging run 2026-04-21T17:08Z showed the exact crash: ValueError: No Hermes provider API key found. Set any one of: ANTHROPIC_API_KEY, HERMES_API_KEY, NOUS_API_KEY, OPENROUTER_API_KEY, OPENAI_API_KEY, ... Read E2E_OPENAI_API_KEY from env and inject into both parent and child workspace POST bodies via the secrets field (persists as workspace_secret, materialises into container env). Empty key falls through — dev can still run smoke tests, workspace just won't reach online. For CI, a new repo secret MOLECULE_STAGING_OPENAI_KEY needs to be added and passed as E2E_OPENAI_API_KEY in the workflow env. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 10:18:14 -07:00
molecule-ai[bot]	d86b8feb36	Merge pull request #1469 from Molecule-AI/fix/main-build-dedupe-ssrf fix(core): resolve main build — remove duplicate SSRF function declarations	2026-04-21 17:06:43 +00:00
Molecule AI Core Platform Lead	8f8be17db4	fix(core): resolve main build — remove duplicate SSRF function declarations Build on origin/main (`38e9eba`) will fail go build with duplicate function declarations: ssrf.go:15 isSafeURL redeclared (a2a_proxy.go:741) ssrf.go:58 isPrivateOrMetadataIP redeclared (a2a_proxy.go:795) ssrf.go:84 validateRelPath redeclared (templates.go:65) a2a_proxy.go:14 "fmt" imported and not used Root cause: main was fast-forwarded to a CWE-22 fix commit that incorporated ssrf.go from the staging handler-split (PR #1457), but ssrf.go declares isSafeURL/isPrivateOrMetadataIP that already exist in a2a_proxy.go, and validateRelPath that already exists in templates.go. Fix: - Delete ssrf.go entirely — its isSafeURL/isPrivateOrMetadataIP are already in a2a_proxy.go; its validateRelPath is in templates.go. - Remove unused "fmt" import from a2a_proxy.go. - Add t.Setenv cleanup in TestIsPrivateOrMetadataIP and TestIsSafeURL so MOLECULE_DEPLOY_MODE=saas from TestIsPrivateOrMetadataIP_SaaSMode cannot leak into sibling tests. - Update stale file-location comments in ssrf_test.go. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-21 17:03:36 +00:00
molecule-ai[bot]	38e9eba59a	fix(P0): CWE-22 path traversal in copyFilesToContainer + ContextMenu test Issue #1434 — CWE-22 Path Traversal Regression: PR #1280 (`dc218212`) correctly used cleaned path in tar header. PR #1363 (`e9615af`) regressed to using uncleaned `name`. Fix: use `clean` in filepath.Join AND add defence-in-depth escape check. Issue #1422 — ContextMenu Test Regression: PR #1340 expanded pendingDelete store type to include `children:[]`. Test assertion missing the field — add `children:[]` to match. Note: ssrf.go created (shared isSafeURL/isPrivateOrMetadataIP) to prepare for the handler-split refactor fix — current branch has no build error, but the shared file will prevent regression when PR #1363 is merged. isSafeURL/isPrivateOrMetadataIP retained in both files for now to avoid breaking callers while the split is finalized. Co-authored-by: Molecule AI Core-BE <core-be@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-21 16:56:47 +00:00
molecule-ai[bot]	deeea0d2bb	research: add enterprise-case-study-pipeline-targeting-brief.md	2026-04-21 16:46:57 +00:00
molecule-ai[bot]	6f470d088c	research: add enterprise-case-study-legal-clearance-brief.md	2026-04-21 16:46:56 +00:00
molecule-ai[bot]	f376c83d07	research: add crewai-competitive-proof-points-brief.md	2026-04-21 16:46:55 +00:00
Hongming Wang	a14cf863d1	Merge pull request #1445 from Molecule-AI/fix/tenant-dockerfile-uid-conflict fix(tenant-image): remove node user so canvas uid 1000 can be created	2026-04-21 08:58:09 -07:00
Hongming Wang	3fe90d1a59	fix(tenant-image): remove node user so canvas uid 1000 can be created node:20-alpine ships with a `node` user at uid/gid 1000. The Dockerfile tried `addgroup -g 1000 canvas` which fails with exit 1 because 1000 is already taken. Publish-workspace-server-image workflow has been red for hours — tenant image :latest stuck on a digest that predates the X-Molecule-Admin-Token CPProvisioner fix. Staging workspace provisioning 401'd because the stale tenant binary never sent the admin header. Delete node user+group first (tolerant of future base-image changes that might not ship it), then create canvas at 1000/1000 as before. Mounted volumes continue to expect uid 1000. Repro: publish-workspace-server-image workflow run 24731870797: "process addgroup -g 1000 canvas && adduser... exit code: 1". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 08:57:47 -07:00
molecule-ai[bot]	a49a7e005e	chore: force Platform(Go) CI run on main — validate go vet clean Triggering platform job explicitly after Python Lint & Test fix (#1431). This ensures go vet runs on the current main HEAD (`4675402` pre-stop serialization + `f2583c2` ci-trigger). Co-Authored-By: PM <pm@molecule.ai>	2026-04-21 15:43:19 +00:00
molecule-ai[bot]	f2583c2d37	chore: PM-triggered CI re-run	2026-04-21 15:40:21 +00:00
Hongming Wang	81c4c02547	fix(e2e): safety-net teardown only sweeps this run's orgs Previously matched every e2e-YYYYMMDD-* slug, which stomped parallel CI runs AND manual dev probes against staging. Incident 2026-04-21 15:02Z: this workflow's safety net deleted an unrelated manual tenant 1s after it hit 'running', timing out the dev run at 15min. Scope to f'e2e-{today}-{GITHUB_RUN_ID}-' so each run only cleans its own leftovers. Empty run_id (local invocation) keeps the old broader behaviour so dev safety-nets still sweep. Also fix: the previous filter used o.get('status') which doesn't exist on the admin API response. Now reads instance_status (the real field). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 08:16:12 -07:00
Hongming Wang	e9d111dbc6	fix(e2e): send X-Molecule-Org-Id on tenant calls TenantGuard middleware on the tenant platform returns 404 (not 403, by design — avoid leaking tenant existence to org scanners) when requests lack X-Molecule-Org-Id matching MOLECULE_ORG_ID. Harness hit this on POST /workspaces (section 5) despite having a valid Authorization bearer. - Capture org_id from admin-create response - Send X-Molecule-Org-Id on every tenant_call Confirmed via manual repro 2026-04-21T14:56Z: curl with Bearer but no org-id header → 404; with both headers → expected route reached. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 07:59:25 -07:00
Hongming Wang	37a02d6f5a	fix(e2e): derive tenant domain from CP URL (staging vs prod) Previous hardcode `$SLUG.moleculesai.app` only matched prod. Staging tenants live at `$SLUG.staging.moleculesai.app`, so the harness hit DNS for a nonexistent host and timed out at section 4 even after provisioning succeeded. Derive from CP URL: api.X → X, staging-api.X → staging.X. Override via MOLECULE_TENANT_DOMAIN for self-hosted setups. Confirmed gap on manual run 2026-04-21T14:40Z: section 2 passed in 2min but section 4 timed out at 3min on the wrong hostname. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 07:46:16 -07:00
Hongming Wang	a510573172	fix(e2e): poll instance_status not status in staging harness /cp/admin/orgs exposes `instance_status` (COALESCE'd from org_instances.status), NOT a top-level `status` field. The harness polled the wrong field and always read empty → timed out at 15min on a tenant that had actually provisioned successfully (confirmed 2026-04-21T14:22Z: EC2 launched, canary ok, but harness never saw status=running). No code change to the admin API — the field has never been named `status`. The harness just had a typo that happened to type-check (the Go struct hasn't changed, only the sh/py polling was wrong). Now the harness correctly reads `instance_status` and the main provision poll loop terminates on the expected transition. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 07:40:03 -07:00
molecule-ai[bot]	4675402e58	feat(workspace): pre-stop serialization for pause/resume (closes #1386 ) Add a pre-stop hook that captures agent state before container exit and writes a scrubbed snapshot to /configs/.agent_snapshot.json. On restart, the snapshot is loaded and the adapter's restore_state() is called before the A2A server starts. - New lib/pre_stop.py: build_snapshot / write_snapshot / read_snapshot / delete_snapshot + _scrub_value deep-scrubber (uses lib.snapshot_scrub to redact API keys, tokens, and sandbox output before persisting) - BaseAdapter.pre_stop_state(): captures _executor._session_id and recent transcript_lines; overridden by adapters with richer in-memory state - BaseAdapter.restore_state(): stores snapshot fields as adapter attrs for create_executor() to pick up - main.py: calls pre_stop serialization in finally block (after server serves) and restore_state() after adapter setup, before server starts - Added 12 unit tests covering scrub, read/write, adapter integration Co-authored-by: Molecule AI Infra-Runtime-BE <infra-runtime-be@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-21 12:40:44 +00:00
molecule-ai[bot]	7dd66c91e0	Merge pull request #1355 from Molecule-AI/staging Merge staging → main: Phase 30 Canvas + workspace PLATFORM_URL Docker defaults Summary of changes: - Canvas: 100px proximity threshold for nest dialog (#1052), context menu delete flow, BudgetSection null guard - Workspace Python: Docker-aware PLATFORM_URL defaults (host.docker.internal:8080 / localhost:8080), WORKSPACE_ID required guard - E2E: context-menu delete regression spec - Docs: Phase 30 blog posts, guides, remote-workspaces FAQ, API reference Security fixes included from main: - CWE-22/CWE-78 path traversal + shell injection protection (PRs #1281/#1310) - SSRF whitelist in SaaS mode, IPv6 bypass fix (#1302/#1364) - HMAC slice truncation guard (#1339/#1352/#1354) - INCIDENT_LOG credential redaction (#1359) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-21 12:26:13 +00:00
Molecule AI SDK Lead	e9615af169	Merge origin/main into staging: resolve conflicts with main's test + security fixes Conflicts resolved (took main's versions): - canvas/src/app/__tests__/orgs-page.test.tsx (act() wrappers, PR #1350) - canvas/src/components/Canvas.tsx (100px proximity threshold, PR #1357) - canvas/src/components/__tests__/ContextMenu.keyboard.test.tsx (hasChildren fix) - workspace-server/internal/handlers/container_files.go (CWE-22/CWE-78 fixes, PRs #1281/#1310) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-21 12:25:42 +00:00
molecule-ai[bot]	3d639b53d8	fix(tests): resolve remaining compaction artefacts — ExpectExpectations, mockResolver.Scheme, largeContent (#1366 )	2026-04-21 12:15:41 +00:00
molecule-ai[bot]	51d6271ed4	fix(tests): update orgTokenValidateQuery mock — Validate reads 3 columns (#1366 )	2026-04-21 12:15:36 +00:00
molecule-ai[bot]	cefe4c9dea	fix(tests): resolve compaction artefacts — Validate returns 4 values (#1366 )	2026-04-21 12:15:30 +00:00
Molecule AI Community Manager	7395ed92f6	docs(assets): add Phase 30 token lifecycle card + canvas fleet mockup - token-lifecycle-card.png: 4-step remote agent token lifecycle (Register → Token Cached → Heartbeat 30s → Revoke). Dark zinc, purple #7C52FF - canvas-fleet-mockup.png: Canvas UI showing mixed Docker + REMOTE fleet, 2 REMOTE agents with purple badges. LinkedIn cut asset. - social-copy.md: updated asset table with actual file paths Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-21 12:12:17 +00:00
Molecule AI Community Manager	2f66f8f8cd	docs(tutorials): add Social Channels Quickstart Parallel Discord + Telegram setup guide, ~10 min to slash-command bot. Companion to Discord adapter launch. Cross-links Lark tutorial, social-channels.md, remote-agent tutorial. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-21 11:52:13 +00:00
Hongming Wang	6bd674e412	fix(e2e): CP DELETE /cp/admin/tenants body uses 'confirm', not 'confirm_token' Verified against live staging: the admin endpoint returns 400 'confirm field must equal the URL slug' when the body key is 'confirm_token'. Every workflow's safety-net teardown step + the main harness + the Playwright teardown all had the wrong key. Fixed all six call sites. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 04:50:28 -07:00
Molecule AI Community Manager	6322e91873	docs(marketing): update Discord adapter posting guide — Day 2 prep - Add Reddit r/LocalLlama + r/MachineLearning copy sources - Add full Hacker News post body + guidelines - Add dev.to full post body + frontmatter - Add Discord server #announcements copy - Add coordination checklist with [BLOG_URL] placeholder flag - Update PR/status references Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-21 11:50:24 +00:00
Hongming Wang	858498fdd6	Merge pull request #1392 from Molecule-AI/fix/saas-review-response fix(platform): SaaS follow-up — saasMode typo fall-closed + revoke-in-both-modes + test fixes	2026-04-21 04:49:02 -07:00
molecule-ai[bot]	e26e542888	fix(docs): correct platform and canvas domains in org-scoped API keys blog post platform.moleculeai.ai -> platform.moleculesai.app canvas.moleculeai.ai -> canvas.moleculesai.app Spotted during docs PR review cycle.	2026-04-21 11:42:15 +00:00
Molecule AI Core-BE	eaadf72e2d	fix(test): resolve 4 compile errors in workspace_provision_test.go Issue #1366: Handlers test package broken on main. Changes: - Wrap orphaned largeContent declarations in TestSeedInitialMemories_ContentOverLimit (was outside any function) - ExpectExpectations → ExpectationsWereMet (3 occurrences, sqlmock API) - mockEnvMutator.Register(interface{}) → Register(provisionhook.EnvMutator) to match pkg/provisionhook Registry.Register signature - mockResolver missing Scheme() method (SourceResolver interface req) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-21 11:39:48 +00:00
Molecule AI Community Manager	657d07a3d8	docs(assets): add Discord adapter hero image for Day 2 campaign 1200×630 PNG, Discord dark theme, slash command /ask flow. Companion asset for Discord adapter announcement. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-21 11:38:34 +00:00
Hongming Wang	d7193dfa34	feat(e2e): pivot to admin-bearer-only auth + add sanity self-check workflow Reduces required secret surface from 2 (session cookie + admin token) to 1 (admin token). Pairs with molecule-controlplane#202 which adds: - POST /cp/admin/orgs — server-to-server org creation - GET /cp/admin/orgs/:slug/admin-token — per-tenant bearer fetch With those endpoints live, CI doesn't need to scrape a browser WorkOS session cookie. CP admin bearer (Railway CP_ADMIN_API_TOKEN) drives provision + tenant-token retrieval + teardown through a single credential. Changes ------- test_staging_full_saas.sh: admin bearer for provision/teardown, fetched per-tenant token drives all tenant API calls. Added E2E_INTENTIONAL_FAILURE=1 toggle that poisons the tenant token after provisioning so the teardown path gets exercised when the happy-path isn't. canvas/e2e/staging-setup.ts: same pivot; exports STAGING_TENANT_TOKEN instead of STAGING_SESSION_COOKIE. canvas/e2e/staging-tabs.spec.ts: context.setExtraHTTPHeaders with Authorization: Bearer on every page request, no cookie handling. All three workflows (e2e-staging-saas, canary-staging, e2e-staging-canvas): drop MOLECULE_STAGING_SESSION_COOKIE env + verification step. One secret to set. NEW e2e-staging-sanity.yml: weekly Mon 06:00 UTC. Runs the harness with E2E_INTENTIONAL_FAILURE=1 and inverts the pass condition — rc=1 is green, rc=0 (unexpected success) or rc=4 (leak) open a priority-high issue labelled e2e-safety-net. This is the answer to 'how do we know the teardown path still works when nothing else has failed recently.' STAGING_SAAS_E2E.md refreshed: single-secret setup, sanity workflow documented, canvas workflow added to the coverage matrix. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 04:34:11 -07:00
Molecule AI Community Manager	b95421609a	docs(audio): add TTS narration for audit chain verification explainer 94s MP3 narration + script for HMAC audit ledger blog post. Companion audio asset. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-21 11:22:03 +00:00
molecule-ai[bot]	1e6d66c6ae	fix(tests): resolve all compaction artefacts in handlers test package (#1366 ) - ExpectExpectations -> ExpectationsWereMet (3 occurrences) - Add Scheme() to mockResolver (satisfies plugins.SourceResolver interface) - Wrap orphan largeContent in TestSeedInitialMemories_Truncation	2026-04-21 11:21:26 +00:00
Hongming Wang	8065d7ef03	fix(orgtoken): update Validate test mock to include org_id column Validate now SELECTs id/prefix/org_id; the test mock row only had two columns, so the actual query against sqlmock errored with 'invalid or revoked org api token' at runtime (the row couldn't Scan). Add org_id to the mocked row and assert it propagates to the 4th return value. This is a test-only change — the production code path already had the third column selected; CI was the canary. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 04:20:47 -07:00
molecule-ai[bot]	cc290c3255	fix(tests): add org_id to orgTokenValidateQuery mock — Validate reads 3 columns (#1366 )	2026-04-21 11:20:37 +00:00
molecule-ai[bot]	8dde18bc61	fix(tests): add orgID to Validate unpack — Validate returns 4 values (#1366 )	2026-04-21 11:19:59 +00:00
Hongming Wang	f4700858ac	feat(e2e): canary + canvas Playwright workflows; delegation mechanics Three additions on top of `187a9bf`: 1. Canary (.github/workflows/canary-staging.yml) 30-min cron that runs the full-SaaS harness in E2E_MODE=canary: one hermes workspace + one A2A PONG + teardown. ~8-min wall clock vs ~20-min for the full run. Alerting is self-contained: opens a single 'Canary failing' issue on first failure, comments on subsequent failures (no issue spam), auto-closes the issue on the next green run. Labels: canary-staging, bug. Safety-net teardown step sweeps e2e-YYYYMMDD-canary-* orgs tagged today so a runner cancel can't leak EC2. 2. Canvas Playwright (canvas/e2e/staging-*.ts + playwright.staging.config.ts + .github/workflows/e2e-staging-canvas.yml) staging-setup.ts provisions a fresh org + hermes workspace (same lifecycle as the bash harness, just in TypeScript). staging-tabs.spec.ts clicks through all 13 workspace-panel tabs (chat, activity, details, skills, terminal, config, schedule, channels, files, memory, traces, events, audit) and asserts each renders without crashing and without 'Failed to load' error toasts. Known SaaS gaps (Files empty, Terminal disconnects, Peers 401) are documented in #1369 and whitelisted so they don't fail the test — the gate is 'no hard crash', not 'no issues'. staging-teardown.ts deletes the org via DELETE /cp/admin/tenants/:slug. playwright.staging.config.ts separates staging from local tests so pnpm test in dev doesn't try to provision against staging. Retries=2 and timeouts are longer; workers=1 because the setup provisions one shared workspace. Workflow uploads HTML report + screenshots on failure for 14 days. 3. Delegation mechanics (tests/e2e/test_staging_full_saas.sh section 10) Parent → child proxy test: POST /workspaces/CHILD/a2a with X-Source-Workspace-Id=PARENT and verify the child responds + child activity log captures PARENT as source. Intentionally LLM-free: the mechanics regression is what matters; prompt-driven delegation correctness belongs in canvas-driven tests. Also reorders teardown step to 11/11 since delegation is 10/11. Mode gating: E2E_MODE=canary -> skips child workspace, HMA memory, peers, activity, delegation (steps 6, 9, 10 no-op). Full-lifecycle still runs every piece. Validated both paths via 'bash -n' syntax check after each edit. Secrets requirement unchanged (same two secrets as `187a9bf`): MOLECULE_STAGING_SESSION_COOKIE, MOLECULE_STAGING_ADMIN_TOKEN. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 04:15:10 -07:00
molecule-ai[bot]	00bd73f8c8	fix(canvas): a11y fixes + budget_used TypeScript guard + orgs-page test fix (#1367 ) * fix(canvas/a11y): mark StatusDot as aria-hidden — decorative element StatusDot is purely decorative; the status is already conveyed via aria-label on parent elements (WorkspaceNode, SidePanel header, etc.). Marking it aria-hidden="true" prevents screen readers from announcing the empty div as "img" with no alt text. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): guard budget_used optional field with ?? 0 in progress calc TypeScript error in CI: 'budget.budget_used' is possibly 'undefined' when used in the progress percentage calculation. The field is optional per BudgetData interface, so ?? 0 is the correct guard. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas/a11y): Tooltip keyboard focus support + ARIA role - Add role="tooltip" + unique id so assistive tech can find tooltip content - Add aria-describedby on trigger so screen readers announce tooltip text - Add onFocus/onBlur handlers so keyboard users (Tab navigation) can see tooltips that mouse users see on hover Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas/test): restore advanceTimersByTime pattern in orgs-page error test waitFor() + fake timers (vi.useFakeTimers in beforeEach) cause race conditions: the 5s polling timeout fires before React state updates flush. Restores the established pattern used by all other tests in this file: advanceTimersByTimeAsync(50) + runAllTimersAsync(). Also removes the now-unused waitFor import. Ref: PRs #1360, #1345 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Molecule AI Core-UIUX <core-uiux@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-21 11:08:24 +00:00
Molecule AI Community Manager	e20ec33d33	docs(blog): add audit chain verification explainer HMAC-SHA256 immutable ledger architecture + PR #1339 panic fix. Companion to org-scoped API keys post. Enterprise/compliance audience. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-21 11:08:01 +00:00
Molecule AI Community Manager	9ef87a4f1e	docs(devrel): add Phase 30 hero video — 3 aspect ratio cuts Primary (16:9), social (9:16), and LinkedIn (1:1) cuts. 47.95s, 30fps H.264, dark zinc theme, burn-in captions, VO track. Assembled from: - marketing/assets/phase30-fleet-diagram.png - marketing/audio/phase30-video-vo.mp3 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-21 11:04:27 +00:00
Hongming Wang	187a9bf87a	feat(e2e): staging full-SaaS workflow — per-run org provision + leak-free teardown Dedicated CI/CD lane that exercises the whole SaaS cross-EC2 shape end to end, against live staging: 1. Accept terms / create org (POST /cp/orgs) — catches ToS gate, slug validation, billing/quota, member insert regressions. 2. Wait for tenant EC2 + cloudflared tunnel + TLS propagation (up to 15 min cold). 3. Provision a parent + child workspace via the tenant URL. 4. Wait both online (exercises the SaaS register + token bootstrap flow fixed in #1364). 5. A2A round-trip on parent — validates the full LLM loop (MCP tools, provider auth, JSON-RPC response shape, proxy SSRF gate). 6. HMA memory write + read — validates awareness namespace + scope routing. 7. Peers + activity smoke — route-registration regression guard. 8. Teardown via DELETE /cp/admin/tenants/:slug + leak assertion — a leaked org at teardown fails CI with exit 4. Why a dedicated workflow (not folded into ci.yml): - ~20 min wall clock per run (EC2 boot is the long pole). Too slow for every PR push. - Needs its own concurrency group (staging has an org-create quota and two overlapping runs would race on slug prefix). - Distinct secret surface (session cookie + admin bearer) — keep it off PR jobs that don't need them. Triggers: push to main (provisioning-critical paths only), PRs on the same paths, manual workflow_dispatch (with runtime + keep_org inputs), and 07:00 UTC nightly cron for drift detection. Belt-and-braces teardown: the script installs an EXIT trap, and the workflow has an always()-step that greps e2e-YYYYMMDD-* orgs created today and force-deletes them via the idempotent admin endpoint. Covers the case where GH cancels the runner before the trap fires. Docs: tests/e2e/STAGING_SAAS_E2E.md — what's covered, how to provision the two required secrets, local-dev notes, cost (~$0.007/run), known gaps (canvas UI + delegation + claude-code). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 03:54:09 -07:00

1 2 3 4 5 ...

1382 Commits