molecule-core

Author	SHA1	Message	Date
Hongming Wang	d46d558ca9	Merge pull request #2148 from Molecule-AI/test/canvas-lib-utils-runtime-names-1815 test(canvas): cover utils.cn + runtime-names.runtimeDisplayName (0% → 100%) (#1815)	2026-04-27 06:57:57 +00:00
Hongming Wang	e5e4eb4d2a	test(canvas): cover canvas-actions restart-pending helpers (25% → 100%) (#1815 ) [Molecule-Platform-Evolvement-Manager] Continues the #1815 coverage rollup. canvas-actions.ts was at 25% in the baseline run from #2147; this PR brings the file's two helpers to full coverage. 5 cases: markAllWorkspacesNeedRestart (3): - calls updateNodeData on every node with `{needsRestart: true}` - no-op when the canvas has zero workspaces - preserves call ordering — matters because the toolbar's Restart Pending pill observes per-node data changes incrementally; a refactor that shuffled iteration order would silently change which workspaces flash first markWorkspaceNeedsRestart (2): - targeted call: updateNodeData fires exactly once on the named id - defensive: regardless of how many other workspaces exist in the store, only the target workspace gets updated. Pre-this-test, a refactor that accidentally wired this function through the per-node iteration path of markAll would silently mark every workspace — pinning the cardinality here catches that. ## Mock strategy Standard pattern for canvas store: mock useCanvasStore as both the selector function AND a getState()-bearing object. updateNodeData is a vi.fn() spy so the test asserts on calls + args directly. ## Test plan - [x] All 5 cases pass locally (~132ms) - [x] No SUT changes — pure additive coverage - [ ] CI green ## #1815 progress - [x] Step 1+2: instrumentation + script (#2147) - [x] utils.ts + runtime-names.ts (#2148) - [x] canvas-actions.ts (this PR) - [ ] Remaining low-coverage targets: store/classNames.ts (17%), store/canvas.ts (73% — largest absolute gap by lines) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 23:47:49 -07:00
Hongming Wang	bfbbe57610	test(canvas): cover utils.cn + runtime-names.runtimeDisplayName (0% → 100%) (#1815 ) [Molecule-Platform-Evolvement-Manager] Closes two of the 0%-coverage files surfaced by the baseline run in PR #2147 (vitest coverage instrumentation). Both files are tiny utility helpers with high-touch read paths. ## utils.cn (8 cases) Wraps `twMerge(clsx(inputs))` — every conditionally-styled component flows through here. The load-bearing case is the last-wins Tailwind dedup: `cn("p-2", "p-4")` → "p-4". A regression that lost twMerge would silently double-apply utilities (cosmetically broken, breaks `:where()` rules + theme overrides). Cases: - single class unchanged - multiple positional classes joined - array input flattening (clsx) - object syntax with truthy/falsy keys - last-wins dedup on conflicting Tailwind utilities (the regression-locked guarantee) - non-conflicting utilities both survive (p-2 + m-4) - mixed input shapes (string + array + object + string) - nullish / empty inputs don't throw ## runtime-names.runtimeDisplayName (4 it.each cases + 3 it()) Friendly-name lookup that surfaces the workspace runtime in the chat indicator, details tab, and a few component labels. Cases: - known runtimes map to display strings (claude-code → Claude Code, langgraph → LangGraph, etc.) - unknown runtime falls back to input string verbatim (a NEW runtime not yet in the lookup still renders something operator-debuggable rather than a generic placeholder) - empty string falls back to "agent" (final default) - case-sensitivity pinned: "Claude-Code" / "LANGGRAPH" miss the lookup. The upstream slug is already normalized lowercase, so a future refactor that lowercases input "for safety" would silently change behavior — pinning the contract here. ## Test plan - [x] All 17 cases pass locally (~129ms) - [x] No SUT changes — pure additive coverage - [ ] CI green ## #1815 progress - [x] Step 1+2: coverage instrumentation + script (#2147) - [x] 0%-file gaps utils.ts + runtime-names.ts (this PR) - [ ] More 0%/low-coverage files: lib/canvas-actions.ts (25%), store/classNames.ts (17%) — separate PRs - [ ] Step 3b: thresholds + CI gate once baseline catches up 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 23:45:51 -07:00
Hongming Wang	ccb961a17b	Merge pull request #2096 from Molecule-AI/refactor/remove-canvas-hermes-runtime-profile-2054 refactor(canvas): remove RUNTIME_PROFILES.hermes — value flows server-side (#2054 phase 3)	2026-04-26 22:05:42 +00:00
Hongming Wang	0ce537750c	fix(ci): handle merge_group + shallow-clone BASE in secret-scan [Molecule-Platform-Evolvement-Manager] ## What was breaking Two distinct failure modes in `.github/workflows/secret-scan.yml`, both visible after PR #2115 / #2117 hit the merge queue: 1. `merge_group` events: the script reads `github.event.before / after` to determine BASE/HEAD. Those properties only exist on `push` events. On `merge_group` events both came back empty, the script fell through to "no BASE → scan entire tree" mode, and false-positived on `canvas/src/lib/validation/__tests__/secret-formats.test.ts` which contains a `ghp_xxxx…` literal as a masking-function fixture. (Run 24966890424 — exit 1, "matched: ghp_[A-Za-z0-9]{36,}".) 2. `push` events with shallow clone: `fetch-depth: 2` doesn't always cover BASE across true merge commits. When BASE is in the payload but absent from the local object DB, `git diff` errors out with `fatal: bad object <sha>` and the job exits 128. (Run 24966796278 — push at 20:53Z merging #2115.) ## Fixes - Add a dedicated fetch step for `merge_group.base_sha` (mirrors the existing pull_request base fetch) so the diff base is in the object DB before `git diff` runs. - Move event-specific SHAs into a step `env:` block so the script uses a clean `case` over `${{ github.event_name }}` instead of a single `if pull_request / else push` that left merge_group on the empty branch. - Add an on-demand fetch for the push-event BASE when it isn't in the shallow clone, plus a `git cat-file -e` guard before the diff so we fall through cleanly to the "scan entire tree" path if the fetch fails (correct, just slower) instead of exiting 128. ## Defense-in-depth `secret-formats.test.ts` had two literal continuous-string fixtures (`'ghp_xxxx…'`, `'github_pat_xxxx…'`). The ghp_ one matched the secret-scan regex. Switched both to the `'prefix_' + 'x'.repeat(N)` pattern already used elsewhere in the same file — runtime value is the same, but the literal source text no longer matches the regex even if the BASE detection ever falls back to tree-scan mode again. ## Test plan - [x] No remaining regex matches in the secret-formats.test.ts source - [x] YAML structure preserved - [ ] CI passes on this PR's pull_request scan (was already passing) - [ ] CI passes on this PR's merge_group scan (the new path) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 14:08:19 -07:00
rabbitblood	b8f24e93da	merge: sync staging into refactor/remove-canvas-hermes-runtime-profile-2054 (pickup #2099+#2107 TLS fixes)	2026-04-26 12:12:51 -07:00
rabbitblood	756aa00e1f	refactor(canvas): remove RUNTIME_PROFILES.hermes — value flows server-side now (#2054 phase 3) Closes the canvas-side loop on #2054. Phases 1+2 plumbed provision_timeout_ms from template manifest → workspace API → canvas socket → node-data → ProvisioningTimeout resolver. The template-hermes manifest declares provision_timeout_seconds: 720 (filed as a separate template-repo PR). With that flow live, the canvas-side hardcoded RUNTIME_PROFILES.hermes entry is redundant. Removed: - RUNTIME_PROFILES.hermes (was 720000ms hardcoded in canvas/src/lib/runtimeProfiles.ts) Doc updates: - RUNTIME_PROFILES jsdoc explains the map is now empty by design — new runtimes that need a non-default cold-boot threshold should declare runtime_config.provision_timeout_seconds in their template manifest, NOT add an entry here. Tests updated (3): - "returns hermes override when runtime = hermes" → "hermes returns default — value moved server-side post-#2054 phase 3". Asserts RUNTIME_PROFILES.hermes is undefined. - The two server-override tests now compare against DEFAULT_RUNTIME_PROFILE since hermes no longer has a profile entry. 19/19 pass locally. The end-state for hermes: workspace-server reads template manifest at request time → workspace API includes provision_timeout_ms: 720000 → canvas hydrate populates node.data.provisionTimeoutMs → ProvisioningTimeout resolver picks it up via overrides. Same effective threshold (720s), now declarative and one-edit-point per runtime. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 07:12:44 -07:00
Hongming Wang	8543bae83f	Merge branch 'staging' into fix/canvas-multilevel-layout-ux	2026-04-26 00:36:54 -07:00
Hongming Wang	5e36c6638c	feat(platform,canvas): classify "datastore unavailable" as 503 + dedicated UI User reported the canvas threw a generic "API GET /workspaces: 500 {auth check failed}" error when local Postgres + Redis were both down. Two problems: 1. The error code (500) and message ("auth check failed") said nothing useful. The actual condition was "platform can't reach its datastore to validate your token" — a Service Unavailable class, not Internal Server Error. 2. The canvas had no way to distinguish infra-down from a real auth bug, so it rendered the raw API string in the same generic-error overlay it uses for everything. Fix in two layers: Server (wsauth_middleware.go): - New abortAuthLookupError helper centralises all three sites that previously returned `500 {"error":"auth check failed"}` when HasAnyLiveTokenGlobal or orgtoken.Validate hit a DB error. - Now returns 503 + structured body `{"error": "...", "code": "platform_unavailable"}`. 503 is the correct semantic ("retry shortly, infra is unavailable") and the code field is the contract the canvas reads. - Body deliberately excludes the underlying DB error string — production hostnames / connection-string fragments must not leak into a user-visible error toast. Canvas (api.ts): - New PlatformUnavailableError class. api.ts inspects 503 responses for the platform_unavailable code and throws the typed error instead of the generic "API GET /…: 503 …" message. Generic 503s (upstream-busy, etc.) keep the legacy path so existing busy-retry UX isn't disrupted. Canvas (page.tsx): - New PlatformDownDiagnostic component renders when the initial hydration catches PlatformUnavailableError. Surfaces the actual condition with operator-actionable copy ("brew services start postgresql@14 / redis") + pointer to the platform log + a Reload button. Tests: - Go: TestAdminAuth_DatastoreError_Returns503PlatformUnavailable pins the response shape (status, code field, no DB-error leak) - Canvas: 5 tests for PlatformUnavailableError classification — typed throw on 503+code match, generic-Error fallback for 503-without-code (upstream busy), 500 stays generic, non-JSON body falls back to generic. 1015 canvas tests + full Go middleware suite pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 00:01:56 -07:00
Hongming Wang	5a3dbb95e1	fix(api): probe /cp/auth/me before redirecting on 401 The actual cause-fix for the staging-tabs E2E saga (#2073/#2074/#2075). Old behaviour: ANY 401 from any fetch on a SaaS tenant subdomain called redirectToLogin → window.location.href = AuthKit. This is wrong. Plenty of 401s don't mean "session is dead": - workspace-scoped endpoints (/workspaces/:id/peers, /plugins) require a workspace-scoped token, not the tenant admin bearer - resource-permission mismatches (user has tenant access but not this specific workspace) - misconfigured proxies returning 401 spuriously A single transient one of those yanked authenticated users back to AuthKit. Same bug yanked the staging-tabs E2E off the tenant origin mid-test for 6+ hours tonight, leading to the cascade of test-side mocks (#2073/#2074/#2075) that worked around the symptom without fixing the cause. This PR fixes it at the source. The new logic: - 401 on /cp/auth/* path → that IS the canonical session-dead signal → redirect (unchanged) - 401 on any other path with slug present → probe /cp/auth/me: probe 401 → session genuinely dead → redirect probe 200 → session fine, endpoint refused this token → throw a real Error, caller renders error state probe network err → assume session-fine (conservative) → throw real Error - slug empty (localhost / LAN / reserved subdomain) → throw without redirect (unchanged) The probe adds one extra fetch on a 401, only when slug is set and the path isn't already auth-scoped. That's rare and worthwhile — a transient probe round-trip is cheap; an unwanted auth redirect is a UX disaster. Tests: - api-401.test.ts rewritten with the full matrix: * /cp/auth/me 401 → redirect (no probe, that IS the signal) * non-auth 401 + probe 401 → redirect * non-auth 401 + probe 200 → throw, no redirect ← the fix * non-auth 401 + probe network err → throw, no redirect * empty slug paths (localhost/LAN/reserved) → throw, no probe - 43 tests in canvas/src/lib/__tests__/api*.test.ts all pass - tsc clean The staging-tabs E2E spec's universal-401 route handler stays as defense-in-depth (silences resource-load console noise + guards against panels without try/catch), but the comment now describes its role honestly: api.ts is the primary fix, the route is the safety net. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-25 23:49:28 -07:00
Hongming Wang	06c85bd185	Merge pull request #2045 from Molecule-AI/feat/flat-rate-pricing-1833 feat(canvas): flat-rate pricing — rename Starter→Team, Pro→Growth (Issue #1833)	2026-04-25 05:54:06 +00:00
Hongming Wang	5adc8a74d5	feat(canvas+org): env preflight, EmptyState parity, shared useTemplateDeploy hook Builds on #2061. Three internally-cohesive sub-features; easiest to read in order. ## 1. Org-level env preflight Server - `OrgTemplate` + `OrgWorkspace` gain `required_env: string[]` and `recommended_env: string[]` YAML fields. - `GET /org/templates` walks the tree and returns the tree-union (deduped, sorted) of both. `collectOrgEnv` dedup prefers required when the same key is declared at both tiers. - `POST /org/import` preflights against `global_secrets` WHERE `octet_length(encrypted_value) > 0` (empty-value rows used to be counted as "configured" and the per-container preflight still failed at start time). 412 Precondition Failed + `missing_env` list when required keys are absent. `force=true` bypasses with an audit log line. DB lookup failure now returns 500 (was: silent fall-through that defeated the guard). Env-var NAMES validated against `^[A-Z][A-Z0-9_]{0,127}$` so a malicious template can't ship pathological names into the UI or DB. Canvas - New `OrgImportPreflightModal`: red "Required" section (blocking) and yellow "Recommended" section (non-blocking, import stays enabled, shows live missing-count next to the Import button). - Per-key password input → `PUT /settings/secrets` → strike-through on save. Functional `setDrafts` throughout (no stale-closure clobbers on rapid successive saves). `useEffect` seed keyed on a sorted-join string signature so a parent re-render with a new array identity doesn't clobber typed inputs. - `TemplatePalette.handleImport` branches: zero env declarations → straight to import; any declarations → fetch configured global secret keys, open the modal. Tests (Go): `TestCollectOrgEnv_*` (5) cover union-across-levels, required-wins-over-recommended (including same-struct), dedup, empty, invalid-name rejection. ## 2. EmptyState parity with TemplatePalette The "Deploy your first agent" grid used to call `POST /workspaces` with no preflight while the sidebar palette ran `checkDeploySecrets` + `MissingKeysModal` first. Same template deployed two different ways → first-run users saw containers boot in `failed` state without guidance. Now both surfaces share one preflight + modal handshake. EmptyState's previous `interface Template` dropped `runtime`, `models`, and `required_env` — silently discarding exactly the fields the preflight needs. `Template` now lives in `deploy-preflight.ts` and is imported from there by both surfaces. ## 3. useTemplateDeploy hook With the preflight + modal wiring now duplicated across EmptyState + TemplatePalette + (going forward) any third surface, extracted the pattern into `canvas/src/hooks/useTemplateDeploy.tsx`: const { deploy, deploying, error, modal } = useTemplateDeploy({ canvasCoords: ..., // optional, default random onDeployed: (id) => ..., }); Closes three drift surfaces that the duplication had created: - `resolveRuntime` id→runtime fallback table (moved to `deploy-preflight.ts`). EmptyState had a narrower fallback that would have silently disagreed with the palette on any future id needing a non-identity mapping. - `checkDeploySecrets` call signature. One owner. - `MissingKeysModal` JSX wiring. One owner. Narrow try/catch around `checkDeploySecrets` so a preflight network failure clears `deploying` and surfaces via `setError` instead of stranding the button forever. `modal: ReactNode` (not a `renderModal()` function) — the previous memoization bought nothing since consumers called it inline every render. Named `MissingKeysInfo` interface for the state shape. ## 4. Viewport auto-fit user-pan gate fix During org deploy the canvas was meant to pan+zoom to follow each arriving workspace (`molecule:fit-deploying-org` event → debounced fitView). In practice the fit stayed stuck on wherever the first fit landed. Root cause: React Flow v12 fires `onMoveEnd` with a truthy `event` at the END of a programmatic `fitView` animation. The original "respect-user-pan" gate stamped `userPannedAtRef` in `onMoveEnd`, so our own fit completing looked like a user pan, and every subsequent auto-fit short-circuited for the rest of the deploy. Fix: stop trusting `onMoveEnd` for user-intent detection. Register explicit `wheel` + `pointerdown` listeners on `document` with capture phase and `target.closest('.react-flow__pane')` filter. Capture-phase immunity to `stopPropagation`; pane-filter rejects toolbar / modal / side-panel clicks (the old `window` fallback caught those). `onMoveEnd` simplified to only drive the debounced viewport save. Also: fit event dispatched on root arrivals (not just children), so the canvas centers on the just-landed root immediately instead of waiting ~2s for the first child. Animation 600ms → 400ms so successive per-arrival fits don't pile up visually. End-state fit stays at 1200ms — intentional asymmetry ("settling" vs "tracking"), documented in code. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 15:15:33 -07:00
Hongming Wang	0b237ed9dd	refactor(canvas): extract runtime profiles to @/lib/runtimeProfiles Preparation for a "hundreds of runtimes" plugin ecosystem. Keeping the runtime-specific UX knobs in-line inside ProvisioningTimeout scales badly — every new runtime would require editing a component, not just adding a table entry. Other components (create-workspace dialog, workspace card tooltips, etc.) will want the same runtime metadata. Changes: - New file `canvas/src/lib/runtimeProfiles.ts` owns: * `RuntimeProfile` type — structural shape, every field optional so new runtimes can partially-fill without breaking consumers. * `DEFAULT_RUNTIME_PROFILE` — 2-min default floor (docker-fast). * `RUNTIME_PROFILES` — named overrides (currently: hermes 12 min). * `WorkspaceRuntimeOverrides` — interface for server-provided per-workspace overrides, so operators can tune via template manifest / workspace metadata without a canvas release. * `getRuntimeProfile()` — resolver with overrides → profile → default priority. * `provisionTimeoutForRuntime()` — convenience wrapper. - `ProvisioningTimeout.tsx` now delegates to the profile module. `DEFAULT_PROVISION_TIMEOUT_MS` re-exported for legacy test importers. - Tests: 16/16 (up from 9 before the first fix). Adds pinning for: * overrides > profile > default priority chain * "every entry in RUNTIME_PROFILES resolves to a number" contract * backward-compat export Adding a new slow runtime is now one table entry in `canvas/src/lib/runtimeProfiles.ts` with a mandatory `WHY` comment. Moving to server-driven profiles later is a ~10-line change (the resolver already threads WorkspaceRuntimeOverrides through). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 11:48:39 -07:00
Molecule AI Marketing Lead	de19cf9bae	fix(canvas): apply flat-rate pricing copy for Phase 34 launch (Issue #1833 ) Rename "Starter" → "Team", update tagline + pricing page hero copy to lead with flat-rate per-org positioning — deliberate wedge against Cursor/Windsurf per-seat pricing ($40/seat vs $29/org). PMM decision: Issue #1833. Approved by Marketing Lead 2026-04-24. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-24 17:54:23 +00:00
Hongming Wang	0ef5dad1b1	Merge pull request #1993 from Molecule-AI/fix/auth-redirect-loop-regression-tests test(auth): add regression tests for redirect loop guards	2026-04-24 06:57:12 +00:00
Hongming Wang	8c80175cd8	fix(canvas): subtree-aware layout + org-import reliability + UX polish Five tightly-related fixes surfaced while stress-testing org-template imports (Legal Team, Molecule Company, etc.) on a running control plane: 1) Org import was silently failing — INSERT wrote `collapsed` into the `workspaces` table but that column lives on `canvas_layouts` (005_canvas_layouts.sql). Every import returned 207 with 0 rows created, which `api.post` treated as success → green "Imported" toast + empty canvas. Moved the write to canvas_layouts; updated the workspace_crud PATCH path to UPSERT there too; refreshed the test mock. Added a client-side assertion that throws on 2xx-with-`error`-body so future partial-failures surface a red toast rather than lying about success. 2) Multi-level nested layout was collision-prone: children that were themselves parents (CTO → Dev Lead → 6 engineers) got the same leaf-sized grid slot as leaf siblings and clipped into each other. Added post-order `sizeOfSubtree` + sibling-size-aware `childSlotInGrid` on both the Go server and the TS client (kept in sync). `buildNodesAndEdges` now uses subtree sizes for both parent dimensions and the rescue heuristic. `setCollapsed` on expand now reads each child's actual rendered width/height instead of the leaf-count formula — a regression test covers the CTO/Dev Lead scenario. 3) Provisioning-timeout banner was unusable during large imports: a 30-workspace tree triggered 27 simultaneous "stuck" warnings 2 minutes in (server paces + provision concurrency = 3 guarantee tail items legitimately wait longer). Scaled threshold with concurrent count (base + 45s per queue slot beyond concurrency) and added a Dismiss (×) button per banner. 4) Auto pan-and-zoom on org ready: after the last workspace flips out of `provisioning`, canvas now fitView's with a 1.2s animation, 0.25 padding, `maxZoom: 0.8` and `minZoom: 0.25`. Without the zoom caps fitView was hitting the component's maxZoom=2 on small trees and zooming in instead of out. 5) Toolbar was visually busy: `+ N sub` count wrapped onto a second row on narrow viewports; status dot and workspace total were in separate border-delimited cells. Merged into one segment with `whitespace-nowrap`; A2A / Audit / Search / Help collapsed to icon-only 28px buttons with tooltip + aria-label (Figma/Linear pattern). Stop All / Restart Pending keep text — they're urgent. Also: - `api.{get,post,...}` accept an optional `{ timeoutMs }` so callers that hit intentionally-slow endpoints (org import paces 2s between siblings) don't trip the 15s default and report false aborts. - `WorkspaceNode` clamps role text to 2 lines so verbose descriptions don't unboundedly grow card height and break the grid. - `PARENT_HEADER_PADDING` bumped 44→130 to clear name + runtime + 2-line role + the currentTask banner that appears during the initial-prompt phase. Tests: 930 canvas tests + full Go handler suite pass. Added regressions for (i) 207 partial-success surfacing as throw, and (ii) setCollapsed sizing with nested-parent children. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 23:48:29 -07:00
Molecule AI Core-FE	e9be12210f	test(auth): add regression tests for redirect loop guards AuthGate now skips session fetch for /cp/auth/* paths, and redirectToLogin guards against re-setting window.location when already on an auth path. Both guards had no test coverage — a future refactor could silently reintroduce the redirect loop. Added: - AuthGate.test.tsx: 2 cases covering /cp/auth/login and /cp/auth/signup path skipping (no fetchSession call, no redirectToLogin call, children rendered) - auth.test.ts: 2 cases covering redirectToLogin early return for /cp/auth/login and /cp/auth/signup paths Fixes: Molecule-AI/molecule-core#1541 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-24 06:30:35 +00:00
Hongming Wang	f2a4b6e0d3	fix: dev-mode bypass for IP rate limiter + 429 retry on GET The 600-req/min/IP bucket is sized for SaaS where each tenant has a distinct client IP. On a local Docker setup every panel shares one IP — hydration (/workspaces + /templates + /org/templates + /approvals/pending) plus polling (A2A overlay + activity tabs + approvals + schedule + channels + audit trail) can burst past the bucket inside a minute, blanking the canvas with 429s. The user reported it after dragging workspaces — dragging itself is release-only (savePosition in onNodeDragStop), but the polling that's always running added onto startup tripped the limit. Two-layer fix: Server: RateLimiter.Middleware short-circuits when isDevModeFailOpen is true (MOLECULE_ENV=development + empty ADMIN_TOKEN), matching the Tier-1b hatch already applied to AdminAuth, WorkspaceAuth, and discovery. SaaS production keeps the bucket. Client: api.ts auto-retries a single 429 on idempotent GET requests, waiting the server-provided Retry-After (capped at 20s). Mutations (POST/PUT/PATCH/DELETE) never auto-retry to avoid double-applying. Users on SaaS hitting a legitimate rate-limit spike get one transparent recovery instead of an immediately-blank Canvas. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 20:44:09 -07:00
Hongming Wang	dc50a1c775	refactor(canvas): data-drive provider picker from template config.yaml The MissingKeysModal's provider list was hardcoded in deploy-preflight.ts as RUNTIME_PROVIDERS — a per-runtime map that duplicated what each template repo already declares in its config.yaml. That meant adding a new provider required changes in two places, and the UI could drift out of sync with the actual template (e.g. when a template adds a MiniMax or Kimi model, the picker wouldn't know). The single source of truth for "which env vars does this workspace need" is each template's config.yaml: * `runtime_config.models[].required_env` — per-model key list * `runtime_config.required_env` — runtime-level AND list Go /templates already returned `models`. This change: * Adds `required_env` alongside `models` on templateSummary so the canvas receives the full picture. * Rewrites deploy-preflight.ts to derive ProviderChoice[] from a template object via `providersFromTemplate(template)`: - groups `models[]` by unique required_env tuple - falls back to runtime_config.required_env when models is empty - decorates labels with model counts (e.g. "OpenRouter (14 models)") * `checkDeploySecrets(template, workspaceId?)` now takes a template object instead of a runtime string. Any-provider satisfaction still short-circuits preflight to ok=true. * MissingKeysModal receives `providers` directly; no more lookups. * TemplatePalette threads `template.models` + `template.required_env` into the preflight. Side effects: * Claude Code's dual-auth (OAuth token OR Anthropic API key) now surfaces as two picker options — its config.yaml already declared both, the UI just wasn't reading them. * Hermes picker now shows 8 provider options (Nous, OpenRouter, Anthropic, Gemini, DeepSeek, GLM, Kimi, Kilocode) instead of the hand-picked 3, matching its 35-model reality. Removed the legacy RUNTIME_PROVIDERS / RUNTIME_REQUIRED_KEYS / getRequiredKeys / findMissingKeys exports; MissingKeysModal.test.tsx deleted (its coverage is subsumed by the new template-driven deploy-preflight.test.ts). 58 modal-adjacent tests pass; full canvas suite 919 pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 17:07:15 -07:00
Hongming Wang	baa7e1531f	feat(canvas): provider-picker MissingKeysModal for multi-provider runtimes Runtimes like Hermes and LangGraph accept any one of several LLM provider keys (OpenRouter OR OpenAI OR Anthropic OR Nous-native). Before this change, the missing-keys modal treated all supported providers as simultaneously required — a fresh user on Hermes was asked for three parallel API keys when any one suffices. Introduces RUNTIME_PROVIDERS in deploy-preflight.ts as the canonical per-runtime provider list (label, envVar, note). checkDeploySecrets now returns all alternatives as missingKeys when nothing is configured, so the modal can offer a picker. MissingKeysModal dispatches between two render paths: * ProviderPickerModal — radio list of supported providers, a single env input for the chosen one. Saving that one key satisfies the preflight. Activated whenever the runtime has ≥2 provider choices. * AllKeysModal — legacy parallel-inputs UX, all keys must be saved before deploy. Kept for single-provider runtimes (claude-code, gemini-cli) and callers that pass unrelated-key lists. Dual-mode preserves the pre-existing contract for every caller while fixing the multi-provider UX. All 930 canvas vitest tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 16:41:09 -07:00
Hongming Wang	b4719ad070	fix(canvas): Legend avoids TemplatePalette + silence WS handshake races ### Two unrelated but small UI fixes surfaced while testing the Canvas 1. Legend hidden under the open TemplatePalette. Legend is `fixed bottom-6 left-4 z-30`. TemplatePalette's drawer (when open) is `fixed top-0 left-0 w-[280px] z-30` — same z-index, same left-edge column. The Legend overlapped the palette's bottom 180 px. Published the palette-open state to the canvas store so the Legend can shift right (to `left-[296px]` — 280 px palette + 16 px gap) while the palette is open, animated via a 200 ms `transition-[left]` to match the palette's slide. Closes cleanly back to `left-4` when the palette is dismissed. Files: - `store/canvas.ts` — added `templatePaletteOpen` + `setTemplatePaletteOpen`. - `TemplatePalette.tsx` — calls `setTemplatePaletteOpen(open)` on every open/close transition via a new useEffect. - `Legend.tsx` — reads the flag and swaps `left-4` <-> `left-[296px]`. 2. "WebSocket is closed before the connection is established" spam. Two components (`ChatTab`, `AgentCommsPanel`) open their own short- lived WebSocket to tail the ACTIVITY_LOGGED stream. Their cleanup path called `ws.close()` unconditionally, which trips a browser console warning when React StrictMode re-runs the effect in dev and the handshake hasn't completed yet. Confirmed via DevTools console on the running canvas. Added a `closeWebSocketGracefully(ws)` helper in `lib/ws-close.ts`: - OPEN / CLOSING → close immediately (normal path). - CONNECTING → defer close to the 'open' listener so the browser sees a full handshake. Also wires an 'error' listener that cancels the queued close if the handshake fails (no double-close). - CLOSED → no-op. Both consumers now call the helper in their useEffect cleanup. Silences the warning without changing observable behaviour. ### Tests `canvas/src/lib/__tests__/ws-close.test.ts` — 5 cases with a fake WebSocket covering each readyState branch plus the error-before-open cancellation path. Full vitest suite: 927/927 pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 16:03:01 -07:00
Hongming Wang	de99a22ffc	fix(quickstart): hotfixes discovered during live testing session Five additional breakages surfaced while testing the restored stack end-to-end (spin up Hermes template → click node → open side panel → configure secrets → send chat). Each fix is narrowly scoped and has matching unit or e2e tests so they don't regress. ### 1. SSRF defence blocked loopback A2A on self-hosted Docker handlers/ssrf.go was rejecting `http://127.0.0.1:<port>` workspace URLs as loopback, so POST /workspaces/:id/a2a returned 502 on every Canvas chat send in local-dev. The provisioner on self-hosted Docker publishes each container's A2A port on 127.0.0.1:<ephemeral> — that's the only reachable address for the platform-on-host path. Added `devModeAllowsLoopback()` — allows loopback only when MOLECULE_ENV ∈ {development, dev}. SaaS (MOLECULE_ENV=production) continues to block loopback; every other blocked range (metadata 169.254/16, TEST-NET, CGNAT, link-local) stays blocked in dev mode. Tests: 5 new tests in ssrf_test.go covering dev-mode loopback, dev-mode short-alias ("dev"), production still blocks loopback, dev-mode still blocks every other range, and a 9-case table test of the predicate with case/whitespace/typo variants. ### 2. canvas/src/lib/api.ts: 401 → login redirect broke localhost Every 401 called `redirectToLogin()` which navigates to `/cp/auth/login`. That route exists only on SaaS (mounted by the cp_proxy when CP_UPSTREAM_URL is set). On localhost it 404s — users landed on a blank "404 page not found" instead of seeing the actual error they should fix. Gated the redirect on the SaaS-tenant slug check: on <slug>.moleculesai.app, redirect unchanged; on any non-SaaS host (localhost, LAN IP, reserved subdomains like app.moleculesai.app), throw a real error so the calling component can render a retry affordance. Tests: 4 new vitest cases in a dedicated api-401.test.ts (needs jsdom for window.location.hostname) — SaaS redirects, localhost throws, LAN hostname throws, reserved apex throws. ### 3. SecretsSection rendered a hardcoded key list config/secrets-section.tsx shipped a fixed COMMON_KEYS list (Anthropic / OpenAI / Google / SERP / Model Override) regardless of what the workspace's template actually needed. A Hermes workspace declaring MINIMAX_API_KEY in required_env got five irrelevant slots and nothing for the key it actually needed. Made the slot list template-driven via a new `requiredEnv?: string[]` prop passed down from ConfigTab. Added `KNOWN_LABELS` for well-known names and `humanizeKeyName` to turn arbitrary SCREAMING_SNAKE_CASE into a readable label (e.g. MINIMAX_API_KEY → "Minimax API Key"). Acronyms (API, URL, ID, SDK, MCP, LLM, AI) stay uppercase. Legacy fallback preserved when required_env is empty. Tests: 8 new vitest cases covering known-label lookup, humanise fallback, acronym preservation, deduplication, and both fallback paths. ### 4. Confusing placeholder in Required Env Vars field The TagList in ConfigTab labelled "Required Env Vars (from template)" is a DECLARATION field — stores variable names. The placeholder "e.g. CLAUDE_CODE_OAUTH_TOKEN" suggested that, but users naturally typed the value of their API key into the field instead. The actual values go in the Secrets section further down the tab. Relabelled to "Required Env Var Names (from template)", changed the placeholder to "variable NAME (e.g. ANTHROPIC_API_KEY) — not the value", and added a one-line helper below pointing to Secrets. ### 5. Agent chat replies rendered 2-3 times Three delivery paths can fire for a single agent reply — HTTP response to POST /a2a, A2A_RESPONSE WS event, and a send_message_to_user WS push. Paths 2↔3 were already guarded by `sendingFromAPIRef`; path 1 had no guard. Hermes emits both the reply body AND a send_message_to_user with the same text, which manifested as duplicate bubbles with identical timestamps. Added `appendMessageDeduped(prev, msg, windowMs = 3000)` in chat/types.ts — dedupes on (role, content) within a 3s window. Threaded into all three setMessages call sites. The window is short enough that legitimate repeat messages ("hi", "hi") from a real user/agent a few seconds apart still render. Tests: 8 new vitest cases covering empty history, different content, duplicate within window, different roles, window elapsed, stale match, malformed timestamps, and custom window. ### 6. New end-to-end regression test tests/e2e/test_dev_mode.sh — 7 HTTP assertions that run against a live platform with MOLECULE_ENV=development and catch regressions on all the dev-mode escape hatches in a single pass: AdminAuth (empty DB + after-token), WorkspaceAuth (/activity, /delegations), AdminAuth on /approvals/pending, and the populated /org/templates response. Shellcheck-clean. ### Test sweep - `go test -race ./internal/handlers/ ./internal/middleware/ ./internal/provisioner/` — all pass - `npx vitest run` in canvas — 922/922 pass (up from 902) - `shellcheck --severity=warning infra/scripts/setup.sh tests/e2e/test_dev_mode.sh` — clean - `bash tests/e2e/test_dev_mode.sh` — 7/7 pass against a live platform + populated template registry ### SaaS parity Every relaxation remains conditional on MOLECULE_ENV=development. Production tenants run MOLECULE_ENV=production (enforced by the secrets-encryption strict-init path) and always set ADMIN_TOKEN, so none of these code paths fire on hosted SaaS. Behaviour on real tenants is byte-for-byte unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 14:57:18 -07:00
Hongming Wang	2c3eccf9d6	test(auth): provide window.location.pathname in redirectToLogin mocks The pathname.startsWith() loop-break added to redirectToLogin needs pathname on the mock Location object; tests were supplying only href. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 11:16:22 -07:00
rabbitblood	b360a4353f	fix(auth): redirect to app.moleculesai.app for login, not tenant subdomain Tenant subdomains (hongmingwang.moleculesai.app) proxy to EC2 platform which has no /cp/auth/* routes. Auth UI lives on app.moleculesai.app. Added getAuthOrigin() that detects SaaS tenant hosts and redirects to the app subdomain for login/signup. Non-SaaS hosts (localhost, dev) fall back to PLATFORM_URL as before. [Molecule-Platform-Evolvement-Manager] Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-23 11:16:22 -07:00
rabbitblood	6730c7713d	fix(auth): redirect to login on 401 from any API call When session credentials expire mid-use, ALL API calls return 401. Previously this threw a generic error that crashed the UI with no recovery path. Now the API client intercepts 401 and redirects to login once (via redirectToLogin which already guards against loops). Combined with the AuthGate /cp/auth/* path guard, this gives the correct behavior: credentials lost → redirect to login → user logs in → return_to sends them back. [Molecule-Platform-Evolvement-Manager] Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-23 11:16:22 -07:00
rabbitblood	edc42b2893	fix(auth): break infinite redirect loop on /cp/auth/login AuthGate redirected anonymous users to /cp/auth/login?return_to=<url>, but the login page itself triggered AuthGate, which redirected again with double-encoded return_to. Each redirect added another encoding layer until the URL exceeded 431 (Request Header Fields Too Large). Two guards: 1. redirectToLogin() returns early if already on /cp/auth/* path 2. AuthGate skips redirect check entirely for /cp/auth/* paths [Molecule-Platform-Evolvement-Manager] Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-23 11:16:22 -07:00
Hongming Wang	66de81fbfa	Merge pull request #1689 from Molecule-AI/refactor/strip-secret-service-dropdown refactor(secrets): strip Service dropdown from Add-Key form	2026-04-22 18:46:02 -07:00
Hongming Wang	8b1af9708c	feat(canvas): default tier T3 and hide T1/T2 on SaaS On SaaS every workspace gets its own EC2 VM — the Docker-sandbox distinction between T1 (sandboxed), T2 (standard Docker), and T3 (full host access) doesn't apply. A SaaS workspace is always a dedicated VM, which is "full access" by construction. Showing T1/T2 in that UI is a category error: users pick a sandbox level that has no effect on the actual EC2 machine they get. Changes: - tenant.ts: export isSaaSTenant() — returns true when canvas is served at <slug>.moleculesai.app (SSR-safe: false on server) - CreateWorkspaceDialog: when isSaaSTenant(), render only the T3 option, default tier=3, grid collapses to a single column. Label gets a " — dedicated VM" hint so the user knows what they're getting. On self-hosted the full T1/T2/T3 picker is unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 17:02:48 -07:00
Hongming Wang	d956164812	refactor(secrets): strip Service dropdown from Add-Key form The Add-Key form used to open with a required Service dropdown (GitHub / Anthropic / OpenRouter / Other) that gated everything else. The dropdown did no persistent work — the secret store only cares about (key_name, value); the Service label was never saved anywhere. It also suffered registry drift: today we support ~22 hermes-dispatched providers (MiniMax, Gemini, DeepSeek, Kimi, Qwen, NVIDIA, etc.); only 3 had entries. Everyone else landed in "Other" with no downside beyond the mandatory click. Replaces it with: 1. Key-name <datalist> autocomplete sourced from new KEY_NAME_SUGGESTIONS in lib/services.ts — 26 entries covering common infra keys + every hermes-supported provider. 2. inferGroup(keyName) derives classification at render time, matching what the store already does in getGrouped(). No behaviour change for list grouping. 3. Provider docs link renders inline only when inferGroup recognises the name. For 'custom' keys we stay quiet — no false-structure prompt. 4. Test-connection button still available when the inferred group supports it AND the value is format-valid. Same providers as before. SERVICES registry preserved for LIST rendering + test routing. Result: two fields instead of three. One fewer decision. Provider- agnostic by design — new providers work the moment someone types their canonical env var name; no UI code change per provider. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 16:41:43 -07:00
Molecule AI CP-BE	acf03cd057	fix(security): suppress raw response body from user-facing billing errors billing.ts (startCheckout, openBillingPortal): replace raw res.text() in thrown Error with a safe status-only message. The response body from /cp/billing/* routes can contain Stripe API error detail (invalid key, card decline message, raw Stripe envelope) that should not reach clients. orgs/page.tsx (createOrg): same fix — raw body → safe message. Full body is logged server-side for debugging. Closes: #91 (CWE-209 — Stripe key echoed in error) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-20 23:23:25 +00:00
rabbitblood	35df23850e	fix(canvas): CSP_DEV_MODE + admin token for local Docker (#1052 follow-up) Three changes that keep getting lost on nuke+rebuild: 1. middleware.ts: read CSP_DEV_MODE env to relax CSP in local Docker 2. api.ts: send NEXT_PUBLIC_ADMIN_TOKEN header (AdminAuth on /workspaces) 3. Dockerfile: accept NEXT_PUBLIC_ADMIN_TOKEN as build arg All three are required for the canvas to work in local Docker where canvas (port 3000) fetches from platform (port 8080) cross-origin. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-20 12:23:43 -07:00
molecule-ai[bot]	fc0e02d692	Merge pull request #1011 from Molecule-AI/test/qa-coverage-orgs-page-and-api-timeout test(canvas): QA coverage — orgs page polling + API timeout	2026-04-20 08:48:00 -07:00
qa-agent	e70cd1c2aa	test(canvas): pin AbortSignal timeout regression + cover /orgs landing page Two independent test additions that harden the surface freshly landed on staging via PRs #982 (canvas fetch timeout), #992 (/orgs landing), #994 (post-checkout redirect to /orgs). canvas/src/lib/__tests__/api.test.ts (+74 lines, 7 new tests) - GET/POST/PATCH/PUT/DELETE each pass an AbortSignal to fetch - TimeoutError (DOMException name=TimeoutError) propagates to the caller - Each request installs its own signal — no shared module-level controller that would allow one slow request to cancel an unrelated fast one This is the hardening nit I flagged in my APPROVE-w/-nit review of fix/canvas-api-fetch-timeout. Landing as a follow-up now that #982 is in staging. canvas/src/app/__tests__/orgs-page.test.tsx (+251 lines, new file, 10 tests) - Auth guard: signed-out → redirectToLogin and no /cp/orgs fetch - Error state: failed /cp/orgs → Error message + Retry button - Empty list: CreateOrgForm renders - CTA by status: running → "Open" link targets {slug}.moleculesai.app awaiting_payment → "Complete payment" → /pricing?org=<slug> failed → "Contact support" mailto - Post-checkout: ?checkout=success renders CheckoutBanner AND history.replaceState scrubs the query param - Fetch contract: /cp/orgs called with credentials:include + AbortSignal Local baseline on origin/staging tip `ede6597`: canvas vitest: 50 files / 778 tests, all green canvas build: clean, /orgs route present (2.83 kB / 105 kB first-load) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-19 19:14:54 +00:00
Hongming Wang	18894bebe8	feat(canvas): Phase 5 — credit balance pill + low-balance banner Adds the UI surface for the credit system to /orgs: - CreditsPill next to each org row. Tone shifts from zinc → amber at 10% of plan to red at zero. - LowCreditsBanner appears under the pill for running orgs when the balance crosses thresholds: overage_used > 0 → "overage active", balance <= 0 → "out of credits, upgrade", trial tail → "trial almost out". - Pure helpers extracted to lib/credits.ts so formatCredits, pillTone, and bannerKind are unit-tested without jsdom. Backend List query now returns credits_balance / plan_monthly_credits / overage_used_credits / overage_cap_credits so no second round-trip is needed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 07:27:29 -07:00
Hongming Wang	26b89400ed	test(canvas): bump billing test for /orgs success_url	2026-04-19 04:26:01 -07:00
Hongming Wang	d77378294b	feat(canvas): post-checkout UX — Stripe success lands on /orgs with banner Two small polish items that together close the signup-to-running-tenant flow for real users: 1. Stripe success_url now points at /orgs?checkout=success instead of the current page (was pricing). The old behavior left people staring at plan cards with no indication payment went through — the new behavior drops them right onto their org list where they can watch the status flip. 2. /orgs shows a green "Payment confirmed, workspace spinning up" banner when it sees ?checkout=success, then clears the query param via replaceState so a reload doesn't show it again. 3. /orgs now polls every 5s while any org is awaiting_payment or provisioning. Users see the Stripe webhook's effect live — no manual refresh needed — and once every org settles the polling stops so idle tabs don't hammer /cp/orgs. Paired with PR #992 (the /orgs page itself) this makes the end-to-end flow on BILLING_REQUIRED=true deployments feel right: /pricing → Stripe → /orgs?checkout=success → banner → live poll → "Open" button when org.status transitions to running. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 04:18:32 -07:00
Hongming Wang	956c1fb9b9	fix(canvas): add 15s fetch timeout on API calls Pre-launch audit flagged api.ts as missing a timeout on every fetch. A slow or hung CP response would leave the UI spinning indefinitely with no way for the user to abort — effectively a client-side DoS. 15s is long enough for real CP queries (slowest observed is Stripe portal redirect at ~3s) and short enough that a stalled backend surfaces as a clear error with a retry affordance. Uses AbortSignal.timeout (widely supported since 2023) so the abort propagates through React Query / SWR consumers cleanly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 02:12:47 -07:00
Hongming Wang	16cb728461	chore(canvas): initialize shadcn/ui — components.json + cn utility Sets up shadcn/ui CLI so new components can be added with `npx shadcn add <component>`. Uses new-york style, zinc base color, no CSS variables (matches existing Tailwind-only approach). Adds clsx + tailwind-merge for the cn() utility. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-18 07:57:17 -07:00
Hongming Wang	6cecf44f45	fix(canvas): add hermes + gemini-cli to deploy preflight required keys Hermes requires OPENROUTER_API_KEY (or any of its 15 providers). Gemini CLI requires GOOGLE_API_KEY. Without these entries, the MissingKeysModal doesn't fire and workspaces start without keys, causing crash loops. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 21:45:54 -07:00
Hongming Wang	54bb543ff7	fix: code review findings — token UI, auth hardening, WS dedup 1. Settings panel: wire TokensTab into "API Tokens" tab (was imported but not rendered). Rename "API Keys" → "Secrets", add "API Tokens" tab. Fix docs link → doc.moleculesai.app/docs/tokens. 2. Referer match hardening: require exact host match or trailing slash to prevent evil.com subdomain bypass. Cache CANVAS_PROXY_URL at init time instead of per-request os.Getenv. 3. Extract shared deriveWsBaseUrl() to lib/ws-url.ts — eliminates duplicate 12-line derivation in socket.ts and TerminalTab.tsx. 4. Token list pagination: add ?limit= and ?offset= params (default 50, max 200) to GET /workspaces/:id/tokens. 507/507 canvas tests pass, Go build + vet clean. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 10:42:26 -07:00
Hongming Wang	a4b1886c35	fix(canvas): address all code review findings on PR #482 - Reconcile TIER_CONFIG/TIER_COLORS into single TIER_CONFIG with both `color` (pill style) and `border` (bordered badge style) fields - Remove TemplatePalette alias indirection (TIER_LABELS_SHARED → direct import) - Extract inline spinner SVGs to shared Spinner component (3 copies → 1) - Migrate status dot colors from 6 remaining files to shared tokens: SearchDialog, StatusDot, Legend, ContextMenu, Toolbar + add statusDotClass() - Add COMM_TYPE_LABELS to design-tokens, used by CommunicationOverlay sr-only - Update reduced-motion tests: components that delegate to design-tokens pass the guard check via import detection; add design-tokens.ts own test - 507/507 tests pass, build clean Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 07:48:47 -07:00
Hongming Wang	ac6d6ce5bd	fix(canvas): UX improvements — shared tokens, focus rings, loading spinners, a11y - Extract STATUS_CONFIG, TIER_CONFIG, TIER_COLORS to shared design-tokens.ts (eliminates 3 duplicate definitions across WorkspaceNode, EmptyState, TemplatePalette) - Add focus-visible:ring-2 ring-blue-500 to WorkspaceNode, SidePanel tabs, EmptyState buttons, TemplatePalette buttons (keyboard navigation now visible) - Replace "Loading..." text with animated spinner SVG in EmptyState, TemplatePalette sidebar, and OrgTemplatesSection - Add disabled:cursor-not-allowed + suppress hover styling when disabled on EmptyState template buttons and TemplatePalette deploy buttons - Brighten SidePanel tab hover from bg-zinc-800/20 to bg-zinc-800/40 and text from zinc-300 to zinc-200 - Add screen reader labels to CommunicationOverlay directional arrows and status icons (sr-only text for "sent", "received", "to", status) Fixes #422, #424, #427 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 07:35:44 -07:00
Hongming Wang	a363b56f25	feat(tenant): combined platform + canvas Docker image with reverse proxy Single-container tenant architecture: Go platform (:8080) + Canvas Node.js (:3000) in one Fly machine, with Go's NoRoute handler reverse- proxying non-API routes to the canvas. Browser only talks to :8080. Changes: platform/Dockerfile.tenant — multi-stage build (Go + Node + runtime). Bakes workspace-configs-templates/ + org-templates/ into the image. Build context: repo root. platform/entrypoint-tenant.sh — starts both processes, kills both if either exits. Fly health check on :8080 covers the Go binary; canvas health is implicit (proxy returns 502 if canvas is down). platform/internal/router/canvas_proxy.go — httputil.ReverseProxy that forwards unmatched routes to CANVAS_PROXY_URL (http://localhost:3000). Activated by NoRoute when CANVAS_PROXY_URL env is set. platform/internal/router/router.go — wire NoRoute → canvasProxy when CANVAS_PROXY_URL is present; no-op otherwise (local dev unchanged). platform/internal/middleware/securityheaders.go — relaxed CSP to allow Next.js inline scripts/styles/eval + WebSocket + data: URIs. The strict `default-src 'self'` was blocking all canvas rendering. canvas/src/lib/api.ts — changed `\|\|` to `??` for NEXT_PUBLIC_PLATFORM_URL so empty string means "same-origin" (combined image) instead of falling back to localhost:8080. canvas/src/components/tabs/TerminalTab.tsx — same `??` fix for WS URL. Verified: tenant machine boots, canvas renders, 8 runtime templates + 4 org templates visible, API routes work through the same port. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 02:46:47 -07:00
Hongming Wang	4b865fa755	feat(canvas): /pricing route with plan selector + Stripe checkout Adds a public /pricing route the apex + tenant canvas can both serve. Three-tier plan cards (Free, Starter, Pro) with per-plan CTA buttons that dispatch correctly regardless of the user's state: Free → redirect to signup Anonymous + paid → redirect to signup (Stripe opens post-auth) Authed + paid → POST /cp/billing/checkout, redirect to Stripe URL No tenant slug → inline error ("pick an org first") Network failures → surfaced in an ARIA alert banner Files: - src/lib/billing.ts — plan metadata + startCheckout + openBillingPortal wrappers over /cp/billing/{checkout,portal} - src/components/PricingTable.tsx — client component, lazy session probe on first CTA click (no probe for anonymous browsers) - src/app/pricing/page.tsx — server-rendered shell with SEO metadata, links to legal pages in the footer - Tests: 10 billing helper tests + 9 PricingTable tests (17 total, additional ones cover the plan-list canonical order) Design notes: - The pricing data (features + prices) is a static const in billing.ts, not fetched from the API. Changing prices requires a deploy — which we'd need to do anyway for tier definition changes. - PLAN_ID 'starter' is flagged highlighted=true so the middle card gets the 'Most popular' visual treatment. One source of truth; test locks it. - Session probe is lazy (first CTA click, not mount) so anonymous visitors don't generate a /cp/auth/me request just to read the page. AuthGate interaction: - On apex (no tenant slug), AuthGate passthrough — /pricing renders freely - On tenant subdomain, AuthGate still bounces anonymous users to login before reaching /pricing — this is the correct UX for the "I'm already logged in and want to upgrade my own org" flow Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 13:41:44 -07:00
Hongming Wang	043b8fe159	feat(canvas): AuthGate — redirect anonymous users to cp login (Phase F close) Wraps the canvas root so every tenant-subdomain request checks for a valid session and bounces to app.moleculesai.app/cp/auth/login with a return_to pointing back at the current URL. Local dev + vercel preview URLs + apex pass through unchanged. Files: - canvas/src/lib/auth.ts: fetchSession() probes /cp/auth/me (credentials:include for cross-origin cookie); returns Session on 200, null on 401 (anonymous, no throw), throws on 5xx so transient outages don't leak the UI. - canvas/src/lib/auth.ts: redirectToLogin() builds the cp login URL with window.location.href as return_to; CP's isSafeReturnTo check rejects cross-domain bounces. - canvas/src/components/AuthGate.tsx: client component wrapping children. State machine: loading → authenticated \| anonymous. In non-SaaS mode (no tenant slug) skips the gate entirely. - canvas/src/app/layout.tsx: wraps the root body in <AuthGate>. Tests: +6 auth.ts (200 / 401 null / 5xx throw / credentials:include / redirectToLogin href + signup variant). Full suite 453 green (was 447). Pairs with molecule-controlplane PR #16 (return_to cookie handshake on the cp side). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 20:37:26 -07:00
Hongming Wang	3abcee11b3	feat(canvas): SaaS cross-origin — slug header + cookie credentials (Phase F) Canvas will be served at <slug>.moleculesai.app (Vercel). API calls go cross-origin to https://app.moleculesai.app. This commit wires the client side: - canvas/src/lib/tenant.ts: getTenantSlug() derives the slug from window.location.hostname, case-insensitive, matching the control plane's reservedSubdomains list (app/www/api/admin/…). Server-side + localhost + vercel preview URLs + apex all return "" so local dev keeps working. - canvas/src/lib/api.ts: adds X-Molecule-Org-Slug header + sets credentials:"include" on every fetch. The control plane's CORS middleware allows the origin + credentials; the session cookie has Domain=.moleculesai.app so the browser ships it. - canvas/src/lib/api/secrets.ts: same treatment (secrets API uses its own fetch helper — shared slug+credentials logic applied). Tests: +6 (tenant.test.ts covers slug / reserved / case / non-SaaS / preview URL / apex). Full canvas suite 447/447 green. Not in this PR: - WS URL derivation for terminal/socket.ts (separate follow-up; WS needs its own slug-aware URL and the canvas terminal isn't used in SaaS launch day-one). - Next.js rewrites (decided against; cross-origin with credentials is cleaner than path-level rewrites for session cookies). Deploys to Vercel once merged — no manual config needed (env already set). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 20:08:39 -07:00
Hongming Wang	24fec62d7f	initial commit — Molecule AI platform Forked clean from public hackathon repo (Starfire-AgentTeam, BSL 1.1) with full rebrand to Molecule AI under github.com/Molecule-AI/molecule-monorepo. Brand: Starfire → Molecule AI. Slug: starfire / agent-molecule → molecule. Env vars: STARFIRE_* → MOLECULE_*. Go module: github.com/agent-molecule/platform → github.com/Molecule-AI/molecule-monorepo/platform. Python packages: starfire_plugin → molecule_plugin, starfire_agent → molecule_agent. DB: agentmolecule → molecule. History truncated; see public repo for prior commits and contributor attribution. Verified green: go test -race ./... (platform), pytest (workspace-template 1129 + sdk 132), vitest (canvas 352), build (mcp). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 11:55:37 -07:00

47 Commits