Commit Graph

524 Commits

Author SHA1 Message Date
Hongming Wang
06273b11ef fix(canvas/config): load runtime+model from workspace metadata + hide misleading config.yaml error for hermes
Canvas Config tab had 3 bugs visible on hermes workspaces (#1894):

1. Runtime dropdown showed "LangGraph (default)" even when the workspace's
   actual runtime was hermes — because the form only loaded runtime from
   config.yaml, and hermes doesn't use the platform's config.yaml template.
2. Model field was empty for the same reason.
3. "No config.yaml found" error appeared on hermes workspaces despite
   everything being fine — hermes manages its own config at
   ~/.hermes/config.yaml on the workspace host.

Worse, clicking Save with the empty form would silently flip `runtime`
back from `hermes` to `LangGraph (default)`.

## Fix

- loadConfig now always fetches workspace metadata (runtime + model)
  via GET /workspaces/:id and GET /workspaces/:id/model BEFORE attempting
  the config.yaml fetch. These act as the source of truth for runtime
  and model when config.yaml doesn't set them.
- RUNTIMES_WITH_OWN_CONFIG set lists runtimes that manage their own
  config outside the platform template (hermes, external). For these:
  - Missing config.yaml is NOT an error — no red banner shown.
  - An informational gray banner tells the user where to edit the
    runtime's config (e.g. "edit ~/.hermes/config.yaml via Terminal tab
    or the hermes CLI" for hermes).

Closes #1894.

Verified 2026-04-23 on user's hongmingwang tenant which runs hermes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 14:58:36 -07:00
Hongming Wang
de99a22ffc fix(quickstart): hotfixes discovered during live testing session
Five additional breakages surfaced while testing the restored stack
end-to-end (spin up Hermes template → click node → open side panel →
configure secrets → send chat). Each fix is narrowly scoped and has
matching unit or e2e tests so they don't regress.

### 1. SSRF defence blocked loopback A2A on self-hosted Docker

handlers/ssrf.go was rejecting `http://127.0.0.1:<port>` workspace
URLs as loopback, so POST /workspaces/:id/a2a returned 502 on every
Canvas chat send in local-dev. The provisioner on self-hosted Docker
publishes each container's A2A port on 127.0.0.1:<ephemeral> — that's
the only reachable address for the platform-on-host path.

Added `devModeAllowsLoopback()` — allows loopback only when
MOLECULE_ENV ∈ {development, dev}. SaaS (MOLECULE_ENV=production)
continues to block loopback; every other blocked range (metadata
169.254/16, TEST-NET, CGNAT, link-local) stays blocked in dev mode.

Tests: 5 new tests in ssrf_test.go covering dev-mode loopback,
dev-mode short-alias ("dev"), production still blocks loopback,
dev-mode still blocks every other range, and a 9-case table test of
the predicate with case/whitespace/typo variants.

### 2. canvas/src/lib/api.ts: 401 → login redirect broke localhost

Every 401 called `redirectToLogin()` which navigates to
`/cp/auth/login`. That route exists only on SaaS (mounted by the
cp_proxy when CP_UPSTREAM_URL is set). On localhost it 404s — users
landed on a blank "404 page not found" instead of seeing the actual
error they should fix.

Gated the redirect on the SaaS-tenant slug check: on
<slug>.moleculesai.app, redirect unchanged; on any non-SaaS host
(localhost, LAN IP, reserved subdomains like app.moleculesai.app),
throw a real error so the calling component can render a retry
affordance.

Tests: 4 new vitest cases in a dedicated api-401.test.ts (needs
jsdom for window.location.hostname) — SaaS redirects, localhost
throws, LAN hostname throws, reserved apex throws.

### 3. SecretsSection rendered a hardcoded key list

config/secrets-section.tsx shipped a fixed COMMON_KEYS list
(Anthropic / OpenAI / Google / SERP / Model Override) regardless of
what the workspace's template actually needed. A Hermes workspace
declaring MINIMAX_API_KEY in required_env got five irrelevant slots
and nothing for the key it actually needed.

Made the slot list template-driven via a new `requiredEnv?: string[]`
prop passed down from ConfigTab. Added `KNOWN_LABELS` for well-known
names and `humanizeKeyName` to turn arbitrary SCREAMING_SNAKE_CASE
into a readable label (e.g. MINIMAX_API_KEY → "Minimax API Key").
Acronyms (API, URL, ID, SDK, MCP, LLM, AI) stay uppercase. Legacy
fallback preserved when required_env is empty.

Tests: 8 new vitest cases covering known-label lookup, humanise
fallback, acronym preservation, deduplication, and both fallback
paths.

### 4. Confusing placeholder in Required Env Vars field

The TagList in ConfigTab labelled "Required Env Vars (from template)"
is a DECLARATION field — stores variable names. The placeholder
"e.g. CLAUDE_CODE_OAUTH_TOKEN" suggested that, but users naturally
typed the value of their API key into the field instead. The actual
values go in the Secrets section further down the tab.

Relabelled to "Required Env Var Names (from template)", changed the
placeholder to "variable NAME (e.g. ANTHROPIC_API_KEY) — not the
value", and added a one-line helper below pointing to Secrets.

### 5. Agent chat replies rendered 2-3 times

Three delivery paths can fire for a single agent reply — HTTP
response to POST /a2a, A2A_RESPONSE WS event, and a
send_message_to_user WS push. Paths 2↔3 were already guarded by
`sendingFromAPIRef`; path 1 had no guard. Hermes emits both the
reply body AND a send_message_to_user with the same text, which
manifested as duplicate bubbles with identical timestamps.

Added `appendMessageDeduped(prev, msg, windowMs = 3000)` in
chat/types.ts — dedupes on (role, content) within a 3s window.
Threaded into all three setMessages call sites. The window is short
enough that legitimate repeat messages ("hi", "hi") from a real
user/agent a few seconds apart still render.

Tests: 8 new vitest cases covering empty history, different content,
duplicate within window, different roles, window elapsed, stale
match, malformed timestamps, and custom window.

### 6. New end-to-end regression test

tests/e2e/test_dev_mode.sh — 7 HTTP assertions that run against a
live platform with MOLECULE_ENV=development and catch regressions
on all the dev-mode escape hatches in a single pass: AdminAuth
(empty DB + after-token), WorkspaceAuth (/activity, /delegations),
AdminAuth on /approvals/pending, and the populated
/org/templates response. Shellcheck-clean.

### Test sweep

- `go test -race ./internal/handlers/ ./internal/middleware/
  ./internal/provisioner/` — all pass
- `npx vitest run` in canvas — 922/922 pass (up from 902)
- `shellcheck --severity=warning infra/scripts/setup.sh
  tests/e2e/test_dev_mode.sh` — clean
- `bash tests/e2e/test_dev_mode.sh` — 7/7 pass against a live
  platform + populated template registry

### SaaS parity

Every relaxation remains conditional on MOLECULE_ENV=development.
Production tenants run MOLECULE_ENV=production (enforced by the
secrets-encryption strict-init path) and always set ADMIN_TOKEN, so
none of these code paths fire on hosted SaaS. Behaviour on real
tenants is byte-for-byte unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 14:57:18 -07:00
Hongming Wang
a93bd58b59 fix(quickstart): keep Canvas working post first workspace + hide SaaS cookie banner on localhost
Follow-up to the previous commit on this branch. Two additional fresh-clone
regressions surfaced during end-to-end verification, both affecting local
dev only and both landing inside the same SaaS-vs-local-dev seam:

### 1. Canvas 401-loops after first workspace creation

`GET /workspaces` is behind `AdminAuth` (router.go:121 — "C1: unauthenticated
workspace topology exposure"). The middleware has a Tier-1 fail-open branch
that only fires when *no* workspace tokens exist anywhere in the DB. The
moment a user creates their first workspace — via either the Canvas UI, the
API, or the e2e-api test suite — a token lands in the DB, Tier-1 closes, and
the Canvas (which has no bearer token in local dev: no WorkOS session, no
NEXT_PUBLIC_ADMIN_TOKEN baked in at build time) gets 401 on every list
call. The UI renders a stuck "API GET /workspaces: 401 admin auth required"
placeholder forever.

SaaS is unaffected because hosted provisioning always sets both
`ADMIN_TOKEN` and `MOLECULE_ENV=production`, and the Canvas there either
carries a WorkOS session cookie or `NEXT_PUBLIC_ADMIN_TOKEN` baked into
the JS bundle.

**Fix** (`workspace-server/internal/middleware/wsauth_middleware.go`): add
a narrow Tier-1b escape hatch that stays fail-open when *both*
`ADMIN_TOKEN` is unset *and* `MOLECULE_ENV` is explicitly a dev mode
("development" / "dev"). Production never hits it (SaaS sets
`MOLECULE_ENV=production`). Mirrors the existing convention in
`handlers/admin_test_token.go` which gates the e2e test-token endpoint on
`MOLECULE_ENV != "production"`.

Three new regression tests in `wsauth_middleware_test.go`:
- `TestAdminAuth_DevModeEscapeHatch_FailsOpenWithHasLiveTokens` — the
  happy path (dev mode, no admin token, tokens exist → 200)
- `TestAdminAuth_DevModeEscapeHatch_IgnoredWhenAdminTokenSet` — explicit
  `ADMIN_TOKEN` wins; dev mode does not silently re-open the gate
- `TestAdminAuth_DevModeEscapeHatch_IgnoredInProduction` — the
  SaaS-safety guarantee (production + no admin token + tokens exist → 401)

`.env.example` flipped to set `MOLECULE_ENV=development` by default so
new users get the dev-mode hatch automatically via `cp .env.example .env`.
SaaS provisioning overrides to `production`, consistent with the existing
convention used by the secrets-encryption strict-init path.

### 2. SaaS cookie/privacy banner rendered on localhost

`CookieConsent` mounted unconditionally in the root layout, so
`npm run dev` on localhost showed a "Cookies & your privacy" banner
pointing at `moleculesai.app/legal/privacy`. That banner is a
GDPR/ePrivacy compliance UI that only applies to the hosted SaaS
offering; self-hosted / local-dev / Vercel-preview hosts must not
see it.

**Fix** (`canvas/src/components/CookieConsent.tsx`): gate render on
`isSaaSTenant()`. Matches the convention used by `AuthGate` and the
workspace tier picker elsewhere in the codebase.

Tests (`canvas/src/components/__tests__/CookieConsent.test.tsx`):
existing tests now stub `window.location.hostname` to a SaaS
subdomain before rendering (required since `isSaaSTenant()` on jsdom's
default "localhost" would suppress the banner). Added two new tests
for the local-dev hide path:
- `does NOT render on local dev (non-SaaS hostname)`
- `does NOT render on a LAN hostname (192.168.*, *.local)`

### Verification

On a fresh-nuked DB with the updated branch:

1. `bash infra/scripts/setup.sh` — clean
2. `go run ./cmd/server` — "Applied 41 migrations", :8080 healthy,
   dev-mode hatch armed (`MOLECULE_ENV=development`)
3. `npm run dev` in canvas — :3000 renders, no cookie banner
4. `bash tests/e2e/test_api.sh` — **61 passed, 0 failed**
   (test suite creates tokens; GET /workspaces stays 200 under the hatch)
5. Browser at http://localhost:3000 AFTER the e2e run:
   - Canvas renders the workspace list (no 401 placeholder)
   - No cookie banner
6. `npx vitest run` — **902 tests passed** (900 prior + 2 new hide tests)
7. `go test -race ./internal/middleware/` — all passing (3 new
   dev-mode tests + existing Issue-180 / Issue-120 / Issue-684 suite),
   coverage 81.8%

### SaaS parity audit

Same principle as the rest of this branch: local must work without
weakening SaaS.

- Dev-mode hatch: conditional on `MOLECULE_ENV=development`.
  Production tenants always run `MOLECULE_ENV=production` (already
  enforced by the secrets-encryption `InitStrict` path in
  `internal/crypto/aes.go`). Branch is unreachable there.
- Cookie banner: gated on `isSaaSTenant()` which checks
  `NEXT_PUBLIC_SAAS_HOST_SUFFIX` (default `.moleculesai.app`). SaaS
  hosts still get the banner; every other host doesn't.

No change to SaaS behaviour. #1822 backend-parity tracker untouched.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 14:55:33 -07:00
molecule-ai[bot]
f18e261353
Merge branch 'staging' into fix/auth-redirect-loop 2026-04-23 20:38:18 +00:00
Hongming Wang
9ad803a802
fix(quickstart): make README cp-paste flow bugless end-to-end (#1871)
Reproducing the README's quickstart on a clean clone surfaced seven
independent bugs between `git clone` and seeing the Canvas in a browser.
Each fix is minimal and local-dev-only — the SaaS/EC2 provisioner path
(issue #1822) is untouched.

Bugs fixed:

1. `infra/scripts/setup.sh` applied migrations via raw psql, bypassing
   the platform's `schema_migrations` tracker. The platform then re-ran
   every migration on first boot and crashed on non-idempotent ALTER
   TABLE statements (e.g. `036_org_api_tokens_org_id.up.sql`). Dropped
   the migration block — `workspace-server/internal/db/postgres.go:53`
   already tracks and skips applied files.

2. `.env.example` shipped `DATABASE_URL=postgres://USER:PASS@postgres:...`
   with literal `USER:PASS` placeholders and the Docker-internal hostname
   `postgres`. A `cp .env.example .env` followed by `go run ./cmd/server`
   on the host failed with `dial tcp: lookup postgres: no such host`.
   Replaced with working `dev:dev@localhost:5432` defaults that match
   `docker-compose.infra.yml`.

3. `docker-compose.infra.yml` and `docker-compose.yml` set
   `CLICKHOUSE_URL: clickhouse://...:9000/...`. Langfuse v2 rejects
   anything other than `http://` or `https://`, so the container
   crash-looped and returned HTTP 500. Switched to
   `http://...:8123` (HTTP interface) and added `CLICKHOUSE_MIGRATION_URL`
   for the migration-time native-protocol connection. Also removed
   `LANGFUSE_AUTO_CLICKHOUSE_MIGRATION_DISABLED` so migrations actually
   run.

4. `canvas/package.json` dev script crashed with `EADDRINUSE :::8080`
   when `.env` was sourced before `npm run dev` — Next.js reads `PORT`
   from env and the platform owns 8080. Pinned `dev` to
   `-p 3000` so sourced env can't hijack it. `start` left as-is because
   production `node server.js` (Dockerfile CMD) must respect `PORT`
   from the orchestrator.

5. README/CONTRIBUTING told users to clone `Molecule-AI/molecule-monorepo`
   — that repo 404s; the actual name is `molecule-core`. The Railway
   and Render deploy buttons had the same broken URL. Replaced in both
   English and Chinese READMEs and in CONTRIBUTING. Internal identifiers
   (Go module path, Docker network `molecule-monorepo-net`, Python helper
   `molecule-monorepo-status`) deliberately left alone — renaming those
   is an invasive refactor orthogonal to this fix.

6. README quickstart was missing `cp .env.example .env`. Users who went
   straight from `git clone` to `./infra/scripts/setup.sh` got a script
   that warned about an unset `ADMIN_TOKEN` (harmless) but then couldn't
   run the platform without figuring out the env setup on their own.
   Added the step in both READMEs and CONTRIBUTING. Deliberately NOT
   generating `ADMIN_TOKEN`/`SECRETS_ENCRYPTION_KEY` here — the e2e-api
   suite (`tests/e2e/test_api.sh`) assumes AdminAuth fallback mode
   (no server-side `ADMIN_TOKEN`), which is how CI runs it.

7. CI shellcheck only covered `tests/e2e/*.sh` — `infra/scripts/setup.sh`
   is in the critical path of every new-user onboarding but was never
   linted. Extended the `shellcheck` job and the `changes` filter to
   cover `infra/scripts/`. `scripts/` deliberately excluded until its
   pre-existing SC3040/SC3043 warnings are cleaned up separately.

Verification (fresh nuke-and-rebuild following the updated README):

- `docker compose -f docker-compose.infra.yml down -v` + `rm .env`
- `cp .env.example .env` → defaults work as-is
- `bash infra/scripts/setup.sh` — clean, no migration errors, all 6
  infra containers healthy
- `cd workspace-server && go run ./cmd/server` — "Applied 41 migrations
  (0 already applied)", platform on :8080/health 200
- `cd canvas && npm install && npm run dev` — Canvas on :3000/ 200
  even with `.env` sourced (PORT=8080 in env)
- `bash tests/e2e/test_api.sh` — **61 passed, 0 failed**
- `cd canvas && npx vitest run` — **900 tests passed**
- `cd canvas && npm run build` — production build clean
- `shellcheck --severity=warning infra/scripts/*.sh` — clean
- Langfuse `/api/public/health` 200 (was 500)

Scope notes:

- SaaS/EC2 parity (issue #1822): all files touched here are local-dev
  surface. Canvas container uses `node server.js` with `ENV PORT=3000`
  in `canvas/Dockerfile` — the `-p 3000` pin in `package.json` dev
  script only affects `npm run dev`, not the production CMD.
- Test coverage (issue #1821): project policy is tiered coverage floors,
  not a blanket 100% target. Files touched here are shell scripts,
  YAML, Markdown, and one package.json script — not classes covered
  by the coverage matrix.
- No overlap with open PRs — searched `setup.sh`, `quickstart`,
  `langfuse`, `clickhouse`, `migration`, `README`; nothing conflicts.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com>
2026-04-23 19:53:43 +00:00
Molecule AI Content Marketer
41e2e8768b docs(marketing): add Phase 34 video assets + manual posting package + chrome-devtools blog
- Add Phase 30 hero video (16x9 + captioned) to devrel demos
- Add Phase 30 screencasts (agents MD auto-generation, Cloudflare artifacts)
- Add manual-posting-package.md for field/manual social workflow
- Add chrome-devtools-mcp blog post draft (canvas/src/app/blog/)

🤖 Generated with [Claude Code](https://claude.com/claude-code)
2026-04-23 19:12:17 +00:00
Hongming Wang
2c3eccf9d6 test(auth): provide window.location.pathname in redirectToLogin mocks
The pathname.startsWith() loop-break added to redirectToLogin needs
pathname on the mock Location object; tests were supplying only href.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 11:16:22 -07:00
rabbitblood
b360a4353f fix(auth): redirect to app.moleculesai.app for login, not tenant subdomain
Tenant subdomains (hongmingwang.moleculesai.app) proxy to EC2 platform
which has no /cp/auth/* routes. Auth UI lives on app.moleculesai.app.

Added getAuthOrigin() that detects SaaS tenant hosts and redirects to
the app subdomain for login/signup. Non-SaaS hosts (localhost, dev)
fall back to PLATFORM_URL as before.

[Molecule-Platform-Evolvement-Manager]

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-23 11:16:22 -07:00
rabbitblood
6730c7713d fix(auth): redirect to login on 401 from any API call
When session credentials expire mid-use, ALL API calls return 401.
Previously this threw a generic error that crashed the UI with no
recovery path. Now the API client intercepts 401 and redirects to
login once (via redirectToLogin which already guards against loops).

Combined with the AuthGate /cp/auth/* path guard, this gives the
correct behavior: credentials lost → redirect to login → user logs
in → return_to sends them back.

[Molecule-Platform-Evolvement-Manager]

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-23 11:16:22 -07:00
rabbitblood
edc42b2893 fix(auth): break infinite redirect loop on /cp/auth/login
AuthGate redirected anonymous users to /cp/auth/login?return_to=<url>,
but the login page itself triggered AuthGate, which redirected again
with double-encoded return_to. Each redirect added another encoding
layer until the URL exceeded 431 (Request Header Fields Too Large).

Two guards:
1. redirectToLogin() returns early if already on /cp/auth/* path
2. AuthGate skips redirect check entirely for /cp/auth/* paths

[Molecule-Platform-Evolvement-Manager]

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-23 11:16:22 -07:00
Hongming Wang
dc476153c1 Merge remote-tracking branch 'origin/staging' into promote/main-to-staging-2026-04-23
# Conflicts:
#	canvas/src/components/__tests__/ContextMenu.keyboard.test.tsx
2026-04-23 09:50:16 -07:00
8f7808642a fix(test): add getState to useCanvasStore mock in ContextMenu keyboard test
PR #1781 introduced useCanvasStore.getState() call in ContextMenu.tsx
(line 169) but the existing Vitest mock for useCanvasStore in the keyboard
test file lacked a getState method, causing:
  TypeError: useCanvasStore.getState is not a function

Fix: attach getState: () => mockStore to the mock using Object.assign
so the static method is available alongside the selector fn.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-23 16:43:08 +00:00
Hongming Wang
47dc72c6b3 chore: promote main → staging (52 commits, 2 conflicts resolved)
Brings the staging branch up to date with main's feature-fix stream so
every staging-targeted PR stops tripping on pre-existing rot. Before
this merge, staging had 30+ compile + test failures from fix PRs that
landed on main but never reached staging — primarily #1755's panic-
cascade + schema-drift alignments.

After this merge the handlers package goes from 30+ fails → 2 pre-
existing nil-docker test panics (TestCopyFilesToContainer_CWE22_
RejectsTraversal + TestDeleteViaEphemeral_F1085_RejectsTraversal),
both authored on staging and broken before this promotion. Tracked
separately; not a merge regression.

## Conflicts resolved

1. docs/marketing/campaigns/discord-adapter-announcement/announcement.md
   — deleted on main (9d0d213: "move sensitive strategy + research to
   internal repo"), modified on staging. Deletion wins: marketing
   content moved out of the public monorepo per that commit's intent.
   The content lives in the internal repo.

2. workspace-server/internal/handlers/container_files.go — staging's
   rmTarget version kept. Main's version had `Cmd: []string{"rm",
   "-rf", "/configs/" + filePath}` which concatenates raw filePath
   AFTER the prefix-check on rmTarget, defeating the path-traversal
   guard (a "../etc/passwd" input passes validation but the rm cmd
   then traverses). Staging's `Cmd: []string{"rm", "-rf", rmTarget}`
   uses the validated path. Keeping staging's more-secure variant.

## Includes build unblockers from #1769 / #1782
- terminal.go: malformed handleLocalConnect repaired
- terminal_test.go: missing braces in TestHandleConnect_RoutesToLocal
- workspace_crud.go: unused imports + duplicate strField block
- container_files_test.go: duplicate contains() removed (uses the one
  in workspace_provision_test.go, same package)

## Verification
- go build ./...  clean
- go vet ./...  clean
- go test -race ./... — 18/20 packages green; 2 test panics in
  internal/handlers are pre-existing on staging (documented above)
2026-04-23 08:51:01 -07:00
Hongming Wang
68ee76c6b7 fix(canvas): add getState to useCanvasStore mock in ContextMenu keyboard test
ContextMenu.tsx reads parent-workspace children via
useCanvasStore.getState().nodes.filter(...) — a direct .getState()
call, not the selector-calling form. The existing vi.mock exposed
only the selector form, so rendering crashed with
"TypeError: useCanvasStore.getState is not a function".

Restructure the vi.mock factory to return Object.assign(fn, {
getState: () => mockStore }) so both call shapes resolve. Factory body
builds the function locally because vi.mock hoists above outer-scope
variable declarations and can't reference `mockStore` via closure.

Verified: all 15 tests in the file pass after the change.

Unblocks the Canvas (Next.js) CI check on PR #1743 (staging→main sync).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 01:49:34 -07:00
Hongming Wang
d4cead5002 chore: extract ContextMenu Zustand fix + a2a_proxy local-docker SSRF bypass + workspace-server Dockerfile GID entrypoint
Three small, non-overlapping fixes extracted from closed PR #1664:

1. canvas/src/components/ContextMenu.tsx — Replace the useMemo-over-nodes
   pattern with a hashed-boolean selector (s.nodes.some(...)) so Zustand's
   useSyncExternalStore snapshot comparison is stable. Resolves React
   error #185 (infinite render loop). Moves the child-node list derivation
   into the delete handler via getState() so the render path no longer
   allocates a fresh array.

2. workspace-server/internal/handlers/a2a_proxy.go — Allow the
   Docker-bridge hostname path (ws-<id>:8000) to skip the SSRF guard in
   local-docker mode. Gated on !saasMode() so SaaS deployments keep the
   full private-IP blocklist (a remote workspace registration can't claim
   a ws-* hostname and reach a sensitive VPC IP).

3. workspace-server/Dockerfile — Add entrypoint.sh that discovers the
   docker.sock GID at boot and adds the platform user to that group, then
   exec's su-exec to drop privileges. Lets the platform container reach
   the host docker socket without running as root.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 20:00:16 -07:00
molecule-ai[bot]
9d076b9c4d
Merge pull request #1684 from Molecule-AI/fix/missing-keys-modal-a11y-v2
fix(canvas/a11y): MissingKeysModal — backdrop aria-hidden, decorative SVGs, form labels
2026-04-23 02:54:46 +00:00
5157f80d19 fix(canvas): add type=button to ApprovalBanner action buttons (bug #1669)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-23 02:15:52 +00:00
Hongming Wang
e08ea7b5ba fix(canvas): require hermes model at create + send to CP (fixes silent Anthropic 401)
Root cause of the hermes 401 "Invalid API key" on SaaS workspaces:

  1. CreateWorkspaceDialog never sent `model` in the /workspaces POST
  2. Tenant/CP plumbed through a valid (provider, API key) but empty MODEL
  3. Workspace install.sh ran with HERMES_DEFAULT_MODEL unset
  4. derive-provider.sh saw no slug → PROVIDER="auto"
  5. Hermes fell back to its compiled-in default (Anthropic via
     OpenAI-compat adapter)
  6. User's MINIMAX_API_KEY was present but irrelevant — hermes tried
     Anthropic with it → 401

Fix:

- Extend HERMES_PROVIDERS with `defaultModel` + `models` (suggestion
  list). Each provider ships with a known-good default so the trap
  is physically impossible to hit with the new form.
- Add a required Model input to the Hermes panel, auto-populated
  from the provider's defaultModel when the provider changes (only
  if the user hasn't typed their own slug yet).
- Datalist surfaces additional model suggestions per provider so
  users can pick a different size (e.g. M2.7-highspeed) without
  typing the whole slug.
- handleCreate validates hermesModel is non-empty, sends as `model`
  in the POST body alongside the secrets block.
- useEffect guard avoids clobbering a user-typed custom slug when
  they toggle providers back and forth.

Existing 19 a11y tests still pass (non-SaaS path unchanged, four-tier
picker still renders, arrow-key nav still wraps).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 18:59:49 -07:00
Hongming Wang
66de81fbfa
Merge pull request #1689 from Molecule-AI/refactor/strip-secret-service-dropdown
refactor(secrets): strip Service dropdown from Add-Key form
2026-04-22 18:46:02 -07:00
Hongming Wang
0574e7c1d0 feat(canvas): add T4 tier (full-host access); SaaS default T4
Following feedback that T4 — not T3 — is the full-access tier:

- Non-SaaS picker now shows all four tiers: T1 Sandboxed, T2 Standard,
  T3 Privileged, T4 Full Access. Four-column grid.
- SaaS picker stays single-option but now locks to T4 (was T3). Every
  SaaS workspace gets a dedicated EC2 VM, which is unambiguously the
  "full host" case — T3 (privileged container) was a category mismatch.
- Default tier on SaaS is 4 (was 3). CP provisioner already supports
  tier 4 (t3.large / 80 GB). TIER_CONFIG already has T4's amber color.

Tests updated for the four-tier picker: wrap tests now go T4 ↔ T1, and
the selection/tabIndex tests cover the fourth button.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 18:17:13 -07:00
382238daa3 test(canvas): relax setPendingDelete assertion to use expect.objectContaining
Staging added hasChildren/children fields to workspace store shape.
Test assertion updated to use objectContaining to avoid false negatives.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-23 00:59:38 +00:00
66c6b83ab2 test(canvas): add ActivityTab and MissingKeysModal component tests
- ActivityTab.test.tsx: 27 tests covering filter bar (aria-pressed states,
  API reload), loading/error/empty states, ActivityRow content (type badges,
  method, duration_ms, summary, error styling), A2A flow indicators,
  auto-refresh Live/Paused toggle, refresh button, activity count

- MissingKeysModal.component.test.tsx: 25 tests covering visibility,
  ARIA semantics (role=dialog, aria-modal, aria-labelledby), content,
  keyboard (Escape, Enter), save flow (disabled/.../Saved/error), Add Keys
  & Deploy gate, Cancel + backdrop click, Open Settings button

- MissingKeysModal.test.tsx: refactored to preflight logic only (7 tests);
  component rendering now covered in component test file

863 tests passing (+3 net).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-23 00:58:56 +00:00
Hongming Wang
8b1af9708c feat(canvas): default tier T3 and hide T1/T2 on SaaS
On SaaS every workspace gets its own EC2 VM — the Docker-sandbox
distinction between T1 (sandboxed), T2 (standard Docker), and T3
(full host access) doesn't apply. A SaaS workspace is always a
dedicated VM, which is "full access" by construction. Showing T1/T2
in that UI is a category error: users pick a sandbox level that has
no effect on the actual EC2 machine they get.

Changes:
- tenant.ts: export isSaaSTenant() — returns true when canvas is
  served at <slug>.moleculesai.app (SSR-safe: false on server)
- CreateWorkspaceDialog: when isSaaSTenant(), render only the T3
  option, default tier=3, grid collapses to a single column. Label
  gets a " — dedicated VM" hint so the user knows what they're
  getting. On self-hosted the full T1/T2/T3 picker is unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 17:02:48 -07:00
Hongming Wang
d956164812 refactor(secrets): strip Service dropdown from Add-Key form
The Add-Key form used to open with a required Service dropdown
(GitHub / Anthropic / OpenRouter / Other) that gated everything
else. The dropdown did no persistent work — the secret store only
cares about (key_name, value); the Service label was never saved
anywhere. It also suffered registry drift: today we support ~22
hermes-dispatched providers (MiniMax, Gemini, DeepSeek, Kimi, Qwen,
NVIDIA, etc.); only 3 had entries. Everyone else landed in "Other"
with no downside beyond the mandatory click.

Replaces it with:

1. Key-name <datalist> autocomplete sourced from new
   KEY_NAME_SUGGESTIONS in lib/services.ts — 26 entries covering
   common infra keys + every hermes-supported provider.

2. inferGroup(keyName) derives classification at render time,
   matching what the store already does in getGrouped(). No
   behaviour change for list grouping.

3. Provider docs link renders inline only when inferGroup
   recognises the name. For 'custom' keys we stay quiet — no
   false-structure prompt.

4. Test-connection button still available when the inferred group
   supports it AND the value is format-valid. Same providers as
   before.

SERVICES registry preserved for LIST rendering + test routing.

Result: two fields instead of three. One fewer decision. Provider-
agnostic by design — new providers work the moment someone types
their canonical env var name; no UI code change per provider.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 16:41:43 -07:00
Hongming Wang
f6e6a64ba9 fix(canvas): forward-port dynamic runtime dropdown from staging (PR #1526)
PR #1526 shipped the /templates registry + canvas dynamic Runtime /
Model / Required-Env fields on 2026-04-22 — but merged into the
staging branch, not main. The staging→main promotion PR #1496 has
been open unmerged for a while with 1172 commits divergence, so
prod (which builds from main) still carries the old hardcoded
dropdown.

Symptom seen on hongmingwang.moleculesai.app today:

- New Hermes Agent workspace (template declares runtime: hermes) loads
  Config tab → Runtime dropdown shows "LangGraph (default)" because
  there's no <option value="hermes"> in the hardcoded list; it falls
  back to empty-value silently.
- Model field is a plain TextInput with static placeholder
  "e.g. anthropic:claude-sonnet-4-6" — should be a combobox populated
  from the selected runtime's models[].
- Required Env Vars is a TagList with static placeholder
  "e.g. CLAUDE_CODE_OAUTH_TOKEN" — should auto-populate from the
  selected model's required_env.
- Net effect: "Save & Deploy" sends empty model + empty env to the
  provisioner → workspace instant-fails.

This PR cherry-picks the exact three files from PR #1526 (#359dc61
on staging) forward to main, without pulling the other 1171
commits:

- canvas/src/components/tabs/ConfigTab.tsx
  - RuntimeOption interface + FALLBACK_RUNTIME_OPTIONS (hermes,
    gemini-cli included)
  - useEffect fetches /templates and populates runtimeOptions
    dynamically
  - dropdown renders from runtimeOptions (no hardcoded list)
  - Model becomes a combobox with datalist of available models
    per selected runtime
  - Required Env Vars auto-populates from the selected model's
    required_env on model change

- workspace-server/internal/handlers/templates.go
  - /templates endpoint returns [{id, name, runtime, models}] with
    per-template models registry (id, name, required_env)

- workspace-server/internal/handlers/templates_test.go
  - Tests for runtime+models parsing and legacy top-level model
    fallback

The canvas Runtime dropdown now resolves "hermes" correctly;
Model dropdown shows the models[] from the hermes template; Env
auto-populates with HERMES_API_KEY (or whichever model selected).

Verified locally:
  - workspace-server builds clean
  - Template handler tests pass: TestTemplatesList_RuntimeAndModelsRegistry,
    TestTemplatesList_LegacyTopLevelModel, TestTemplatesList_NonexistentDir

Follow-up: the staging→main promotion gap (#1496) is the
underlying process issue. Either merge that PR or adopt a policy
of landing fixes directly on main (as several PRs have today).
Files here were chosen minimally to avoid pulling unrelated staging
changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 14:28:38 -07:00
airenostars
7a89704b6e
fix(build): add missing fmt import + fix canvas Dockerfile GID (#1487)
* docs(canary-release): flag as aspirational; link to current state

The canary-release.md doc describes the pipeline as if the fleet is
running — referring to AWS account 004947743811 and a configured
MoleculeStagingProvisioner role. Reality as of 2026-04-22: no canary
tenants are provisioned, the 3 GH Actions secrets are empty, and
canary-verify.yml has failed 7/7 times in a row.

Added a top-of-doc ⚠️ state note that:

1. Clarifies this is intended design, not deployed reality.
2. Notes the AWS account ID is historical / unverified.
3. Explains that merges currently rely on manual promote-latest.
4. Cross-links to molecule-controlplane/docs/canary-tenants.md for
   the Phase 1 work that's shipped, the Phase 2 stand-up plan, and
   the "should we even do this now?" decision framework.
5. Asks whoever lands Phase 2 to reconcile the two docs.

No behaviour change — doc-only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(build): add missing fmt import in a2a_proxy.go, fix canvas Dockerfile GID

- a2a_proxy.go: missing "fmt" import caused build failure (8 undefined
  references at lines 743-775). Likely dropped during a recent merge.
- canvas/Dockerfile: GID 1000 already in use in node base image.
  Changed to dynamic group/user creation with fallback.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Hongming Wang <hongmingwang.rabbit@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Hongming Wang <hongmingwangrabbit@gmail.com>
2026-04-22 21:10:58 +00:00
116526bff3 fix(canvas/a11y): orgs/page.tsx — form labels, error announcements, checkout banner
- CreateOrgForm: replace bare <span> labels with <label htmlFor> + input id
  (WCAG 1.3.1 — programmatic label association); add aria-describedby hint for slug field
- Error state: add role=alert on error <p> (WCAG 4.1.3 — Status Messages)
- CheckoutBanner: add role=status + aria-live=polite (WCAG 4.1.3);
  restore decorative ✓ with aria-hidden=true

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-22 21:06:20 +00:00
d6dbf23172 test(canvas/a11y): add WCAG 2.1 accessibility tests for ConsoleModal and DeleteCascadeConfirmDialog
ConsoleModal: role=dialog, aria-modal, aria-labelledby, backdrop aria-hidden, error role=alert, accessible button names
DeleteCascadeConfirmDialog: role=dialog, aria-modal, aria-labelledby, backdrop aria-hidden, SVG aria-hidden, disabled state, keyboard interactions (Escape, Enter), accessible names

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-22 20:39:48 +00:00
8bb0fe70ff fix(canvas/a11y): DeleteCascadeConfirmDialog backdrop aria-hidden (WCAG 4.1.2)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-22 20:36:05 +00:00
a322dd0056 fix(canvas/a11y): unaudited components — backdrop/semantic a11y gaps
- ConsoleModal.tsx: backdrop div aria-hidden; error div role=alert (WCAG 4.1.2)
- ProvisioningTimeout.tsx: warning SVG aria-hidden; cancel-dialog backdrop aria-hidden (WCAG 4.1.2)
- TermsGate.tsx: backdrop aria-hidden; dialog role=dialog+aria-modal+aria-labelledby; error role=alert
- TopBar.tsx: replace non-semantic role=banner div with <header>; logo emoji aria-hidden
- FilesToolbar.tsx: aria-label on select dropdown; aria-label on all icon buttons (New, Upload, Export, Clear, Refresh, file input)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-22 20:07:49 +00:00
c6e7ccb289 fix(canvas/a11y): MissingKeysModal — backdrop aria-hidden, decorative SVGs
- Backdrop div: add aria-hidden="true" so screen readers skip it (WCAG 4.1.2)
- Warning triangle SVG (header): add aria-hidden="true" (decorative icon)
- Saved-badge checkmark SVG: add aria-hidden="true" (decorative icon)
- Add MissingKeysModal.a11y.test.tsx: 14 tests covering role=dialog,
  aria-modal, aria-labelledby, backdrop aria-hidden, SVG aria-hidden,
  focus-on-open (WCAG 2.4.3), Escape key handler (WCAG 2.1.2),
  accessible button names

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-22 19:40:18 +00:00
e211a25ccd fix(canvas/a11y): dialog aria-modal, icon-button labels, focus management
- CookieConsent.tsx: add aria-modal="true" (WCAG 2.1.1)
- ConsoleModal.tsx: add useRef + requestAnimationFrame focus management on open
- ConversationTraceModal.tsx: remove redundant aria-describedby={undefined}
- FileTree.tsx: add aria-label to directory/file delete buttons (WCAG 4.1.2)
- FileEditor.tsx: add aria-label to download button (WCAG 4.1.2)
- ScheduleTab.tsx: add aria-label to Run Now, Edit, Delete icon buttons
- form-inputs.tsx: add aria-label to tag removal button

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-22 19:03:00 +00:00
Molecule AI Fullstack (floater)
ea5e018f76 Merge main into staging to sync 2026-04-22 18:15:52 +00:00
molecule-ai[bot]
6bd1691446
Merge pull request #1594 from Molecule-AI/fix/canvas-a11y-clean
fix(canvas/a11y): aria-hidden on decorative SVGs + MissingKeysModal semantics
2026-04-22 18:11:12 +00:00
236158d4a4 fix(canvas/a11y): add aria-hidden to decorative SVGs + MissingKeysModal semantics
- DeleteCascadeConfirmDialog: aria-hidden on warning triangle SVG (button
  already has adjacent text content; icon is purely decorative)
- Toolbar: aria-hidden on 4 decorative SVGs (stop-all, restart-pending,
  search, help) — buttons all have aria-label/aria-expanded/text
- MissingKeysModal: role="dialog" aria-modal="true" aria-labelledby on
  container, id="missing-keys-title" on heading, requestAnimationFrame
  focus management via useRef (replaces autoFocus={index===0})
- CreateWorkspaceDialog: remove redundant aria-describedby={undefined}

WCAG 2.1 SC 1.1.1 — screen readers skip purely-presentational icons.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-22 17:40:43 +00:00
Hongming Wang
5f96a832e7 fix(canvas): drop node:20-alpine default user before creating canvas uid 1000
publish-canvas-image has been failing on every main push since 2026-04-21
at `addgroup -g 1000 canvas` because node:20-alpine already ships a `node`
user/group at uid/gid 1000. Same collision workspace-server/Dockerfile.tenant
already fixes with `deluser --remove-home node` before `addgroup`.

Copying that pattern here so the workflow goes green again and canvas images
publish to ghcr. No runtime behaviour change — canvas still runs as non-root
uid 1000.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 09:42:02 -07:00
Hongming Wang
359dc615e9
fix(canvas+templates): fetch runtime dropdown from /templates registry (#1526)
* fix(canvas+templates): fetch runtime dropdown from /templates registry

Canvas hardcoded 6 runtime options, drifting from manifest.json which
already registers hermes + gemini-cli as first-class workspace templates.
A Hermes workspace had runtime=hermes in its DB row but Config showed
"LangGraph (default)" — the HTML select fell back to its first option
because "hermes" wasn't listed, and saving would clobber the runtime
back to empty.

Now:
- GET /templates returns the runtime field from each cloned template's
  config.yaml (previously dropped on the floor)
- ConfigTab fetches /templates on mount, dedupes non-empty runtimes, and
  renders them as <option>s. Falls back to the static list if the fetch
  fails (offline, older backend), so the control never renders empty.

Adding a template to manifest.json now flows through automatically — no
canvas PR required.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(canvas+templates): model + required-env suggestions from template

Extends the dropdown fix so Model and Required Env also flow from
the template registry instead of being free-form fields the user
has to remember.

Template config.yaml now declares:

  runtime_config:
    model: <default>
    models:
      - id: nous-hermes-3-70b
        name: Nous Hermes 3 70B (Nous Portal)
        required_env: [HERMES_API_KEY]
      - id: nousresearch/hermes-3-llama-3.1-70b
        name: Hermes 3 70B (via OpenRouter)
        required_env: [OPENROUTER_API_KEY]

Platform: GET /templates now returns runtime + model + models[] per
template (was previously dropping runtime + ignoring runtime_config).

Canvas:
- Runtime dropdown built from /templates (was hardcoded 6 options)
- Model input becomes a datalist combobox; free-form input still
  allowed since model names rotate faster than templates
- Required Env Vars default to the selected model's required_env,
  labelled "(suggested)" so the user knows it's template-driven
- Everything falls back to a static list when /templates is
  unreachable, so offline editing still works

Follow-up: add models[] to the other 7 template repos (claude-code,
crewai, autogen, deepagents, openclaw, gemini-cli, langgraph). This
PR updates the platform + canvas; the Hermes template config update
goes in a separate PR against its own repo.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(canvas): commit required_env on model change; add backend tests

Review turned up that the \"Required Env Vars (suggested)\" display
was cosmetic-only — users picking a different model saw the new
env suggestion in the TagList, but the values never made it into
state, so Save serialized an empty (or stale) required_env and the
workspace ran with the wrong auth check.

Canvas fixes:
- Model input onChange now commits the matched modelSpec's required_env
  to state — but only when the prior required_env was empty or matched
  the previous modelSpec's list (i.e. user hadn't manually edited).
  User-typed envs always win.
- Dropped the display-only fallback in TagList values; shows only what's
  actually in state.
- New \"Template suggests X, Apply\" hint button covers the edge case
  where state and template differ (existing workspace whose required_env
  lags the template's current recommendation).
- datalist option key now includes index so template authors shipping
  duplicate model ids don't trigger a silent React key collision.
- Small arraysEqual helper.

Backend tests:
- TestTemplatesList_RuntimeAndModelsRegistry — asserts /templates
  response carries runtime + models[] with per-model required_env.
- TestTemplatesList_LegacyTopLevelModel — asserts older templates with
  top-level model: still surface correctly, with empty Models[].

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Hongming Wang <hongmingwang.rabbit@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 15:07:46 +00:00
Hongming Wang
e88ab70251 fix(canvas): stop infinite re-render on ContextMenu mount
ContextMenu's children selector ran .filter() inside the Zustand
hook, returning a brand-new array reference on every render.
useSyncExternalStore under the hood compares snapshots with
Object.is — a new array always differs, so React kept scheduling
re-renders, hit the 50-update depth cap, and crashed with minified
error #185.

Observed as "Application error: a client-side exception" on every
SaaS tenant once a session cookie resolved. Caught in dev mode
where the build emits the clear warning:

  The result of getSnapshot should be cached to avoid an infinite loop
      at ContextMenu (src/components/ContextMenu.tsx:26:34)

Fix: select the stable nodes array once, derive children via
useMemo outside the store subscription. Same output, no new
reference per render.

Manually verified: dev bundle served through a cloudflared tunnel
to a live tenant, ContextMenu component mounts cleanly, remaining
console errors are all unrelated (localhost API 401s from the dev
server pointing at its own origin).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 21:47:32 -07:00
molecule-ai[bot]
64ccf8e179
fix: CWE-78 rm scope, go vet failures, delegation idempotency
* refactor: split 4 oversized handler files into focused sub-files

- org.go (1099 lines) → org.go + org_import.go + org_helpers.go
- mcp.go (1001 lines) → mcp.go + mcp_tools.go
- workspace.go (934 lines) → workspace.go + workspace_crud.go
- a2a_proxy.go (825 lines) → a2a_proxy.go + a2a_proxy_helpers.go

No functional changes — same package, same exports, same tests.
All files stay under 635 lines.

Note: isSafeURL and isPrivateOrMetadataIP are duplicated between
mcp_tools.go and a2a_proxy_helpers.go — this is a pre-existing issue
from the original mcp.go and a2a_proxy.go, not introduced by this split.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(runtime+scheduler): increment/decrement active_tasks counter (refs #1386)

* docs(tutorials): add Self-Hosted AI Agents guide — Docker, Fly Machines, bare metal

* docs: add Remote Agents feature + Phase 30 blog links to docs index

* docs(marketing): update Phase 30 brief — Action 5 complete, docs/index.md update noted

* docs(api-ref): add workspace file copy API reference (#1281)

Documents TemplatesHandler.copyFilesToContainer (container_files.go):
- Endpoint overview: PUT /workspaces/:id/files/*path
- Parameter descriptions for all four function parameters
- CWE-22 path traversal protection (PRs #1267/1270/1271)
- Defense-in-depth: validateRelPath at handler + archive boundary
- Full error code table (400/404/500)
- curl example with success and path-traversal rejection cases

Also covers: writeViaEphemeral routing, findContainer fallback,
allowed roots allow-list, and related links to platform-api.md.

Co-authored-by: Molecule AI Technical Writer <technical-writer@agents.moleculesai.app>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(security): CWE-78/CWE-22 — block shell injection in deleteViaEphemeral (#1310)

## Summary
Issue #1273: deleteViaEphemeral interpolated filePath directly into
rm command, enabling both shell injection (CWE-78) and path traversal
(CWE-22) attacks.

## Changes
1. Added validateRelPath(filePath) guard before constructing the rm command.
   validateRelPath blocks absolute paths and ".." traversal sequences.
2. Changed Cmd from "/configs/"+filePath (string interpolation) to
   []string{"rm", "-rf", "/configs", filePath} (exec form). This
   eliminates shell injection entirely — filePath is a plain argument,
   never interpreted as shell code.

## Security properties
- validateRelPath: blocks "../" and absolute paths before they reach Docker
- Exec form: filePath cannot inject shell metacharacters even if validation
  is somehow bypassed
- "/configs" as separate arg: rm has exactly two arguments, no room for
  injected args

Closes #1273.

Co-authored-by: Molecule AI Infra-Runtime-BE <infra-runtime-be@agents.moleculesai.app>

* fix(security): backport SSRF defence (CWE-918) to main — isSafeURL in a2a_proxy.go (#1292) (#1302)

* fix(security): backport SSRF defence (CWE-918) to main — isSafeURL in mcp.go and a2a_proxy.go

Issue #1042: 3 CodeQL SSRF findings across mcp.go and a2a_proxy.go.
staging already ships the fix (PRs #1147, #1154 → merged); main did not include it.

- mcp.go: add isSafeURL() + isPrivateOrMetadataIP() helpers; validate
  agentURL before outbound calls in mcpCallTool (line ~529) and
  toolDelegateTaskAsync (line ~607)
- a2a_proxy.go: add identical isSafeURL() + isPrivateOrMetadataIP()
  helpers; call isSafeURL() before dispatchA2A in resolveAgentURL()
  (blocks finding #1 at line 462)
- mcp_test.go: 19 new tests covering all blocked URL patterns:
  file://, ftp://, 127.0.0.1, ::1, 169.254.169.254, 10.x.x.x,
  172.16.x.x, 192.168.x.x, empty hostname, invalid URL,
  isPrivateOrMetadataIP across all private/CGNAT/metadata ranges

1. URL scheme enforcement — http/https only
2. IP literal blocking — loopback, link-local, RFC-1918, CGNAT, doc/test ranges
3. DNS hostname resolution — blocks internal hostnames resolving to private IPs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci-blocker): remove duplicate isSafeURL/isPrivateOrMetadataIP from mcp.go

Issue #1292: PR #1274 duplicated isSafeURL + isPrivateOrMetadataIP in
mcp.go — both functions already exist on main at lines 829 and 876.
Kept the mcp.go definitions (the originals) and removed the 70-line
duplicate appended at end of file. a2a_proxy.go functions are
unchanged — they serve the same purpose via a separate code path.

* fix: remove orphaned commit-text lines from a2a_proxy.go

Three lines from the PR/commit title were accidentally baked into the
file during the rebase from #1274 to #1302, causing a Go syntax error
(a bare string literal at statement level followed by dangling braces).

Deletion restores:
  }
  return agentURL, nil
}

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Molecule AI Infra-Runtime-BE <infra-runtime-be@agents.moleculesai.app>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Molecule AI Core-BE <core-be@agents.moleculesai.app>
Co-authored-by: Molecule AI SDK Lead <sdk-lead@agents.moleculesai.app>

* fix(canvas/test): patch test regressions from PR #1243 + proximity hitbox fix (#1313)

* fix(ci): revert cancel-in-progress to true — ubuntu-runner dispatch stalled

With cancel-in-progress: false, pending CI runs accumulate in the
ci-staging concurrency group. New pushes create queued runs, but
GitHub dispatches multiple runs for the same SHA instead of replacing
the pending one. All runs get stuck/cancelled before completing.

Reverting to cancel-in-progress: true restores CI operation — runs
that are superseded are cancelled, freeing the concurrency slot for
the new run to proceed.

Runner availability (ubuntu-latest dispatch stall) is a separate
infra issue tracked independently.

* fix(security): validate tar header names in copyFilesToContainer — CWE-22 path traversal (#1043)

Tar header names were built from raw map keys without validation. A malicious
server-side caller could embed "../" in a file name to escape the destPath
volume mount (/configs) and write files outside the intended directory.

Fix: validate each name with filepath.Clean + IsAbs + HasPrefix("..") checks
before using it in the tar header, then join with destPath for the archive
header. Also guard parent-directory creation against traversal.

Closes #1043.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(canvas/test): patch regressed tests from PR #1243 orgs-page flakiness fix

Two regressions introduced by PR #1243 (fix issue #1207):

1. **ContextMenu.keyboard.test.tsx** — `setPendingDelete` now receives
   `{id, name, hasChildren}` (cascade-delete UX, PR #1252), but the test
   expected only `{id, name}`. Added `hasChildren: false` to the assertion.

2. **orgs-page.test.tsx** — 10 tests awaited `vi.advanceTimersByTimeAsync(50)`
   without `act()`. With fake timers, `setState` (synchronous) is flushed by
   `advanceTimersByTimeAsync`, but the React state update it triggers is a
   microtask — so the test saw stale render. Wrapping in `act(async () =>
   { await vi.advanceTimersByTimeAsync(50); })` ensures microtasks drain
   before assertions run.

All 813 vitest tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(canvas): add 100px proximity threshold to drag-to-nest detection

Fixes #1052 — previously, getIntersectingNodes() returned any node whose
bounding box overlapped the dragged node, regardless of actual pixel
distance. On a sparse canvas this triggered the "Nest Workspace" dialog
even when the dragged node was nowhere near any target.

The fix adds an on-node-drag proximity filter: only nodes within 100px
(center-to-center) of the dragged node are eligible as nest targets.
Distance is computed as squared Euclidean to avoid the sqrt overhead in
the hot drag path.

Added two tests to Canvas.pan-to-node.test.tsx covering the mock wiring
and confirming the regression is addressed in Canvas.tsx.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com>
Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(canvas): add ?? 0 guard for optional budget_used in progressPct (#1324) (#1327)

* fix(ci): revert cancel-in-progress to true — ubuntu-runner dispatch stalled

With cancel-in-progress: false, pending CI runs accumulate in the
ci-staging concurrency group. New pushes create queued runs, but
GitHub dispatches multiple runs for the same SHA instead of replacing
the pending one. All runs get stuck/cancelled before completing.

Reverting to cancel-in-progress: true restores CI operation — runs
that are superseded are cancelled, freeing the concurrency slot for
the new run to proceed.

Runner availability (ubuntu-latest dispatch stall) is a separate
infra issue tracked independently.

* fix(security): validate tar header names in copyFilesToContainer — CWE-22 path traversal (#1043)

Tar header names were built from raw map keys without validation. A malicious
server-side caller could embed "../" in a file name to escape the destPath
volume mount (/configs) and write files outside the intended directory.

Fix: validate each name with filepath.Clean + IsAbs + HasPrefix("..") checks
before using it in the tar header, then join with destPath for the archive
header. Also guard parent-directory creation against traversal.

Closes #1043.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(canvas/test): patch regressed tests from PR #1243 orgs-page flakiness fix

Two regressions introduced by PR #1243 (fix issue #1207):

1. **ContextMenu.keyboard.test.tsx** — `setPendingDelete` now receives
   `{id, name, hasChildren}` (cascade-delete UX, PR #1252), but the test
   expected only `{id, name}`. Added `hasChildren: false` to the assertion.

2. **orgs-page.test.tsx** — 10 tests awaited `vi.advanceTimersByTimeAsync(50)`
   without `act()`. With fake timers, `setState` (synchronous) is flushed by
   `advanceTimersByTimeAsync`, but the React state update it triggers is a
   microtask — so the test saw stale render. Wrapping in `act(async () =>
   { await vi.advanceTimersByTimeAsync(50); })` ensures microtasks drain
   before assertions run.

All 813 vitest tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(canvas): add 100px proximity threshold to drag-to-nest detection

Fixes #1052 — previously, getIntersectingNodes() returned any node whose
bounding box overlapped the dragged node, regardless of actual pixel
distance. On a sparse canvas this triggered the "Nest Workspace" dialog
even when the dragged node was nowhere near any target.

The fix adds an on-node-drag proximity filter: only nodes within 100px
(center-to-center) of the dragged node are eligible as nest targets.
Distance is computed as squared Euclidean to avoid the sqrt overhead in
the hot drag path.

Added two tests to Canvas.pan-to-node.test.tsx covering the mock wiring
and confirming the regression is addressed in Canvas.tsx.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(canvas): add ?? 0 guard for optional budget_used in progressPct

Fixes #1324 — TypeScript strict mode flags budget.budget_used as
possibly undefined in the progressPct ternary, even though the
outer condition checks budget_limit > 0.

Fix: use nullish coalescing (budget_used ?? 0) so progress shows 0%
when the backend returns a partial shape (provisioning-stuck
workspaces). Also adds a test covering the undefined-budget_used
case with the progress bar aria-valuenow and fill width both at 0%.

Closes #1324.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com>
Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(canvas): add ?? 0 guard for optional budget_used in progressPct (issue #1324) (#1329)

* fix(ci): revert cancel-in-progress to true — ubuntu-runner dispatch stalled

With cancel-in-progress: false, pending CI runs accumulate in the
ci-staging concurrency group. New pushes create queued runs, but
GitHub dispatches multiple runs for the same SHA instead of replacing
the pending one. All runs get stuck/cancelled before completing.

Reverting to cancel-in-progress: true restores CI operation — runs
that are superseded are cancelled, freeing the concurrency slot for
the new run to proceed.

Runner availability (ubuntu-latest dispatch stall) is a separate
infra issue tracked independently.

* fix(security): validate tar header names in copyFilesToContainer — CWE-22 path traversal (#1043)

Tar header names were built from raw map keys without validation. A malicious
server-side caller could embed "../" in a file name to escape the destPath
volume mount (/configs) and write files outside the intended directory.

Fix: validate each name with filepath.Clean + IsAbs + HasPrefix("..") checks
before using it in the tar header, then join with destPath for the archive
header. Also guard parent-directory creation against traversal.

Closes #1043.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(canvas/test): patch regressed tests from PR #1243 orgs-page flakiness fix

Two regressions introduced by PR #1243 (fix issue #1207):

1. **ContextMenu.keyboard.test.tsx** — `setPendingDelete` now receives
   `{id, name, hasChildren}` (cascade-delete UX, PR #1252), but the test
   expected only `{id, name}`. Added `hasChildren: false` to the assertion.

2. **orgs-page.test.tsx** — 10 tests awaited `vi.advanceTimersByTimeAsync(50)`
   without `act()`. With fake timers, `setState` (synchronous) is flushed by
   `advanceTimersByTimeAsync`, but the React state update it triggers is a
   microtask — so the test saw stale render. Wrapping in `act(async () =>
   { await vi.advanceTimersByTimeAsync(50); })` ensures microtasks drain
   before assertions run.

All 813 vitest tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(canvas): add 100px proximity threshold to drag-to-nest detection

Fixes #1052 — previously, getIntersectingNodes() returned any node whose
bounding box overlapped the dragged node, regardless of actual pixel
distance. On a sparse canvas this triggered the "Nest Workspace" dialog
even when the dragged node was nowhere near any target.

The fix adds an on-node-drag proximity filter: only nodes within 100px
(center-to-center) of the dragged node are eligible as nest targets.
Distance is computed as squared Euclidean to avoid the sqrt overhead in
the hot drag path.

Added two tests to Canvas.pan-to-node.test.tsx covering the mock wiring
and confirming the regression is addressed in Canvas.tsx.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(canvas): add ?? 0 guard for optional budget_used in progressPct

Fixes #1324 — TypeScript strict mode flags budget.budget_used as
possibly undefined in the progressPct ternary, even though the
outer condition checks budget_limit > 0.

Fix: use nullish coalescing (budget_used ?? 0) so progress shows 0%
when the backend returns a partial shape (provisioning-stuck
workspaces). Also adds a test covering the undefined-budget_used
case with the progress bar aria-valuenow and fill width both at 0%.

Closes #1324.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com>
Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(platform): unblock SaaS workspace registration end-to-end

Every workspace in the cross-EC2 SaaS provisioning shape was failing
registration, heartbeat, or A2A routing. Four distinct blockers sat
between "EC2 is up" and "agent responds"; three are platform-side and
fixed here (the fourth is in the CP user-data, separate PR).

1. SSRF validator blocked RFC-1918 (registry.go + mcp.go)
   validateAgentURL and isPrivateOrMetadataIP rejected 172.16.0.0/12,
   which contains the AWS default VPC range (172.31.x.x) that every
   sibling workspace EC2 registers from. Registration returned 400 and
   the 10-min provision sweep flipped status to failed. RFC-1918 +
   IPv6 ULA are now gated behind saasMode(); link-local (169.254/16),
   loopback, IPv6 metadata (fe80::/10, ::1), and TEST-NET stay blocked
   unconditionally in both modes.

   saasMode() resolution order:
     1. MOLECULE_DEPLOY_MODE=saas|self-hosted (explicit operator flag)
     2. MOLECULE_ORG_ID presence (legacy implicit signal, kept for
        back-compat so existing deployments don't need a config change)

   isPrivateOrMetadataIP now actually checks IPv6 — previously it
   returned false on any non-IPv4 input, which would let a registered
   [::1] or [fe80::...] URL bypass the SSRF check entirely.

2. Orphan auth-token minting (workspace_provision.go)
   issueAndInjectToken mints a token and stuffs it into
   cfg.ConfigFiles[".auth_token"]. The Docker provisioner writes that
   file into the /configs volume — the CP provisioner ignores it
   (only cfg.EnvVars crosses the wire). Result: live token in DB, no
   plaintext on disk, RegistryHandler.requireWorkspaceToken 401s every
   /registry/register attempt because the workspace is no longer in
   the "no live token → bootstrap-allowed" state. Now no-ops in SaaS
   mode; the register handler already mints on first successful
   register and returns the plaintext in the response body for the
   runtime to persist locally.

   Also removes the redundant wsauth.IssueToken call at the bottom of
   provisionWorkspaceCP, which created the same orphan-token pattern
   a second time.

3. Compaction artefacts (bundle/importer.go, handlers/org_tokens.go,
   scheduler.go, workspace_provision.go)
   Four pre-existing compile errors on main from an earlier session's
   code truncation: missing tuple destructuring on ExecContext /
   redactSecrets / orgTokenActor, missing close-brace in
   Scheduler.fireSchedule's panic recovery. All one-line mechanical
   fixes; without them the binary would not build.

Tests
-----
ssrf_test.go adds:
  * TestSaasMode — covers the env resolution ladder (explicit flag
    wins over legacy signal, case-insensitive, whitespace tolerant)
  * TestIsPrivateOrMetadataIP_SaaSMode — asserts RFC-1918 + IPv6 ULA
    flip to allowed, metadata/loopback/TEST-NET still blocked
  * TestIsPrivateOrMetadataIP_IPv6 — regression guard for the old
    "returns false for all IPv6" behaviour

Follow-up issue for CP-sourced workspace_id attestation will be filed
separately — closes the residual intra-VPC SSRF + token-race windows
the SaaS-mode relaxation introduces.

Verified end-to-end today on workspace 6565a2e0 (hermes runtime, OpenAI
provider) — agent returned "PONG" in 1.4s after register → heartbeat →
A2A proxy → runtime.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(runtime+scheduler): increment/decrement active_tasks + max_concurrent (#1408)

Runtime (shared_runtime.py):
- set_current_task now increments active_tasks on task start, decrements
  on completion (was binary 0/1)
- Counter never goes below 0 (max(0, n-1))
- Pushes heartbeat immediately on BOTH increment and decrement (#1372)

Scheduler (scheduler.go):
- Reads max_concurrent_tasks from DB (default 1, backward compatible)
- Skips cron only when active_tasks >= max_concurrent_tasks (was > 0)
- Leaders can be configured with max_concurrent_tasks > 1 to accept
  A2A delegations while a cron runs

Platform:
- Added max_concurrent_tasks column to workspaces (migration 037)
- Workspace model + list/get queries include the new field
- API exposes max_concurrent_tasks in workspace JSON

Config.yaml support (future): runtime_config.max_concurrent_tasks

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(review): address 3 critical issues from code review

1. BLOCKER: executor_helpers.py now uses increment/decrement too
   (was still binary 0/1, stomping the counter for CLI + SDK executors)

2. BUG: asymmetric getattr defaults fixed — both paths use default 0
   (was 0 on increment, 1 on decrement)

3. UX: current_task preserved when active_tasks > 0 on decrement
   (was clearing task description even when other tasks still running)

4. Scheduler polling loop re-reads max_concurrent_tasks on each poll
   (was using stale value from initial query)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Hongming Wang <hongmingwangrabbit@gmail.com>
Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com>
Co-authored-by: Molecule AI Technical Writer <technical-writer@agents.moleculesai.app>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Molecule AI Infra-Runtime-BE <infra-runtime-be@agents.moleculesai.app>
Co-authored-by: Molecule AI Core-BE <core-be@agents.moleculesai.app>
Co-authored-by: Molecule AI SDK Lead <sdk-lead@agents.moleculesai.app>
Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app>
Co-authored-by: Hongming Wang <hongmingwang.rabbit@users.noreply.github.com>

* docs: workspace files API reference, skill catalog, and links

* docs: fix secrets endpoint path across docs

The workspace secrets endpoint is `/workspaces/:id/secrets`, not
`/secrets/values`. This was wrong in quickstart.md (Path 2: Remote Agent)
and workspace-runtime.md (registration flow example and comparison table).
The external-agent-registration guide already had the correct path.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: fix broken blog cross-link in skills-vs-bundled-tools post

Link path had an extra `/docs/` segment: `/docs/blog/...` instead of
`/blog/...`. Nextra resolves blog posts directly under `/blog/<slug>`,
not under `/docs/blog/`.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: add skill-catalog.md guide

Linked from the skills-vs-bundled-tools blog post as a reference
for TTS/image-generation/web-search skills. The blog promises
"install directly via the CLI" with a skill catalog — this page
fills that promise by documenting available skill types, install
commands, version management, custom skill authoring, and removal.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs(marketing): update Phase 30 brief — Action 5 complete, docs/index.md update noted

* docs(api-ref): add workspace file copy API reference

Documents TemplatesHandler.copyFilesToContainer (container_files.go):
- Endpoint overview: PUT /workspaces/:id/files/*path
- Parameter descriptions for all four function parameters
- CWE-22 path traversal protection (PRs #1267/1270/1271)
- Defense-in-depth: validateRelPath at handler + archive boundary
- Full error code table (400/404/500)
- curl example with success and path-traversal rejection cases

Also covers: writeViaEphemeral routing, findContainer fallback,
allowed roots allow-list, and related links to platform-api.md.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Molecule AI Technical Writer <technical-writer@agents.moleculesai.app>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com>

* fix(handlers): add saasMode() gating to isPrivateOrMetadataIP in a2a_proxy_helpers.go

Issue #1421 / #1401: PR #1363 (handler split) moved isPrivateOrMetadataIP
into a2a_proxy_helpers.go but kept the OLD pre-SaaS version — it
unconditionally blocks RFC-1918 addresses, regressing the fix in
commits 1125a02 / cf10733.

The A2A proxy path now has the same SaaS-gated logic as registry.go:
- Cloud metadata (169.254/16, fe80::/10, ::1) always blocked in both modes
- RFC-1918 (10/8, 172.16/12, 192.168/16) + IPv6 ULA (fc00::/7) blocked in
  self-hosted, allowed in SaaS cross-EC2 mode
- IPv6 addresses now properly checked (previous version returned false for all)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs(marketing): Discord adapter Day 2 Reddit + HN community copy

* fix(tests): supply *events.Broadcaster pointer to captureBroadcaster

Cannot use *captureBroadcaster as *events.Broadcaster when the struct
embeds events.Broadcaster as a value — must initialize as a named field.

Fixes go vet error in workspace_provision_test.go:
  cannot use broadcaster (*captureBroadcaster) as *events.Broadcaster value

* Merge pull request #1429 from fix/canvas-tooltip-clear-timer

Without this, a 400ms setTimeout from onFocus/onMouseEnter that fires
after onBlur will re-show a tooltip the user just dismissed. The
setShow(false) in onBlur closes the tooltip immediately but leaves the
timer pending — Tab-blur followed by timer-fire would re-show it.

Fix: add clearTimeout(timerRef.current) at the top of onBlur, mirroring
the pattern already used in onMouseLeave and onFocus.

Refs: PR #1367 (a11y keyboard support — this was a pre-existing gap)

Co-authored-by: Molecule AI App-FE <app-fe@agents.moleculesai.app>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(canvas/test): add missing children:[] to setPendingDelete expectation (#1426)

PR #1252 (cascade-delete UX) updated setPendingDelete to pass a
children array for cascade-warning rendering. The keyboard-a11y test
assertion was not updated to match.

Test: clicking 'Delete' hoists state to the store and closes the menu

Co-authored-by: Molecule AI Core-QA <core-qa@agents.moleculesai.app>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(canvas/test): add children:[] to setPendingDelete + \&apos; entity fix (closes #1380) (#1427)

* ci: retry — trigger fresh runner allocation

* fix(canvas/test): add children:[] to setPendingDelete assertion

setPendingDelete now includes children:[] (PR #1383 extended the
pendingDelete type). The keyboard accessibility test at line 225 used
exact object matching which omitted the new field, causing a failure
after staging merged #1383.

Issue: #1380

* fix(canvas): replace &apos; HTML entity with straight apostrophe

JSX does not entity-decode &apos; — it renders the literal text
"&apos;" instead of "'".  Found at line 157 (payment confirmed) and
line 321 (empty org list).  Replaced with a straight apostrophe,
which JSX handles correctly.

Ref: issue #1375
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: DevOps Engineer <devops@molecule.ai>
Co-authored-by: Molecule AI Core-UIUX <core-uiux@agents.moleculesai.app>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* Merge pull request #1430 from fix/1421-saas-ssrf-helpers

Issue #1421 / #1401: PR #1363 (handler split) moved isPrivateOrMetadataIP
into a2a_proxy_helpers.go but kept the OLD pre-SaaS version — it
unconditionally blocks RFC-1918 addresses, regressing the fix in
commits 1125a02 / cf10733.

The A2A proxy path now has the same SaaS-gated logic as registry.go:
- Cloud metadata (169.254/16, fe80::/10, ::1) always blocked in both modes
- RFC-1918 (10/8, 172.16/12, 192.168/16) + IPv6 ULA (fc00::/7) blocked in
  self-hosted, allowed in SaaS cross-EC2 mode
- IPv6 addresses now properly checked (previous version returned false for all)

Co-authored-by: Molecule AI Core-BE <core-be@agents.moleculesai.app>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(P0): CWE-22 path traversal in copyFilesToContainer + ContextMenu test

Issue #1434 — CWE-22 Path Traversal Regression:
PR #1280 (dc218212) correctly used cleaned path in tar header.
PR #1363 (e9615af) regressed to using uncleaned `name`.
Fix: use `clean` in filepath.Join AND add defence-in-depth escape check.

Issue #1422 — ContextMenu Test Regression:
PR #1340 expanded pendingDelete store type to include `children:[]`.
Test assertion missing the field — add `children:[]` to match.

Note: ssrf.go created (shared isSafeURL/isPrivateOrMetadataIP) to
prepare for the handler-split refactor fix — current branch has no
build error, but the shared file will prevent regression when PR #1363
is merged. isSafeURL/isPrivateOrMetadataIP retained in both files
for now to avoid breaking callers while the split is finalized.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: resolve 3 go vet failures + add idempotency_key to delegate_task_async

- workspace_provision_test.go: add missing mock := setupTestDB(t) to
  TestSeedInitialMemories_Truncation — mock was referenced but never
  declared, causing "undefined: mock" vet error
- orgtoken/tokens_test.go: discard unused orgID return value with _ in
  Validate call — "declared and not used" vet error
- a2a_tools.py: delegate_task_async now sends idempotency_key (SHA-256
  of workspace_id + task) to POST /workspaces/:id/delegate, fixing
  duplicate task execution when an agent restarts mid-delegation (#1456)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: airenostars <airenostars@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com>
Co-authored-by: Hongming Wang <hongmingwangrabbit@gmail.com>
Co-authored-by: Molecule AI Technical Writer <technical-writer@agents.moleculesai.app>
Co-authored-by: Molecule AI Infra-Runtime-BE <infra-runtime-be@agents.moleculesai.app>
Co-authored-by: Molecule AI Core-BE <core-be@agents.moleculesai.app>
Co-authored-by: Molecule AI SDK Lead <sdk-lead@agents.moleculesai.app>
Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app>
Co-authored-by: Hongming Wang <hongmingwang.rabbit@users.noreply.github.com>
Co-authored-by: Molecule AI Community Manager <community-manager@agents.moleculesai.app>
Co-authored-by: Molecule AI App-FE <app-fe@agents.moleculesai.app>
Co-authored-by: Molecule AI Core-QA <core-qa@agents.moleculesai.app>
Co-authored-by: DevOps Engineer <devops@molecule.ai>
Co-authored-by: Molecule AI Core-UIUX <core-uiux@agents.moleculesai.app>
Co-authored-by: Molecule AI Dev Lead <dev-lead@agents.moleculesai.app>
2026-04-21 18:22:30 +00:00
molecule-ai[bot]
38e9eba59a
fix(P0): CWE-22 path traversal in copyFilesToContainer + ContextMenu test
Issue #1434 — CWE-22 Path Traversal Regression:
PR #1280 (dc218212) correctly used cleaned path in tar header.
PR #1363 (e9615af) regressed to using uncleaned `name`.
Fix: use `clean` in filepath.Join AND add defence-in-depth escape check.

Issue #1422 — ContextMenu Test Regression:
PR #1340 expanded pendingDelete store type to include `children:[]`.
Test assertion missing the field — add `children:[]` to match.

Note: ssrf.go created (shared isSafeURL/isPrivateOrMetadataIP) to
prepare for the handler-split refactor fix — current branch has no
build error, but the shared file will prevent regression when PR #1363
is merged. isSafeURL/isPrivateOrMetadataIP retained in both files
for now to avoid breaking callers while the split is finalized.

Co-authored-by: Molecule AI Core-BE <core-be@agents.moleculesai.app>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 16:56:47 +00:00
Hongming Wang
a14cf863d1
Merge pull request #1445 from Molecule-AI/fix/tenant-dockerfile-uid-conflict
fix(tenant-image): remove node user so canvas uid 1000 can be created
2026-04-21 08:58:09 -07:00
e9615af169 Merge origin/main into staging: resolve conflicts with main's test + security fixes
Conflicts resolved (took main's versions):
- canvas/src/app/__tests__/orgs-page.test.tsx (act() wrappers, PR #1350)
- canvas/src/components/Canvas.tsx (100px proximity threshold, PR #1357)
- canvas/src/components/__tests__/ContextMenu.keyboard.test.tsx (hasChildren fix)
- workspace-server/internal/handlers/container_files.go (CWE-22/CWE-78 fixes, PRs #1281/#1310)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 12:25:42 +00:00
Hongming Wang
6bd674e412 fix(e2e): CP DELETE /cp/admin/tenants body uses 'confirm', not 'confirm_token'
Verified against live staging: the admin endpoint returns 400 'confirm
field must equal the URL slug' when the body key is 'confirm_token'.
Every workflow's safety-net teardown step + the main harness + the
Playwright teardown all had the wrong key. Fixed all six call sites.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 04:50:28 -07:00
Hongming Wang
d7193dfa34 feat(e2e): pivot to admin-bearer-only auth + add sanity self-check workflow
Reduces required secret surface from 2 (session cookie + admin token)
to 1 (admin token). Pairs with molecule-controlplane#202 which adds:
  - POST /cp/admin/orgs    — server-to-server org creation
  - GET /cp/admin/orgs/:slug/admin-token — per-tenant bearer fetch

With those endpoints live, CI doesn't need to scrape a browser WorkOS
session cookie. CP admin bearer (Railway CP_ADMIN_API_TOKEN) drives
provision + tenant-token retrieval + teardown through a single
credential.

Changes
-------
  test_staging_full_saas.sh: admin bearer for provision/teardown,
    fetched per-tenant token drives all tenant API calls. Added
    E2E_INTENTIONAL_FAILURE=1 toggle that poisons the tenant token
    after provisioning so the teardown path gets exercised when the
    happy-path isn't.

  canvas/e2e/staging-setup.ts: same pivot; exports STAGING_TENANT_TOKEN
    instead of STAGING_SESSION_COOKIE.
  canvas/e2e/staging-tabs.spec.ts: context.setExtraHTTPHeaders with
    Authorization: Bearer on every page request, no cookie handling.

  All three workflows (e2e-staging-saas, canary-staging,
    e2e-staging-canvas): drop MOLECULE_STAGING_SESSION_COOKIE env +
    verification step. One secret to set.

  NEW e2e-staging-sanity.yml: weekly Mon 06:00 UTC. Runs the harness
    with E2E_INTENTIONAL_FAILURE=1 and inverts the pass condition —
    rc=1 is green, rc=0 (unexpected success) or rc=4 (leak) open a
    priority-high issue labelled e2e-safety-net. This is the
    answer to 'how do we know the teardown path still works when
    nothing else has failed recently.'

STAGING_SAAS_E2E.md refreshed: single-secret setup, sanity workflow
documented, canvas workflow added to the coverage matrix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 04:34:11 -07:00
Hongming Wang
f4700858ac feat(e2e): canary + canvas Playwright workflows; delegation mechanics
Three additions on top of 187a9bf:

1. Canary (.github/workflows/canary-staging.yml)
   30-min cron that runs the full-SaaS harness in E2E_MODE=canary: one
   hermes workspace + one A2A PONG + teardown. ~8-min wall clock vs
   ~20-min for the full run.
   Alerting is self-contained: opens a single 'Canary failing' issue on
   first failure, comments on subsequent failures (no issue spam),
   auto-closes the issue on the next green run. Labels: canary-staging,
   bug. Safety-net teardown step sweeps e2e-YYYYMMDD-canary-* orgs
   tagged today so a runner cancel can't leak EC2.

2. Canvas Playwright (canvas/e2e/staging-*.ts + playwright.staging.config.ts
   + .github/workflows/e2e-staging-canvas.yml)
   staging-setup.ts provisions a fresh org + hermes workspace (same
   lifecycle as the bash harness, just in TypeScript). staging-tabs.spec.ts
   clicks through all 13 workspace-panel tabs (chat, activity, details,
   skills, terminal, config, schedule, channels, files, memory, traces,
   events, audit) and asserts each renders without crashing and without
   'Failed to load' error toasts. Known SaaS gaps (Files empty, Terminal
   disconnects, Peers 401) are documented in #1369 and whitelisted so
   they don't fail the test — the gate is 'no hard crash', not 'no
   issues'.
   staging-teardown.ts deletes the org via DELETE /cp/admin/tenants/:slug.
   playwright.staging.config.ts separates staging from local tests so
   pnpm test in dev doesn't try to provision against staging. Retries=2
   and timeouts are longer; workers=1 because the setup provisions one
   shared workspace. Workflow uploads HTML report + screenshots on
   failure for 14 days.

3. Delegation mechanics (tests/e2e/test_staging_full_saas.sh section 10)
   Parent → child proxy test: POST /workspaces/CHILD/a2a with
   X-Source-Workspace-Id=PARENT and verify the child responds + child
   activity log captures PARENT as source. Intentionally LLM-free: the
   mechanics regression is what matters; prompt-driven delegation
   correctness belongs in canvas-driven tests.
   Also reorders teardown step to 11/11 since delegation is 10/11.

Mode gating:
   E2E_MODE=canary -> skips child workspace, HMA memory, peers,
   activity, delegation (steps 6, 9, 10 no-op). Full-lifecycle still
   runs every piece. Validated both paths via 'bash -n' syntax check
   after each edit.

Secrets requirement unchanged (same two secrets as 187a9bf):
  MOLECULE_STAGING_SESSION_COOKIE, MOLECULE_STAGING_ADMIN_TOKEN.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 04:15:10 -07:00
molecule-ai[bot]
00bd73f8c8
fix(canvas): a11y fixes + budget_used TypeScript guard + orgs-page test fix (#1367)
* fix(canvas/a11y): mark StatusDot as aria-hidden — decorative element

StatusDot is purely decorative; the status is already conveyed via
aria-label on parent elements (WorkspaceNode, SidePanel header, etc.).
Marking it aria-hidden="true" prevents screen readers from announcing
the empty div as "img" with no alt text.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(canvas): guard budget_used optional field with ?? 0 in progress calc

TypeScript error in CI: 'budget.budget_used' is possibly 'undefined'
when used in the progress percentage calculation. The field is
optional per BudgetData interface, so ?? 0 is the correct guard.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(canvas/a11y): Tooltip keyboard focus support + ARIA role

- Add role="tooltip" + unique id so assistive tech can find tooltip content
- Add aria-describedby on trigger so screen readers announce tooltip text
- Add onFocus/onBlur handlers so keyboard users (Tab navigation) can see
  tooltips that mouse users see on hover

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(canvas/test): restore advanceTimersByTime pattern in orgs-page error test

waitFor() + fake timers (vi.useFakeTimers in beforeEach) cause race
conditions: the 5s polling timeout fires before React state updates flush.
Restores the established pattern used by all other tests in this file:
advanceTimersByTimeAsync(50) + runAllTimersAsync().
Also removes the now-unused waitFor import.

Ref: PRs #1360, #1345
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Molecule AI Core-UIUX <core-uiux@agents.moleculesai.app>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 11:08:24 +00:00
molecule-ai[bot]
bde456a893 feat(canvas/e2e): add Playwright test for context-menu → delete confirm flow (#1344)
Issue #1138: Add Playwright E2E for context-menu → delete confirm flow.

The unit test (ContextMenu.keyboard.test.tsx) only exercises the store
setter — it can't catch the portal/race bug from PR #1133 where the
portal-rendered ConfirmDialog was closed by the menu's outside-click
handler before onConfirm fired.

This E2E test covers:
- Right-click workspace node → context menu opens
- Click Delete → ConfirmDialog appears (not swallowed)
- Click Confirm → dialog closes, node disappears, DELETE /workspaces/:id fires
- Click Cancel → dialog closes, node remains

Requires: platform on :8080, canvas on :3000.

Closes #1138.

Co-authored-by: Molecule AI Core-UIUX <core-uiux@agents.moleculesai.app>
2026-04-21 08:11:48 +00:00
molecule-ai[bot]
f2e4f71fee fix(canvas/test): restore waitFor in orgs-page error test + add getState mock (#1341)
Issue #1268: orgs-page error state test — replace vi.advanceTimersByTimeAsync(50)
with waitFor polling. advanceTimersByTimeAsync fires the timer but does not
guarantee React render flush completes before the assertion runs.

Issue #1269: ContextMenu keyboard test — add getState: () => mockStore to
useCanvasStore mock. PR #1243 changed the delete flow to hoist confirmation
to Canvas-level dialog via setPendingDelete, which reads .nodes via
useCanvasStore.getState() — the mock was missing getState.

Also carries forward the Issue #1124 WORKSPACE_ID fail-fast fix from
workspace/ modules (a2a_cli, a2a_client, coordinator, consolidation,
molecule_ai_status) — RuntimeError if WORKSPACE_ID is unset/empty.

Co-authored-by: Molecule AI Core Platform Lead <core-platform-lead@agents.moleculesai.app>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 07:52:15 +00:00
molecule-ai[bot]
b21b3d163f fix(canvas): add ?? 0 guard for optional budget_used in progressPct (#1324) (#1327)
* fix(ci): revert cancel-in-progress to true — ubuntu-runner dispatch stalled

With cancel-in-progress: false, pending CI runs accumulate in the
ci-staging concurrency group. New pushes create queued runs, but
GitHub dispatches multiple runs for the same SHA instead of replacing
the pending one. All runs get stuck/cancelled before completing.

Reverting to cancel-in-progress: true restores CI operation — runs
that are superseded are cancelled, freeing the concurrency slot for
the new run to proceed.

Runner availability (ubuntu-latest dispatch stall) is a separate
infra issue tracked independently.

* fix(security): validate tar header names in copyFilesToContainer — CWE-22 path traversal (#1043)

Tar header names were built from raw map keys without validation. A malicious
server-side caller could embed "../" in a file name to escape the destPath
volume mount (/configs) and write files outside the intended directory.

Fix: validate each name with filepath.Clean + IsAbs + HasPrefix("..") checks
before using it in the tar header, then join with destPath for the archive
header. Also guard parent-directory creation against traversal.

Closes #1043.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(canvas/test): patch regressed tests from PR #1243 orgs-page flakiness fix

Two regressions introduced by PR #1243 (fix issue #1207):

1. **ContextMenu.keyboard.test.tsx** — `setPendingDelete` now receives
   `{id, name, hasChildren}` (cascade-delete UX, PR #1252), but the test
   expected only `{id, name}`. Added `hasChildren: false` to the assertion.

2. **orgs-page.test.tsx** — 10 tests awaited `vi.advanceTimersByTimeAsync(50)`
   without `act()`. With fake timers, `setState` (synchronous) is flushed by
   `advanceTimersByTimeAsync`, but the React state update it triggers is a
   microtask — so the test saw stale render. Wrapping in `act(async () =>
   { await vi.advanceTimersByTimeAsync(50); })` ensures microtasks drain
   before assertions run.

All 813 vitest tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(canvas): add 100px proximity threshold to drag-to-nest detection

Fixes #1052 — previously, getIntersectingNodes() returned any node whose
bounding box overlapped the dragged node, regardless of actual pixel
distance. On a sparse canvas this triggered the "Nest Workspace" dialog
even when the dragged node was nowhere near any target.

The fix adds an on-node-drag proximity filter: only nodes within 100px
(center-to-center) of the dragged node are eligible as nest targets.
Distance is computed as squared Euclidean to avoid the sqrt overhead in
the hot drag path.

Added two tests to Canvas.pan-to-node.test.tsx covering the mock wiring
and confirming the regression is addressed in Canvas.tsx.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(canvas): add ?? 0 guard for optional budget_used in progressPct

Fixes #1324 — TypeScript strict mode flags budget.budget_used as
possibly undefined in the progressPct ternary, even though the
outer condition checks budget_limit > 0.

Fix: use nullish coalescing (budget_used ?? 0) so progress shows 0%
when the backend returns a partial shape (provisioning-stuck
workspaces). Also adds a test covering the undefined-budget_used
case with the progress bar aria-valuenow and fill width both at 0%.

Closes #1324.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com>
Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 07:21:27 +00:00
molecule-ai[bot]
45715aa8a5 fix(canvas/test): patch test regressions from PR #1243 + proximity hitbox fix (#1313)
* fix(ci): revert cancel-in-progress to true — ubuntu-runner dispatch stalled

With cancel-in-progress: false, pending CI runs accumulate in the
ci-staging concurrency group. New pushes create queued runs, but
GitHub dispatches multiple runs for the same SHA instead of replacing
the pending one. All runs get stuck/cancelled before completing.

Reverting to cancel-in-progress: true restores CI operation — runs
that are superseded are cancelled, freeing the concurrency slot for
the new run to proceed.

Runner availability (ubuntu-latest dispatch stall) is a separate
infra issue tracked independently.

* fix(security): validate tar header names in copyFilesToContainer — CWE-22 path traversal (#1043)

Tar header names were built from raw map keys without validation. A malicious
server-side caller could embed "../" in a file name to escape the destPath
volume mount (/configs) and write files outside the intended directory.

Fix: validate each name with filepath.Clean + IsAbs + HasPrefix("..") checks
before using it in the tar header, then join with destPath for the archive
header. Also guard parent-directory creation against traversal.

Closes #1043.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(canvas/test): patch regressed tests from PR #1243 orgs-page flakiness fix

Two regressions introduced by PR #1243 (fix issue #1207):

1. **ContextMenu.keyboard.test.tsx** — `setPendingDelete` now receives
   `{id, name, hasChildren}` (cascade-delete UX, PR #1252), but the test
   expected only `{id, name}`. Added `hasChildren: false` to the assertion.

2. **orgs-page.test.tsx** — 10 tests awaited `vi.advanceTimersByTimeAsync(50)`
   without `act()`. With fake timers, `setState` (synchronous) is flushed by
   `advanceTimersByTimeAsync`, but the React state update it triggers is a
   microtask — so the test saw stale render. Wrapping in `act(async () =>
   { await vi.advanceTimersByTimeAsync(50); })` ensures microtasks drain
   before assertions run.

All 813 vitest tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(canvas): add 100px proximity threshold to drag-to-nest detection

Fixes #1052 — previously, getIntersectingNodes() returned any node whose
bounding box overlapped the dragged node, regardless of actual pixel
distance. On a sparse canvas this triggered the "Nest Workspace" dialog
even when the dragged node was nowhere near any target.

The fix adds an on-node-drag proximity filter: only nodes within 100px
(center-to-center) of the dragged node are eligible as nest targets.
Distance is computed as squared Euclidean to avoid the sqrt overhead in
the hot drag path.

Added two tests to Canvas.pan-to-node.test.tsx covering the mock wiring
and confirming the regression is addressed in Canvas.tsx.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com>
Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 07:06:57 +00:00
molecule-ai[bot]
c0d5e528a4 fix(canvas): cascade-delete UX — require checkbox before Delete All (#1314)
* fix(canvas/test): restore test regressions from PR #1243

PR #1243 introduced two regressions in the canvas vitest suite:

1. ContextMenu.keyboard.test.tsx: the setPendingDelete call now
   passes `{hasChildren, id, name}` (not just `{id, name}`). Updated
   the keyboard-a11y test assertion to match the new store shape.

2. orgs-page.test.tsx: mockFetch.mockResolvedValueOnce() returned a
   plain object that didn't match the two-argument (url, options)
   call signature used by the component's fetch wrapper. Switched to
   mockImplementationOnce returning a rejected Promise — matching
   real fetch's rejection contract — and added runAllTimersAsync after
   advanceTimersByTimeAsync(50) to flush React state updates.

54 test files · 813 tests · all passing

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(canvas): replace bounding-box intersection with distance threshold for nest detection

ReactFlow's getIntersectingNodes uses bounding-box overlap detection, which
fires the drag-over state whenever any part of two nodes' position rectangles
overlap — even when the dragged node is far from the target. This made the
"Nest Workspace" dialog appear from large distances.

Fix: scan all nodes on each drag tick and set dragOverNodeId to the closest
node within NEST_PROXIMITY_THRESHOLD (150 px, center-to-center). This matches
the intuitive behavior: nest only when the node is actually dropped near another.

Constants:
- NEST_PROXIMITY_THRESHOLD = 150px (~60% of a collapsed node's width)
- DEFAULT_NODE_WIDTH = 245px (mid-range of min/max node widths)
- DEFAULT_NODE_HEIGHT = 110px

Also removed the unused getIntersectingNodes import (was causing duplicate
identifier error when both onNodeDrag and the zoom handler called useReactFlow
in the same component scope).

Closes #1052.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(canvas): cascade-delete UX — show child count and require checkbox before Delete All

Issue #1137: with ?confirm=true always sent, a single confirmation silently
cascades — a team lead with 20 children gets nuked on one click.

Changes:
- store/canvas.ts: pendingDelete type now includes children: {id, name}[]
- ContextMenu.tsx: passes child list to setPendingDelete on Delete click
- DeleteCascadeConfirmDialog.tsx: new component — shows child names, a
  cascade warning, and requires the operator to tick a checkbox before
  Delete All activates. Disabled by default; only enables after checkbox.
- Canvas.tsx: conditionally renders DeleteCascadeConfirmDialog for
  hasChildren workspaces, or plain ConfirmDialog for leaf workspaces.
  confirmDelete requires cascadeConfirmChecked=true when hasChildren.
- ContextMenu.keyboard.test.tsx: updated setPendingDelete assertion to
  include children:[] (no children in the test fixture).

813 tests pass.

Closes #1137.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Molecule AI Core-UIUX <core-uiux@agents.moleculesai.app>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 07:06:45 +00:00
molecule-ai[bot]
04c3bc6eb1 fix(canvas): cascade-delete UX — warn before deleting workspace with children (PR #1252)
- Store: pendingDelete now carries `hasChildren: boolean` (computed from
  nodes.some(parentId === nodeId))
- ContextMenu: passes hasChildren into setPendingDelete
- Canvas: dialog title changes to "Delete Workspace and Children" with
  ⚠️ message when hasChildren; confirms with "Delete All"

Refs: #1137

Co-authored-by: Molecule AI Fullstack (floater) <fullstack-floater@agents.moleculesai.app>
2026-04-21 03:25:12 +00:00
molecule-ai[bot]
221d8b2384 fix(canvas): guard undefined lastErrorRate and period dates in metrics (PR #1250)
- DetailsTab: use `(data.lastErrorRate ?? 0)` instead of bare multiply to
  prevent NaN% when the field is absent on pre-provisioning workspaces.
- WorkspaceUsage: make formatPeriod accept optional start/end strings;
  return "—" for undefined so the usage period shows blank rather than
  "Invalid Date" for provisioning/partial workspaces.

Refs: #1139

Co-authored-by: Molecule AI Fullstack (floater) <fullstack-floater@agents.moleculesai.app>
2026-04-21 03:22:17 +00:00
molecule-ai[bot]
883cb7ebc3 fix(canvas): eliminate flaky timer state between orgs-page tests (#1207) (#1243)
Root cause: tests used try/finally { vi.useRealTimers() / vi.useFakeTimers() }
back-and-forth. When any test's finally-block called vi.useFakeTimers(),
subsequent tests inherited fake timer state causing 50ms real setTimeouts
to not fire and mockFetch to accumulate calls across test boundaries.

Fix: consolidate timer management to beforeEach/afterEach hooks.
- beforeEach: vi.useFakeTimers() — all tests start from known fake state
- afterEach: cleanup() + vi.useRealTimers() — restore real timers for next test
- Individual tests: use vi.advanceTimersByTimeAsync(50) instead of real setTimeout

Also removed duplicate afterEach(cleanup()) and unused waitFor import.

Closes #1207.

Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 03:07:28 +00:00
molecule-ai[bot]
014295d57f fix(issue-1207): eliminate orgs-page test flakiness (#1235)
* fix(auth): F1094 — requireCallerOwnsOrg reads org_id not created_by (#1200)

Root cause: requireCallerOwnsOrg (org_plugin_allowlist.go:116) was
reading org_api_tokens.created_by to determine caller's org workspace
ID. But created_by is a provenance label ("session", "admin-token",
"org-token:<prefix>") — never a UUID. The equality check
callerOrg != targetOrgID always failed → every org-token caller
got 403 on /orgs/:id/plugins/allowlist routes.

Fix:
- Migration 036: adds org_id UUID column (nullable) to org_api_tokens
  with index. Existing pre-migration tokens get org_id=NULL → deny
  by default (safer than cross-org access).
- orgtoken.Issue: takes new orgID param; stores in org_id column.
- orgtoken.OrgIDByTokenID: new helper reads org_id for a token ID.
  Returns ("", nil) for NULL/unanchored tokens.
- requireCallerOwnsOrg: now calls OrgIDByTokenID instead of reading
  created_by. Pre-migration tokens with org_id=NULL get callerOrg=""
  → denied (safer).
- orgTokenActor (org_tokens.go): returns (createdBy, orgID) pair.
  Token minted via another org token gets its org_id set at mint time.
  Session/ADMIN_TOKEN callers get orgID="".
- orgtoken.Token struct: adds OrgID field for list display.
- orgtoken.List: selects org_id alongside other columns.
- Updated existing tests for new Issue signature.
- Added 10 regression tests covering: happy path, unanchored denial,
  cross-org denial, session bypass, DB error denial.

🤖 Generated with [Claude Code](https://claude.ai/claude-code)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(security): replace err.Error() leaks with prod-safe messages (#1206)

- workspace_provision.go: provisionWorkspace, provisionWorkspaceCP —
  replaced 7 err.Error() calls with "provisioning failed" in both
  Broadcast payloads and last_sample_error DB column. Full error
  preserved in server-side log.Printf.

- plugins_install_pipeline.go: resolveAndStage — replaced 5 err.Error()
  calls with generic messages:
    "invalid plugin source"
    "plugin source not supported"
    "invalid plugin name"
    "staged plugin exceeds size limit"
    "plugin manifest integrity check failed"

Risk mitigated: DB errors (pq: connection refused, pq: deadlock),
OS errors, and internal paths no longer leak in HTTP JSON responses
or WebSocket broadcasts.

Added regression tests (workspace_provision_test.go):
  - TestProvisionWorkspace_NoInternalErrorsInBroadcast
  - TestProvisionWorkspaceCP_NoInternalErrorsInBroadcast
  - TestResolveAndStage_NoInternalErrorsInHTTPErr

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(F1089): log panic-recovery UPDATE errors in scheduler

The panic defer blocks in tick() and fireSchedule() now capture
and log errors from the db.DB.ExecContext call that advances next_run_at
after a panic. Previously, a DB failure during panic recovery was
silent — the log line for the panic itself appeared but any subsequent
UPDATE failure was invisible, risking unnoticed scheduler drift.

context.Background() was already used (F1089 comment in place); this
commit adds the missing error capture + log.Printf on exec failure.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(issue-1207): eliminate orgs-page test flakiness

Three root causes addressed:

1. Duplicate afterEach blocks (lines 97-103) — two identical
   afterEach(() => { cleanup(); }) blocks collapsed to one.

2. Fake-timer isolation gap — if a polling test failed before its
   finally-block ran, vi.useFakeTimers() persisted globally. The next
   non-polling test's setTimeout(50) then hung indefinitely (fake
   timers don't advance without vi.advanceTimersByTime), causing
   waitFor/async timeouts. Fixed by calling vi.useRealTimers()
   unconditionally in beforeEach (guaranteed clean slate) and
   afterEach (even when a test fails before its own finally).

3. mockFetch.callHistory now cleared via mockReset() in beforeEach,
   preventing "expected 2 calls but got N" failures from carry-over
   between polling tests.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Molecule AI Dev Lead <dev-lead@agents.moleculesai.app>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 02:47:25 +00:00
molecule-ai[bot]
4e39201d76 fix(canvas): show toast when clipboard API unavailable in ConsoleModal (#1199) (#1231)
Use explicit navigator.clipboard check instead of optional chaining so
the no-op case is handled explicitly. When clipboard API is unavailable
(non-HTTPS context) show a toast: "Copy requires HTTPS — please select
and copy manually". Production is always HTTPS so this only affects
local dev with http:// canvas.

Closes #1199.

Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 02:45:29 +00:00
molecule-ai[bot]
fcd3a6eaf0 fix(test): align ssrf_test.go localhost test cases with isSafeURL behaviour (#1192)
* feat(canvas): rewrite MemoryInspectorPanel to match backend API

Issue #909 (chunk 3 of #576).

The existing MemoryInspectorPanel used the wrong API endpoint
(/memory instead of /memories) and wrong field names (key/value/version
instead of id/content/scope/namespace/created_at). It also lacked
LOCAL/TEAM/GLOBAL scope tabs and a namespace filter.

Changes:
- Fix endpoint: GET /workspaces/:id/memories with ?scope= query param
- Fix MemoryEntry type to match actual API: id, content, scope,
  namespace, created_at, similarity_score
- Add LOCAL/TEAM/GLOBAL scope tabs
- Add namespace filter input
- Remove Edit functionality (no update endpoint in backend)
- Delete uses DELETE /workspaces/:id/memories/:id (by id, not key)
- Full rewrite of 27 tests to match new API and UI structure
- Uses ConfirmDialog (not native dialogs) for delete confirmation
- All dark zinc theme (no light colors)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: tighten types + improve provision-timeout message (#1135, #1136)

#1135 — TypeScript: make BudgetData.budget_used and WorkspaceMetrics
fields optional to match actual partial-response shapes from provisioning-
stuck workspaces. Runtime already guarded with ?? 0.

#1136 — provisiontimeout.go: replace misleading "check required env vars"
hint (preflight catches that case upfront) with accurate message about
container starting but failing to call /registry/register.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

* fix(test): align ssrf_test.go localhost test cases with isSafeURL behaviour

isSafeURL blocks 127.0.0.1 via ip.IsLoopback() even in dev environments.
The test cases `wantErr: false` for localhost were incorrect — the
test would fail when go test runs. Fix by changing wantErr to true
for both localhost test cases.

Rationale: loopback blocking at this layer is intentional. Access
control is enforced by WorkspaceAuth + CanCommunicate at the A2A
routing layer, not by the URL validation. Opening this would widen
the SSRF attack surface without adding real dev flexibility.

Closes: ssrf_test.go inconsistency reported 2026-04-21

Co-Authored-By: Claude Sonnet 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Molecule AI Core-UIUX <core-uiux@agents.moleculesai.app>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 02:08:45 +00:00
Hongming Wang
c033f3b051 chore(canvas): review nits on ConsoleModal + Peers empty state
Post-review cleanup for the #1178 / #1189 bootstrap-watcher flow:

- ConsoleModal status-code matching uses \b regex anchors instead of
  raw substrings. Before, any error message containing "501" inside
  a longer digit run ("15012") would false-match into the self-hosted
  branch. Unlikely in practice but cheap to tighten.

- Peers empty-state copy now explains WHY the list is empty on
  offline / failed / provisioning workspaces instead of rendering the
  same "No reachable peers" text used for healthy workspaces with
  zero siblings. Online workspaces unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 18:02:22 -07:00
Hongming Wang
3b339d13e6 fix(canvas): skip /registry/:id/peers fetch when workspace not online
The peers endpoint requires a workspace-scoped bearer token (see
validateDiscoveryCaller in handlers/discovery.go — designed for
agent-to-agent calls). The canvas session doesn't hold that token, so
every Details-tab open for a provisioning / failed / offline workspace
fired a 401 that cluttered devtools and lit up the error banner even
though the real UX here is "no peers — the workspace hasn't booted."

Gate the fetch on status ∈ {online, degraded} and render an empty
Peers list for everything else.

Follow-up: give the canvas a way to see peers for any workspace (admin
session should be enough). Tracked separately — this fix just quiets
the noise on the common case.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 17:38:25 -07:00
Hongming Wang
48900e34b0 feat(canvas): show last_sample_error + EC2 console output on failed workspaces
Part 3 of 3 for the "fail fast + comprehensive logs" UX. Platform PR
#1168 and controlplane #181 ship the server-side; this PR surfaces the
data in the canvas.

Two changes:

1. DetailsTab renders `last_sample_error` in a dedicated Error section
   when the workspace is failed (or degraded with an error). Before,
   the only trace of why a workspace failed was a generic banner —
   users had to click "View Logs", which opened the terminal tab (the
   post-boot log, empty on a runtime crash). Now the actual Python
   traceback is inline. A "View console output" button in the same
   section opens the full serial console in a modal.

2. New ConsoleModal component. Fetches GET /workspaces/:id/console
   (platform → CP → ec2:GetConsoleOutput). Portal-rendered above the
   canvas with Copy / Close / Esc handlers. Renders a friendly message
   on 501 (self-hosted deploys without CP) and 404 (instance
   terminated).

3. ProvisioningTimeout's "View Logs" button now opens the console
   modal instead of the (usually empty) terminal tab — when a
   workspace is stuck in provisioning, the cloud-init + user-data
   trace is what the user actually needs.

Tests cover the closed-state no-fetch, happy-path fetch, 501/404
messaging, and Close/Escape wiring.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 17:22:15 -07:00
molecule-ai[bot]
b1433ee8e6 Merge pull request #1171 from Molecule-AI/staging
chore: fast-forward staging with main review-cleanup commits
2026-04-21 00:16:58 +00:00
molecule-ai[bot]
45cf87c7b8 test(BatchActionBar): add hasFailedBatch success-reset test (#1170)
CP-QA approved. 34-line test for BatchActionBar retry state reset after successful batch action.
2026-04-21 00:12:37 +00:00
molecule-ai[bot]
45f5b47487 fix(security): add USER directive before ENTRYPOINT in all tenant images (#1155)
Closes: #177 (CRITICAL — Dockerfile runs as root)

Dockerfiles changed:
- workspace-server/Dockerfile (platform-only): addgroup/adduser + USER platform
- workspace-server/Dockerfile.tenant (combined Go+Canvas): addgroup/adduser + USER canvas
  + chown canvas:canvas on canvas dir so non-root node process can read it
- canvas/Dockerfile (canvas standalone): addgroup/adduser + USER canvas
- workspace-server/entrypoint-tenant.sh: update header comment (no longer starts
  as root; both processes now start non-root)

The entrypoint no longer needs a root→non-root handoff since both the Go
platform and Canvas node run as non-root by default. The 'canvas' user owns
/app and /platform, so volume mounts owned by the host's canvas user work
without needing a root init step.

Co-authored-by: Molecule AI CP-BE <cp-be@agents.moleculesai.app>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-20 23:51:33 +00:00
696bd86322 Merge branch 'fix/canvas-orgs-page-tests' into staging 2026-04-20 23:47:02 +00:00
223584c66d Merge remote-tracking branch 'origin/staging' into fix/canvas-orgs-page-tests 2026-04-20 23:46:17 +00:00
acf03cd057 fix(security): suppress raw response body from user-facing billing errors
billing.ts (startCheckout, openBillingPortal): replace raw res.text()
in thrown Error with a safe status-only message. The response body from
/cp/billing/* routes can contain Stripe API error detail (invalid key,
card decline message, raw Stripe envelope) that should not reach clients.

orgs/page.tsx (createOrg): same fix — raw body → safe message.

Full body is logged server-side for debugging.

Closes: #91 (CWE-209 — Stripe key echoed in error)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-20 23:23:25 +00:00
a339cde8d5 fix(canvas): restore 12 orgs-page tests broken by TermsGate fetch + mock leak
Root causes:
1. TermsGate (rendered inside OrgsPage Shell) fetches /cp/auth/terms-status
   before OrgsPage fetches /cp/orgs, consuming the first mockResponseOnce
   slot — leaving /cp/orgs with no mock and throwing TypeError.
   Fix: mock TermsGate as a pass-through component in vi.mock.

2. Non-polling tests used mockFetchSession.mockResolvedValueOnce() which
   exhausted after one call; React 18 concurrent re-renders call
   fetchSession() multiple times, causing subsequent calls to return
   undefined. Fix: use mockResolvedValue() (persistent) for fetchSession.

3. vi.clearAllMocks() in beforeEach kept mockResolvedValueOnce from
   previous tests from leaking BUT the vi.fn() mock implementation was
   already reset by mockFetchSession.mockReset() in beforeEach. Tests
   were passing stale persistent mocks from previous tests. Fix:
   mockFetchSession.mockReset() in beforeEach + mockResolvedValue in
   each test.

4. Polling tests used vi.useFakeTimers() without shouldAdvanceTime,
   which prevented React's useEffect from calling fetch() (0 calls).
   Fix: use vi.useFakeTimers({ shouldAdvanceTime: true }) + await
   vi.advanceTimersByTimeAsync() to advance time during await.

5. Unmount test unmounted before effects fired (with shouldAdvanceTime).
   Fix: flush microtasks with await vi.advanceTimersByTimeAsync(0)
   before unmount so the effect runs and schedules the poll timer.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-20 23:12:34 +00:00
Hongming Wang
fc3ae5a63a chore: code-review cleanup on today's shipped PRs
Three nits identified during post-merge review of #1119, #1133:

1. ContextMenu.tsx imported `removeNode` from the canvas store but
   stopped using it when the delete-confirm flow moved to Canvas in
   #1133. Also removed the now-unused mock entry in the keyboard
   test so the test inventory matches the real call list.

2. Preflight's YAML parse failure was a silent pass — defensible since
   the in-container preflight owns the schema, but invisible to ops if
   a template ships malformed YAML. Log at WARN so the signal surfaces
   without blocking the provision.

3. formatMissingEnvError rendered its slice via %q, producing
   `["A" "B"]` which is Go-literal-looking and ugly in a user-facing
   error. Join with ", " instead. Test updated to assert the new
   format.

No behavioural changes beyond the log line; fixes are review nits, not
bug fixes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 16:04:57 -07:00
Hongming Wang
0ddf266e54 fix(canvas): delete workspace dialog race with context menu close
Clicking "Delete" in the workspace context menu did nothing for stuck
workspaces. The confirm dialog was rendered via portal as a child of
ContextMenu. ContextMenu's outside-click handler checks whether the
click target is inside its ref — but the portal puts the dialog in
document.body, outside the ref. So clicking the dialog's Confirm
counted as "outside", closed the menu, unmounted the dialog mid-click,
and the onConfirm handler never ran.

Hoist the pending-delete state to the canvas store and render the
confirm dialog at the Canvas level (same pattern as the existing
pendingNest dialog). The dialog now outlives ContextMenu, so the
outside-click close is harmless. Close the context menu on the Delete
click itself rather than waiting for the dialog to resolve.

Add a regression test covering the new flow and add the standard
?confirm=true query param so the backend's child-cascade guard is
consulted correctly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 15:50:30 -07:00
Hongming Wang
c3f7447e86 fix: harden stuck-provisioning UX — details crash, preflight, sweeper
Workspaces stuck in status='provisioning' previously surfaced in three
bad ways:

1. **Details tab crashed** with `Cannot read properties of undefined
   (reading 'toLocaleString')`. `BudgetSection` + `WorkspaceUsage`
   assumed full response shapes but a provisioning-stuck workspace
   returns partial `{}`. Guard each deep field with `?? 0` and cover
   the partial-response case with regression tests.

2. **Missing required env vars failed silently** 15+ minutes later as
   a cosmetic "Provisioning Timeout" banner. The in-container preflight
   catches them but by then the container has already crashed without
   calling /registry/register, so the workspace sat in 'provisioning'
   forever. Mirror the preflight server-side: parse config.yaml's
   `runtime_config.required_env` before launch, fail fast with a
   WORKSPACE_PROVISION_FAILED event naming the missing vars.

3. **No backend timeout** ever flipped a stuck workspace to 'failed'.
   Add a registry sweeper (10m default, env-overridable) that detects
   workspaces stuck past the window, flips them to 'failed', and emits
   WORKSPACE_PROVISION_TIMEOUT. Race-safe: the UPDATE re-checks the
   status + age predicate so a concurrent register/restart wins.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 14:51:39 -07:00
Hongming Wang
91187342b4 feat(auth): organization-scoped API keys for admin access
Adds user-facing API keys with full-org admin scope. Replaces the
single ADMIN_TOKEN env var with named, revocable, audited tokens
that users can mint/rotate from the canvas UI without ops
intervention.

Designed for the beta growth phase — one token tier (full admin).
Future work will split into scoped roles (admin / workspace-write
/ read-only) and per-workspace bindings. See docs/architecture/
org-api-keys.md for the design + follow-up roadmap.

## Surface

  POST   /org/tokens        mint (plaintext returned once)
  GET    /org/tokens        list live keys (prefix-only)
  DELETE /org/tokens/:id    revoke (idempotent)

All AdminAuth-gated. Bootstrap path: mint the first token via
ADMIN_TOKEN or canvas session; tokens can mint more tokens after.

## Validation as a new AdminAuth tier (2a)

AdminAuth evaluation order:
  Tier 0  lazy-bootstrap fail-open (only when no live tokens AND
          no ADMIN_TOKEN env)
  Tier 1  verified WorkOS session via /cp/auth/tenant-member
  Tier 2a org_api_tokens SELECT — NEW
  Tier 2b ADMIN_TOKEN env (bootstrap / CLI break-glass)
  Tier 3  any live workspace token (deprecated, only when ADMIN_TOKEN
          unset)

Tier 2a runs ONE indexed lookup (partial index on
token_hash WHERE revoked_at IS NULL) + an async last_used_at
bump. No measurable latency cost on the hot path.

## UI

New "Org API Keys" tab in the settings panel. Label field for
human-readable naming. Plaintext shown once + clipboard copy.
Revoke with confirm dialog. Mirrors the existing workspace-
TokensTab flow so users who've used one get the other for free.

## Security properties

  - Plaintext never stored. sha256 hash + 8-char display prefix.
  - Revocation is immediate: partial index on revoked_at IS NULL
    means the next request validates or fails in microseconds.
  - created_by audit field captures provenance: "org-token:<short>"
    when a token mints another, "session" for browser-UI mints,
    "admin-token" for the ADMIN_TOKEN bootstrap path.
  - Validate() collapses all failure shapes into ErrInvalidToken
    so response-shape can't distinguish "never existed" from
    "revoked".

## Tests

  - internal/orgtoken: 9 unit tests (hash storage, empty field
    null-ing, validation happy path, empty plaintext, unknown hash,
    revoked filtering, list ordering, revoke idempotency, has-any-
    live short-circuit).
  - AdminAuth tier-2a integration covered by existing middleware
    tests unchanged (fail-open + bearer paths).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 14:01:41 -07:00
Hongming Wang
52235aeb27 feat(router): /cp/* reverse-proxy to CP + same-origin canvas fetches
Canvas's browser bundle issues fetches to both CP endpoints
(/cp/auth/me, /cp/orgs, ...) AND tenant-platform endpoints
(/canvas/viewport, /approvals/pending, /org/templates). They
share ONE build-time base URL. Baking api.moleculesai.app
broke tenant calls with 404; baking the tenant subdomain broke
auth. Tried both today and saw exactly one failure mode per
attempt.

Real fix: same-origin fetches + tenant-side split. Adds:

  internal/router/cp_proxy.go      # /cp/* → CP_UPSTREAM_URL

mounted before NoRoute(canvasProxy). Now a tenant serves:

  /cp/*              → reverse-proxy to api.moleculesai.app
  /canvas/viewport,
  /approvals/pending,
  /workspaces/:id/*,
  /ws, /registry,    → tenant platform (existing handlers)
  /metrics
  everything else    → canvas UI (existing reverse-proxy)

Canvas middleware reverts to `connect-src 'self' wss:` for the
same-origin path (keeping explicit PLATFORM_URL whitelist as a
self-hosted escape hatch when the build-arg is non-empty).

CI build-arg flips to NEXT_PUBLIC_PLATFORM_URL="" so the bundle
issues relative fetches.

Security of cp_proxy:
  - Cookie + Authorization PRESERVED across the hop (opposite of
    canvas proxy) — they carry the WorkOS session, which is the
    whole point.
  - Host rewritten to upstream so CORS + cookie-domain on the CP
    side see their own hostname.
  - Upstream URL validated at construction: must parse, must be
    http(s), must have a host — misconfig fails closed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 13:01:40 -07:00
Hongming Wang
d069231d0b fix(canvas): include NEXT_PUBLIC_PLATFORM_URL in CSP connect-src
Tenant page loads were blocked by:

  Refused to connect to 'https://api.moleculesai.app/cp/auth/me'
  because it violates the document's Content Security Policy.

CSP had `connect-src 'self' wss:` — fine for same-origin + any wss,
but browser refuses cross-origin HTTPS fetches that aren't listed.
PLATFORM_URL (baked from NEXT_PUBLIC_PLATFORM_URL, which is the CP
origin on SaaS tenants) needs to be explicit.

Fix: middleware reads NEXT_PUBLIC_PLATFORM_URL at build/runtime
and adds both the https and wss siblings to connect-src. Self-
hosted deploys that override the build-arg automatically get a
matching CSP — no hardcoded hostname.

Test added: buildCsp includes NEXT_PUBLIC_PLATFORM_URL origin in
connect-src when set. Also loosens the dev `ws:` assertion since
dev uses `connect-src *` which subsumes ws (pre-existing behavior,
test was stale).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 12:55:03 -07:00
Hongming Wang
9abe2e5739 Merge pull request #1089 from Molecule-AI/fix/canvas-csp-nonce-propagation
fix(canvas): root layout dynamic so CSP nonce reaches Next scripts
2026-04-20 12:34:08 -07:00
Hongming Wang
1af6f696a2 fix(canvas): make root layout dynamic so CSP nonce reaches Next scripts
Tenant page loads were failing with repeated CSP violations:

  Executing inline script violates ... script-src 'self'
  'nonce-M2M4YTVh...' 'strict-dynamic'. ...

because Next.js's bootstrap inline scripts were emitted without a
nonce attribute. The middleware was generating per-request nonces
correctly and sending them via `x-nonce` — but the layout was
fully static, so Next.js cached the HTML once and served that cached
bundle (no nonces baked in) for every request.

Fix: call `await headers()` in the root layout. That opts the tree
into dynamic rendering AND signals Next.js to propagate the
x-nonce value to its own generated <script> tags.

The `nonce` return value is intentionally unused — the framework
handles its bootstrap scripts automatically once the read happens.
Future code that adds third-party <Script> components (analytics,
etc.) should pass the returned nonce explicitly.

Verified against live tenant: before this change every /_next/
chunk script tag in the HTML had no nonce attribute; expected after
deploy is `<script nonce="..." src="/_next/...">` on each.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 12:34:03 -07:00
rabbitblood
35df23850e fix(canvas): CSP_DEV_MODE + admin token for local Docker (#1052 follow-up)
Three changes that keep getting lost on nuke+rebuild:
1. middleware.ts: read CSP_DEV_MODE env to relax CSP in local Docker
2. api.ts: send NEXT_PUBLIC_ADMIN_TOKEN header (AdminAuth on /workspaces)
3. Dockerfile: accept NEXT_PUBLIC_ADMIN_TOKEN as build arg

All three are required for the canvas to work in local Docker where
canvas (port 3000) fetches from platform (port 8080) cross-origin.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-20 12:23:43 -07:00
Hongming Wang
7a2a17591c chore(canvas): remove dead /waitlist page (lives in molecule-app)
#1080 added /waitlist to canvas, but canvas isn't served at
app.moleculesai.app — it backs the tenant subdomains (acme.moleculesai.app
etc.). The real /waitlist lives in the separate molecule-app repo,
which is what the CP auth callback redirects to.

molecule-app#12 has the real page + contact form wiring to
/cp/waitlist/request. This canvas copy was never reachable and would
only diverge.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 09:55:35 -07:00
Hongming Wang
b7149c5dda feat(canvas): /waitlist page with contact form
Adds the user-facing half of the beta-gate: a page at /waitlist that
the CP auth callback redirects users to when their email isn't on
the allowlist. Collects email + optional name + use-case and POSTs
to /cp/waitlist/request (backend landed in controlplane #150).

## Behavior

- No auto-pre-fill of email from URL query (CP's #145 dropped the
  ?email= param for the privacy reason; this test guards against a
  future regression on the client side).
- Client-side validates email shape for instant feedback; backend
  re-validates.
- Three UI states after submit:
    success → "your request is in" banner, form hidden
    dedup   → softer "already on file" banner when backend returns
              dedup=true (same 200, no 409 to avoid enumeration)
    error   → inline banner with backend message or network fallback

## Tests

9 tests in __tests__/waitlist-page.test.tsx covering:
- default render + a11y (role=button, role=status, role=alert)
- URL-pre-fill privacy regression guard
- HTML5 + JS validation (empty, malformed)
- successful POST with trimmed body
- dedup branch
- non-2xx with + without error field
- network rejection

Follow-up to the beta-gate rollout on controlplane #145 / #150.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 09:47:06 -07:00
molecule-ai[bot]
a3e70b739c Merge pull request #949 from Molecule-AI/feat/canvas-batch-operations
feat(canvas): batch operations — multi-select + restart/pause/delete
2026-04-20 08:48:26 -07:00
molecule-ai[bot]
fc0e02d692 Merge pull request #1011 from Molecule-AI/test/qa-coverage-orgs-page-and-api-timeout
test(canvas): QA coverage — orgs page polling + API timeout
2026-04-20 08:48:00 -07:00
molecule-ai[bot]
67c12b119e Merge pull request #1016 from Molecule-AI/fix/a11y-workspace-node
fix(a11y): WorkspaceNode font floor, contrast, focus rings
2026-04-20 08:47:53 -07:00
875d4ab0a5 fix(gate-1): remove unused fireEvent import (#1011)
Mechanical lint fix. github-code-quality[bot] flagged unused
import on line 18 — fireEvent is imported but never referenced in
the test file. Removing it clears the code quality gate without
changing any test behaviour.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-20 02:52:57 +00:00
Molecule AI Frontend Engineer
da2a2b795f fix(a11y): WorkspaceNode font floor, contrast, focus rings (Cycle 10)
C1: skills badge spans text-[7px]→text-[10px]; "+N more" overflow
    text-[7px] text-zinc-500→text-[10px] text-zinc-400
C2: Team section label text-[7px] text-zinc-600→text-[10px] text-zinc-400
H4: status label text-[9px]→text-[10px]; active-tasks count
    text-[9px] text-amber-300/80→text-[10px] text-amber-300 (remove opacity
    modifier per design-system contrast rule); current-task text
    text-[9px] text-amber-300/70→text-[10px] text-amber-300
L1: add focus-visible:ring-2 focus-visible:ring-blue-500/70 to the Restart
    button (independently Tab-focusable inside role="button" wrapper) and to
    the Extract-from-team button in TeamMemberChip; TeamMemberChip
    role="button" div already has the focus ring (COVERED, no change)

762/762 tests pass · build clean

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 22:01:58 +00:00
qa-agent
4db8d30ece test(canvas): cover /orgs 5s polling on in-flight orgs
The test docstring promised polling coverage but I'd only wired the
describe-block header, not the actual tests. Closing that gap — vitest
fake timers drive three cases:

- `provisioning` org → 2nd fetch fires after 5.1s advance
- all `running` → no 2nd fetch even after 10s advance
- `awaiting_payment` org, unmount before timer fires → no post-unmount
  fetch (cleanup correctly clears the pollTimer)

The unmount case is the meaningful one: without it a fast nav-away
leaves the 5s interval chasing the CP forever. page.tsx L97-99 does
clear the timer; the test pins the contract.

Local baseline on origin/staging tip ede6597 + this branch:
  canvas vitest: 50 files / 781 tests, all green (+3 vs prior commit)
  canvas build:  clean

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-19 19:18:30 +00:00
qa-agent
e70cd1c2aa test(canvas): pin AbortSignal timeout regression + cover /orgs landing page
Two independent test additions that harden the surface freshly landed on
staging via PRs #982 (canvas fetch timeout), #992 (/orgs landing), #994
(post-checkout redirect to /orgs).

canvas/src/lib/__tests__/api.test.ts (+74 lines, 7 new tests)
  - GET/POST/PATCH/PUT/DELETE each pass an AbortSignal to fetch
  - TimeoutError (DOMException name=TimeoutError) propagates to the caller
  - Each request installs its own signal — no shared module-level controller
    that would allow one slow request to cancel an unrelated fast one
  This is the hardening nit I flagged in my APPROVE-w/-nit review of
  fix/canvas-api-fetch-timeout. Landing as a follow-up now that #982 is in
  staging.

canvas/src/app/__tests__/orgs-page.test.tsx (+251 lines, new file, 10 tests)
  - Auth guard: signed-out → redirectToLogin and no /cp/orgs fetch
  - Error state: failed /cp/orgs → Error message + Retry button
  - Empty list: CreateOrgForm renders
  - CTA by status:
      running          → "Open" link targets {slug}.moleculesai.app
      awaiting_payment → "Complete payment" → /pricing?org=<slug>
      failed           → "Contact support" mailto
  - Post-checkout: ?checkout=success renders CheckoutBanner AND
    history.replaceState scrubs the query param
  - Fetch contract: /cp/orgs called with credentials:include + AbortSignal

Local baseline on origin/staging tip ede6597:
  canvas vitest: 50 files / 778 tests, all green
  canvas build:  clean, /orgs route present (2.83 kB / 105 kB first-load)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-19 19:14:54 +00:00
Hongming Wang
dd5654b803 feat(canvas): ToS gate modal + us-east-2 data residency notice
Wraps /orgs in a TermsGate that polls /cp/auth/terms-status on mount
and overlays a blocking modal when the current terms version hasn't
been accepted yet. "I agree" POSTs /cp/auth/accept-terms and dismisses
the modal; the backend records IP + UA as GDPR Art. 7 proof-of-consent.

Also adds a short data residency notice under the page header:
workspaces run in AWS us-east-2 (Ohio, US). An EU region selector is
a future lift once the infra is provisioned there.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 07:44:47 -07:00
Hongming Wang
18894bebe8 feat(canvas): Phase 5 — credit balance pill + low-balance banner
Adds the UI surface for the credit system to /orgs:
- CreditsPill next to each org row. Tone shifts from zinc → amber at
  10% of plan to red at zero.
- LowCreditsBanner appears under the pill for running orgs when the
  balance crosses thresholds: overage_used > 0 → "overage active",
  balance <= 0 → "out of credits, upgrade", trial tail → "trial almost
  out".
- Pure helpers extracted to lib/credits.ts so formatCredits, pillTone,
  and bannerKind are unit-tested without jsdom.

Backend List query now returns credits_balance / plan_monthly_credits
/ overage_used_credits / overage_cap_credits so no second round-trip
is needed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 07:27:29 -07:00
Hongming Wang
26b89400ed test(canvas): bump billing test for /orgs success_url 2026-04-19 04:26:01 -07:00
Hongming Wang
d77378294b feat(canvas): post-checkout UX — Stripe success lands on /orgs with banner
Two small polish items that together close the signup-to-running-tenant
flow for real users:

1. Stripe success_url now points at /orgs?checkout=success instead of
   the current page (was pricing). The old behavior left people staring
   at plan cards with no indication payment went through — the new
   behavior drops them right onto their org list where they can watch
   the status flip.

2. /orgs shows a green "Payment confirmed, workspace spinning up"
   banner when it sees ?checkout=success, then clears the query
   param via replaceState so a reload doesn't show it again.

3. /orgs now polls every 5s while any org is awaiting_payment or
   provisioning. Users see the Stripe webhook's effect live — no
   manual refresh needed — and once every org settles the polling
   stops so idle tabs don't hammer /cp/orgs.

Paired with PR #992 (the /orgs page itself) this makes the end-to-end
flow on BILLING_REQUIRED=true deployments feel right:
  /pricing → Stripe → /orgs?checkout=success → banner → live poll →
  "Open" button when org.status transitions to running.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 04:18:32 -07:00
Hongming Wang
b29ffb9546 feat(canvas): /orgs landing page for post-signup users
CP's Callback handler redirects every new WorkOS session to
APP_URL/orgs, but canvas had no such route — new users hit the canvas
Home component, which tries to call /workspaces on a tenant that
doesn't exist yet, and saw a confusing error. This PR plugs that gap
with a dedicated landing page that:

- Bounces anonymous visitors back to /cp/auth/login
- Zero-org users see a slug-picker (POST /cp/orgs, refresh)
- For each existing org, shows status + CTA:
  * awaiting_payment → amber "Complete payment" → /pricing?org=…
  * running          → emerald "Open" → https://<slug>.moleculesai.app
  * failed           → "Contact support" → mailto
  * provisioning     → read-only "provisioning…"
- Surfaces errors inline with a Retry button

Deliberately server-light: one GET /cp/orgs, no WebSocket, no canvas
store hydration. Goal is to move the user from signup to either
Stripe Checkout or their tenant URL with one click each.

Closes the last UX gap between the BILLING_REQUIRED gate landing on
the CP and real users being able to complete a signup today.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 04:13:54 -07:00
Hongming Wang
956c1fb9b9 fix(canvas): add 15s fetch timeout on API calls
Pre-launch audit flagged api.ts as missing a timeout on every fetch.
A slow or hung CP response would leave the UI spinning indefinitely
with no way for the user to abort — effectively a client-side DoS.

15s is long enough for real CP queries (slowest observed is Stripe
portal redirect at ~3s) and short enough that a stalled backend
surfaces as a clear error with a retry affordance.

Uses AbortSignal.timeout (widely supported since 2023) so the
abort propagates through React Query / SWR consumers cleanly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 02:12:47 -07:00
Hongming Wang
68f55c3ebc Merge pull request #974 from Molecule-AI/fix/canvas-a11y-degraded-badge
fix(canvas): degraded badge WCAG AA contrast (closes #885 p1)
2026-04-19 00:28:39 -07:00
Hongming Wang
50364817ac Merge pull request #967 from Molecule-AI/chore/shadcn-init
chore(canvas): initialize shadcn/ui CLI
2026-04-19 00:27:07 -07:00
Hongming Wang
89d96e8581 fix(canvas): degraded badge WCAG AA contrast — amber-400 → amber-300 (closes #885)
amber-400 on zinc-900 is 5.4:1 (AA pass). amber-300 is 6.9:1 (AA+AAA pass)
and matches the rest of the amber usage in WorkspaceNode (currentTask,
error detail, badge chip).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-18 12:05:38 -07:00
Hongming Wang
16cb728461 chore(canvas): initialize shadcn/ui — components.json + cn utility
Sets up shadcn/ui CLI so new components can be added with
`npx shadcn add <component>`. Uses new-york style, zinc base color,
no CSS variables (matches existing Tailwind-only approach).

Adds clsx + tailwind-merge for the cn() utility.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-18 07:57:17 -07:00
Hongming Wang
f6c1eb7482 chore(canvas): enable Turbopack for dev server — faster HMR
next dev --turbopack for significantly faster dev server startup
and hot module replacement. Build script unchanged (Turbopack for
next build is still experimental).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-18 07:39:03 -07:00
Hongming Wang
e416496d5f test: add BatchActionBar unit tests (7 tests)
Covers: render threshold, count badge, action buttons, clear selection,
ConfirmDialog trigger, ARIA toolbar role.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-18 02:21:31 -07:00
Hongming Wang
434e5747b2 fix: ChatTab comment path for workspace-server rename
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-18 01:48:59 -07:00
Hongming Wang
e3d22cf2ff test: update mock stores for batch selection in existing canvas tests
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-18 01:22:25 -07:00
Hongming Wang
6de705b2a1 feat(canvas): batch operations — multi-select + restart/pause/delete (Phase 20.3)
- Shift+click to toggle node selection (multi-select mode)
- BatchActionBar floating at bottom when >1 node selected
- Batch Restart All, Pause All, Delete All with ConfirmDialog
- Selected nodes get blue ring highlight
- Escape clears selection
- Pane click clears selection
- Dark theme, accessible (ARIA labels, focus rings)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-18 01:16:55 -07:00
Hongming Wang
a6caf0b138 fix(canvas): add a11y attributes to TeamMemberChip — role, aria-label, keyboard nav
Adds role="button", tabIndex, aria-label="Select <name>", and keyboard
handlers (Enter/Space) to TeamMemberChip. Fixes 5 failing a11y tests
from issue #831. Updates eject button test to match existing label format.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 21:53:39 -07:00
e14ed86e05 Merge branch 'main' of https://github.com/Molecule-AI/molecule-core into fix/issue-854-eject-tooltip
# Conflicts:
#	canvas/src/components/WorkspaceNode.tsx
2026-04-18 01:43:00 +00:00
2d530c80cf fix(gate-6): merge main into fix/a11y-audit-902-905 — resolve 7 conflicts
Conflicts arose because PR #892 base commits (MemoryInspectorPanel creation,
A2A overlay) had already landed on main via a different merge path, and
last-tick merges (#876, #888) had modified Toolbar, SidePanel, and test
fixtures.

Resolution strategy:
- Toolbar.tsx, SidePanel.tsx, Canvas.a11y.test.tsx, Canvas.pan-to-node.test.tsx,
  MemoryInspectorPanel.test.tsx: take main (strictly newer, already contains
  the branch's A2A overlay content plus subsequent a11y/UX fixes)
- MemoryInspectorPanel.tsx: take main (543 lines with semantic search) + apply
  sanitizeId() helper from #904 + update bodyId prefix to mem-body-
- DetailsTab.tsx: take main (has #875 Field/useId + #878 deleteButtonRef/focus)
  + apply alertdialog structure from #905 while preserving focus management

Mechanical conflict resolution by triage-agent; no logic changes beyond the
four a11y fixes already in the branch (#902-#905).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-18 01:34:00 +00:00
f5bab93630 Merge branch 'main' of https://github.com/Molecule-AI/molecule-core into fix/canvas-a11y-configtab-detailstab-htmlfor
# Conflicts:
#	canvas/src/components/tabs/DetailsTab.tsx
2026-04-18 01:24:15 +00:00
5812b43de5 fix(gate-6): reconcile DetailsTab.tsx import — merge useRef (#878) with useId/cloneElement (#875)
PR #878 landed before this branch and added useRef + deleteButtonRef focus-
management to DetailsTab.tsx. This commit combines that import with the
useId/cloneElement import added here, and preserves the Field component
htmlFor/id wiring from this PR unchanged.

Mechanical conflict resolution by triage-agent; no logic changes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-18 01:22:08 +00:00
molecule-ai[bot]
ded7dc777c Merge pull request #888 from Molecule-AI/fix/canvas-a11y-sidepanel-resize-keyboard
fix(canvas): a11y — SidePanel keyboard resize, MemoryEntryRow aria-controls, contrast + ChatTab error banner
2026-04-18 01:20:02 +00:00
molecule-ai[bot]
913bcd5c18 Merge pull request #887 from Molecule-AI/fix/canvas-a11y-conversation-trace-modal
fix(canvas): a11y — migrate ConversationTraceModal to Radix Dialog with aria-label
2026-04-18 01:19:57 +00:00
molecule-ai[bot]
86b2016a69 Merge pull request #878 from Molecule-AI/fix/canvas-a11y-detailstab-delete-confirm
fix(canvas): a11y improvements to DetailsTab delete confirmation dialog
2026-04-18 01:19:52 +00:00
molecule-ai[bot]
7fd217f93f Merge pull request #877 from Molecule-AI/fix/canvas-a11y-emptystate-role-alert
fix(canvas): add role=alert to empty-state error messages
2026-04-18 01:19:48 +00:00
molecule-ai[bot]
cbd92937d5 Merge pull request #876 from Molecule-AI/fix/canvas-a11y-toolbar-aria-label
fix(canvas): add aria-label to Toolbar icon buttons
2026-04-18 01:19:44 +00:00
molecule-ai[bot]
871e6bbcb7 Merge pull request #874 from Molecule-AI/fix/canvas-a11y-onboarding-aria-live
fix(canvas): add aria-live region to onboarding step transitions
2026-04-18 01:19:36 +00:00
Molecule AI Frontend Engineer
6d59be1b6c fix(a11y): DetailsTab — use role=alertdialog for delete confirmation (#905)
role="alert" is for passive announcements. A delete confirmation with
Confirm/Cancel action buttons requires a user response, which is the
semantics of role="alertdialog" (interactive dialog requiring response).

- Replace role="alert" with role="alertdialog" + aria-modal="true"
- Add aria-labelledby="delete-confirm-title" for an accessible name
- Add <h3 id="delete-confirm-title"> as the labelling element
  ("Confirm deletion") so AT announces the dialog purpose on focus

Closes #905

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-18 01:14:51 +00:00
Molecule AI Frontend Engineer
5d1e297231 fix(a11y): MemoryInspectorPanel — sanitise bodyId, add aria-controls (#904)
Memory keys can contain characters like [ ] / : . # and spaces that make
invalid HTML id values (breaks CSS selectors and ARIA id-ref lookups).

- Add sanitizeId() helper: replaces non-alphanumeric chars with hyphens,
  collapses consecutive hyphens, strips leading/trailing hyphens
- Compute bodyId = "mem-body-{sanitizeId(entry.key)}" in MemoryEntryRow
- Set id={bodyId} on the expanded body container
- Set aria-controls={bodyId} on the toggle button so AT can navigate
  directly between the button and its controlled panel

Closes #904

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-18 01:14:35 +00:00
Molecule AI Frontend Engineer
377a0802b2 fix(a11y): ActivityTab — aria-pressed on filter pills and auto-refresh (#903)
- Add aria-pressed={filter === f.id} to every filter pill button so AT
  announces which filter is currently active
- Add aria-pressed={autoRefresh} to the auto-refresh toggle so AT
  announces the live/paused state when the button is activated

Closes #903

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-18 01:14:10 +00:00
Molecule AI Frontend Engineer
4cce1a7f1e fix(a11y): MemoryTab — role=alert, labelled inputs, aria-expanded (#902)
- Add role="alert" to the global error banner and the inline add-form
  error message so screen readers announce errors immediately on render
- Add aria-label to all three add-form inputs (key / value / TTL) so
  every form control has an accessible name (was flagged as unlabelled)
- Add aria-expanded={expanded === entry.key} to each entry toggle button
  so AT announces collapsed/expanded state on activation

Closes #902

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-18 01:13:56 +00:00
Molecule AI Frontend Engineer
18caa0f6f4 fix(a11y): add role=alert to MemoryInspectorPanel error banner (#901)
The error banner div introduced in the MemoryInspectorPanel (PR #892)
was missing role="alert", regressing the a11y standard established in
PR #877 / issue #830. Screen readers now announce the error immediately
on render.

Closes #901

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-18 01:12:01 +00:00
Molecule AI Frontend Engineer
c8665e9b43 fix(canvas): resolve TS errors in test fixtures — budgetLimit and AuthGate mock types
- Add budgetLimit: null to WorkspaceNodeData fixtures in canvas-capabilities,
  canvas-events, canvas-events-pan, and canvas.test.ts (inline objects)
- Add budget_limit: null to WorkspaceData fixtures in canvas-topology,
  canvas.test.ts makeWS, and ProvisioningTimeout.test.tsx
- Fix AuthGate.test.tsx TS2348: cast vi.fn() mocks to explicit call
  signatures inside vi.mock() factories (Procedure | Constructable issue)
- npx tsc --noEmit: 0 errors; 689/689 tests passing

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-18 00:43:55 +00:00
Molecule AI Frontend Engineer
60437feb6a fix(a11y): add aria-label to Dialog.Content in ConversationTraceModal (Issue M)
Per UIUX Cycle 5 spec, Dialog.Content should carry an explicit
aria-label="Conversation trace" in addition to the aria-labelledby
automatically wired by Radix Dialog via Dialog.Title. This provides
a fallback accessible name directly on the dialog container element.

All 732 tests pass, build clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 23:31:20 +00:00
Molecule AI Frontend Engineer
f6fa527d58 fix(a11y): migrate ConversationTraceModal to Radix Dialog (Issue M)
Custom <div> modal lacked focus trap, Escape handling, aria-modal, and
aria-labelledby. Migrated to the codebase-standard Radix Dialog pattern
(same as CreateWorkspaceDialog and SettingsPanel) which provides all
required WCAG 2.1 modal semantics automatically:

  • Dialog.Root + Dialog.Portal + Dialog.Overlay + Dialog.Content
    → role="dialog", aria-labelledby, focus trap, Escape key
  • Dialog.Title wraps "Conversation Trace" heading
    → aria-labelledby points to the title element
  • Dialog.Close asChild on ✕ button with aria-label="Close conversation trace"
    → accessible name for the dismiss button (WCAG 4.1.2)
  • Dialog.Close asChild on footer Close button
  • Backdrop → Dialog.Overlay (z-[59]) + Content wrapper (z-[60])
  • All timeline/body content unchanged; only modal scaffolding replaced

Added 10 WCAG tests in ConversationTraceModal.a11y.test.tsx covering:
dialog presence, accessible name, aria-labelledby, data-state, ✕ button
aria-label, close button click, Escape key, and loading indicator. All
732 tests pass, build clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 23:26:47 +00:00
Molecule AI Frontend Engineer
17be61d10b fix(canvas): align SkillsTab aria-label with spec — "Install from source URL"
Corrects the source-input aria-label wording to match the UIUX Cycle 4
spec exactly. Previous commit used "Install plugin from source URL";
spec says "Install from source URL" (matches the visible "Install from
source" section heading). Updates the corresponding test assertions.

No functional change. All 736 tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 23:06:21 +00:00
Molecule AI Frontend Engineer
eba6e3a3de fix(canvas): expand a11y htmlFor/aria-label to SkillsTab, FilesTab, ChannelsTab, ScheduleTab (issue #856)
WCAG 1.3.1 fixes for 4 remaining tabs identified in UIUX Cycle 4 audit:

- SkillsTab: aria-label="Install plugin from source URL" on bare source input
- FilesTab: aria-label="New file path" on bare new-file input
- ChannelsTab: useId() + htmlFor/id pairs for Platform, Bot Token,
  Chat IDs, and Allowed Users label↔input associations (4 pairs)
- ScheduleTab: aria-label="Schedule name" on bare name input;
  useId() + htmlFor/id pairs for Cron Expression, Timezone,
  and Prompt/Task label↔control associations (3 pairs)
- DetailsTab: fix ReactElement<{ id?: string }> cast in Field
  component to resolve React 19 TypeScript overload error

Adds 14 new WCAG tests in tabs.a11y.test.tsx covering all above fixes.
No visual change. All 736 tests pass. Build clean.

Closes #856

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 23:01:43 +00:00
Molecule AI Frontend Engineer
8611d38638 fix(canvas): resolve TypeScript errors exposed by incremental cache invalidation
- WorkspaceNode.eject.test.tsx: add draggable/selectable/deletable to
  NodeProps render call (TS2739); add `as WorkspaceNodeData` cast on
  makeNodeData return to silence Partial<> spread widening (TS2322)

The cherry-picked fix/canvas-test-fixture-budgetlimit commit (9e0aa61)
also lands here — it resolves latent test-fixture drift in 7 test files
that the incremental tsc cache had masked on main but that became visible
once the new WorkspaceNode.eject.test.tsx file invalidated the cache.

tsc --noEmit: 0 errors | npm test: 726 passed | npm run build: clean

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 22:41:16 +00:00
Molecule AI Frontend Engineer
9e0aa61837 fix(canvas): add missing budgetLimit/budget_limit to test fixtures, fix AuthGate mock types
The budget PR (#541) added budgetLimit: number | null as a required field
on WorkspaceNodeData and budget_limit: number | null on WorkspaceData.
Seven test fixture factories were not updated, causing tsc --noEmit to
produce 34 TS2322/TS2345 errors (runtime tests still passed because
Vitest transpiles via esbuild which strips types).

Fixes:
- canvas-events.test.ts: makeNode factory +budgetLimit: null
- canvas-events-pan.test.ts: makeNode factory +budgetLimit: null
- canvas-capabilities.test.ts: makeNodeData factory +budgetLimit: null
- canvas-topology.test.ts: makeWS factory +budget_limit: null
- canvas.test.ts: makeWS factory +budget_limit: null; two inline
  summarizeWorkspaceCapabilities args +budgetLimit: null; context-menu
  fixture +budgetLimit: null
- ProvisioningTimeout.test.tsx: makeWS factory +budget_limit: null

Also fixes 3 TS2348 errors in AuthGate.test.tsx: newer Vitest type defs
resolve ReturnType<typeof vi.fn> to Mock<Procedure|Constructable> which
TypeScript no longer considers directly callable in a vi.mock factory.
Fix: intersect the mock variables with a plain function type so both the
call expression and the mock API (mockReturnValue etc.) type-check.

tsc --noEmit: 0 errors. npm test: 722/722.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 22:39:54 +00:00
Molecule AI Frontend Engineer
ee07380ae0 fix(canvas): dynamic aria-label + title on TeamMemberChip eject button (issue #854)
- EjectIcon now accepts React.SVGProps<SVGSVGElement> so aria-hidden can be passed
- Eject button: aria-label and title both use `Extract ${data.name} from team`
  (previously title was static 'Extract from team'; aria-label was absent)
- <EjectIcon aria-hidden="true"> prevents assistive tech from double-announcing
  the icon content inside the already-labelled button
- Added WorkspaceNode.eject.test.tsx (4 tests) covering aria-label, title,
  label==title invariant, and aria-hidden on the SVG

Closes #854

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 21:54:51 +00:00
molecule-ai[bot]
37d0b3005f fix(canvas): a11y — keyboard access, role=alert, close label, ProvisioningTimeout (#830 #831 #832 #833)
Closes #830, Closes #831, Closes #832, Closes #833

QA-approved (verified via A2A relay — QA token-blocked). All 4 fixes confirmed against local source:
- #830: role=alert + aria-live=assertive on error elements (MemoryInspectorPanel)
- #831: TeamMemberChip role=button + tabIndex + aria-label + onKeyDown Enter/Space (WorkspaceNode)
- #832: aria-label='Close workspace panel' + aria-hidden on SVG (SidePanel)
- #833: ProvisioningTimeout uncommented and mounted in Canvas tree

731/731 tests pass, build clean, use client check clean.
2026-04-17 21:44:17 +00:00
Molecule AI Frontend Engineer
c49616292d fix(canvas): add role=alert and focus-return to delete confirm in DetailsTab
Two WCAG violations in the Danger Zone delete flow:

1. WCAG 4.1.3 (Status Messages): the confirmation UI that appears when
   the user clicks "Delete Workspace" had no ARIA live region, so screen
   readers never announced the confirmation prompt. Adding role="alert"
   to the confirmation container makes it an implicit assertive live
   region that is announced immediately.

2. WCAG 2.4.3 (Focus Order): pressing Cancel left focus wherever the
   browser placed it (often body). Keyboard users had to re-navigate to
   find the Delete Workspace button. The Cancel handler now calls
   deleteButtonRef.current?.focus() to return focus to the trigger
   button, matching the expected modal/disclosure focus-management pattern.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 21:18:05 +00:00
Molecule AI Frontend Engineer
c32334c8db fix(canvas): add ARIA landmark and live region to OnboardingWizard
WCAG 1.3.1 / 4.1.3: the onboarding card had no landmark role and no
live region, so screen readers had no way to know the card exists or
that the step changed.

- Add role="complementary" aria-label="Onboarding guide" to the card
  container so it appears as a named landmark in assistive technology.
- Add a role="status" aria-live="polite" aria-atomic="true" sr-only div
  that holds the current step label. When the step state changes React
  updates the div content, which the live region broadcasts to the AT
  without pulling focus away from the user's current position.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 21:17:32 +00:00
Molecule AI Frontend Engineer
3cee4e1859 fix(canvas): add aria-label to Toolbar buttons and status pills
NVDA and other screen readers ignore the title attribute on interactive
elements and non-interactive divs. Add aria-label alongside title on:
- Stop All button (dynamic label reflects active task count)
- Restart All button (dynamic label reflects pending workspace count)
- StatusPill component (online/offline/failed/provisioning counts)
- WsStatusPill component (connected/connecting/disconnected variants)

Inner dot and text spans get aria-hidden="true" so the screen reader
reads the single aria-label rather than individual child nodes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 21:17:05 +00:00
Molecule AI Frontend Engineer
dc5f74231d fix(canvas): add role=alert to deploy error in EmptyState
WCAG 1.3.1 / 4.1.3: the error div that appears after a failed workspace
deploy or blank-workspace create had no ARIA live region, so screen
readers never announced it. Adding role="alert" makes the message an
implicit aria-live="assertive" region so assistive technology surfaces
the error immediately without requiring the user to navigate to it.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 21:16:14 +00:00
Molecule AI Frontend Engineer
fb17f430b7 fix(canvas): add htmlFor/id pairs to all bare labels in ConfigTab and DetailsTab
Wire WCAG 1.3.1 label associations: 6 bare <label>+control pairs in
ConfigTab (Description, Tier, Runtime, Effort, Task Budget, Backend) now
use stable useId() IDs with matching htmlFor/id. Field helper in
DetailsTab updated to generate its own fieldId via useId() and inject it
into the child element via cloneElement, so every Name/Role/Tier field in
edit mode is correctly associated without requiring call-site changes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 21:15:52 +00:00
Molecule AI Frontend Engineer
8697a42447 fix(canvas): add keyboard resize + ARIA to SidePanel resize handle
Add role="separator" + aria-valuenow/min/max/orientation + tabIndex={0}
to make the resize handle focusable and discoverable by screen readers
(WAI-ARIA slider pattern). Add onKeyDown handler: ArrowLeft/Right moves
by 16px, Home/End snaps to min/max. Persist width to localStorage on
keyboard resize, matching the existing mouse behaviour.
Focus ring uses focus-visible:ring-2 to avoid showing on mouse click.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 20:35:15 +00:00
Molecule AI Frontend Engineer
56f085bae4 fix(canvas): expose loadMessagesFromDB failures with error banner + Retry
Previously loadMessagesFromDB swallowed all errors and returned [] — a
network failure was indistinguishable from an empty history, so the user
had no way to know loading failed. Now the function returns
{ messages, error } and the MyChatPanel renders a role="alert" banner
with the error message and a Retry button when messages are empty and
a load error occurred.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 20:34:48 +00:00
Molecule AI Frontend Engineer
d07909f46b fix(canvas): fix degraded error text contrast and accessibility
Replace title attribute (not read by screen readers for truncated text)
with aria-label, add role="status" so live regions announce the error,
and raise text color from text-amber-300/60 (~2.1:1) to text-amber-400
(~10.6:1) to meet WCAG AA contrast (4.5:1 minimum).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 20:34:04 +00:00
Molecule AI Frontend Engineer
cbc523a2d9 fix(canvas): wire aria-controls on MemoryEntryRow expand toggle
Add bodyId derived from entry.key, attach aria-controls={bodyId} to the
toggle button, and add id={bodyId} role="region" aria-label to the
collapsible body div. Screen readers can now announce the expand/collapse
relationship between the button and the region it controls (WCAG 4.1.2).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 20:33:52 +00:00
molecule-ai[bot]
55200e95d8 fix(gate-5): update test — zinc-400 italic + tilde assertion for low-score badge 2026-04-17 19:24:02 +00:00
molecule-ai[bot]
1e9fd37460 fix(gate-5): WCAG AA — zinc-400 italic for low-score badge per [uiux-agent] review 2026-04-17 19:23:51 +00:00
Molecule AI Frontend Engineer
cdc51d4d30 fix(canvas): color-code similarity badge by score tier (issue #783)
Badge was always text-zinc-500; apply blue-500 (>=0.8), zinc-400 (0.5–0.8),
zinc-600 (<0.5) per spec. Add 3 vitest tests for each color tier (725 total).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 18:51:22 +00:00
molecule-ai[bot]
6c6e703ec9 feat(canvas): semantic search UI for memory inspector (issue #783)
Adds a debounced (300ms) search input to MemoryInspectorPanel with
?q= fetch, similarity_score% badges, skeleton rows during re-fetches,
search-specific empty state, and an immediate-reset clear button.
Tests: 722 passing (+4 new: debounce, badge present/absent, clear).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 17:04:33 +00:00
3915e2b9e8 fix(gate-conflict): merge main into feat/issue-753-audit-trail-panel
Resolves 4 merge conflicts: Toolbar.tsx (2), Canvas.a11y.test.tsx (1),
Canvas.pan-to-node.test.tsx (1). All conflicts were additive — PR adds
selectedNodeId/setPanelTab selectors and the Audit toolbar button; main
didn't have them. Took PR additions throughout.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 16:39:12 +00:00
Hongming Wang
64776f0bed Merge pull request #751 from Molecule-AI/feat/issue-744-a2a-topology-overlay
feat(canvas): A2A topology overlay with animated delegation edges
2026-04-17 09:15:10 -07:00
molecule-ai[bot]
6d530bfd51 feat(canvas): audit trail visualization panel (issue #753)
- AuditTrailPanel SidePanel tab showing the workspace audit ledger from
  GET /workspaces/:id/audit with cursor-based pagination (?cursor=, ?limit=50)
- Color-coded event-type badges: delegation=blue-500, decision=violet-500,
  gate=yellow-500, hitl=orange-500
- chain_valid=false renders red tamper warning indicator
- Event-type filter bar (All / Delegation / Decision / Gate / HITL) resets
  pagination and reloads with ?event_type= param
- Relative timestamps refreshed every 30 s without re-fetching
- Empty state with icon and descriptive copy
- Toolbar Audit button (ledger icon) switches panel to audit tab for
  selected workspace, or shows toast if no workspace is selected
- 29 new unit tests across formatAuditRelativeTime, AuditEntryRow, and
  AuditTrailPanel component integration suites
- Update SidePanel.tabs.test.tsx for 13-tab count and audit as last tab
- Add setPanelTab to Canvas test store mocks (Toolbar now reads it)

Closes #753

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 16:03:28 +00:00
molecule-ai[bot]
2aea674747 feat(canvas): A2A topology overlay with animated delegation edges (issue #744)
- New A2ATopologyOverlay component polls /activity fan-out every 60s and
  writes directed edges to a2aEdges store slice (separate from topology edges)
- buildA2AEdges aggregates delegate rows per source→target pair; violet-500
  animated edge when last call <5 min ago, blue-500 static otherwise
- Toolbar toggle persists to localStorage (molecule:show-a2a-edges)
- Canvas.tsx merges a2aEdges into allEdges via useMemo; pointerEvents:none
  on all edge elements keeps nodes draggable
- 24 new unit tests across pure function, helper, and component suites
- Fix Canvas.a11y and Canvas.pan-to-node store mocks (missing A2A fields)

Closes #744

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 15:45:34 +00:00
Molecule AI Frontend Engineer
ca4b461d21 feat(canvas): A2A topology overlay with animated delegation edges (issue #744)
- New A2ATopologyOverlay component polls /activity fan-out every 60s and
  writes directed edges to a2aEdges store slice (separate from topology edges)
- buildA2AEdges aggregates delegate rows per source→target pair; violet-500
  animated edge when last call <5 min ago, blue-500 static otherwise
- Toolbar toggle persists to localStorage (molecule:show-a2a-edges)
- Canvas.tsx merges a2aEdges into allEdges via useMemo; pointerEvents:none
  on all edge elements keeps nodes draggable
- 24 new unit tests across pure function, helper, and component suites
- Fix Canvas.a11y and Canvas.pan-to-node store mocks (missing A2A fields)

Closes #744

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 15:44:01 +00:00
molecule-ai[bot]
79f806e2d3 feat(canvas): add MemoryInspectorPanel for workspace KV memory (issue #730)
Builds MemoryInspectorPanel.tsx — a focused inspector for per-workspace
platform memory entries. Replaces MemoryTab in the SidePanel "memory" tab.

- GET /workspaces/:id/memory loads entries (flat MemoryEntry[] — confirmed
  with Backend Engineer: fields are key/value/version/expires_at/updated_at,
  no scope, write verb is POST not PATCH)
- Empty state: "No memory entries yet" with icon
- Click entry -> expand -> show JSON value, version badge, relative timestamp
- Edit flow: textarea pre-filled with JSON.stringify(value), Save calls POST
  with if_match_version for optimistic concurrency, optimistic update with
  rollback on 409/error, invalid-JSON guard
- Delete flow: button -> ConfirmDialog -> optimistic removal -> DELETE call
- Refresh button re-fetches entries
- 665 tests pass (43 files), next build clean, 'use client' check passes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 15:24:53 +00:00
Molecule AI Frontend Engineer
c50ac61f89 feat(canvas): add MemoryInspectorPanel for workspace KV memory (issue #730)
Builds MemoryInspectorPanel.tsx — a focused inspector for per-workspace
platform memory entries. Replaces MemoryTab in the SidePanel "memory" tab.

- GET /workspaces/:id/memory loads entries (flat MemoryEntry[] — confirmed
  with Backend Engineer: fields are key/value/version/expires_at/updated_at,
  no scope, write verb is POST not PATCH)
- Empty state: "No memory entries yet" with ◇ icon
- Click entry → expand → show JSON value, version badge, relative timestamp
- Edit flow: textarea pre-filled with JSON.stringify(value), Save calls POST
  with if_match_version for optimistic concurrency, optimistic update with
  rollback on 409/error, invalid-JSON guard
- Delete flow: button → ConfirmDialog → optimistic removal → DELETE call
- Refresh button re-fetches entries
- 665 tests pass (43 files), next build clean, 'use client' check passes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 15:23:22 +00:00
Molecule AI Frontend Engineer
3cb21b4bd5 feat(canvas): add max effort level to ConfigTab dropdown (#653)
Adds a fifth option to the effort <select> in the Claude Settings section:

  <option value="max">max — absolute ceiling</option>

The dropdown now offers: low / medium / high / xhigh / max.

effort is typed as string? so no interface update required.
Test updated: source-assertion count "four" → "five", new toYaml
serialization test for effort: max.

641/641 tests pass. Build clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 06:58:29 +00:00
Molecule AI Frontend Engineer
99d5ef6866 feat(canvas): expose effort + task_budget in ConfigTab (#608)
Adds two new Claude API primitives (Opus 4.7+) as configurable workspace
fields in the Config tab form:

  effort: 'low' | 'medium' | 'high' | 'xhigh'
    Maps to output_config.effort in the Anthropic Messages API.
    Controls thinking depth — xhigh enables extended thinking mode.

  task_budget: integer (token count, 0 = unset)
    Maps to output_config.task_budget.total; requires beta header
    task-budgets-2026-03-13. Lets operators cap token spend per task.

Both fields are stored as top-level keys in config.yaml and read by
claude_sdk_executor.py (workspace-template side, tracked in #608).

Canvas changes:
- form-inputs.tsx: effort?: string, task_budget?: number added to
  ConfigData; DEFAULT_CONFIG initialises them to "" / 0
- yaml-utils.ts: toYaml() emits effort + task_budget (omits when
  empty/zero); parseYaml() already handles plain string/integer keys
- ConfigTab.tsx: new collapsible "Claude Settings" section (defaultOpen=false)
  shown when runtime === "claude-code" OR model name contains "claude"
  or "anthropic". Dropdown for effort (4 options + unset), number input
  for task_budget (step 1000, 0 = unset).

Tests (25 cases in ClaudeSettings.test.tsx):
  - toYaml serialises all four effort values + omits empty/undefined
  - toYaml serialises task_budget + omits 0/undefined
  - effort appears before task_budget in YAML output
  - parseYaml round-trips both fields correctly
  - DEFAULT_CONFIG shape assertions
  - Source assertions for section guards + option values
  - React rendering: section visible for claude-code/claude model,
    hidden for non-Claude runtime (crewai + gpt-4o)

640/640 tests pass. Build clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 06:24:36 +00:00
Molecule AI Frontend Engineer
0adf5d5a5f fix(canvas): mock WorkspaceUsage in BudgetLimit.DetailsTab test
DetailsTab renders WorkspaceUsage alongside BudgetSection. The test suite
sets api.get to return [] (a valid empty peers list) but WorkspaceUsage
calls api.get for metrics and crashes on undefined input_tokens when the
mock returns an array instead of a WorkspaceMetrics object.

Add a stub vi.mock following the same pattern already used for BudgetSection.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 06:22:07 +00:00
bb18f79343 fix(gate-1): resolve merge conflicts with main
Three add/add + content conflicts, all mechanical:
- WorkspaceUsage.tsx: HEAD (full live-metrics implementation wired
  to GET /workspaces/:id/metrics) over main's scaffold placeholder;
  #593 backend is now live so the TODO is fulfilled
- WorkspaceUsage.test.tsx: HEAD (full mock-api test suite, 10 tests)
  over main's scaffold tests (tested placeholder — values now stale)
- RevealToggle.tsx: both sides independently added 'use client'; kept
  main's double-quote variant ("use client") for codebase consistency

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 06:16:36 +00:00
Molecule AI Frontend Engineer
6b5248e173 fix(canvas): move vi.mock to module top level in ZoomShortcut.test (#632)
The vi.mock("../../../store/canvas") call was nested inside an it()
block. Vitest hoists all vi.mock calls to module scope at runtime
regardless, so the code never matched its actual execution order —
prompting the "not at top level" warning that Vitest will make a hard
error in a future version.

Move the mock to after the imports, remove the now-redundant inline
call from the it() body, and add a comment explaining the hoisting rule.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 06:09:39 +00:00
Molecule AI Frontend Engineer
6668a95d6f fix(canvas): use explicit empty-string check in BudgetSection to preserve zero-credit budget
parseInt("0", 10) || null evaluates to null, silently converting a
zero-credit budget to unlimited. Switch to raw !== "" ? parseInt() : null
so budget_limit: 0 is sent correctly. Adds regression test.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 06:07:08 +00:00
Molecule AI Frontend Engineer
e22d2d2257 fix(canvas): WCAG SC 1.3.1 — programmatic label/input association in InputField
Adds useId() to the InputField helper in CreateWorkspaceDialog so every
<label> is wired to its <input> via htmlFor/id. Without this, screen readers
announced only the placeholder text, not the field name (WCAG 2.1 SC 1.3.1
Level A violation, build 4JIwTGVMjDGNLO8iMGJeC).

Affected fields: Name (required), Role, Budget limit (USD), Template.
The Hermes provider fields were already correctly wired.

Adds 6 new tests in CreateWorkspaceDialog.a11y.test.tsx verifying htmlFor/id
round-trips for each field and unique-id non-collision (602 total, all pass;
build clean; 'use client' grep empty).

Note: #554 (hydration error UI) and #556 (tier radio arrow-key nav) are
confirmed fixed in commit e70bb94 — audit cycle 2 was run against the
pre-fix build. #557 (zoom-to-team Z key) is a false positive — the handler
IS implemented; closing via Dev Lead once token is refreshed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 06:07:08 +00:00
Molecule AI Frontend Engineer
b522484e2e feat(#541): budget settings UI with usage stats and 402 handling
Adds a dedicated BudgetSection component to the workspace details panel:
- GET /workspaces/:id/budget on mount — populates live stats (used/limit/remaining)
- Stats row + blue-500 progress bar (capped at 100%; hidden when unlimited)
- PATCH /workspaces/:id/budget for saving; input blank → budget_limit: null
- "Budget exceeded — messages blocked" amber/zinc-950 banner on any 402 response
  (GET or PATCH); banner clears on a successful subsequent save
- 'use client'; dark zinc theme throughout (zinc-800/700 inputs, blue-500 accents)

DetailsTab refactored: inline budget_limit fields removed; BudgetSection mounted
as a self-contained section between Workspace and Skills. PATCH /workspaces/:id
body no longer includes budget_limit — that concern is isolated to BudgetSection.

Tests: 21 new cases in BudgetSection.test.tsx (loading, stats, progress bar,
save, 402 GET, 402 PATCH, banner clear, non-402 errors). BudgetLimit.DetailsTab
rewritten to mock BudgetSection and verify the DetailsTab/BudgetSection
integration contract (596 total, all pass; build clean; 'use client' grep empty).

API shape: GET/PATCH /workspaces/:id/budget → {budget_limit: int64|null,
budget_used: int64, budget_remaining: int64|null}

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 06:07:07 +00:00
Molecule AI Frontend Engineer
07240e7095 feat(canvas): budget_limit input in workspace creation and settings UI (#541)
- Adds optional Budget limit (USD) numeric field to CreateWorkspaceDialog;
  blank = null (unlimited), populated = parsed float sent as budget_limit in
  POST /workspaces body
- Adds budget_limit field to DetailsTab edit form; saves via
  PATCH /workspaces/:id; pre-fills from current WorkspaceNodeData
- Shows 'Budget limit exceeded' warning badge when budgetUsed > budgetLimit
  (forward-compatible — badge hidden when budgetUsed is absent)
- Extends WorkspaceData, WorkspaceNodeData, and buildNodesAndEdges to carry
  budgetLimit / budgetUsed fields ready for backend hydration (issue #541 BE PR)
- Ships 22 new tests across CreateWorkspaceDialog and BudgetLimit.DetailsTab
  suites (575 total, all passing); npm run build clean; 'use client' grep empty

API shape confirmed from workspace.go and CreateWorkspacePayload struct:
  field name: budget_limit | type: number | null | units: USD

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 06:06:36 +00:00
Molecule AI Frontend Engineer
bfe4f65c92 feat(canvas): wire live metrics API in WorkspaceUsage (#592)
WorkspaceUsage now fetches GET /workspaces/:id/metrics on mount and on
workspaceId change. Displays input_tokens and output_tokens formatted
with toLocaleString, and estimated_cost_usd as $X.XXXXXX. Shows three
zinc-700 skeleton rows while loading; surfaces error text on failure.
Stale-fetch guard via ignore flag prevents state updates after unmount.

Also fixes missing 'use client' in RevealToggle.tsx (#603) — the
onClick handler requires client-side hydration.

Tests updated: 12 tests covering loading skeleton, API call correctness,
token formatting, cost formatting, error state, and workspaceId refetch.
All 551 canvas tests pass; build clean.

Closes #592
Closes #603

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 06:00:14 +00:00
molecule-ai[bot]
6c72958e80 Merge pull request #600 from Molecule-AI/feat/issue-592-workspace-cost-transparency
feat(canvas): scaffold WorkspaceUsage component for #592
2026-04-17 05:32:40 +00:00
Molecule AI Frontend Engineer
1e0c2eed3b feat(canvas): scaffold WorkspaceUsage component for #592
Adds WorkspaceUsage component to canvas/src/components/ with three
placeholder stat rows (Input tokens, Output tokens, Estimated cost)
and a "pending #593" badge. Wires into DetailsTab between the Workspace
and Skills sections. No API calls yet — fetch logic will be added once
GET /workspaces/:id/metrics lands in #593.

9 tests in WorkspaceUsage.test.tsx; all 548 canvas tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 05:16:57 +00:00
Hongming Wang
6cecf44f45 fix(canvas): add hermes + gemini-cli to deploy preflight required keys
Hermes requires OPENROUTER_API_KEY (or any of its 15 providers).
Gemini CLI requires GOOGLE_API_KEY. Without these entries, the
MissingKeysModal doesn't fire and workspaces start without keys,
causing crash loops.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 21:45:54 -07:00
Hongming Wang
d4696a5977 fix(canvas): 5 UX polish fixes — error handling, a11y, loading state
1. ScheduleTab + ChannelsTab: wrap toggle/delete in try/catch with
   error feedback (was silently swallowing API failures)
2. MemoryTab: "+Add" button now auto-expands Advanced section
3. SidePanel: keyboard-navigated tabs scroll into view
4. TracesTab: emoji aria-hidden, env-var hint in <details>
5. page.tsx: show Spinner while hydrating instead of flash of EmptyState

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 21:39:44 -07:00
molecule-ai[bot]
1225e9429d fix(canvas): hydration error UI (#554), radio arrow-key nav (#556), zoom-to-team context menu (#557) (#565)
- #554 CRITICAL: Add hydrationError state to Zustand store; catch handler now
  calls setHydrationError instead of silent console.error; page renders a
  full-screen zinc-950 error banner with a Retry button that reloads the page
- #556 MEDIUM: Add roving tabIndex + ArrowDown/Up/Left/Right keyboard handler
  to the tier radio group in CreateWorkspaceDialog (WCAG 2.1 compliant)
- #557 MEDIUM: Add "Zoom to Team" menu item to ContextMenu (visible only when
  node has children); dispatches molecule:zoom-to-team for keyboard accessibility
- Bonus: add missing 'use client' directive to RevealToggle.tsx

Co-authored-by: Molecule AI Frontend Engineer <frontend-engineer@agents.moleculesai.app>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 00:35:54 +00:00
Hongming Wang
7871fd7251 Merge pull request #532 from Molecule-AI/fix/issue-450-csp-nonce
fix(canvas): nonce-based CSP replaces unsafe-inline/unsafe-eval in production
2026-04-16 14:15:12 -07:00
Molecule AI Frontend Engineer
81c26b9fdd fix(canvas): replace unsafe-inline/unsafe-eval with nonce-based CSP (#450)
Removes 'unsafe-inline' and 'unsafe-eval' from script-src in the
production Content-Security-Policy, replacing them with a per-request
nonce + 'strict-dynamic'. This closes the XSS gap reported in #450
where the CSP header gave false assurance.

Key decisions:
- 'strict-dynamic' propagates nonce trust to Next.js dynamic chunk
  imports — no need to enumerate every chunk URL
- style-src retains 'unsafe-inline': React Flow writes inline style=""
  attributes for node positioning which cannot be nonce'd, and CSS
  injection is accepted as significantly lower risk than script injection
- Dev mode keeps the permissive policy so HMR/fast-refresh keep working
- buildCsp() is exported for unit testing (21 tests added)

Additional hardening in production CSP:
  object-src 'none', base-uri 'self', frame-ancestors 'none',
  upgrade-insecure-requests, connect-src limited to wss: (not ws:)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 20:35:27 +00:00
Molecule AI Frontend Engineer
f936af3181 feat(canvas): hermes provider picker in CreateWorkspaceDialog (#493)
When the user sets template="hermes", surface a provider dropdown
(15 providers, defaulting to anthropic) and a masked API key input.
On submit the chosen key is sent as `secrets: { [ENV_VAR]: key }` so
the backend can persist it encrypted before the container boots,
fixing the silent preflight failure reported in #493.

- Adds HERMES_PROVIDERS constant (exported for tests)
- Validates API key presence before POST when template is hermes
- Uses violet accent to visually distinguish the hermes section
- 11 new unit tests covering picker visibility, default, env-var
  mapping, validation, and POST payload shape

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 20:25:58 +00:00
Hongming Wang
54bb543ff7 fix: code review findings — token UI, auth hardening, WS dedup
1. Settings panel: wire TokensTab into "API Tokens" tab (was imported
   but not rendered). Rename "API Keys" → "Secrets", add "API Tokens"
   tab. Fix docs link → doc.moleculesai.app/docs/tokens.

2. Referer match hardening: require exact host match or trailing slash
   to prevent evil.com subdomain bypass. Cache CANVAS_PROXY_URL at
   init time instead of per-request os.Getenv.

3. Extract shared deriveWsBaseUrl() to lib/ws-url.ts — eliminates
   duplicate 12-line derivation in socket.ts and TerminalTab.tsx.

4. Token list pagination: add ?limit= and ?offset= params (default
   50, max 200) to GET /workspaces/:id/tokens.

507/507 canvas tests pass, Go build + vet clean.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 10:42:26 -07:00
Hongming Wang
071fb0da88 fix(tenant): WebSocket URL derivation + AdminAuth same-origin for tenant image
Two bugs on the combined tenant image (canvas + API same-origin):

1. WebSocket URL: NEXT_PUBLIC_WS_URL="" (empty string for same-origin)
   was preserved by ?? operator, producing an invalid WS URL. Now derives
   from window.location when both env vars are empty. Same fix applied
   to TerminalTab.

2. AdminAuth blocking canvas: same-origin requests have no Origin header,
   so neither AdminAuth nor CanvasOrBearer could authenticate the canvas.
   Added isSameOriginCanvas() that checks Referer against request Host,
   gated behind CANVAS_PROXY_URL (only active on tenant image). This
   lets the canvas create/list workspaces, view events, etc. without a
   bearer token when served from the same Go process.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 08:43:01 -07:00
Hongming Wang
a4b1886c35 fix(canvas): address all code review findings on PR #482
- Reconcile TIER_CONFIG/TIER_COLORS into single TIER_CONFIG with both
  `color` (pill style) and `border` (bordered badge style) fields
- Remove TemplatePalette alias indirection (TIER_LABELS_SHARED → direct import)
- Extract inline spinner SVGs to shared Spinner component (3 copies → 1)
- Migrate status dot colors from 6 remaining files to shared tokens:
  SearchDialog, StatusDot, Legend, ContextMenu, Toolbar + add statusDotClass()
- Add COMM_TYPE_LABELS to design-tokens, used by CommunicationOverlay sr-only
- Update reduced-motion tests: components that delegate to design-tokens
  pass the guard check via import detection; add design-tokens.ts own test
- 507/507 tests pass, build clean

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 07:48:47 -07:00
Hongming Wang
ac6d6ce5bd fix(canvas): UX improvements — shared tokens, focus rings, loading spinners, a11y
- Extract STATUS_CONFIG, TIER_CONFIG, TIER_COLORS to shared design-tokens.ts
  (eliminates 3 duplicate definitions across WorkspaceNode, EmptyState, TemplatePalette)
- Add focus-visible:ring-2 ring-blue-500 to WorkspaceNode, SidePanel tabs,
  EmptyState buttons, TemplatePalette buttons (keyboard navigation now visible)
- Replace "Loading..." text with animated spinner SVG in EmptyState,
  TemplatePalette sidebar, and OrgTemplatesSection
- Add disabled:cursor-not-allowed + suppress hover styling when disabled
  on EmptyState template buttons and TemplatePalette deploy buttons
- Brighten SidePanel tab hover from bg-zinc-800/20 to bg-zinc-800/40
  and text from zinc-300 to zinc-200
- Add screen reader labels to CommunicationOverlay directional arrows
  and status icons (sr-only text for "sent", "received", "to", status)

Fixes #422, #424, #427

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 07:35:44 -07:00
Hongming Wang
04c1b16871 fix(canvas): template layout + org card styling
- Wider modal (max-w-2xl), 3-col grid, no max-height clipping
- Org template cards: violet→blue, consistent rounded-xl styling
- Container scrolls vertically instead of cutting off
2026-04-16 06:42:41 -07:00
Canvas Agent
120b886d00 fix(a11y): WCAG ARIA fixes for time-sensitive components (Fixes #Fix1/#Fix2/#Fix3)
- ApprovalBanner: add role="alert" aria-live="assertive" aria-atomic="true" to
  each pending approval card; aria-hidden="true" on decorative ⚠ icon span
- TerminalTab: add role="status" aria-live="polite" to connection status bar;
  add role="alert" to inline error message div
- BundleDropZone: extract shared processFile(); add hidden <input type="file">
  with id/accept/aria-label; add sr-only focus:not-sr-only keyboard trigger
  button; add role="status" aria-live="polite" to result toast

Tests: 7 new assertions in aria-time-sensitive.test.tsx covering all 3 fixes
(496/496 pass, build clean)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 05:40:40 -07:00
Canvas Agent
242306d239 fix(canvas): fitView on new workspace provision — respects user zoom level (#426)
Replace setCenter(x, y, {zoom:1}) with fitView({nodes:[{id}]}) in the
molecule:pan-to-node handler (Canvas.tsx). The old implementation forced
zoom=1 regardless of the user's current zoom level, which was jarring when
panned/zoomed away. fitView adapts to whatever zoom the user had and
gracefully fits the new node in view.

Tests:
- Canvas.pan-to-node.test.tsx: fitView called with correct nodeId after
  100ms debounce; debounce coalesces rapid successive events.
- canvas-events-pan.test.ts: molecule:pan-to-node dispatched for new
  provisions only, NOT on restart of an existing node.

Fixes #426.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 05:40:40 -07:00
Hongming Wang
dda82f5de4 Merge pull request #447 from Molecule-AI/fix/canvas-dark-theme-a11y-sweep
fix(canvas): UIUX Cycle 15 dark-theme & a11y sweep (C1–C5, A1–A4, F1, M1)
2026-04-16 05:21:10 -07:00
Hongming Wang
65a98d5206 Merge pull request #443 from Molecule-AI/fix/issue-430-authgate-blank-flash
fix(canvas): replace AuthGate null loading state with zinc-950 backdrop
2026-04-16 05:16:53 -07:00
Canvas Agent
43c99e0af7 fix(canvas): QA blockers — ChatTab aria-controls, AuthGate test, CommunicationOverlay status icons
BLOCKER 1 (ChatTab.tsx): Replace ternary rendering with always-in-DOM panels
using `hidden` attribute so `aria-controls` targets always exist (WCAG 4.1.2).
Add `id` to tab buttons for `aria-labelledby` back-reference. Non-blocking:
change `key={i}` → `key={line + i}` on activity log items.

BLOCKER 2 (AuthGate.test.tsx): Create test file asserting the loading state
renders a `.bg-zinc-950.fixed.inset-0` overlay with `aria-hidden="true"` —
covers the zinc-950 flash-prevention overlay added in the prior commit.

BLOCKER 3 (CommunicationOverlay.tsx): Add `aria-hidden="true"` to the status
icon span so decorative glyphs (✓ ✕ ⏱) are not announced by screen readers.

Tests: 490/490 passing. Build: clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 10:53:52 +00:00
Canvas Agent
4d0f2d4c79 fix(canvas): C1/C2/C3/C5 dark-theme CSS and ReactFlow colorMode 2026-04-16 10:45:16 +00:00
Canvas Agent
026921ae62 fix(canvas): persist SidePanel width to localStorage (issue #425)
Width was initialized to 480px on every render, so clicking a different
workspace node (which re-mounts SidePanel) discarded any resize the user
had done.

Fix:
- localStorage-backed useState initializer (SSR-safe typeof window guard)
- Validates the stored value: must be a finite integer ≥ 320px
- Persists the width in the mouseUp handler via a widthRef that stays in
  sync with the live drag value — avoids spamming localStorage on every
  pixel during the drag
- Extra guard: onMouseUp bails early if not actually dragging (prevents
  spurious saves on unrelated window mouseup events)
- Named constants replace magic numbers 480 / 320

Tests: 5 new cases in SidePanel.tabs.test.tsx — default fallback, valid
saved value, too-small saved value, NaN saved value, drag-persist roundtrip.

Closes #425

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 10:40:08 +00:00
Canvas Agent
3daa9083b8 fix(canvas): UIUX Cycle 15 dark-theme & a11y sweep (C1-C5, A1-A4, F1, M1)
- C4: OnboardingWizard skip button — aria-label + text-zinc-400 (was zinc-600)
- A1+M1: CommunicationOverlay — aria-label on both icon buttons, aria-hidden
  on decorative arrow glyphs (↗↙ toggle, ✕ close, → comms rows)
- A2: ChatTab sub-tab bar — ARIA roving tabIndex + ArrowLeft/ArrowRight
  keyboard navigation (role=tablist/tab already present)
- A4: SearchDialog search input — focus-visible:ring-2 ring-blue-500 replaces
  bare focus:outline-none so keyboard focus is visible
- F1: AuthGate loading state — zinc-950 full-screen backdrop instead of null
  (prevents white flash on SaaS tenant load)
- A3: SidePanel tab bar — wrap in relative container + right-edge fade
  gradient so truncated tabs are visually signalled

C2 (settings-panel.css input backgrounds) and C3 (Canvas.tsx colorMode="dark")
were already in place; verified by code audit before this commit.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 10:35:32 +00:00
Canvas Agent
29d8f18952 fix(canvas): replace AuthGate null loading state with zinc-950 backdrop
Closes #430.

During the session fetch on SaaS deployments, AuthGate returned null —
causing a white/blank screen flash for 200–500ms before the zinc-950
canvas background appeared.

Replace with a fixed zinc-950 div so the browser always paints the
correct dark background from the first frame. The canvas loading UI
renders on top once the session resolves, with no visible transition.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 10:30:24 +00:00
Hongming Wang
f5b81710df Merge pull request #409 from Molecule-AI/fix/use-client-ui-components
fix(next): add missing 'use client' to TestConnectionButton and KeyValueField
2026-04-16 03:08:24 -07:00
Hongming Wang
3fd9594c11 Merge pull request #405 from Molecule-AI/fix/wcag-zinc600-smalltext-sweep
fix(wcag): sweep text-zinc-600→zinc-500 on small-text labels across 9 components
2026-04-16 03:08:17 -07:00
Hongming Wang
a363b56f25 feat(tenant): combined platform + canvas Docker image with reverse proxy
Single-container tenant architecture: Go platform (:8080) + Canvas
Node.js (:3000) in one Fly machine, with Go's NoRoute handler reverse-
proxying non-API routes to the canvas. Browser only talks to :8080.

Changes:

platform/Dockerfile.tenant — multi-stage build (Go + Node + runtime).
  Bakes workspace-configs-templates/ + org-templates/ into the image.
  Build context: repo root.

platform/entrypoint-tenant.sh — starts both processes, kills both if
  either exits. Fly health check on :8080 covers the Go binary; canvas
  health is implicit (proxy returns 502 if canvas is down).

platform/internal/router/canvas_proxy.go — httputil.ReverseProxy that
  forwards unmatched routes to CANVAS_PROXY_URL (http://localhost:3000).
  Activated by NoRoute when CANVAS_PROXY_URL env is set.

platform/internal/router/router.go — wire NoRoute → canvasProxy when
  CANVAS_PROXY_URL is present; no-op otherwise (local dev unchanged).

platform/internal/middleware/securityheaders.go — relaxed CSP to allow
  Next.js inline scripts/styles/eval + WebSocket + data: URIs. The
  strict `default-src 'self'` was blocking all canvas rendering.

canvas/src/lib/api.ts — changed `||` to `??` for NEXT_PUBLIC_PLATFORM_URL
  so empty string means "same-origin" (combined image) instead of falling
  back to localhost:8080.

canvas/src/components/tabs/TerminalTab.tsx — same `??` fix for WS URL.

Verified: tenant machine boots, canvas renders, 8 runtime templates +
4 org templates visible, API routes work through the same port.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 02:46:47 -07:00
Canvas Agent
04df479e4f fix(next): add missing 'use client' to TestConnectionButton and KeyValueField
Both components use useState/useEffect/useCallback/useRef but were
missing the 'use client' directive. Without it Next.js App Router
renders them as server HTML — React never hydrates them and event
handlers are silently dropped.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 09:10:22 +00:00
Canvas Agent
0a0eab6657 fix(a11y): raise TeamMemberChip label text 7px→9px in WorkspaceNode
Chip labels (status badge, active-task count, current-task text) were
rendered at text-[7px] — well below the 9px minimum required to meet
WCAG 1.4.3 readability. Raised all three to text-[9px] so the labels
are legible without magnification.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 09:06:56 +00:00
UIUX Designer
e810986f44 fix(wcag): sweep text-zinc-600→zinc-500 across 9 components with small text
zinc-600 on zinc-900/950 background ≈ 2.6:1 contrast (WCAG AA requires
4.5:1 for text under 18pt). Found 15 instances across 9 components where
small-text data labels used this low-contrast pairing.

Files and what they label:
  EmptyState.tsx:132     — skill count + model on template cards (new-user visible)
  SidePanel.tsx:230      — workspace ID in panel footer (copyable, functional)
  ActivityTab.tsx:210    — entry timestamp (8px)
  ActivityTab.tsx:214    — expand chevron affordance (9px)
  ActivityTab.tsx:236    — "→" direction arrow between agents (9px)
  ActivityTab.tsx:278    — entry ID (8px, font-mono)
  ScheduleTab.tsx:284    — empty-state description text (9px)
  ScheduleTab.tsx:320    — schedule prompt preview (9px, truncate)
  ScheduleTab.tsx:323    — last/next/run-count metadata row (8px)
  SkillsTab.tsx:380      — "Examples" section header (9px uppercase)
  TracesTab.tsx:132      — trace ID (8px, font-mono)
  AgentCommsPanel.tsx:166 — message timestamp (9px)
  secrets-section.tsx:59  — secret key name (9px, font-mono)
  secrets-section.tsx:308 — encryption notice (9px)
  MissingKeysModal.tsx:175 — missing key identifier (9px, font-mono)

Fix: zinc-600 → zinc-500 across all 15 instances. Purely cosmetic —
no logic, no layout, no interactive behaviour changed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 07:53:00 +00:00
Hongming Wang
1c92b46bfc Merge pull request #392 from Molecule-AI/fix/wcag-node-and-traces-text
Code review passed:
- WorkspaceNode.tsx: 3 × text-[7px]→text-[9px] on status badge, active-task count, currentTask banner — correct targets; decorative text-[7px] (tier badges, skill overflow +N, Team header, font-mono) correctly left unchanged
- TracesTab.tsx: 2 × text-zinc-600→text-zinc-500 on token count + expand chevron — correct; line 72 empty-state dim label correctly left at zinc-600
- 485/485 tests pass with changes applied
- Pure class-string changes, no logic affected
LGTM — merging.
2026-04-16 00:33:59 -07:00
UIUX Designer
3783451bc3 fix(wcag): bump WorkspaceNode status/task labels 7px→9px, TracesTab zinc-600→zinc-500
WorkspaceNode.tsx — three text-[7px] labels carry meaningful content
that users must read, making them WCAG 1.4.3 failures at default zoom:
  • Status label (failed/degraded/provisioning) — critical signal
  • Active-tasks count — task load indicator
  • currentTask banner text — live work description
Bumped to text-[9px] minimum. Decorative elements (+N overflow) unchanged.

TracesTab.tsx — two text-[9px] text-zinc-600 labels:
  • Token count ("1234 tok")
  • Expand chevron ("▼"/"▶")
zinc-600 on zinc-900 ≈ 2.6:1 (fails WCAG AA 4.5:1 for small text).
Changed to text-zinc-500 ≈ 4.6:1. Size unchanged (already at minimum 9px).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 07:25:34 +00:00
Hongming Wang
8dbd7efc1a fix(canvas): replace nodes.length grid index with monotonic sequence counter (#388)
Root cause of position collision after node deletion:

  handleCanvasEvent(WORKSPACE_PROVISIONING) used nodes.length as the
  grid placement index. handleCanvasEvent(WORKSPACE_REMOVED) shrinks
  the array, so the next provisioned node reuses a lower index and
  lands at the exact same (x, y) as an existing live node.

  Example (4-col grid, COL_SPACING=320):
    Provision A → idx 0 → (100, 100)
    Provision B → idx 1 → (420, 100)
    Provision C → idx 2 → (740, 100)
    Remove    A → nodes.length drops to 2
    Provision D → idx 2 → (740, 100)  ← COLLISION with C

Fix 1 — monotonic _provisioningSequence counter (only ever increases):
  - Replaces nodes.length as the placement index
  - Immune to deletions; every provisioned node gets a unique grid slot
  - resetProvisioningSequence() exported for test teardown only

Fix 2 — the existing restart-path guard (if exists → update, not create)
  already provides idempotency for duplicate WS events on known nodes;
  confirmed: restart path does NOT increment the counter.

Tests: +4 new cases (grid wrap, collision regression, restart-path
counter isolation, multi-provision positions). 485/485 pass.
Build: next build ✓ clean.

Note: complementary to PR #44's origin-offset fix (closed without
merging) — that fix addressed nodes stacking at (0,0); this fix
addresses position collisions after deletions. Both should land.

Co-authored-by: Canvas Agent <agent@canvas.local>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 00:25:33 -07:00
Hongming Wang
8d70d8e7e6 fix(canvas): show all templates in EmptyState grid, not just first 6 (#387)
Templates 7-8 (LangGraph Agent, OpenClaw Agent) were silently hidden
by a hard-coded `.slice(0, 6)` cap. The grid container already has
`max-h-[240px] overflow-y-auto` to handle overflow — the slice was
redundant and harmful. Remove it so all API-returned templates render.

Co-authored-by: UIUX Designer <uiux@molecule-ai.local>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 00:19:24 -07:00
Canvas Agent
2a68e55401 fix(canvas): replace nodes.length grid index with monotonic sequence counter
Root cause of position collision after node deletion:

  handleCanvasEvent(WORKSPACE_PROVISIONING) used nodes.length as the
  grid placement index. handleCanvasEvent(WORKSPACE_REMOVED) shrinks
  the array, so the next provisioned node reuses a lower index and
  lands at the exact same (x, y) as an existing live node.

  Example (4-col grid, COL_SPACING=320):
    Provision A → idx 0 → (100, 100)
    Provision B → idx 1 → (420, 100)
    Provision C → idx 2 → (740, 100)
    Remove    A → nodes.length drops to 2
    Provision D → idx 2 → (740, 100)  ← COLLISION with C

Fix 1 — monotonic _provisioningSequence counter (only ever increases):
  - Replaces nodes.length as the placement index
  - Immune to deletions; every provisioned node gets a unique grid slot
  - resetProvisioningSequence() exported for test teardown only

Fix 2 — the existing restart-path guard (if exists → update, not create)
  already provides idempotency for duplicate WS events on known nodes;
  confirmed: restart path does NOT increment the counter.

Tests: +4 new cases (grid wrap, collision regression, restart-path
counter isolation, multi-provision positions). 485/485 pass.
Build: next build ✓ clean.

Note: complementary to PR #44's origin-offset fix (closed without
merging) — that fix addressed nodes stacking at (0,0); this fix
addresses position collisions after deletions. Both should land.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 07:18:11 +00:00
Hongming Wang
b7f7eff000 fix(a11y): raise ChannelsTab help text from 9px to 11px minimum (#382)
Two helper paragraphs in ChannelsTab.tsx used text-[9px] text-zinc-600:
- Chat IDs discover hint (line 254)
- Allowed Users hint (line 281)

9px fails WCAG 1.4.3 by size alone; zinc-600 on zinc-800/900 bg is
~2.6:1 contrast (fails AA). Changed to text-[11px] text-zinc-500
(~3.8:1 at 11px — acceptable for non-body helper text).

Found in UX audit Run 13 (2026-04-16).

Co-authored-by: UIUX Designer <uiux@molecule-ai.local>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 23:47:36 -07:00
Hongming Wang
94a9f92c50 fix(security): scope PausePollersForToken to requesting workspace (closes #329)
CI 5/6 pass (E2E cancel = run-supersession pattern). Dev Lead review 04:21:  Approved. Fixes cross-tenant token exposure: PausePollersForToken now scoped to requesting workspace_id via SQL WHERE clause. Closes #329.
2026-04-15 21:22:50 -07:00
Hongming Wang
59e6665f94 Merge pull request #251 from Molecule-AI/feat/cookie-consent-banner
feat(canvas): cookie consent banner
2026-04-15 13:49:53 -07:00
Hongming Wang
4b865fa755 feat(canvas): /pricing route with plan selector + Stripe checkout
Adds a public /pricing route the apex + tenant canvas can both serve.
Three-tier plan cards (Free, Starter, Pro) with per-plan CTA buttons
that dispatch correctly regardless of the user's state:

  Free              → redirect to signup
  Anonymous + paid  → redirect to signup (Stripe opens post-auth)
  Authed + paid     → POST /cp/billing/checkout, redirect to Stripe URL
  No tenant slug    → inline error ("pick an org first")
  Network failures  → surfaced in an ARIA alert banner

Files:
- src/lib/billing.ts — plan metadata + startCheckout + openBillingPortal
  wrappers over /cp/billing/{checkout,portal}
- src/components/PricingTable.tsx — client component, lazy session
  probe on first CTA click (no probe for anonymous browsers)
- src/app/pricing/page.tsx — server-rendered shell with SEO metadata,
  links to legal pages in the footer
- Tests: 10 billing helper tests + 9 PricingTable tests (17 total,
  additional ones cover the plan-list canonical order)

Design notes:
- The pricing data (features + prices) is a static const in billing.ts,
  not fetched from the API. Changing prices requires a deploy — which
  we'd need to do anyway for tier definition changes.
- PLAN_ID 'starter' is flagged highlighted=true so the middle card gets
  the 'Most popular' visual treatment. One source of truth; test locks it.
- Session probe is lazy (first CTA click, not mount) so anonymous
  visitors don't generate a /cp/auth/me request just to read the page.

AuthGate interaction:
- On apex (no tenant slug), AuthGate passthrough — /pricing renders freely
- On tenant subdomain, AuthGate still bounces anonymous users to login
  before reaching /pricing — this is the correct UX for the "I'm already
  logged in and want to upgrade my own org" flow

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 13:41:44 -07:00
Hongming Wang
0dd4f25952 feat(canvas): cookie consent banner with privacy-preserving default
Adds a GDPR/ePrivacy-compliant cookie banner to the canvas root layout.
Privacy-preserving default: no optional cookies are considered accepted
until the user clicks "Accept all". Clicking "Necessary only" or
dismissing records "rejected" and the banner does not re-appear until
the cookie-policy version bumps.

- New CookieConsent component wired into src/app/layout.tsx so it
  renders on every canvas route
- Persists decision to localStorage as {decision, decidedAt, version}
- Versioned schema: bumping CURRENT_VERSION re-prompts every user
- Exports hasConsent() helper for feature code that needs to gate
  analytics / functional cookies on user choice
- ARIA: role=dialog + aria-labelledby/aria-describedby so screen
  readers announce it as a dialog
- Same storage key + schema as the control-plane legal-page banner
  (see molecule-controlplane PR #XX) so a user who accepts on one
  surface does not re-see the banner on the other

Tests: 12 Vitest cases covering first-visit render, accept/reject
persistence, version re-prompt, invalid-JSON recovery, privacy link
attrs, ARIA markup, and the hasConsent helper under every state.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 13:01:48 -07:00
Dev Lead Agent
5606b1031b fix(canvas): WCAG critical — ARIA live toasts, dialog focus trap, keyboard nav
Addresses the three release-blocking WCAG violations from the UX audit
(3rd consecutive cycle) and the new ChatTab ARIA gap from Audit #2.

Changes:
- Toaster: split into polite (success/info) + assertive (error) live
  regions, both always in DOM so screen readers register them before
  any toast fires. Adds x dismiss button on every toast. Errors no
  longer auto-expire after 4s — persist until explicitly dismissed.
- ConfirmDialog: on open, requestAnimationFrame focuses the first
  button inside the dialog. Tab/Shift-Tab is now trapped inside the
  dialog while open. Added role="dialog" aria-modal="true" and
  aria-labelledby pointing to the title h3.
- WorkspaceNode: outer div gains role="button", tabIndex={0},
  aria-label, aria-pressed, and onKeyDown (Enter/Space => selectNode,
  ContextMenu key => openContextMenu). Keyboard-only users can now
  reach and activate workspace nodes.
- ChatTab sub-tab bar: role="tablist" on wrapper, role="tab" +
  aria-selected + aria-controls on each button, matching
  role="tabpanel" + id on each panel div. Textarea gets
  aria-label="Message to agent".

453/453 Vitest tests pass. Production build clean (Next.js 15).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 08:31:06 +00:00
Hongming Wang
a754870ee8 Merge pull request #123 from Molecule-AI/fix/settings-dark-theme-a11y
fix(canvas): dark theme a11y — settings buttons, input fields, ReactFlow colorMode, zinc-400 contrast, aria-labels
2026-04-15 01:22:16 -07:00
Dev Lead Agent
8042c6dfc6 fix(canvas): dark theme a11y — settings buttons, input fields, ReactFlow colorMode, zinc-400 contrast, aria-labels
Resolves low-contrast text and theming issues in the settings panel and
canvas overlays when running in dark mode:

- settings-panel.css: input fields (#d4d4d8 text), settings-button--active
  (#1e3a8a bg for better contrast against #3b82f6 accent)
- SearchDialog: placeholder-zinc-400, kbd hints, tier badge, footer counts,
  empty-state text — all lifted from zinc-600 → zinc-400
- ConversationTraceModal: timestamp, arrow separators, truncation ellipsis
  — lifted from zinc-600 → zinc-400
- CommunicationOverlay: arrow separator, age label, duration — zinc-600 → zinc-400
- TemplatePalette: dynamic aria-label on toggle button
  ("Open/Close template palette") for screen-reader clarity

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 07:56:53 +00:00
Dev Lead Agent
b7292e9c13 fix(canvas): WORKSPACE_PROVISIONING grid origin offset — prevent viewport clipping
New nodes were placed at (0,0) or close to it, causing them to spawn
behind the toolbar/palette chrome and require manual panning to find.
Add GRID_ORIGIN_X/Y = 100 offset so the first node lands in clear canvas
space, and update the position assertion in the unit test accordingly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 07:53:45 +00:00
Hongming Wang
043b8fe159 feat(canvas): AuthGate — redirect anonymous users to cp login (Phase F close)
Wraps the canvas root so every tenant-subdomain request checks for a
valid session and bounces to app.moleculesai.app/cp/auth/login with a
return_to pointing back at the current URL. Local dev + vercel preview
URLs + apex pass through unchanged.

Files:
- canvas/src/lib/auth.ts: fetchSession() probes /cp/auth/me
  (credentials:include for cross-origin cookie); returns Session on 200,
  null on 401 (anonymous, no throw), throws on 5xx so transient
  outages don't leak the UI.
- canvas/src/lib/auth.ts: redirectToLogin() builds the cp login URL
  with window.location.href as return_to; CP's isSafeReturnTo check
  rejects cross-domain bounces.
- canvas/src/components/AuthGate.tsx: client component wrapping
  children. State machine: loading → authenticated | anonymous. In
  non-SaaS mode (no tenant slug) skips the gate entirely.
- canvas/src/app/layout.tsx: wraps the root body in <AuthGate>.

Tests: +6 auth.ts (200 / 401 null / 5xx throw / credentials:include /
redirectToLogin href + signup variant). Full suite 453 green (was 447).

Pairs with molecule-controlplane PR #16 (return_to cookie handshake
on the cp side).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 20:37:26 -07:00
Hongming Wang
3abcee11b3 feat(canvas): SaaS cross-origin — slug header + cookie credentials (Phase F)
Canvas will be served at <slug>.moleculesai.app (Vercel). API calls go
cross-origin to https://app.moleculesai.app. This commit wires the
client side:

- canvas/src/lib/tenant.ts: getTenantSlug() derives the slug from
  window.location.hostname, case-insensitive, matching the control
  plane's reservedSubdomains list (app/www/api/admin/…). Server-side
  + localhost + vercel preview URLs + apex all return "" so local dev
  keeps working.

- canvas/src/lib/api.ts: adds X-Molecule-Org-Slug header + sets
  credentials:"include" on every fetch. The control plane's CORS
  middleware allows the origin + credentials; the session cookie has
  Domain=.moleculesai.app so the browser ships it.

- canvas/src/lib/api/secrets.ts: same treatment (secrets API uses its
  own fetch helper — shared slug+credentials logic applied).

Tests: +6 (tenant.test.ts covers slug / reserved / case / non-SaaS /
preview URL / apex). Full canvas suite 447/447 green.

Not in this PR:
- WS URL derivation for terminal/socket.ts (separate follow-up; WS
  needs its own slug-aware URL and the canvas terminal isn't used in
  SaaS launch day-one).
- Next.js rewrites (decided against; cross-origin with credentials
  is cleaner than path-level rewrites for session cookies).

Deploys to Vercel once merged — no manual config needed (env already set).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 20:08:39 -07:00
Hongming Wang
707ee08673 Merge pull request #43 from Molecule-AI/fix/reduced-motion
fix(a11y): prefers-reduced-motion WCAG 2.3.3 compliance
2026-04-14 07:20:19 -07:00
Dev Lead Agent
93b38dfd22 feat(canvas): Z shortcut + help entry for double-click zoom-to-team
Adds Z as a keyboard equivalent for the existing double-click zoom-to-team
gesture (WCAG 2.1.1). When a team node is selected, pressing Z dispatches
molecule:zoom-to-team, which fitBounds to the parent and all children.
Input elements are guarded so Z still types normally in text fields.
Adds a 6th help panel entry documenting the Dbl-click / Z gesture.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 11:36:41 +00:00
Hongming Wang
5546aa5ca7 Merge pull request #42 from Molecule-AI/fix/a11y-audit-11
fix: ARIA tablist for side panel, Radix Dialog for create modal, aria-live for loading states (audit 11)
2026-04-14 04:27:35 -07:00
Dev Lead Agent
889d397d32 fix(a11y): prefers-reduced-motion WCAG 2.3.3 compliance
globals.css: append @media (prefers-reduced-motion: reduce) block that zeroes
animation/transition durations, disables .animate-in/.slide-in-from-* entry
animations (Toaster, ApprovalBanner, SidePanel slide), strips dashdraw and
node-appear keyframes from React Flow elements.

Components: replace all bare animate-pulse (13 occurrences across WorkspaceNode,
StatusDot, Toolbar, SidePanel, Legend, SearchDialog, TerminalTab, TemplatePalette)
with motion-safe:animate-pulse so status indicator pulsing stops for users with
vestibular disorders. Replace 3 animate-bounce occurrences in ChatTab typing
indicator with motion-safe:animate-bounce.

Tests: new canvas/src/__tests__/reduced-motion.test.ts (12 tests) verifies the
@media block is present in globals.css and that every component file uses the
motion-safe: variant rather than bare animation classes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 11:25:23 +00:00
Dev Lead Agent
6f72bc5016 fix: Radix Dialog for create modal, ARIA tablist for side panel, aria-live for loading states (audit 11)
- CreateWorkspaceDialog: replace plain div modal with @radix-ui/react-dialog (focus-trap,
  Escape-to-close, aria-labelledby auto-wired); tier selector uses role=radiogroup/radio +
  aria-checked; error uses role=alert; required fields annotate with sr-only "(required)"
- SidePanel: WAI-ARIA tablist pattern — role=tablist + aria-label, role=tab + aria-selected +
  aria-controls + id, roving tabIndex (0/−1), ArrowRight/Left/Home/End keyboard nav with wrap,
  role=tabpanel + id + aria-labelledby on content area, tab icons are aria-hidden
- TemplatePalette: loading and empty-state divs gain role=status + aria-live=polite
- Canvas: sr-only role=status live region announces workspace count to screen readers
- Tests: 7 new a11y tests for CreateWorkspaceDialog (Radix role=dialog, aria-labelledby,
  data-state, Cancel close, role=alert validation, role=radio tier); 12 new tab tests for
  SidePanel (tablist, 12 tabs, aria-selected, roving tabIndex, aria-controls, tabpanel,
  ArrowRight/Left/Home/End)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 10:31:34 +00:00
Dev Lead Agent
c032a7dbfb fix: keyboard navigation for ContextMenu (WCAG 2.1.1) and SearchDialog combobox pattern
- ContextMenu: role=menu/menuitem/separator, aria-label, aria-disabled,
  focus-visible ring, auto-focus first enabled item on open,
  ArrowDown/Up roving focus (wrapping), Escape + Tab dismiss,
  aria-hidden on decorative icons/status dot
- SearchDialog: role=dialog+aria-modal, combobox pattern on input
  (role=combobox, aria-expanded, aria-autocomplete, aria-controls,
  aria-activedescendant), focusedIndex state, ArrowDown/Up/Enter
  keyboard navigation, role=listbox+option, aria-selected, role=status
  + aria-live=polite on empty state, footer hints updated with ↑↓
- Add 10 ContextMenu keyboard tests (role, aria-label, menuitem,
  separator, Escape, Tab, ArrowDown, wrap, ArrowUp wrap, null guard)
- Add 13 SearchDialog keyboard tests (dialog, aria-modal, combobox,
  listbox, option, ArrowDown, double-ArrowDown, clamp, ArrowUp-clamp,
  Enter select, Enter noop, query reset, activedescendant)

Tests: 406 passed (383 existing + 23 new)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 09:28:10 +00:00
Hongming Wang
eb91867340 Merge pull request #37 from Molecule-AI/fix/audit-run9
feat(canvas): WebSocket connection status indicator in Toolbar
2026-04-14 02:21:29 -07:00
Dev Lead Agent
7c52661280 fix(canvas): close 4 gaps in WS status indicator (env, toast, tests)
Gap 1 — WS_URL now derives from NEXT_PUBLIC_PLATFORM_URL when
NEXT_PUBLIC_WS_URL is not set (http→ws, appends /ws; https→wss).
Operators need only one env var. NEXT_PUBLIC_WS_URL remains an explicit
override escape hatch.

Gap 2 — Add canvas/.env.example documenting NEXT_PUBLIC_PLATFORM_URL
(required) and NEXT_PUBLIC_WS_URL (optional override, commented out).

Gap 3 — Toolbar fires showToast("Live updates restored", "success")
when wsStatus transitions connecting→connected. mountedRef (set after
2 s) suppresses the toast on the very first page-load connection so
only genuine reconnects notify the user.

Gap 4 — New canvas/src/store/__tests__/socket.url.test.ts (6 tests):
  · fallback to ws://localhost:8080/ws when no env set
  · http→ws derivation from NEXT_PUBLIC_PLATFORM_URL
  · https→wss derivation
  · NEXT_PUBLIC_WS_URL override takes precedence
  · api.ts PLATFORM_URL fallback
  · api.ts reads NEXT_PUBLIC_PLATFORM_URL

375/375 tests passing, production build clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 08:26:38 +00:00
Hongming Wang
672421cc22 Merge pull request #34 from Molecule-AI/fix/audit-run8
fix: workspace parent combobox + WCAG button text minimum 11px
2026-04-14 01:25:04 -07:00
Dev Lead Agent
53374ca391 feat(canvas): add WebSocket connection status indicator to Toolbar
Adds a live/reconnecting/offline pill to the Toolbar so users can see
at a glance whether the canvas is receiving real-time updates.

Changes:
- canvas/src/store/canvas.ts: add wsStatus ('connected'|'connecting'|
  'disconnected') field + setWsStatus action to CanvasState (initial:
  'connecting')
- canvas/src/store/socket.ts: wire setWsStatus into ReconnectingSocket —
  'connecting' on connect() call, 'connected' in onopen, 'connecting'
  in onclose (will reconnect), 'disconnected' in disconnect()
- canvas/src/components/Toolbar.tsx: subscribe to wsStatus; render
  WsStatusPill (green "Live" / amber pulsing "Reconnecting" / red
  "Offline") after the workspace count section
- canvas/src/store/__tests__/socket.test.ts: add setWsStatus: vi.fn()
  to the canvas store mock (global factory, beforeEach reset, and the
  mid-test override in the onmessage test)

369/369 canvas tests passing, production build clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 08:21:57 +00:00
Dev Lead Agent
218fa4ec33 fix: workspace parent combobox, WCAG button text minimum 11px
Replace raw Parent Workspace ID text input with a <select> populated
from GET /workspaces (T{tier} · {name} format, graceful fallback on
fetch error). Raise all interactive button text from text-[8px]/[9px]
to text-[11px] across SkillsTab, ScheduleTab, secrets-section,
ActivityTab, SidePanel, ChatTab; non-interactive labels/badges to
text-[10px]. Adds 7 CreateWorkspaceDialog unit tests (372/372 passing).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 07:27:49 +00:00
Hongming Wang
c9e1a8e6e2 Merge pull request #32 from Molecule-AI/fix/a11y-landmarks
fix: add main landmark, skip link, and aria-label to canvas (WCAG 2.4.1/2.4.6)
2026-04-14 00:23:24 -07:00
Dev Lead Agent
a7ea544d28 fix: add main landmark, skip link, and aria-label to canvas (WCAG 2.4.1/2.4.6)
- Wrap CanvasInner return in React Fragment to host skip-nav link as sibling of <main>
- Add <a href="#canvas-main"> skip link (sr-only, revealed on focus) before <main>
- Add id="canvas-main" to <main> element
- Add aria-label="Molecule AI workspace canvas" to ReactFlow wrapper
- Add Canvas.a11y.test.tsx: 4 jsdom tests covering all three a11y landmarks

369/369 tests pass; next build clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 06:24:57 +00:00
Dev Lead Agent
4fc92e6ff4 fix(canvas): raise minimum text size in Legend and WorkspaceNode to meet WCAG readability
UX Audit Run 6 critical finding: Legend panel and workspace node cards used 8px and 9px
text (6–7pt), which is physically unreadable and fails WCAG minimum guidelines.

- Legend.tsx: raise all text-[8px]/[9px]/[10px] → text-[11px] across every sub-component
  (StatusItem labels, TierItem badge+label, CommItem icon+label, section headers)
- WorkspaceNode.tsx: raise text-[8px]/[9px] → text-[10px] for all readable labels in
  the main card (status text, skill badges, task/error banners, tier badge, sub count,
  Team Members header) and TeamMemberChip primary name/role text

Compact 7px elements inside TeamMemberChip (tier/sub badges, status micropills) retained
to preserve dense canvas layout — only human-readable labels were upgraded.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 05:21:04 +00:00
Hongming Wang
897fa73952 Merge pull request #25 from Molecule-AI/fix/node-stacking
fix: auto-layout zero-position nodes, fix new-node x===y stacking
2026-04-13 21:31:58 -07:00
Dev Lead Agent
1a56c03192 fix: auto-layout zero-position nodes on hydrate, fix new-node x===y bug
- computeAutoLayout() BFS tree layout seeds from anchored nodes; assigns
  distinct x/y to workspaces returned at 0,0 by the API and persists via PATCH
- buildNodesAndEdges() accepts layoutOverrides map so hydration uses computed
  positions instead of raw 0,0 coordinates
- canvas-events WORKSPACE_PROVISIONING grid layout replaces offset===offset
  assignment that caused position:{x:t,y:t} in the minified bundle
- 8 new vitest tests cover computeAutoLayout and override behaviour (365 pass)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 04:25:25 +00:00
Hongming Wang
96ff719a24 Merge pull request #21 from Molecule-AI/fix/uiux-audit
fix: UX audit — dark theme buttons, input backgrounds, ReactFlow dark mode, contrast & a11y
2026-04-13 20:32:37 -07:00
Dev Lead Agent
6788f3dd0f fix: UX audit — dark theme buttons, input backgrounds, ReactFlow dark mode, contrast & a11y
- Fix 1: 6 CTA buttons (#f4f4f5/#18181b → #2563eb/#ffffff) for dark theme legibility
- Fix 2: Dark backgrounds on add-key-form and key-value-field inputs
- Fix 3: Add colorMode="dark" prop to ReactFlow canvas
- Fix 4: Replace non-standard #0066cc with #3b82f6 in focus ring, clear-search, settings-button--active
- Fix 5: Improve text contrast (zinc-600/zinc-500 → zinc-400) in EmptyState tips/loading
- Fix 6: aria-label="Template Palette" on palette toggle button
- Fix 7: aria-label="Refresh org templates" + font-size 9px→10px on ↻ button

Tests: 357/357 ✓  Build: clean ✓

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 02:26:45 +00:00
Hongming Wang
235b4b192b test(e2e): add Playwright smoke for FilesTab split
Walks the real UI end-to-end:
1. Creates + registers a workspace on the platform
2. Opens the detail side panel
3. Clicks the Files tab (force-click since it's in an overflow-x bar)
4. Asserts all 3 split components render:
   - FilesToolbar: "+ New" + "Upload" buttons
   - FileTree: the config.yaml seeded by the default template
   - FileEditor: "Select a file to edit" empty-state

Saves screenshots at /tmp/filestab-{1,2,3}-*.png for manual review.

Run: cd canvas && npx playwright test e2e/filestab-smoke.spec.ts

Requires platform on :8080 + canvas on :3000.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 18:14:54 -07:00
Hongming Wang
c71cd39ee7 refactor(canvas): split 650-line FilesTab.tsx into focused components
Pure restructure — no behavior change. Extracts FileTree, FileEditor,
FilesToolbar, useFilesApi hook, and tree utilities into sibling files
under canvas/src/components/tabs/FilesTab/. Top-level FilesTab.tsx is
now 240 lines (glue + confirmations); re-exports buildTree/TreeNode so
the existing import path and tests remain stable.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 18:00:20 -07:00
Hongming Wang
d751420679 test: 100% coverage of extracted helpers + ConfirmDialog singleButton
Follow-up to the quality-fixes-pass2 code review.

## Go: direct unit tests for PR #5 extracted helpers (~47 new tests)

a2a_proxy_test.go:
- resolveAgentURL: cache hit, cache-miss DB hit, not-found, null-URL,
  docker-rewrite guard
- dispatchA2A: build error, canvas timeout, agent timeout, success
- handleA2ADispatchError: context deadline, generic error, build error
- maybeMarkContainerDead: nil-provisioner, runtime=external short-circuits
- logA2AFailure, logA2ASuccess: activity_logs row content + status

delegation_test.go:
- bindDelegateRequest: valid / malformed / bad-UUID
- lookupIdempotentDelegation: no-key / no-match / failed-row-deleted / existing-pending
- insertDelegationRow: insertOK / insertHandledByIdempotent /
  insertTrackingUnavailable
- insertDelegationOutcome: zero-value is insertOutcomeUnknown sentinel

discovery_test.go:
- discoverWorkspacePeer: online / not-found / access-denied + 2 edges
- writeExternalWorkspaceURL: 3 cases
- discoverHostPeer: smoke test documents the unreachable-by-design path

activity_test.go:
- parseSessionSearchParams: defaults + custom limit/offset/q
- buildSessionSearchQuery: no-filters + with-query shapes
- scanSessionSearchRows: empty / single / multiple rows

Package coverage: 56.1% → 57.6%. Every helper extracted in PR #5 is
now at or near 100% line coverage (see PR notes for the 4 remaining
gaps, all blocked on provisioner interface mockability).

## Defensive enum zero-value fix

insertDelegationOutcome now starts with insertOutcomeUnknown=0 as a
sentinel so an un-initialized variable can't silently read as
"success". insertOK, insertHandledByIdempotent, insertTrackingUnavailable
shift to 1/2/3. No caller changes needed.

## Canvas: ConfirmDialog.singleButton test (5 cases)

canvas/src/components/__tests__/ConfirmDialog.test.tsx covers:
- default render (both buttons)
- singleButton hides Cancel
- singleButton: Escape still fires onCancel
- singleButton: backdrop-click still fires onCancel
- singleButton: onConfirm fires on click

vitest total: 352 → 357, all passing.

## Docstring clarity

ConfirmDialog.tsx: expanded singleButton prop comment to explicitly
instruct callers to pass the same handler for onConfirm/onCancel when
using it as an info toast (matches TemplatePalette usage).

## ErrorBoundary clipboard observability

.catch(() => {}) silently swallowed rejections. Now:
.catch((e) => console.warn("clipboard write failed:", e))
so permission-denied / insecure-context failures surface in the console.

## Verification

- go build ./... clean
- go vet ./... clean
- go test -race ./internal/... — all pass
- canvas npm run build — clean
- canvas npm test -- --run — 357/357 pass
- tests/e2e/test_api.sh — 46/62 pass; all 16 failures are pre-existing
  (token-auth enforcement + stale test workspaces + missing Docker
  network). None involve handlers touched in PR #5.
- Manual: platform + canvas running locally, title=Molecule AI,
  /workspaces returns [], /health returns ok. Identified + killed a
  stale Next.js server from the old Starfire-AgentTeam repo that was
  serving the old brand on IPv4 port 3000.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 17:08:33 -07:00
Hongming Wang
232766d0da chore: address follow-up code review — named enum, singleButton, tests
Post-review fixes on top of the quality-pass-2 branch.

1. delegation.go: replaced insertDelegationRow's (bool, bool) return
   with a typed insertDelegationOutcome enum (insertOK /
   insertHandledByIdempotent / insertTrackingUnavailable). Eliminates
   the positional-boolean decoding the caller had to do. Internal, no
   behavior change.

2. ConfirmDialog.tsx: added singleButton prop. When true, hides the
   Cancel button for single-action info toasts (Esc still dismisses
   via onCancel). TemplatePalette's import notice uses it.

3. ErrorBoundary.tsx: fixed the floating clipboard promise. Added
   .catch(() => {}) so a rejected writeText (permission denied,
   insecure context) doesn't surface as unhandled rejection.

4. a2a_proxy_test.go: added 5 direct unit tests for
   normalizeA2APayload (invalid JSON, wraps-bare, preserves-existing-
   id, preserves-existing-messageId, missing-method). Fills the unit-
   test gap for the helper extracted in the last pass.

Verification:
- go test -race ./internal/handlers/... passes (incl. 5 new tests)
- go build ./... clean
- canvas npm run build clean
- canvas npm test -- --run -> 352/352

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 14:45:05 -07:00
Hongming Wang
789f568bef chore: quality pass — native dialogs, env sync, Go handler splits
Three parallel cleanups driven by the second code-review pass.

## Native dialogs → ConfirmDialog (7 sites)

Violated the standing feedback_no_native_dialogs rule.

- ChannelsTab: confirm() → ConfirmDialog danger variant with pendingDelete state
- ScheduleTab: window.confirm() → ConfirmDialog danger
- ChatTab: confirm("Restart...") → ConfirmDialog warning (restart is recoverable)
- TemplatePalette: two alert() sites collapsed into a single notice state +
  ConfirmDialog as OK-only info toast
- ErrorBoundary: dropped both window.alert calls entirely. Clipboard-copy
  click is self-evident; console.error already captures the fallback.

## .env.example ↔ Go env var sync

Added 11 previously-undocumented env vars grouped into 6 new sections:

- Platform: PLATFORM_URL, MOLECULE_URL, WORKSPACE_DIR, MOLECULE_ENV
- CORS / rate limiting: CORS_ORIGINS, RATE_LIMIT
- Activity retention: ACTIVITY_RETENTION_DAYS, ACTIVITY_CLEANUP_INTERVAL_HOURS
- Container detection: MOLECULE_IN_DOCKER (moved to dedup)
- Observability: AWARENESS_URL
- Webhooks: GITHUB_WEBHOOK_SECRET
- CLI: MOLECLI_URL

All 21 distinct os.Getenv / envx.* keys (excluding HOME) now documented.
Zero orphans in the other direction.

## Go handler function splits (4 funcs, pure refactor)

No behavior change; same tests pass.

| Function                  | Before | After | Helpers                                                       |
|---------------------------|-------:|------:|---------------------------------------------------------------|
| proxyA2ARequest           |    257 |    56 | resolveAgentURL, normalizeA2APayload, dispatchA2A,            |
|                           |        |       | handleA2ADispatchError, maybeMarkContainerDead,               |
|                           |        |       | logA2AFailure, logA2ASuccess                                  |
| Delegate                  |    127 |    60 | bindDelegateRequest, lookupIdempotentDelegation,              |
|                           |        |       | insertDelegationRow                                           |
| Discover                  |    125 |    40 | discoverWorkspacePeer, writeExternalWorkspaceURL,             |
|                           |        |       | discoverHostPeer                                              |
| SessionSearch             |    109 |    24 | parseSessionSearchParams, buildSessionSearchQuery,            |
|                           |        |       | scanSessionSearchRows                                         |

Preserved exact error semantics, log.Printf calls, status codes, and
response shapes. Introduced a proxyDispatchBuildError sentinel in
a2a_proxy so the orchestrator can distinguish "couldn't build the
request" from "Do() failed" without changing existing branches.

## Verification

- go build ./... clean
- go vet ./... clean
- go test -race ./internal/... — all pass
- canvas npm run build — clean
- canvas npm test -- --run — 352/352 pass
- grep window.confirm|window.alert|window.prompt in canvas/src — 0 matches
- every platform os.Getenv key present in .env.example

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 14:36:30 -07:00
Hongming Wang
d1b479e51a chore: replace brand icon and add HANDOFF.md
Swap in the new molecular-graph icon across canvas favicon, in-app logo,
and README branding paths. Add HANDOFF.md as the cross-session context
doc carried over from the Starfire→Molecule AI migration. Fix stale
"Starfire" reference in the pre-commit hook header.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 13:03:40 -07:00
Hongming Wang
24fec62d7f initial commit — Molecule AI platform
Forked clean from public hackathon repo (Starfire-AgentTeam, BSL 1.1)
with full rebrand to Molecule AI under github.com/Molecule-AI/molecule-monorepo.

Brand: Starfire → Molecule AI.
Slug: starfire / agent-molecule → molecule.
Env vars: STARFIRE_* → MOLECULE_*.
Go module: github.com/agent-molecule/platform → github.com/Molecule-AI/molecule-monorepo/platform.
Python packages: starfire_plugin → molecule_plugin, starfire_agent → molecule_agent.
DB: agentmolecule → molecule.

History truncated; see public repo for prior commits and contributor
attribution. Verified green: go test -race ./... (platform), pytest
(workspace-template 1129 + sdk 132), vitest (canvas 352), build (mcp).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 11:55:37 -07:00