Commit Graph

2671 Commits

Author SHA1 Message Date
molecule-ai[bot]
a9c0cdadfe
docs(devrel): add Tool Trace + Platform Instructions demo (#1844)
PR #1686 introduced two platform-level features:
- Tool Trace: tool_call list in A2A metadata, stored in activity_logs.tool_trace JSONB
- Platform Instructions: admin-configurable instruction text (global/workspace scope),
  injected as first section of every agent's system prompt at startup

Demo covers 5 scenarios: admin creates global instruction, workspace-scoped instruction,
agent fetches resolved instructions at boot, admin lists instructions, and query activity
logs with tool_trace. Includes screencast outline (5 moments, ~90s) and TTS narration script.

Co-authored-by: Molecule AI DevRel Engineer <devrel-engineer@agents.moleculesai.app>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-23 19:16:27 +00:00
Hongming Wang
7cd9ad1959
Merge pull request #1802 from Molecule-AI/fix/main-orgtoken-mocks
fix(orgtoken): restore flexible LIMIT regex in TestList_NewestFirst
2026-04-23 12:04:51 -07:00
molecule-ai[bot]
0466dc5f7e
Merge branch 'staging' into fix/main-orgtoken-mocks 2026-04-23 18:59:34 +00:00
Hongming Wang
d6abc1286f
fix(workspace): auto-fill model from template's runtime_config when missing (#1779)
Extends the existing "read runtime from template config.yaml"
preflight to also pre-fill `model` from the template's
runtime_config.model (current format) or top-level `model:` (legacy
format). Without this, any create path that names a template but
doesn't pass an explicit model produced a workspace with empty
model — and hermes-agent's compiled-in Anthropic fallback ran with
whatever key the user did provide, 401'ing at the first A2A call.

Affected paths (all produced broken workspaces before this change):
- TemplatePalette "Deploy" button (POSTs only name + template + tier)
- Direct API / script callers (MCP, CI scripts)
- Anyone copying an existing workspace's template name without model

PR #1714 fixed the canvas CreateWorkspaceDialog's hermes branch —
when the user typed template="hermes" in the dialog, a provider
picker + model auto-fill kicked in. But TemplatePalette and direct
API calls bypassed that dialog entirely, so the trap stayed open.

Fix is backend-side so it catches every caller at once (defense in
depth). The parser is line-based + a minimal state var tracking
whether the current line sits under `runtime_config:` — matches the
existing fragile-but-safe style used for `runtime:` above. Strings
are trimmed of quote wrappers so both `model: x` and `model: "x"`
round-trip.

Explicit model in the payload still wins — we only pre-fill when
payload.Model is empty. Added TestWorkspaceCreate_
CallerModelOverridesTemplateDefault to pin that contract.

## Tests
- TestWorkspaceCreate_TemplateDefaultsMissingRuntimeAndModel — the
  hermes-trap fix: runtime=hermes + model=nousresearch/... inherits
  from template when payload omits both.
- TestWorkspaceCreate_TemplateDefaultsLegacyTopLevelModel — legacy
  top-level `model:` still fills.
- TestWorkspaceCreate_CallerModelOverridesTemplateDefault — explicit
  payload.model NOT overwritten.
- Full suite `go test -race ./...` stays green.

## Complementary work in flight
- PR molecule-core#1772 — fixes the E2E Staging SaaS which had the
  same trap on its own POST body (missing provider prefix).
- Canvas TemplatePalette could still surface a richer per-template
  key picker (deferred; MissingKeysModal already handles keys, and
  the default model now flows from the template config).

Co-authored-by: Hongming Wang <hongmingwang.rabbit@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com>
2026-04-23 18:58:04 +00:00
Hongming Wang
a5ca587516
Merge pull request #1826 from Molecule-AI/fix/coverage-gate-platform-go-1823
ci(platform-go): add critical-path coverage gate + per-file report (#1823)
2026-04-23 11:46:38 -07:00
molecule-ai[bot]
bbc59fccf8
Merge branch 'staging' into fix/coverage-gate-platform-go-1823 2026-04-23 18:40:23 +00:00
Hongming Wang
f001a4cf5e
fix(registry): heartbeat transitions provisioning→online on first heartbeat (#1784) (#1794)
Workspaces restart with status='provisioning' and never transition to
'online' because the runtime never calls /registry/register after
container start — only the heartbeat loop runs post-boot. The heartbeat
handler had transitions for online→degraded, degraded→online, and
offline→online, but NOT provisioning→online, leaving newly-started
workspaces in a phantom-idle state where the scheduler defers dispatch
and the A2A proxy rejects them even though they're running fine.

Fix: add provisioning→online transition to evaluateStatus(), guarded by
`AND status = 'provisioning'` in the UPDATE WHERE clause so a concurrent
Delete cannot flip 'removed' back to 'online'. Broadcasts WORKSPACE_ONLINE
with recovered_from='provisioning' so dashboard/scheduler reflect reality.

Add TestHeartbeatHandler_ProvisioningToOnline to cover the new path.

Issue: Molecule-AI/molecule-core#1784

Co-authored-by: Molecule AI Core-BE <core-be@agents.moleculesai.app>
Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com>
2026-04-23 18:34:10 +00:00
rabbitblood
1a084426da Merge remote-tracking branch 'origin/staging' into fix/coverage-gate-platform-go-1823 2026-04-23 11:26:22 -07:00
Hongming Wang
c23ff848aa
fix(cp-provisioner): look up real EC2 instance_id for Stop + IsRunning (#1738)
Resolves a "Save & Restart cascade" failure on SaaS tenants. Observed
2026-04-22 on hongmingwang workspace a8af9d79 after a Config-tab save:

  03:13:20 workspace deprovision: TerminateInstances
           InvalidInstanceID.Malformed: a8af9d79-... is malformed
  03:13:21 workspace provision: CreateSecurityGroup
           InvalidGroup.Duplicate: workspace-a8af9d79-394 already
           exists for VPC vpc-09f85513b85d7acee

Root cause: CPProvisioner.Stop and IsRunning passed the workspace UUID
as the `instance_id` query param to CP. CP forwarded it to EC2
TerminateInstances, which rejected it (EC2 ids are i-…, not UUIDs).
The failed terminate left the workspace's SG attached → the immediate
re-provision hit InvalidGroup.Duplicate → user saw `provisioning
failed`.

Fix: both methods now call a new `resolveInstanceID` that reads
`workspaces.instance_id` from the tenant DB and passes the real EC2
id downstream. When no row / no instance_id exists, Stop is a no-op
and IsRunning returns (false, nil) so restart cascades can freshly
re-provision.

resolveInstanceID is exposed as a `var` package-level func so tests
can swap it for a pairs-map stub without standing up sqlmock — the
per-table DB scaffolding was a heavier price than the surface
warranted given these tests are about the CP HTTP flow downstream
of the lookup, not the lookup SQL itself.

Adds regression tests:
  - TestStop_EmptyInstanceIDIsNoop: no DB row → no CP call
  - TestIsRunning_UsesDBInstanceID: DB id round-trips to CP
  - TestIsRunning_EmptyInstanceIDReturnsFalse: no instance → false/nil
Updates existing tests to assert the resolved instance_id (i-abc123
variants) instead of the previous buggy workspaceID.

After this lands, user's existing workspaces with stale instance_id
bindings still need a manual cleanup of the orphaned EC2 + SG (done
for a8af9d79 today). Future restarts use the correct id.

Co-authored-by: Hongming Wang <hongmingwang.rabbit@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 18:25:29 +00:00
molecule-ai[bot]
df257c41af
Merge branch 'staging' into fix/main-orgtoken-mocks 2026-04-23 18:24:50 +00:00
rabbitblood
f536768d02 ci: fix regex + add coverage allowlist (14 known 0% critical paths)
First run of the gate found 14 security-critical files at 0% coverage —
exactly the debt the user's audit flagged. Rather than block this PR on
fixing all 14 (scope creep), acknowledge them in .coverage-allowlist.txt
with 30-day expiry + #1823 reference.

Regex bug: `go tool cover -func` emits `file.go:LINE:TAB...` (single colon
after line, no column on some Go versions). My original `:[0-9]+\..*`
required a period after the line number, which never matched, so file
names kept their `:LINE:` suffix. Fixed to `:[0-9][0-9.]*:.*` which
accepts both `:LINE:` and `:LINE.COL:` formats.

Allowlist pattern: paths in `.coverage-allowlist.txt` warn (not fail),
new critical-path files at <10% coverage fail. This makes the gate land
cleanly AND keeps the teeth for regressions.

Allowlisted files (all tracked under #1823, expire 2026-05-23):

  Tight-match critical paths:
    - internal/handlers/a2a_proxy.go
    - internal/handlers/a2a_proxy_helpers.go
    - internal/handlers/registry.go
    - internal/handlers/secrets.go
    - internal/handlers/tokens.go
    - internal/handlers/workspace_provision.go
    - internal/middleware/wsauth_middleware.go

  Looser substring matches (flagged because my CRITICAL_PATHS entries use
  contains-match; follow-up PR to use exact prefix match):
    - internal/channels/registry.go
    - internal/crypto/aes.go
    - internal/registry/*.go (access, healthsweep, hibernation, provisiontimeout)
    - internal/wsauth/tokens.go

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 11:20:36 -07:00
Hongming Wang
925a71887d
fix(workspace): credential helper security hardening (#1797)
Four findings from security audit (internal/security/credential-token-backlog.md):

1. STDERR LEAK — molecule-git-token-helper.sh:146,153 logged ${response}
   on platform errors. The response body MAY contain the token in some
   failure modes (alternate JSON key shape on partial success). Now:
   - capture curl's stderr to a tmp file (not $response) so we can log
     the curl error message without ever interpolating the response body
   - on empty-token branch, log only response size (bytes) for debug
2. CHMOD 600 — already in place at lines 116, 124, 223 (verified, no change)
3. RESPAWN SUPERVISION — entrypoint.sh wrapped daemon launch in a
   while-true bash loop with 30s back-off. Without this, a daemon crash
   silently leaves the workspace stuck on an expired token until the
   container restarts. Logs to /home/agent/.gh-token-refresh.log
   (agent-writable; /var/log is root-owned).
4. JITTER — molecule-gh-token-refresh.sh: added 0..120s random offset to
   each sleep so 39 containers don't synchronize their refresh requests
   against the platform endpoint.

Also:
- Daemon now sends helper output to /dev/null instead of merging stderr,
  belt-and-suspenders against any future helper change that might write
  the token to stdout.
- Daemon log lines include rc=$? on failure for actionable triage.

Inherent risks (org-wide token blast, prompt-injection theft, bearer
in volume, no audit log) tracked in internal/security/credential-token-backlog.md
as separate roadmap items.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com>
2026-04-23 18:14:55 +00:00
molecule-ai[bot]
5f0bfc1f19
Merge branch 'staging' into fix/main-orgtoken-mocks 2026-04-23 18:12:47 +00:00
rabbitblood
c4bb325267 ci(platform-go): add critical-path coverage gate + per-file report (#1823)
## Problem

External audit flagged critical security-path files at 0% coverage:
  - workspace-server/handlers/tokens.go            0%  (target 90%+)
  - workspace-server/handlers/workspace_provision  0%  (target 75%+)
  - workspace-server/middleware/wsauth            ~48% (target 90%+)

Tests *exist* for these files (tokens_test.go is 200 lines, workspace_
provision_test.go is 1138 lines) — they just don't exercise the critical
branches where auth/provisioning decisions happen. CI's existing coverage
step measured total coverage (floor 25%) but never checked per-file,
so any single file could drop to 0% and CI stayed green.

## Fix — Layer 1 of #1823 (strictly additive)

1. **Per-file coverage report** — advisory step prints every source file
   with its coverage, sorted worst-first. Reviewers see the gap at a
   glance. Does not fail the build.

2. **Critical-path per-file gate** — if any non-test source file in a
   security-sensitive directory (tokens, workspace_provision, a2a_proxy,
   registry, secrets, wsauth, crypto) has coverage ≤10%, CI fails with
   a specific error message pointing at the file + #1823.

3. **Unchanged: total floor stays at 25%** — ratcheting is a separate PR
   so this one has zero risk of breaking existing coverage. Ratchet plan
   lives in COVERAGE_FLOOR.md (monthly schedule through Oct 2026 to reach
   70% total / 70% critical).

## Why this specifically

"Tell devs to write tests" doesn't fix this — the prompts already
require tests ("Write tests for every handler, every query, every edge
case"), and the engineers mostly do. The gap is mechanical: CI generates
coverage.out and throws it away without checking per-file distribution.

This gate makes "no untested security path merges" a property of the CI,
not a property of QA agents who (as of today's incident) can go phantom-
busy for hours.

## Smoke test

Local awk-logic verification with synthetic coverage.out:
  - tokens.go at 2.5% (critical path, ≤10%)           → correctly FAILS
  - noncritical.go at 0.0% (not in critical list)     → correctly PASSES
  - wsauth_middleware.go at 65% (critical, above 10%) → correctly PASSES
  - crypto/kek.go at 85% (critical, above 10%)        → correctly PASSES

Regex bug caught and fixed: go tool cover -func emits
  file.go:LINE.COL:FUNC  PERCENT
The stripper needed :[0-9]+\..* not :[0-9]+:.*

## Follow-up (not in this PR)

- Layer 2 (issue #1823): per-changed-file delta gate via diff-cover,
  enforcing the prompt rule ">80% on changed files"
- Add these two new steps to branch protection required checks
- Canvas (Next.js) equivalent with vitest --coverage + threshold

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 11:12:40 -07:00
Hongming Wang
cfdaefe5bc
docs(blog): Phase 34 — Partner API Keys, Governance, Tool Trace (clean extract) (#1799)
* docs(blog): add Phase 34 blog posts — Partner API Keys, Governance, Tool Trace

- Partner API Keys: partner-gated MCP server access for enterprise
- Platform Instructions Governance: org-scoped AI instruction governance
- Tool Trace Observability: debug/audit AI agent decision trees

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(blog): remove og_image refs from Phase 34 posts — images TBD

OG images are a known gap across many posts in the repo. Removed og_image
lines from all 4 Phase 34 posts to avoid 404s. Social Media Brand to
generate final assets. Also fixed broken link in governance post:
/docs/blog/ai-agent-observability-without-overhead → /blog/...

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Molecule AI Content Marketer <content-marketer@agents.moleculesai.app>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com>
2026-04-23 18:02:44 +00:00
Hongming Wang
7d15a02a3d
docs(tutorials): Chrome DevTools MCP quickstart + live agent transcript demo (clean extract) (#1798)
* docs(tutorial): add Chrome DevTools MCP quickstart — 3 runnable demos

- Demo 1: screenshot-based visual regression
- Demo 2: authenticated session scraping with workspace secrets
- Demo 3: automated Lighthouse audit on every PR
- Governance config: plugin allowlisting, token-scoped sessions
- SSRF protection notes and troubleshooting table
- Links to MCP setup guide, org API keys, Chrome DevTools blog post

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs(tutorials): add live agent transcript endpoint demo (devrel #521)

---------

Co-authored-by: Molecule AI DevRel Engineer <devrel-engineer@agents.moleculesai.app>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com>
2026-04-23 17:57:11 +00:00
molecule-ai[bot]
833fbeaa5c
fix(canvas/a11y): aria-hidden SVGs, MissingKeysModal semantics, session cookie auth (#1744)
1. f675500: aria-hidden="true" on decorative SVG icons in
   DeleteCascadeConfirmDialog warning icon and Toolbar stop/restart
   /search/help icons. All have adjacent aria-label text or parent
   button aria-label — correct.

2. eb87737: session cookie auth fallback for /registry/:id/peers
   SaaS canvas path. verifiedCPSession() checked after bearer token
   in validateDiscoveryCaller, allowing canvas to hit the Peers tab
   via session cookie rather than bearer token. Self-hosted bypass
   logic preserved.

3. 80fedd6: MissingKeysModal dialog semantics — role="dialog",
   aria-modal="true", aria-labelledby="missing-keys-title",
   requestAnimationFrame focus management. Also removes stale
   aria-describedby={undefined} from CreateWorkspaceDialog.

Co-authored-by: Molecule AI App & Docs Lead <app-docs-lead@agents.moleculesai.app>
Co-authored-by: molecule-ai[bot] <molecule-ai[bot]@users.noreply.github.com>
2026-04-23 17:39:38 +00:00
cd1d678cd3 fix(orgtoken): restore flexible regex in TestList_NewestFirst
The PR #1683 fix to TestList used a literal column-name regex that
doesn't match the actual List() query. sqlmock uses regex matching:
- Actual query uses COALESCE(name,'') wrappers
- Literal 'name' doesn't match 'COALESCE(name,'')'
- Also missing WHERE clause and LIMIT

Revert to the flexible pattern used on main (SELECT id, prefix.*)
with explicit LIMIT allowance — proven working on main branch.

TestValidate_HappyPath 3-column fix is kept.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-23 17:34:30 +00:00
c2dd4db36d fix(orgtoken): sync test mocks with actual query column count
Real Validate() query: SELECT id, prefix, org_id FROM org_api_tokens
Real List() query: SELECT id, prefix, name, org_id, created_by, created_at, last_used_at FROM org_api_tokens

Fixes:
- TestValidate_HappyPath: add org_id to mock row (was 2 cols, query returns 3)
- TestList_NewestFirst: fix column list AND AddRow calls to match List() query
  (7 columns: id, prefix, name, org_id, created_by, created_at, last_used_at)

This resolves the Platform (Go) CI failure blocking all molecule-core PRs.

Ref: pre-existing failure, unrelated to F1085 security fix.
2026-04-23 17:34:30 +00:00
Hongming Wang
6904a8c448
Merge pull request #1791 from Molecule-AI/fix/memory-poisoning-GH1610
fix(security): cross-tenant memory poisoning — GLOBAL scope isolation (GH#1610)
2026-04-23 10:26:02 -07:00
Molecule AI Marketing Lead
e00797ba35 fix(security): prevent cross-tenant memory contamination in commit_memory/recall_memory (GH#1610)
Two critical gaps in a2a_tools.py let any tenant workspace poison org-wide
(GLOBAL) memory and bypass all RBAC enforcement:

1. tool_commit_memory had no RBAC check — any agent could write any scope.
2. tool_commit_memory had no root-workspace enforcement for GLOBAL scope —
   Tenant A could POST scope=GLOBAL and pollute the shared memory store
   that Tenant B's agent reads as trusted context.

Fix adds:
- _ROLE_PERMISSIONS table (mirrors builtin_tools/audit.py) so a2a_tools
  has isolated RBAC logic without depending on memory.py.
- _check_memory_write_permission() / _check_memory_read_permission() helpers:
  evaluate RBAC roles from WorkspaceConfig; fail closed (deny) on errors.
- _is_root_workspace() / _get_workspace_tier(): read WorkspaceConfig.tier
  (0 = root/org, 1+ = tenant) from config.yaml; fall back to
  WORKSPACE_TIER env var.
- tool_commit_memory now (a) checks memory.write RBAC, (b) rejects
  GLOBAL scope for non-root workspaces, (c) embeds workspace_id in the
  POST body so the platform can namespace-isolate and audit cross-workspace
  writes.
- tool_recall_memory now checks memory.read RBAC before any HTTP call,
  and always sends workspace_id as a GET param for platform cross-validation.

Security regression tests added:
- GLOBAL scope denied for non-root (tier>0) workspaces.
- RBAC denial blocks all scope levels (including LOCAL) on write.
- RBAC denial blocks recall entirely.
- workspace_id present in POST body and GET params.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-23 10:21:34 -07:00
Hongming Wang
6539908f77
Merge pull request #1783 from Molecule-AI/promote/main-to-staging-2026-04-23
chore: promote main → staging (52 commits, 2 conflicts resolved)
2026-04-23 09:55:59 -07:00
Hongming Wang
dc476153c1 Merge remote-tracking branch 'origin/staging' into promote/main-to-staging-2026-04-23
# Conflicts:
#	canvas/src/components/__tests__/ContextMenu.keyboard.test.tsx
2026-04-23 09:50:16 -07:00
molecule-ai[bot]
842a7daf4c
Merge pull request #1777 from Molecule-AI/fix/canvas-mock-staging
fix(canvas): add getState to useCanvasStore mock in ContextMenu test
2026-04-23 16:43:52 +00:00
8f7808642a fix(test): add getState to useCanvasStore mock in ContextMenu keyboard test
PR #1781 introduced useCanvasStore.getState() call in ContextMenu.tsx
(line 169) but the existing Vitest mock for useCanvasStore in the keyboard
test file lacked a getState method, causing:
  TypeError: useCanvasStore.getState is not a function

Fix: attach getState: () => mockStore to the mock using Object.assign
so the static method is available alongside the selector fn.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-23 16:43:08 +00:00
Hongming Wang
df2cf935d3 fix(handlers): validate path/auth BEFORE docker availability checks
Three traversal / cross-workspace rejection tests on staging were
masked by premature "docker not available" early returns:

1. deleteViaEphemeral — nil-docker check fired BEFORE path validation;
   malicious paths got "docker not available" (wrong code path) instead
   of "path not allowed". Reversed the order + added "path not allowed:"
   prefix to rejection messages.

2. copyFilesToContainer — split the traversal classifier into:
   - absolute path → "unsafe file path in archive"
   - literal "../" prefix → "unsafe file path in archive" (classic)
   - URL-encoded / mid-path traversal → "path escapes destination"
   Added nil-docker guard AFTER validation so legitimate inputs error
   cleanly instead of panicking on nil docker.

3. HandleConnect KI-005 — test used outdated table name
   "workspace_tokens"; ValidateAnyToken uses "workspace_auth_tokens"
   since #1210. Updated the mock. Added best-effort last_used_at
   UPDATE expectation that fires after successful token validation.

Brings the handlers package from 3 failing tests to 0. All 20 Go
packages green on go test -race ./... locally.
2026-04-23 09:31:54 -07:00
Hongming Wang
47dc72c6b3 chore: promote main → staging (52 commits, 2 conflicts resolved)
Brings the staging branch up to date with main's feature-fix stream so
every staging-targeted PR stops tripping on pre-existing rot. Before
this merge, staging had 30+ compile + test failures from fix PRs that
landed on main but never reached staging — primarily #1755's panic-
cascade + schema-drift alignments.

After this merge the handlers package goes from 30+ fails → 2 pre-
existing nil-docker test panics (TestCopyFilesToContainer_CWE22_
RejectsTraversal + TestDeleteViaEphemeral_F1085_RejectsTraversal),
both authored on staging and broken before this promotion. Tracked
separately; not a merge regression.

## Conflicts resolved

1. docs/marketing/campaigns/discord-adapter-announcement/announcement.md
   — deleted on main (9d0d213: "move sensitive strategy + research to
   internal repo"), modified on staging. Deletion wins: marketing
   content moved out of the public monorepo per that commit's intent.
   The content lives in the internal repo.

2. workspace-server/internal/handlers/container_files.go — staging's
   rmTarget version kept. Main's version had `Cmd: []string{"rm",
   "-rf", "/configs/" + filePath}` which concatenates raw filePath
   AFTER the prefix-check on rmTarget, defeating the path-traversal
   guard (a "../etc/passwd" input passes validation but the rm cmd
   then traverses). Staging's `Cmd: []string{"rm", "-rf", rmTarget}`
   uses the validated path. Keeping staging's more-secure variant.

## Includes build unblockers from #1769 / #1782
- terminal.go: malformed handleLocalConnect repaired
- terminal_test.go: missing braces in TestHandleConnect_RoutesToLocal
- workspace_crud.go: unused imports + duplicate strField block
- container_files_test.go: duplicate contains() removed (uses the one
  in workspace_provision_test.go, same package)

## Verification
- go build ./...  clean
- go vet ./...  clean
- go test -race ./... — 18/20 packages green; 2 test panics in
  internal/handlers are pre-existing on staging (documented above)
2026-04-23 08:51:01 -07:00
Hongming Wang
68ee76c6b7 fix(canvas): add getState to useCanvasStore mock in ContextMenu keyboard test
ContextMenu.tsx reads parent-workspace children via
useCanvasStore.getState().nodes.filter(...) — a direct .getState()
call, not the selector-calling form. The existing vi.mock exposed
only the selector form, so rendering crashed with
"TypeError: useCanvasStore.getState is not a function".

Restructure the vi.mock factory to return Object.assign(fn, {
getState: () => mockStore }) so both call shapes resolve. Factory body
builds the function locally because vi.mock hoists above outer-scope
variable declarations and can't reference `mockStore` via closure.

Verified: all 15 tests in the file pass after the change.

Unblocks the Canvas (Next.js) CI check on PR #1743 (staging→main sync).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 01:49:34 -07:00
Hongming Wang
fa5e62b484
Merge pull request #1778 from Molecule-AI/fix/e2e-hermes-slug-staging
fix(e2e/staging-saas): send provider-prefixed model slug for hermes
2026-04-23 01:48:17 -07:00
Hongming Wang
786a8470e5 fix(e2e/staging-saas): send provider-prefixed model slug for hermes
The E2E posts a bare "gpt-4o" as the workspace model. Hermes
template's derive-provider.sh parses the slug PREFIX (before the
slash) to set HERMES_INFERENCE_PROVIDER at install time. With no
prefix, provider falls back to hermes's auto-detect, which picks
the compiled-in Anthropic default. Hermes-agent then tries the
Anthropic API with the OpenAI key the E2E passed in SECRETS_JSON
and returns 401 "Invalid API key" at step 8/11 (A2A call).

Same trap PR #1714 fixed for the canvas Create flow. The E2E
was quietly broken on the same vector — it masked before today
because workspaces never reached "online" (pre-#231 install.sh
hook missing on staging; staging now deploys #231 via CP #236).

Fix: pin MODEL_SLUG="openai/gpt-4o" since the E2E's secret is
always the OpenAI key. Non-hermes runtimes ignore the prefix.

Now that both layers are fixed (install.sh runs AND the slug
steers hermes to OpenAI), the E2E should reach step 11/11.

Evidence from run 24822173171 attempt 2 (post-CP-#236 deploy):
  07:55:25  CP reachable
  07:57:28  Tenant provisioning complete (2:03, canary)
  08:04:56  Workspace 52107c1a online (7:28, install.sh ran!)
  08:05:06  Workspace 34a286df online
  08:05:06  A2A 401 — hermes tried Anthropic with OpenAI key

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 01:43:55 -07:00
Hongming Wang
b4cd78729d
fix(platform-go-ci): align test mocks with schema drift + org_id context contract (#1755)
* fix(platform-go-ci): align test mocks with schema drift + org_id context contract

Reduces Platform (Go) CI failures from 12 to 2 (both remaining are pre-existing
on origin/main and unrelated to this PR's scope).

Schema drift fixes (sqlmock column counts misaligned with current prod Scans):
- `orgtoken/tokens_test.go`: Validate query gained `org_id` column post-migration
  036 — updated 3 TestValidate_* tests from 2-col to 3-col ExpectQuery.
- `handlers/handlers_test.go` + `_additional_test.go`: `scanWorkspaceRow` now
  has 21 cols (`max_concurrent_tasks` inserted between `active_tasks` and
  `last_error_rate`). Updated TestWorkspaceList, TestWorkspaceList_WithData,
  and TestWorkspaceGet_CurrentTask mocks.
- `handlers/handlers_test.go`: activity scan now has 14 cols (`tool_trace`
  between `response_body` and `duration_ms`). Updated 5 TestActivityHandler_*
  tests (List, ListByType, ListEmpty, ListCustomLimit, ListMaxLimit).

Middleware org_id contract (7 failing tests → passing, zero prod callers):
- `middleware/wsauth_middleware.go`: WorkspaceAuth and AdminAuth now set the
  `org_id` context key only when the token has a non-NULL org_id. This lets
  downstream handlers use `c.Get("org_id")` existence to distinguish anchored
  tokens from pre-migration/ADMIN_TOKEN bootstrap tokens. Grep confirmed no
  current prod callers read this key — tests were the sole spec.
- `middleware/wsauth_middleware_test.go` + `_org_id_test.go`: consolidated
  separate primary+secondary ExpectQuery blocks into a single 3-col mock
  per test, and dropped the now-unused `orgTokenOrgIDQuery` constant.

Other:
- `handlers/github_token_test.go`: TestGitHubToken_NoTokenProvider now asserts
  500 + "token refresh failed" (env-based fallback path added in #960/#1101).
  Added missing `strings` import.
- `handlers/handlers_additional_test.go`: TestRegister_ProvisionerURLPreserved
  URL changed from `http://agent:8000` to `http://localhost:8000` — `agent` is
  not DNS-resolvable in CI and is rejected by validateAgentURL's SSRF check;
  `localhost` is name-exempt. The contract under test is provisioner-URL
  precedence, not URL validation.

Methodology (per quality mandate):
- Baselined 12 failing tests on clean origin/main before any edit.
- For each fix: grep'd prod for semantic contract, made minimal edits,
  verified full-suite delta = zero regressions.
- Discovered +5 pre-existing failures previously masked by TestWorkspaceList
  panic (which killed the test binary on origin/main before downstream tests
  ran). 3 of these are in this PR's bug class and were fixed; 2 are unrelated
  (a panicking test with a missing Request and a missing template file) —
  deferred to a follow-up issue.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: trigger CI after base retarget to main

* fix(platform-go-ci): stop TestRequireCallerOwnsOrg_NotOrgTokenCaller panic + skip yaml-includes test

Reduces Platform (Go) CI failures from 2 to 1 on this branch.

- `TestRequireCallerOwnsOrg_NotOrgTokenCaller`: the test's comment says
  "set to a non-string type" but the code stored the string "something",
  which passed the `tokenID.(string)` assertion in requireCallerOwnsOrg
  and triggered a DB lookup on a bare gin test context (no Request) →
  nil-deref in c.Request.Context(). Fixed by storing an int (12345), which
  matches the stated intent of exercising the non-string-assertion branch.

- `TestResolveYAMLIncludes_RealMoleculeDev`: the in-tree copy at
  /org-templates/molecule-dev/ is being extracted to the standalone
  Molecule-AI/molecule-ai-org-template-molecule-dev repo. Until that
  extraction lands the in-tree copy is stale (teams/dev.yaml !include's
  core-platform.yaml etc. that don't exist). Skipped with a pointer to
  the extraction so this doesn't rot.

Remaining failure: `TestRequireCallerOwnsOrg_TokenHasMatchingOrgID` panics
with the same root cause (bare gin context + string org_token_id → DB
lookup → nil-deref). Fixing it by adding a Request would unmask ~25 other
pre-existing hidden failures (schema drift, DNS-dependent tests, mock
drift) that were being masked by the earlier panic killing the test
binary. Those belong to a dedicated cleanup PR; the panic-chain triage
is tracked separately.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(platform-go-ci): eliminate remaining 25 cascade failures + harden auth

Takes Platform (Go) CI from 1 remaining failure (post–first pass) to 0.
Fixing `TestRequireCallerOwnsOrg_NotOrgTokenCaller`'s panic unmasked ~25
pre-existing handler-package failures that were silently hidden because
the panic killed the test binary mid-run. All are now fixed.

## Prod change
`org_plugin_allowlist.go#requireOrgOwnership` now denies unanchored
org-tokens (org_id NULL in DB) instead of treating them as session/admin.
The stated contract in `requireCallerOwnsOrg`'s comment already said
"those callers get callerOrg="" and are denied"; the downstream check
was the gap. Distinguishes the two `callerOrg == ""` paths by reading
`c.Get("org_token_id")` — key present → unanchored token → deny;
absent → session/ADMIN_TOKEN → allow.

## Tests fixed by class

**Request-less test-context panic** (7 tests, `org_plugin_allowlist_test.go`):
added `httptest.NewRequest(...)` to each bare `gin.CreateTestContext` so
the DB path in `requireCallerOwnsOrg` can read `c.Request.Context()`
without nil-deref.

**Workspace scan drift — `max_concurrent_tasks` 21st column** (8 tests):
- `TestWorkspaceGet_Success`, `_FinancialFieldsStripped`, `_SensitiveFieldsStripped`
- `TestWorkspaceBudget_Get_NilLimit`, `_WithLimit` (+ shared `wsColumns`)
- `TestWorkspaceBudget_A2A_UnderLimitPassesThrough`, `_NilLimitPassesThrough`,
  `_DBErrorFailOpen` — each also needed `allowLoopbackForTest(t)` because
  the SSRF guard now blocks `httptest.NewServer`'s 127.0.0.1 URL.

**Org-token INSERT param drift — added `org_id` 5th param** (5 tests,
`org_tokens_test.go`): `TestOrgTokenHandler_Create_*` (4) get a 5th
`nil` `WithArgs` arg; `TestOrgTokenHandler_List_HappyPath` gets `org_id`
as the 4th column in its mock row.

**ReplaceFiles/WriteFile restart-cascade SELECT shape change** (3 tests,
`template_import_test.go` + `templates_test.go`): handler now selects
`name, instance_id, runtime` for the post-write restart cascade — tests
now pin the full 3-column shape instead of just `SELECT name`.

**GitHub webhook forwarding** (2 tests, `webhooks_test.go`): added
`allowLoopbackForTest(t)` — same SSRF-guard / loopback-server mismatch
as the budget A2A tests.

**DNS-dependent sentinel hostname** (2 tests): `TestIsSafeURL/public_*`
+ `TestValidateAgentURL/valid_public_*` used `agent.example.com` which
is NXDOMAIN on most resolvers; switched to `example.com` itself (RFC-2606,
resolves globally via Cloudflare Anycast).

**Register C18 hijack assertion** (`registry_test.go`): attacker URL
was `attacker.example.com` (NXDOMAIN) → `validateAgentURL` rejected
with 400 before the C18 auth gate could fire 401. Switched to
`example.com` so the test actually exercises the C18 gate.

**Plugin install error vocabulary** (`plugins_test.go`): handler now
returns generic "invalid plugin source" instead of leaking the internal
`ParseSource` "empty spec" string to the HTTP surface. Test assertion
updated; "empty spec" still covered at the unit level in `plugins/source_test.go`.

**seedInitialMemories tests tripping redactSecrets** (3 tests,
`workspace_provision_test.go`): content was `strings.Repeat("X", N)`
which matches the BASE64_BLOB redactor (33+ chars of `[A-Za-z0-9+/]`)
and got replaced with `[REDACTED:BASE64_BLOB]` before INSERT, making
the `WithArgs` assertion mismatch. Switched to a space-containing
`"hello world "` pattern that breaks the run. Also fixed an unrelated
pre-existing bug in `TestSeedInitialMemories_Truncation` where
`copy([]byte(largeContent), "X")` was a no-op (strings are immutable
in Go — the copy modified a throwaway slice).

Net: Platform (Go) handlers package is now fully green on `go test -race`.
Unblocks PRs #1738, #1743, and any future handlers-package work that was
inheriting the 12→25 baseline.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Hongming Wang <hongmingwang.rabbit@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 07:14:33 +00:00
molecule-ai[bot]
e739b49938
Merge pull request #1760 from Molecule-AI/fix/docs-external-quickstart-clean
docs(guides): add external-workspace quickstart for DevRel
2026-04-23 06:15:57 +00:00
Hongming Wang
e88ce3b88b docs(guides): add 5-minute external-workspace quickstart for DevRel
Existing external-agent-registration.md is 784 lines — great reference
but hostile to first-time devs evaluating Molecule. Add a tight
5-minute quickstart aimed at "make it work today":

- 40-line Python agent with A2A JSON-RPC skeleton
- Cloudflare quick-tunnel for instant public URL (no account)
- Single curl registration
- Common gotchas table (includes the canvas dedup + tunnel rotation
  issues caught in the demo this afternoon)
- Production upgrade path
- Preview of polling mode (Phase N+1 transport)
- 4-step diagnostic checklist at the bottom

The reference doc (external-agent-registration.md) now has a prominent
"in a hurry?" callout pointing at the quickstart, so the discovery
path works either way.

Target audience: a developer who wants to see their code on canvas
inside 5 minutes, not a self-hoster hardening for prod.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 06:13:16 +00:00
Hongming Wang
64e4c7b661
Merge pull request #1725 from Molecule-AI/fix/platform-go-ci-tests
fix(handlers): unblock Platform (Go) CI — sqlmock budget-check + test loopback
2026-04-22 20:03:06 -07:00
Hongming Wang
d5ec0a9d25
Merge pull request #1734 from Molecule-AI/fix/registry-heartbeat-autorecover
fix(registry): auto-recover failed/provisioning workspaces on successful heartbeat
2026-04-22 20:03:03 -07:00
Hongming Wang
3c785bc7f5
Merge pull request #1731 from Molecule-AI/fix/scheduler-sweep-phantom-busy
feat(scheduler): sweepPhantomBusy — clear stuck active_tasks from crashed runs
2026-04-22 20:03:00 -07:00
Hongming Wang
c5d81aa745
Merge pull request #1730 from Molecule-AI/fix/workspace-gh-token-refresh-daemon
feat(workspace): 45-min gh-token refresh daemon + credential helper cache
2026-04-22 20:02:57 -07:00
Hongming Wang
0d820bd869
Merge pull request #1735 from Molecule-AI/chore/extract-1664-small-fixes
chore: extract 3 small fixes from closed #1664
2026-04-22 20:02:54 -07:00
Hongming Wang
7c81b081d2 fix(registry): auto-recover failed/provisioning workspaces on successful heartbeat (extracted from #1664)
When a workspace is marked "failed" or "provisioning" but is actively
sending heartbeats, transition it to "online". Transient boot failures
or mid-setup provisioner crashes otherwise leave workspaces stuck in a
stale terminal state even after they become healthy.

Preserves existing online/degraded/offline transitions; only adds a new
conditional branch for the failed/provisioning case with a guarded
WHERE clause so a concurrent delete cannot flip 'removed' back to
'online'.
2026-04-22 20:00:26 -07:00
Hongming Wang
d4cead5002 chore: extract ContextMenu Zustand fix + a2a_proxy local-docker SSRF bypass + workspace-server Dockerfile GID entrypoint
Three small, non-overlapping fixes extracted from closed PR #1664:

1. canvas/src/components/ContextMenu.tsx — Replace the useMemo-over-nodes
   pattern with a hashed-boolean selector (s.nodes.some(...)) so Zustand's
   useSyncExternalStore snapshot comparison is stable. Resolves React
   error #185 (infinite render loop). Moves the child-node list derivation
   into the delete handler via getState() so the render path no longer
   allocates a fresh array.

2. workspace-server/internal/handlers/a2a_proxy.go — Allow the
   Docker-bridge hostname path (ws-<id>:8000) to skip the SSRF guard in
   local-docker mode. Gated on !saasMode() so SaaS deployments keep the
   full private-IP blocklist (a remote workspace registration can't claim
   a ws-* hostname and reach a sensitive VPC IP).

3. workspace-server/Dockerfile — Add entrypoint.sh that discovers the
   docker.sock GID at boot and adds the platform user to that group, then
   exec's su-exec to drop privileges. Lets the platform container reach
   the host docker socket without running as root.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 20:00:16 -07:00
Hongming Wang
2849a9a939 feat(scheduler): sweepPhantomBusy — clear stuck active_tasks from crashed runs (extracted from #1664)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 19:57:49 -07:00
molecule-ai[bot]
9d076b9c4d
Merge pull request #1684 from Molecule-AI/fix/missing-keys-modal-a11y-v2
fix(canvas/a11y): MissingKeysModal — backdrop aria-hidden, decorative SVGs, form labels
2026-04-23 02:54:46 +00:00
Hongming Wang
2885583d05 feat(workspace): 45-min gh-token refresh daemon + credential helper cache
Extracted from the now-closed PR #1664 (Molecule-AI/molecule-core).

- New scripts/molecule-gh-token-refresh.sh background daemon — every
  45 min (TOKEN_REFRESH_INTERVAL_SEC) calls the credential helper's
  _refresh_gh action to keep both gh CLI auth and the on-disk cache
  fresh through the GitHub App installation token's ~60 min TTL.
- scripts/molecule-git-token-helper.sh rewritten with a ~50 min
  on-disk cache (${CACHE_DIR}/gh_installation_token + _expiry
  companion file), a cache > API > env-var fallback chain, a new
  _refresh_gh action (invoked by the daemon above), a _invalidate_cache
  action, and path references flipped from /workspace/scripts/... to
  /app/scripts/... to match the runtime image layout.
- Dockerfile copies the new refresh daemon and extends mkdir to
  create /home/agent/.molecule-token-cache at build time.
- entrypoint.sh configures the git credential helper for github.com
  while still root (so the global gitconfig is written before the
  gosu handoff), creates + chowns the token cache dir, then as agent
  starts the refresh daemon in the background and does an initial
  gh auth login from GITHUB_TOKEN/GH_TOKEN so gh works before the
  first refresh fires.

Dropped from PR #1664: cosmetic em-dash -> ASCII hyphen rewrites
(charset-normalizer noise) that would conflict with the repo's
existing em-dash convention used elsewhere in workspace/.
2026-04-22 19:52:46 -07:00
molecule-ai[bot]
32555a884a
Merge pull request #1686 from Molecule-AI/feat/tool-trace-v2
feat: tool trace + platform instructions (review-passed)
2026-04-23 02:43:27 +00:00
Hongming Wang
2df644f528 fix(handlers): unblock Platform (Go) CI — sqlmock budget-check + test loopback
Fixes 14 of the 18 failing tests that have been reddening Platform (Go)
CI on main since the 2026-04-18 open-source restructure + 2026-04-21
SSRF-backport. Reduces handlers package failure count 18 → 4
(remaining 4 are unrelated schema/behavior drift — see follow-ups).

Three root causes fixed:

  1. httptest.NewServer binds to 127.0.0.1; isSafeURL rejects loopback.
     Tests that stub workspace URLs via httptest therefore 502'd at
     the SSRF guard before reaching the handler logic they wanted to
     exercise.
     Fix: add `testAllowLoopback` var to ssrf.go + `allowLoopbackForTest(t)`
     helper in handlers_test.go. Only 127.0.0.0/8 and ::1 are relaxed;
     169.254 metadata, RFC-1918, TEST-NET, CGNAT, and link-local
     protections remain active. Flag is paired with t.Cleanup and is
     never touched by production code.

  2. ProxyA2A's checkWorkspaceBudget query (SELECT budget_limit, COALESCE
     (monthly_spend, 0) FROM workspaces WHERE id = $1) was added with the
     restructure but the a2a_proxy_test.go sqlmock expectations never
     caught up, producing "call to Query ... was not expected" on every
     ProxyA2A-exercising test.
     Fix: `expectBudgetCheck(mock, workspaceID)` helper that registers
     an empty-rows expectation (checkWorkspaceBudget fails-open on
     sql.ErrNoRows, so an empty result = "no budget limit"). Added to
     each of the 8 affected TestProxyA2A_* tests in the correct
     position relative to access-control + activity-log expectations.

  3. TestAdminMemories_Import_Success + _RedactsSecretsBeforeDedup
     mocked a 5-arg INSERT when the handler actually issues a 4-arg
     INSERT (workspace_id, content, scope, namespace) unless the
     payload carries a created_at override. Removed the spurious 5th
     AnyArg from both tests; _PreservesCreatedAt is untouched since it
     legitimately uses the 5-arg form.

Also: TestResolveAgentURL_CacheHit and _CacheMissDBHit used bogus
`cached.example` / `dbhit.example` hostnames that fail DNS resolution
inside isSafeURL (which happens BEFORE the loopback check). Swapped to
`127.0.0.1` variants preserving test intent (they never hit the network).

Remaining 4 failures — out of scope for this PR, tracked separately:
  - TestGitHubToken_NoTokenProvider (handler behavior drift — 500 vs 404)
  - TestWorkspaceList + TestWorkspaceList_WithData (Scan arg count —
    workspaces table gained a column, mock not updated)
  - TestRegister_ProvisionerURLPreserved (request body shape drift)

Closes the 4 wrong-target PRs (#1710, #1718, #1719, #1664) that all
tried to silence the symptom by disabling golangci-lint — which has
`continue-on-error: true` in ci.yml and was never the actual blocker.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 19:40:06 -07:00
5157f80d19 fix(canvas): add type=button to ApprovalBanner action buttons (bug #1669)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-23 02:15:52 +00:00
molecule-ai[bot]
16b2e5da29
Merge branch 'main' into feat/tool-trace-v2 2026-04-23 02:09:17 +00:00
Hongming Wang
47e459cdec
Merge pull request #1714 from Molecule-AI/fix/hermes-require-model-at-create
fix(canvas): require hermes model at create (fixes silent Anthropic 401)
2026-04-22 19:02:21 -07:00
Hongming Wang
e08ea7b5ba fix(canvas): require hermes model at create + send to CP (fixes silent Anthropic 401)
Root cause of the hermes 401 "Invalid API key" on SaaS workspaces:

  1. CreateWorkspaceDialog never sent `model` in the /workspaces POST
  2. Tenant/CP plumbed through a valid (provider, API key) but empty MODEL
  3. Workspace install.sh ran with HERMES_DEFAULT_MODEL unset
  4. derive-provider.sh saw no slug → PROVIDER="auto"
  5. Hermes fell back to its compiled-in default (Anthropic via
     OpenAI-compat adapter)
  6. User's MINIMAX_API_KEY was present but irrelevant — hermes tried
     Anthropic with it → 401

Fix:

- Extend HERMES_PROVIDERS with `defaultModel` + `models` (suggestion
  list). Each provider ships with a known-good default so the trap
  is physically impossible to hit with the new form.
- Add a required Model input to the Hermes panel, auto-populated
  from the provider's defaultModel when the provider changes (only
  if the user hasn't typed their own slug yet).
- Datalist surfaces additional model suggestions per provider so
  users can pick a different size (e.g. M2.7-highspeed) without
  typing the whole slug.
- handleCreate validates hermesModel is non-empty, sends as `model`
  in the POST body alongside the secrets block.
- useEffect guard avoids clobbering a user-typed custom slug when
  they toggle providers back and forth.

Existing 19 a11y tests still pass (non-SaaS path unchanged, four-tier
picker still renders, arrow-key nav still wraps).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 18:59:49 -07:00
Hongming Wang
59e0fd68f2
Merge pull request #1697 from Molecule-AI/docs/move-marketing-strategy-to-internal
docs: move marketing strategy + research to internal repo
2026-04-22 18:48:03 -07:00