Commit Graph

223 Commits

Author SHA1 Message Date
molecule-ai[bot]
0195308b73 feat: pgvector semantic search for agent memory recall (#576)
Rebase of feat/issue-576-pgvector-semantic-memory onto current main,
preserving the #767 security layer (globalMemoryDelimiter + GLOBAL audit
log) that predates this branch.

Changes layered on top of main:
- Migration 031: embedding vector(1536) column + ivfflat cosine-ops index
  (renumbered from 029 — 029/030 were taken by workspace-hibernation and
  audit-events)
- Commit: embed-on-write after INSERT, non-fatal on embedding failure
- Search: semantic cosine-distance path when EmbeddingFunc is wired up;
  falls back to FTS/ILIKE; GLOBAL delimiter wrapping applies on both paths
- EmbeddingFunc injection pattern; WithEmbedding chainable builder

All security invariants preserved:
- globalMemoryDelimiter wrapping on GLOBAL scope in both semantic + FTS
- GLOBAL write audit log (SHA-256 forensic trail) in Commit
- TestRecallMemory_GlobalScope_HasDelimiter passes
- TestMemoriesCommit_Global_AsRoot passes
- 3 new pgvector tests pass

Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com>
2026-04-17 17:19:45 +00:00
molecule-ai[bot]
255c888ca1 Merge pull request #651 from Molecule-AI/feat/issue-594-audit-ledger
feat: molecule-audit-ledger — HMAC-SHA256 immutable agent event log (#594)
2026-04-17 16:37:01 +00:00
2f7a979ee6 chore(migrations): rename 029_audit_events → 030_audit_events (collision with 029_workspace_hibernation)
PR #724 (workspace hibernation) claimed migration number 029.
Renaming to 030 to resolve the sequence collision before merging #651.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 16:36:52 +00:00
molecule-ai[bot]
8b59a1cb9a Merge pull request #724 from Molecule-AI/feat/issue-711-workspace-hibernation
feat(registry): workspace hibernation — auto-pause idle workspaces
2026-04-17 16:36:27 +00:00
molecule-ai[bot]
c47021c19e Merge pull request #769 from Molecule-AI/fix/issue-767-global-memory-injection
fix(security): GLOBAL memory prompt injection safeguards (#767)
2026-04-17 16:35:35 +00:00
molecule-ai[bot]
ccb9317a49 Merge pull request #766 from Molecule-AI/fix/issue-761-system-caller-header-forge
fix(security): reject X-Workspace-ID system-caller prefix forgery (#761)
2026-04-17 16:35:25 +00:00
molecule-ai[bot]
7e9e105029 fix(security): GLOBAL memory prompt injection safeguards (#767)
Two defenses against GLOBAL-scope agent memory injection attacks:

1. Recall delimiter: Search() wraps every GLOBAL-scope memory value
   with a non-instructable prefix before returning it to MCP clients:
     [MEMORY id=<uuid> scope=GLOBAL from=<workspace_id>]: <value>
   This prevents stored content (e.g. "IGNORE ALL PREVIOUS INSTRUCTIONS")
   from being parsed as instructions in the agent's context window.
   Raw DB content is unchanged — the wrapper is applied on read only.

2. Write audit log: Commit() writes an activity_log entry with
   activity_type='memory_write_global' whenever a GLOBAL memory is
   stored. The entry records a SHA-256 hash of the content (never
   plaintext) alongside memory_id and namespace for forensic replay.
   Audit failure is non-fatal — a logging error must not roll back
   a successful write.

Tests:
- TestRecallMemory_GlobalScope_HasDelimiter — verifies exact delimiter
  format [MEMORY id=... scope=GLOBAL from=...]: <value>
- TestCommitMemory_GlobalScope_AuditLogEntry — verifies activity_logs
  INSERT fires on every GLOBAL write (via mock.ExpectationsWereMet)
- TestMemoriesCommit_Global_AsRoot — updated to expect the audit INSERT

All 16 Go test packages pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 16:26:46 +00:00
molecule-ai[bot]
d76c56e648 fix(security): reject X-Workspace-ID system-caller prefix forgery (#761)
Added an early guard in ProxyA2A() that rejects HTTP requests whose
X-Workspace-ID header passes isSystemCaller() with 403 Forbidden.

Legitimate system callers (webhooks, scheduler, restart_context) call
proxyA2ARequest() directly via ProxyA2ARequest() and never send HTTP
headers with system-caller prefixes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 16:15:47 +00:00
Hongming Wang
865d825993 Merge pull request #743 from Molecule-AI/feat/issue-727-opus-4-7-default
feat: upgrade default workspace model to claude-opus-4-7
2026-04-17 08:47:27 -07:00
Molecule AI QA Engineer
489f8bfb16 test(hibernation): integration tests for workspace hibernation (#711)
Cover the full hibernation feature (PR #724) + scheduler interaction (#722):

handlers/hibernation_test.go (new, 6 tests):
- HibernateWorkspace_OnlineWorkspace_Success — container stop called (nil
  provisioner guard), DB status set to 'hibernated', Redis keys cleared
  (ws:{id}, ws:{id}:url, ws:{id}:internal_url), WORKSPACE_HIBERNATED broadcast
- HibernateWorkspace_NotEligible_NoOp — ErrNoRows → early return, no UPDATE,
  Redis keys untouched
- HibernateWorkspace_DBUpdateFails_NoCrash — UPDATE error → no panic, no broadcast
- HibernateHandler_Online_Returns200 — HTTP POST, online workspace → 200 {"status":"hibernated"}
- HibernateHandler_NotActive_Returns404 — not online/degraded → 404
- HibernateHandler_DBError_Returns500 — DB error → 500

a2a_proxy_test.go (2 new tests):
- ResolveAgentURL_HibernatedWorkspace_Returns503WithWaking — empty Redis + DB
  returns status=hibernated/url="" → 503 + Retry-After:15 + {waking:true,retry_after:15}
- ResolveAgentURL_HibernatedWorkspace_NullURLVariant — same with SQL NULL url

scheduler_test.go (1 new test):
- RepairNullNextRunAt_HibernatedWorkspace_ScheduleRepaired — repair query has
  no workspace status filter; hibernated workspace's schedule still gets
  next_run_at repaired so it fires on wake

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 15:44:41 +00:00
Molecule AI QA Engineer
a0a84b9d22 chore: merge main into test/issue-711-hibernation-integration (gets scheduler #722 fix) 2026-04-17 15:40:56 +00:00
Molecule AI Backend Engineer
ec4309138b feat: upgrade default workspace model to claude-opus-4-7 (#727)
Replace the anthropic:claude-sonnet-4-6 default across config, handlers,
env example, and litellm proxy config. All tests updated to match the new
default; sonnet-4-6 alias kept in litellm_config.yml for pinned workspaces.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 15:30:57 +00:00
Molecule AI QA Engineer
d438ff357a test(security): route-specific #684 regression — three vulnerable admin routes
The BE's tests (AdminTokenSet_*, FailOpen_*) validated the core AdminAuth
contract on /admin/secrets. These table-driven additions pin the same contract
on the three routes explicitly named in the #684 security report, each with
three scenarios: workspace token rejected, correct ADMIN_TOKEN accepted, no
bearer rejected.

Routes covered:
  GET /admin/liveness
  GET /admin/github-installation-token
  GET /approvals/pending

When ADMIN_TOKEN is set (tier 2), ValidateAnyToken is never called — the
env-var comparison short-circuits before any DB lookup. The mock sets only
HasAnyLiveTokenGlobal and nothing else; an extra DB expectation would itself
be a test bug (calling it proves the middleware regressed to tier 3).

All 18 TestAdminAuth_684* tests pass. Full go test ./... is green across all
15 platform packages.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 15:25:41 +00:00
Hongming Wang
f3f5ce32fe Merge pull request #729 from Molecule-AI/fix/issue-684-adminauth-bearer-scope
fix(auth): AdminAuth rejects workspace bearer tokens when ADMIN_TOKEN is set (#684)
2026-04-17 08:17:11 -07:00
Molecule AI Backend Engineer
84584af2e6 fix(a2a): restore delivery_confirmed body-read logic removed by hibernation commit (#689)
The hibernation PR (5c1a9d0) accidentally removed the delivery_confirmed
fix that was introduced for issue #689. When io.ReadAll fails after the
target has already responded with headers (200-399), the message WAS
delivered — stripping delivery_confirmed from the error response caused
callers to treat a successful send as a hard failure.

Restore the full original body-read error block:
- deliveryConfirmed flag (true when status 200-399)
- log line with status/bytes_read context
- logA2ASuccess call when deliveryConfirmed (audit trail accuracy)
- proxyA2AError.Response includes "delivery_confirmed" field so callers
  can distinguish "not delivered" from "delivered, body lost"

The hibernation auto-wake feature (resolveAgentURL status='hibernated'
check) is orthogonal and untouched.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 15:14:25 +00:00
Molecule AI Backend Engineer
7b9bede14b fix(auth): tighten AdminAuth to reject workspace bearer tokens when ADMIN_TOKEN is set (#684)
Blast-radius isolation gap: AdminAuth called ValidateAnyToken which
accepted any live workspace bearer token. A compromised workspace agent
could present its own token to GET /admin/github-installation-token and
steal the platform's GitHub App credential, or hit /approvals/pending to
enumerate cross-workspace approvals.

Fix: introduce a dedicated admin credential tier via ADMIN_TOKEN env var.
When set, AdminAuth verifies the bearer against that secret exclusively
(crypto/subtle constant-time comparison). Workspace tokens are rejected
outright — no DB lookup occurs. When ADMIN_TOKEN is not set the previous
behaviour is preserved as a deprecated backward-compat fallback (tier 3)
so existing deployments without the env var don't break immediately.

Credential tiers (evaluated in order):
  1. Fail-open — no live tokens globally (fresh install / pre-Phase-30)
  2. ADMIN_TOKEN match — env var set, bearer must equal it exactly
  3. Fallback (deprecated) — any valid workspace token (ADMIN_TOKEN unset)

Operators should set ADMIN_TOKEN=<openssl rand -base64 32> to fully close
the blast-radius gap. Tier 3 will be removed in a future release.

Fixes #684.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 15:08:54 +00:00
molecule-ai[bot]
7cdbf8175e fix(scheduler): prevent NULL next_run_at from permanently dropping schedules (#722)
Three bugs caused enabled schedules to silently disappear from the fire query
(which requires next_run_at IS NOT NULL AND next_run_at <= now()):

Bug 1 - fireSchedule() and recordSkipped(): when ComputeNextRun returned an
error, nextRunPtr stayed nil and UPDATE SET next_run_at = $2 wrote NULL.
Fix: change to COALESCE($2, next_run_at) so the existing DB value is preserved
when $2 is NULL, and log the error explicitly.

Bug 2 - org importer (handlers/org.go): nextRun, _ := ComputeNextRun(...)
silently discarded the error. A bad cron expression would pass time.Time{}
(zero value) to the INSERT. Fix: surface the error, log it, and skip the
schedule INSERT via continue.

Bug 3 - no startup repair: schedules already NULL'd by the pre-fix binary
would never recover. Fix: Start() now calls repairNullNextRunAt() once on
boot, recomputing next_run_at for every enabled schedule with a NULL value.

Tests: TestFireSchedule_ComputeNextRunError, TestRecordSkipped_ComputeNextRunError,
TestRepairNullNextRunAt_RepairsRows, TestRepairNullNextRunAt_DBError_NoPanic,
TestOrgImport_ScheduleComputeError (all pass).

Fixes #722

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 13:34:28 +00:00
molecule-ai[bot]
5c1a9d0f24 feat(registry): workspace hibernation — auto-pause idle workspaces (#711)
Implements automatic workspace hibernation for workspaces that have been idle
longer than their configured hibernation_idle_minutes threshold.

Changes:
- migrations/029: Add hibernation_idle_minutes INT DEFAULT NULL column +
  partial index on workspaces table
- registry/hibernation.go: New StartHibernationMonitor goroutine that ticks
  every 2 min and calls hibernateIdleWorkspaces via the HibernateHandler
  callback (same import-cycle-prevention pattern as OfflineHandler)
- registry/hibernation_test.go: 5 unit tests covering handler calls, no-rows,
  DB error, tick behaviour, and context-cancel shutdown
- handlers/workspace_restart.go: New Hibernate() HTTP handler (POST
  /workspaces/:id/hibernate) + HibernateWorkspace(ctx, id) method — stops
  container, sets status='hibernated', clears Redis keys, broadcasts event
- handlers/a2a_proxy.go: Auto-wake in resolveAgentURL — when status='hibernated'
  and URL is empty, triggers async RestartByID and returns 503 + Retry-After: 15
  so callers can retry transparently
- registry/liveness.go: Exclude 'hibernated' workspaces from offline detection
- router.go: Register POST /workspaces/:id/hibernate under wsAuth group
- cmd/server/main.go: Wire hibernation monitor via supervised.RunWithRecover

Closes #711

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 13:27:39 +00:00
molecule-ai[bot]
37524ebe8f Merge pull request #719 from Molecule-AI/fix/issue-697-validate-token-removed-workspace
fix(wsauth): add removed-workspace JOIN to ValidateToken (#697)
2026-04-17 12:50:52 +00:00
Hongming Wang
3842d92ceb Merge pull request #696 from Molecule-AI/fix/issue-682-684-683-auth-token-fixes
fix(security): metrics auth, token revocation hardening, A2A false-negative (#682 #683 #689)
2026-04-17 05:47:08 -07:00
molecule-ai[bot]
523c0b9aa7 fix(wsauth): add removed-workspace JOIN to ValidateToken (#697)
Defense-in-depth: workspace-scoped ValidateToken now rejects tokens
belonging to workspaces with status='removed' at the DB layer, even
when revoked_at IS NULL. Mirrors the same guard added to ValidateAnyToken
in #696. Updated all test mock patterns (workspace_test, a2a_proxy_test,
secrets_test, admin_test_token_test, middleware) to match the new JOIN query.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 12:46:27 +00:00
Molecule AI QA Engineer
5fd25dc0df test(security): regression suite for input validation fixes (#685 #686 #687 #688)
30 test cases covering all four security fixes from PR #701:

  #686 — AdminAuth gate on GET /templates and GET /org/templates:
    - NoAuth returns 401 when tokens are enrolled
    - FreshInstall fails open (bootstraps correctly)

  #687 — UUID path param validation:
    - URL-encoded traversal (..%2f..%2fetc%2fpasswd) → 400
    - Non-UUID strings (not-a-uuid, ws-123, XSS payloads) → 400
    - Valid UUIDs pass through (regression check)

  #688 — Field length limits:
    - name=256, role=1001, model=101 chars → 400
    - Exact-boundary values (255/1000/100) → pass (off-by-one guard)

  #685 — YAML injection via newline/CR:
    - Newline in name, CR in role → 400
    - YAML multi-field injection payload "agent\nrole: injected" → 400

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 12:37:13 +00:00
molecule-ai[bot]
3f5dea791b Merge pull request #701 from Molecule-AI/fix/issue-685-686-687-688-input-validation
fix(security): input validation, route auth, UUID safety (#685 #686 #687 #688)
2026-04-17 12:32:03 +00:00
Molecule AI Backend Engineer
104683694a fix(wsauth): restore ValidateAnyToken removed-workspace JOIN (#682 defense-in-depth), restore ADR-001 blast-radius docs
- ValidateAnyToken: add JOIN on workspaces with AND w.status != 'removed'
  so tokens belonging to deleted workspaces cannot be replayed against
  admin endpoints even before the token row is explicitly revoked.

- tokens_test.go: update ValidateAnyToken regexp patterns to match new
  JOIN query; add TestValidateAnyToken_RemovedWorkspaceRejected.

- wsauth_middleware_test.go: update validateAnyTokenSelectQuery constant
  to match JOIN query; add TestAdminAuth_RemovedWorkspaceToken_Returns401
  to pin the AdminAuth removed-workspace rejection at the middleware layer.

- ADR-001: restore full blast-radius endpoint table (15 affected admin
  routes), explicit risk statement ("full platform takeover"), current
  mitigations, and Phase-H remediation plan (schema, middleware, bootstrap
  flow, migration path). Tracking issue: #710.
2026-04-17 12:25:44 +00:00
molecule-ai[bot]
bdd56b1489 fix(security): rebase #685-688 onto main — preserve wsAuth PATCH, add yamlSpecialChars
- Rebased onto 350288f1 (main HEAD, post-#692 IDOR fix)
- PATCH /workspaces/:id remains under wsAuth group (not open router)
- Added validateWorkspaceID (uuid.Parse check) in Get/Update/Delete
- Added validateWorkspaceFields: rejects \n\r in all fields,
  yamlSpecialChars {}[]|>*&! in name/role only, enforces max lengths
- Template endpoints (GET /templates, GET /org/templates) now require AdminAuth
- Replaced stale in-handler sensitiveUpdateFields gate tests with
  TestWorkspaceUpdate_SensitiveField_AuthEnforcedByMiddleware

Closes #685 #686 #687 #688
2026-04-17 12:13:44 +00:00
molecule-ai[bot]
bbaf406ed1 fix(router): restore admin/schedules/health route; add ADR-001 for #684 2026-04-17 12:03:34 +00:00
molecule-ai[bot]
112c17510c fix(security): revert #684 schema migration, restore /admin/schedules/health, add ADR-001
Required changes from security auditor before PR #696 can merge:

1. REVERT #684 (token_type schema migration):
   - Remove migration 029_token_type.{up,down}.sql
   - Revert wsauth/tokens.go — remove IssueAdminToken, token_type constants,
     restore HasAnyLiveTokenGlobal and ValidateAnyToken to pre-#684 behavior
   - Revert admin_test_token.go to use IssueToken (not IssueAdminToken)
   - Revert associated tests to pre-#684 patterns
   Path B: formal risk acceptance documented in ADR-001.

2. RESTORE /admin/schedules/health route (regression fix):
   - Add platform/internal/handlers/admin_schedules_health.go (from PR #671)
   - Add platform/internal/handlers/admin_schedules_health_test.go (from PR #671)
   - Wire GET /admin/schedules/health via AdminAuth in router.go

3. ADD ADR-001 (platform/docs/adr/ADR-001-admin-token-scope.md):
   - Documents #684 as known risk with Phase-H remediation plan
   - Phase-H tracking issue: Molecule-AI/molecule-core#710
2026-04-17 12:01:12 +00:00
rabbitblood
327cc3ea55 fix(router): remove AdminAuth from test-token — unblocks E2E bootstrap
#612 added AdminAuth to GET /admin/workspaces/:id/test-token, breaking
the chicken-and-egg bootstrap that E2E tests rely on:

1. POST /workspaces creates first workspace (fail-open, no tokens)
2. Provision generates a workspace auth token → inserts into DB
3. AdminAuth now sees a live token → requires auth on ALL routes
4. E2E calls test-token to get its first admin bearer → 401
5. All subsequent E2E calls fail → EVERY open PR CI blocked

The test-token handler already has its own production guard
(TestTokensEnabled returns false when MOLECULE_ENV=prod). That's
sufficient — AdminAuth was defence-in-depth but broke the only
bootstrap path in dev/CI environments.

This has been blocking CI for 6+ cycles, stalling 4 PRs (#650,
#651, #696, #701) and masking as 'flaky E2E Postgres timeout'
until root-cause analysis this cycle.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 04:50:14 -07:00
molecule-ai[bot]
643ffc6648 fix(security): add token_type column — workspace tokens rejected by AdminAuth (#684)
Security Auditor confirmed: ValidateAnyToken accepted any live workspace
token, meaning a workspace agent bearer could satisfy AdminAuth and reach
/bundles/import, /events, /org/import, /settings/secrets, etc.

Fix: add token_type TEXT ('workspace' | 'admin') to workspace_auth_tokens.

Migration 029:
- ALTER workspace_id DROP NOT NULL (admin tokens have no workspace scope)
- ADD COLUMN token_type TEXT NOT NULL DEFAULT 'workspace'
- ADD CONSTRAINT token_type_check (IN 'workspace', 'admin')
- ADD CONSTRAINT scope_check (workspace tokens MUST have workspace_id;
  admin tokens MUST have workspace_id = NULL)

Code changes:
- IssueToken: explicitly inserts token_type = 'workspace'
- IssueAdminToken (new): inserts NULL workspace_id + token_type = 'admin'
- ValidateAnyToken: now filters WHERE token_type = 'admin' — workspace
  tokens unconditionally fail
- HasAnyLiveTokenGlobal: counts only admin tokens
- admin_test_token.go: GetTestToken calls IssueAdminToken (#684)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 11:47:31 +00:00
molecule-ai[bot]
108d257833 fix(a2a): surface delivery_confirmed + prevent 503-busy double-delivery (#689)
Two targeted fixes for the A2A false-negative (delivery succeeded but caller
receives A2A_ERROR):

Body-read failure: when Do() succeeds (target sent 2xx headers — delivery
confirmed) but io.ReadAll(resp.Body) fails, proxy now returns
{"delivery_confirmed": true} in the 502 body and logs the activity as
successful. Audit trail records true delivery, not a false failed entry.

isTransientProxyError fix: delegation retry loop now only retries 503s with
{restarting: true} (container died, message NOT delivered). 503 {busy: true}
signals the agent IS processing the delivered message — retrying causes
double-delivery. Fix prevents the double-delivery race.

All 16 packages pass: go test ./...

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 11:26:28 +00:00
molecule-ai[bot]
572b314c4e fix(security): AdminAuth scope, token revocation, metrics auth (#682 #683 #684)
Three Offensive Security findings addressed:

#684 — AdminAuth accepts any workspace bearer token (FALSE POSITIVE).
ValidateAnyToken intentionally accepts any valid workspace token — the
platform's trust model uses workspace credentials as admin credentials.
No code change; documented as by-design in the PR body.

#682 — Deleted-workspace bearer tokens still authenticate (defense-in-depth).
The Delete handler already revokes all tokens (revoked_at = now()), so this
was a false positive. As defense-in-depth we add a JOIN against workspaces in
ValidateAnyToken so that even if revoked_at is not set (transient DB error
between status update and token revocation), the token still fails validation
once workspace.status = 'removed'.
Files: platform/internal/wsauth/tokens.go, tokens_test.go,
       platform/internal/middleware/wsauth_middleware_test.go

#683 — /metrics unauthenticated (REAL).
GET /metrics was on the open router with no auth. The Prometheus endpoint
exposes the full HTTP route-pattern map, request counts by route+status, and
Go runtime memory stats — ops intel that should not reach unauthenticated
callers. Scraper must now present a valid workspace bearer token.
File: platform/internal/router/router.go

All 16 packages pass: go test ./...

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 11:14:15 +00:00
molecule-ai[bot]
1bbda53142 Merge pull request #692 from Molecule-AI/fix/issue-680-681-workspace-auth
fix(security): auth+ownership on PATCH /workspaces/:id (#680 #681)
2026-04-17 11:03:25 +00:00
molecule-ai[bot]
c825e44b50 Merge pull request #659 from Molecule-AI/infra/rebuild-runtime-images-script
infra: add rebuild-runtime-images.sh — patches all 6 adapter images with git credential helper (#658)
2026-04-17 10:59:33 +00:00
molecule-ai[bot]
627946528d fix(security): add auth+ownership to PATCH /workspaces/:id (#680 #681)
ISSUE #680 — IDOR on PATCH /workspaces/🆔
- Route was on the open router with no auth middleware. Any unauthenticated
  caller could rename, change role, or update any workspace field of any
  workspace ID without credentials (zero auth + no ownership check).
- Fix: register under wsAuth (WorkspaceAuth middleware) which (a) requires a
  valid bearer token and (b) validates the token belongs to the target
  workspace, providing auth + ownership in a single check.
- Remove the now-redundant in-handler field-level auth block — the middleware
  is a strictly stronger gate. Dead code gone.
- Remove unused `middleware` import from workspace.go.
- Update tests: two tests that asserted the old in-handler 401 are replaced
  by TestWorkspaceUpdate_SensitiveField_AuthEnforcedByMiddleware (documents
  that auth is now at the router layer); cosmetic-field test renamed.

ISSUE #681 — test-token endpoint auth:
- Confirmed: GET /admin/workspaces/:id/test-token already has
  middleware.AdminAuth(db.DB). No change needed — finding was from older state.

Build: `go build ./...` clean. All 15 test packages pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 10:55:23 +00:00
molecule-ai[bot]
c07793eedf fix(security): cap discord error response body read at 4096 bytes
Unbounded io.ReadAll on the Discord webhook error response body was a LOW
OOM risk: a malicious gateway or misconfigured proxy could return a multi-MB
body and exhaust agent memory. Cap with io.LimitReader(resp.Body, 4096) —
error messages are always short; any extra content is irrelevant noise.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 10:46:09 +00:00
molecule-ai[bot]
4a27866c45 fix(router): restore artifacts routes, remove stray audit route from #618 scope
FIX 1: Cloudflare Artifacts routes (wsAuth POST/GET /artifacts, /fork, /token)
were accidentally dropped when #618 modified router.go. Restored along with the
handler and client packages that were already on main (#595/#641) but missing
from this branch.

FIX 2: Stray `audh := handlers.NewAuditHandler()` / `wsAuth.GET("/audit", ...)` block
was added out-of-scope during #618 work. Removed — #594 (audit-ledger) is a
separate merged PR and its routes live on main independently.

Build: `go build ./...` clean. All 17 test packages pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 10:44:34 +00:00
molecule-ai[bot]
470704416e fix(security): Ed25519 signature verification for Discord webhooks + strip token from error chain
HIGH (#659-1): POST /webhooks/discord had no signature verification, allowing
any attacker to POST forged Discord slash-command payloads. Add Ed25519
verification via verifyDiscordSignature() before adapter.ParseWebhook() is
called. The function reads r.Body, verifies Ed25519(pubKey, timestamp+body,
X-Signature-Ed25519), then restores r.Body with io.NopCloser so ParseWebhook
can still read the payload. The public key is resolved from the first enabled
Discord channel's app_public_key config (plaintext — it is a public key and
not in sensitiveFields) with a fallback to DISCORD_APP_PUBLIC_KEY env var;
no key configured -> 401 (fail-closed). discordPublicKey() is the DB helper.

MEDIUM (#659-2): discord.go SendMessage() wrapped http.Client.Do errors with
%w, propagating the *url.Error which includes the full webhook URL
(https://discord.com/api/webhooks/{id}/{token}) into logs and error responses.
Replace with a static "discord: HTTP request failed" string.

Tests added (11 new):
- TestVerifyDiscordSignature_Valid / _WrongKey / _TamperedBody /
  _MissingTimestamp / _MissingSignature / _InvalidHexSignature /
  _InvalidHexPubKey / _WrongLengthPubKey (real Ed25519 key pairs)
- TestChannelHandler_Webhook_Discord_NoKey_Returns401
- TestChannelHandler_Webhook_Discord_InvalidSig_Returns401
- TestChannelHandler_Webhook_Discord_ValidSig_PingAccepted
- TestDiscordAdapter_SendMessage_ErrorDoesNotLeakToken

go test ./... green.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 10:36:51 +00:00
molecule-ai[bot]
6e4979954b feat(platform): add GET /admin/schedules/health for cross-workspace schedule monitoring (#618)
Operators and audit agents can now detect silent cron failures across all
workspaces with a single AdminAuth-gated request — no per-workspace bearer
tokens required. This closes the proactive detection gap that left issue #85
(cron died silently 10+ hours) undetectable until users noticed missing work.

Changes:
- platform/internal/handlers/admin_schedules_health.go: new AdminSchedulesHealthHandler
  - GET /admin/schedules/health joins workspace_schedules + workspaces (excluding
    removed workspaces), computes status (ok|stale|never_run) and
    stale_threshold_seconds (2 × cron interval via scheduler.ComputeNextRun)
  - computeStaleThreshold() and classifyScheduleStatus() extracted as
    package-level helpers for direct unit testing
- platform/internal/handlers/admin_schedules_health_test.go: 16 tests
  - Unit tests for computeStaleThreshold (5min/hourly/daily crons, invalid expr,
    invalid timezone) and classifyScheduleStatus (never_run/stale/ok/zero-threshold)
  - Integration tests via sqlmock: empty result, never_run classification,
    stale detection, ok status, DB error → 500, multi-workspace response,
    required JSON fields coverage
- platform/internal/router/router.go: register GET /admin/schedules/health
  behind middleware.AdminAuth(db.DB), mirroring the /admin/liveness gate

Closes #618

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 10:28:55 +00:00
rabbitblood
8eaffc49aa fix(migrations): TEXT→UUID in 028_workspace_artifacts — unblocks all E2E CI
Migration 028 declared workspace_id as TEXT with a FK to workspaces(id)
which is UUID. Postgres rejects the FK: 'cannot be implemented' because
the types don't match. Same class of bug as #646 (which fixed 025).

This has been blocking ALL open PRs' E2E API Smoke Test for 5+ cycles
(since 028 was introduced in #641 Cloudflare Artifacts). Every PR CI
run applies all migrations from scratch → hits this → platform exits
with log.Fatalf → /health never responds → 30s timeout → FAIL.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 02:48:08 -07:00
Molecule AI Backend Engineer
0e9270feb7 chore: renumber audit-events migration 028 → 029
PR #641 (workspace_artifacts) already claimed 028 on main.
Rename both .up.sql and .down.sql to 029_audit_events.* to avoid
the collision when this branch merges.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 07:31:14 +00:00
molecule-ai[bot]
a2a26c6cce Merge pull request #656 from Molecule-AI/feat/issue-625-discord-adapter-clean
feat(channels): add Discord adapter (#625)
2026-04-17 07:30:39 +00:00
Molecule AI Backend Engineer
3895e02e01 fix(security): address Security Auditor findings on audit-ledger (#651)
- Replace == HMAC comparisons with hmac.compare_digest (Python) and
  hmac.Equal (Go) in ledger.py, verify.py, and audit.go to prevent
  timing oracle attacks (Fixes 1-6)
- Increase PBKDF2 iterations from 100K to 210K in both ledger.py and
  audit.go — must match for cross-language verification (Fix 7)
- Return chain_valid: null when offset > 0 (paginated views cannot
  verify a truncated chain; null means "not computed") (Fix 8)
- Remove module-level AUDIT_LEDGER_SALT attribute from ledger.py; read
  the secret exclusively from os.environ inside _get_hmac_key() so the
  salt is not exposed in the module namespace (Fix 9)
- Update tests: use monkeypatch.setenv/delenv instead of setattr on the
  removed AUDIT_LEDGER_SALT attribute; update testAuditKey helper to
  use 210K iterations; add TestAuditQuery_PaginatedOffsetReturnsNullChainValid
- Fix migration 028: workspace_id column type TEXT → UUID to match
  workspaces.id UUID primary key

All tests pass: 1043 pytest + 0 Go test failures.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 07:30:10 +00:00
Molecule AI Backend Engineer
54737d58a2 feat(platform): merge stacked system messages for Hermes/vLLM (#499)
vLLM (and Nous Hermes portal) only accept a single system message.
When the platform builds a messages array from multiple sources
(base system prompt + workspace config + per-session override), the
consecutive system entries at the front cause vLLM to reject or
silently drop all but the first.

Adds mergeSystemMessages() — a stateless pre-flight transform in the
handlers package that collapses the uninterrupted leading run of
{"role":"system"} entries into one, joining their content with "\n\n".
Non-system messages between system messages are not touched; a single
system message is returned as-is (no allocation).

10 unit tests cover: stacked merge, single-unchanged, no-system passthrough,
three-message collapse, interleaved user (trailing system not merged),
only-system-messages, empty slice, nil slice, non-string content, and
assistant-leading passthrough.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 07:19:30 +00:00
1b9be1e289 feat(channels): add Discord adapter (#625)
Implements DiscordAdapter conforming to the ChannelAdapter interface,
using Discord Incoming Webhooks for outbound messages and the Interactions
endpoint for inbound slash commands.

Changes:
- platform/internal/channels/discord.go: DiscordAdapter + splitMessage
  helper (Discord enforces 2000-char limit; long messages are split at
  newline/space boundaries). ParseWebhook handles type-1 PING (returns
  nil so the router layer can respond), type-2 APPLICATION_COMMAND, and
  type-3 MESSAGE_COMPONENT payloads. ValidateConfig rejects non-discord
  webhook URLs (SSRF guard matches Slack pattern).
- platform/internal/channels/discord_test.go: 20 unit tests covering
  Type/DisplayName, ValidateConfig (valid + 5 invalid cases), SendMessage
  error paths, ParseWebhook (PING / slash command / DM user / unknown type /
  invalid JSON), StartPolling, GetAdapter registry lookup, ListAdapters
  inclusion, and splitMessage edge cases.
- platform/internal/channels/registry.go: register "discord" adapter.
- .env.example: document DISCORD_WEBHOOK_URL.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 07:02:50 +00:00
Molecule AI Backend Engineer
fff063bd15 feat: molecule-audit-ledger — HMAC-SHA256 immutable agent event log (#594)
Implements EU AI Act Annex III compliance (Art. 12 record-keeping, Art. 13
transparency) via an append-only HMAC-SHA256-chained agent event log.

Python (workspace-template/molecule_audit/):
- ledger.py: SQLAlchemy 2.0 AuditEvent model + PBKDF2 key derivation +
  append_event() with prev_hmac chain linkage + verify_chain() CLI helper.
- hooks.py: LedgerHooks — on_task_start/on_llm_call/on_tool_call/on_task_end
  pipeline hooks; exception-safe (_safe_append); context manager support.
- verify.py: `python -m molecule_audit.verify --agent-id <id>` CLI;
  exits 0=valid, 1=broken, 2=missing SALT, 3=DB error.
- tests/test_audit_ledger.py: 46 tests covering HMAC determinism, field
  sensitivity, chain verification, LedgerHooks lifecycle, CLI.

Go (platform/):
- migrations/028_audit_events.up.sql: audit_events table with indexes.
- internal/handlers/audit.go: GET /workspaces/:id/audit — parameterized
  queries, inline chain verification (chain_valid: bool|null), PBKDF2
  key cached via sync.Once.
- internal/handlers/audit_test.go: 14 tests — HMAC, chain verify, handler
  query/filter/pagination/cap/error paths.
- internal/router/router.go: wire wsAuth.GET("/audit", audh.Query).
- .env.example: document AUDIT_LEDGER_SALT.
- requirements.txt: add sqlalchemy>=2.0.0.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 06:55:36 +00:00
molecule-ai[bot]
ed9106d83a Merge pull request #646 from Molecule-AI/fix/migration-025-fk-type
Merge gate passed. +2/-2 FK type fix: workspace_id TEXT→UUID in 025, org_id TEXT→UUID in 026 — matches workspaces.id (UUID PK). Schema migration — CEO explicit authorization in chat (boot-blocker/urgent). UNSTABLE = known App token scope gap.
2026-04-17 06:46:08 +00:00
molecule-ai[bot]
a6f4678b9e Merge pull request #641 from Molecule-AI/feat/issue-595-cloudflare-artifacts-demo
Merge gate passed (all 7 gates). Cloudflare Artifacts demo integration: 4 routes behind WorkspaceAuth, CF token from env only, import_url HTTPS enforced, CF 5xx errors sanitized, parameterized SQL throughout. Migration 028 uses CREATE TABLE IF NOT EXISTS. Schema migration — CEO explicit authorization in chat (urgent/first-mover). Tip SHA dc89d8f verified. UNSTABLE = known App token scope gap.
2026-04-17 06:43:21 +00:00
Hongming Wang
7dcb36b9eb fix(migrations): TEXT→UUID FK type mismatch blocking all E2E runs
Migrations 025 + 026 declared workspace_id/org_id as TEXT but
workspaces.id is UUID — Postgres rejects the FK constraint, crashing
every E2E run on main since these migrations were merged.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 23:40:22 -07:00
Molecule AI Backend Engineer
dc89d8fd7b fix(platform): address security review findings on CF Artifacts (#641)
Four findings from the security audit on PR #641:

FIX 1 (MEDIUM): import_url scheme validation
- Reject non-HTTPS import URLs with 400 before forwarding to CF API.
  Prevents SSRF via http://, git://, ssh://, file:// etc.

FIX 2 (MEDIUM): CF 5xx error leakage
- Add cfErrMessage() helper: returns "upstream service error" for CF 5xx
  responses and non-CF errors, passes through 4xx messages.
- Applied at all four CF-error response sites (Create, Get, Fork, Token).

FIX 3 (LOW): repo name validation
- Add package-level repoNameRE = ^[a-zA-Z0-9][a-zA-Z0-9_-]{0,62}$
- Validate in Create and Fork handlers when caller supplies an explicit name.
  Auto-generated names ("molecule-ws-<id>") are always safe and skip validation.

FIX 4 (LOW): response body size limit in CF client
- Wrap resp.Body with io.LimitReader(1 MB) before json.NewDecoder in do().
  Prevents memory exhaustion from a runaway/malicious CF response.

Tests: 16 new tests covering all four fixes (cfErrMessage 4xx/5xx/non-API,
import_url non-HTTPS cases, invalid repo names in Create and Fork).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 06:39:47 +00:00
Molecule AI Backend Engineer
57ff2eab78 feat(platform): Cloudflare Artifacts demo integration (#595)
Add a minimal but complete integration with the Cloudflare Artifacts API
(private beta Apr 2026, public beta May 2026) — "Git for agents" versioned
workspace-snapshot storage.

## What's included

**`platform/internal/artifacts/client.go`** — typed Go HTTP client for the
CF Artifacts REST API:
- CreateRepo, GetRepo, ForkRepo, ImportRepo, DeleteRepo
- CreateToken, RevokeToken
- CF v4 response-envelope decoding; *APIError with StatusCode + Message

**`platform/internal/handlers/artifacts.go`** — four workspace-scoped
Gin handlers (all behind WorkspaceAuth middleware):
- POST /workspaces/:id/artifacts — attach or import a CF Artifacts repo
- GET  /workspaces/:id/artifacts — get linked repo info (DB + live CF)
- POST /workspaces/:id/artifacts/fork — fork the workspace's repo
- POST /workspaces/:id/artifacts/token — mint a short-lived git credential

**`platform/migrations/028_workspace_artifacts.up.sql`** — `workspace_artifacts`
table: one-to-one link between a workspace and its CF Artifacts repo.
Credentials are never stored; only the credential-stripped remote URL.

**`platform/internal/router/router.go`** — wire the four routes into the
existing wsAuth group.

## Configuration
Two env vars gate the feature (returns 503 when either is absent):
- CF_ARTIFACTS_API_TOKEN — Cloudflare API token with Artifacts write perms
- CF_ARTIFACTS_NAMESPACE — Cloudflare Artifacts namespace name

## Tests
- 10 client-level tests (httptest.Server + CF v4 envelope mocks)
- 14 handler-level tests (sqlmock DB + mock CF server)
- Helper unit tests for stripCredentials, cfErrToHTTP

All 21 packages pass (go test ./...).

Closes #595

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 06:28:58 +00:00