molecule-core

Author	SHA1	Message	Date
Hongming Wang	81d5b658ad	fix(security): gate /channels/discover behind AdminAuth (#250 ) Closes #250 (MEDIUM). POST /channels/discover was on the open router and accepted an arbitrary Telegram bot token, turning it into: 1. A free bot-token validity oracle — attackers can enumerate/probe tokens at zero cost 2. A drive-by deleteWebhook side effect — every call invokes tgbotapi.DeleteWebhookConfig against the target bot, breaking legitimate webhook delivery 3. A rate-limit amplifier — getMe + deleteWebhook + getUpdates per call Fix: one-line addition of middleware.AdminAuth(db.DB) to the route, matching its actual intent (platform-operator admin helper, not a per-workspace route). Pattern mirrors /admin/liveness, /events, and /bundles/export from PR #167. No new test: AdminAuth behavior is covered by wsauth_middleware_test.go; this PR only wires it onto an additional route. The load-bearing code comment references #250 so future reviewers can't revert without an issue citation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 13:11:22 -07:00
Hongming Wang	7a16eb4f70	fix(auth): #168 — CanvasOrBearer middleware for PUT /canvas/viewport only Closes #168 by the route-split path from #194's review. #167 put PUT /canvas/viewport behind strict AdminAuth, breaking canvas drag/zoom persist because the canvas uses session cookies not bearer tokens. New narrow middleware CanvasOrBearer: - Accepts a valid bearer (same contract as AdminAuth) OR - Accepts a request whose Origin exactly matches CORS_ORIGINS - Lazy-bootstrap fail-open preserved for fresh installs Applied ONLY to PUT /canvas/viewport. The softer check is acceptable there because viewport corruption is cosmetic-only — worst case a user refreshes the page. This middleware must NOT be used on routes that leak prompts (#165), create resources (#164), or write files (#190) — see #194 review for why. The other canvas-facing routes mentioned in #168 (Events tab, Bundle Export/Import) remain behind strict AdminAuth pending a proper session-cookie-accepting AdminAuth (#168 follow-up for Phase H). 6 new tests cover: bootstrap fail-open, no-creds 401, canvas origin match, wrong origin 401, empty origin rejected, localhost default. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 11:09:16 -07:00
Hongming Wang	7a164f98d3	fix(security): #190 — gate POST /templates/import behind AdminAuth Closes #190 (HIGH). The route was registered on the root router with no auth middleware, letting any unauthenticated caller write arbitrary files into configsDir via a crafted template. Same vulnerability class as #164 (bundles/import) and path-traversal risk same as #103 (org/import). One-line gate via the existing wsAdmin pattern. Lazy-bootstrap fail-open preserved for fresh installs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 11:00:49 -07:00
Backend Engineer	06e02a310c	fix(router): call SetTrustedProxies(nil) to close IP-spoofing bypass (#179 ) Without this call Gin's default trusts all X-Forwarded-For headers, letting any caller rotate their effective IP and bypass per-IP rate limiting. SetTrustedProxies(nil) forces c.ClientIP() to always return the real TCP RemoteAddr. Adds two regression tests: one documenting the pre-fix bypass, one asserting the spoofed header is ignored after the fix. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-15 17:32:54 +00:00
Backend Engineer	9122d6aeea	fix(security): gate GET /approvals/pending behind AdminAuth (#180 ) GET /approvals/pending was registered on the open router with no middleware, allowing any unauthenticated caller to enumerate all pending approvals across every workspace on the platform. Fix: add inline middleware.AdminAuth(db.DB) to the route registration, matching the pattern used in PR #167 for bundles, events, and viewport. The three workspace-scoped approvals routes (POST/GET /approvals, POST /approvals/:id/decide) were already correctly behind WorkspaceAuth inside the wsAuth group — no change needed there. Tests: two new regression tests in wsauth_middleware_test.go — TestAdminAuth_Issue180_ApprovalsListing_NoBearer_Returns401 TestAdminAuth_Issue180_ApprovalsListing_FailOpen_NoTokens Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-15 17:25:09 +00:00
Hongming Wang	186206b33f	fix(security): #164 + #165 + #166 — gate 6 unauth routes behind AdminAuth CRITICAL (#164): POST /bundles/import — anon callers could create arbitrary workspaces with user-supplied system prompts, plugins, and secrets envelopes. Fixed by gating behind AdminAuth (bundleAdmin group). HIGH (#165): GET /bundles/export/:id — anon UUID probe leaked full system prompts, agent_card, plugins, memory for any workspace. GET /events + GET /events/:workspaceId — anon read of the append-only event log leaked org topology, workspace names, card fragments. Both moved into the same bundleAdmin / eventsAdmin groups. MEDIUM (#166): PUT /canvas/viewport — anon callers could reset shared viewport state. Gated via a scoped viewportAdmin group; GET stays open so canvas bootstraps without a bearer. GET /admin/liveness — operational-intel leak (scheduler cadence reveals work pattern). Inline AdminAuth on the single handler. All 6 routes use the same lazy-bootstrap admin auth the rest of the platform uses: zero-token installs fail-open, once any token exists every request must present a valid bearer. Known follow-up: canvas uses session cookies not bearer tokens (same pattern as #138). In multi-tenant production these canvas features — Events tab, Export/Duplicate, viewport persist — will return 401 once a workspace is token-enrolled. Needs cookie-accepting AdminAuth as a follow-up (tracked as option B in #138 triage discussion); a new issue will be filed for that scope. The security gain from closing #164 CRITICAL outweighs the canvas UX regression for tonight. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 09:52:32 -07:00
Hongming Wang	cbf46a837b	fix(auth): #138 — field-level authz on PATCH /workspaces/:id Closes #138. #125 moved PATCH /workspaces/:id into the wsAdmin AdminAuth group to close the #120 unauth vulnerability, but broke canvas drag- reposition and inline rename because canvas uses session cookies not bearer tokens. Multi-tenant deployments with any live token would have seen every canvas PATCH 401. Option A per #138 triage: PATCH goes back on the open router, but WorkspaceHandler.Update now enforces field-level authz: Cosmetic (no bearer required): name, role, x, y, canvas Sensitive (bearer required when any live token exists): tier — resource escalation parent_id — A2A hierarchy manipulation runtime — container image swap workspace_dir — host bind-mount redirection Fail-open bootstrap: HasAnyLiveTokenGlobal = 0 → pass-through (fresh install, pre-Phase-30 upgrade path). Matches the same lazy-bootstrap contract WorkspaceAuth and AdminAuth use elsewhere. 3 new tests cover all three branches of the matrix (cosmetic no-bearer, sensitive no-bearer-rejected, sensitive fail-open). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 09:39:09 -07:00
Hongming Wang	4b6482bea0	fix(security): #151 — register SecurityHeaders middleware Closes #151. The middleware was already implemented + tested (3 passing tests in securityheaders_test.go covering base set, multi-route, and the don't-override-existing contract) but never registered in router.go. One-line wire-up, runs after TenantGuard so rejected requests still get the same headers as accepted ones, and before routes so handlers can still opt out by setting their own header before c.Next() returns. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 03:50:52 -07:00
Dev Lead Agent	590eefb5ae	fix(security): #120 PATCH auth + #113 schedule IDOR — close unauthenticated write vectors Issue #120 (HIGH — immediately exploitable): PATCH /workspaces/:id was registered on the root router with no auth middleware. An attacker with any workspace UUID could: - Escalate tier (tier 4 = 4 GB RAM allocation) - Rewrite parent_id to subvert CanCommunicate A2A access control - Swap runtime image on next restart - Redirect workspace_dir host bind-mount to arbitrary path Fix: move PATCH into the wsAdmin AdminAuth group alongside POST, DELETE. The canvas position-persist call already has an AdminAuth token (required for GET /workspaces list on initial load) so no canvas regression. Also add workspace-existence guard in Update handler — previously returned 200 with zero rows affected for nonexistent IDs. Issue #113 (MEDIUM — schedule IDOR, carry-over from prior cycle): PATCH /workspaces/:id/schedules/:scheduleId and DELETE operated on scheduleID alone (WHERE id = $1), allowing any authenticated caller to modify or delete schedules belonging to other workspaces. Fix: bind workspace_id = c.Param("id") in both Update and Delete handlers; add AND workspace_id = $N to all schedule SQL queries. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-15 08:01:22 +00:00
Hongming Wang	69ba583508	Merge pull request #106 from Molecule-AI/fix/org-import-path-traversal fix(security): #103 — path-sanitize + admin-gate POST /org/import	2026-04-15 00:26:16 -07:00
Hongming Wang	0c7d84d6ce	Merge pull request #95 from Molecule-AI/fix/supervised-goroutines fix(platform): panic-recovering supervisor for every background goroutine (#92)	2026-04-15 00:26:13 -07:00
Hongming Wang	b95bf36690	Merge pull request #99 from Molecule-AI/fix/auth-middleware-critical fix(security): C1 — auth-gate GET /workspaces + middleware test coverage (C4/C8/C10/C11)	2026-04-15 00:26:10 -07:00
Hongming Wang	a7cb00cc8b	fix(security): #103 — path-sanitize + admin-gate POST /org/import Closes #103 (HIGH). Three attack surfaces on the import endpoint — body.Dir, workspace.Template, workspace.FilesDir — were concatenated via filepath.Join without validation, letting an unauthenticated caller probe arbitrary filesystem paths with "../../../etc". Two layers of defense: 1. resolveInsideRoot() rejects absolute paths and any relative path whose lexically cleaned join escapes the provided root (Abs + HasPrefix + separator guard). 6 tests cover happy path, traversal attempts, absolute path, empty input, prefix-sibling escape, and deep subpath resolution. 2. Route now runs behind middleware.AdminAuth so an unauthenticated attacker can't reach the handler at all once a token exists. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 00:18:09 -07:00
Backend Engineer	1a28ec8ee5	fix(security): C1 — gate GET /workspaces behind AdminAuth; add auth middleware tests Security Auditor confirmed C1 (GET /workspaces) exposes workspace topology without any authentication. The endpoint was intentionally left open for the canvas browser frontend; this PR closes that gap. Router change: - Move GET /workspaces from the bare root router into the wsAdmin AdminAuth group alongside POST /workspaces and DELETE /workspaces/:id. - AdminAuth uses the same fail-open bootstrap contract as all other auth gates: fresh installs (no live tokens) pass through; once any workspace has registered with a token, a valid bearer is required. Status of findings C2–C11 (documented here for audit trail): - C2 POST /workspaces/:id/activity → already in wsAuth group (Cycle 5) - C3 POST /workspaces/:id/delegations/record → already in wsAuth group (Cycle 5) - C4 POST /workspaces/:id/delegations/:id/update → already in wsAuth group (Cycle 5) - C5 GET /workspaces/:id/delegations → already in wsAuth group (Cycle 5) - C7 GET /workspaces/:id/memories → already in wsAuth group (Cycle 5) - C8 POST /workspaces/:id/memories → already in wsAuth group (Cycle 5) - C9 POST /workspaces/:id/delegate → already in wsAuth group (Cycle 5) - C10 GET /admin/secrets → already in adminAuth group (Cycle 7) - C11 POST+DELETE /admin/secrets → already in adminAuth group (Cycle 7) Tests (platform/internal/middleware/wsauth_middleware_test.go — 13 new): WorkspaceAuth: - fail-open when workspace has no tokens (bootstrap path) - C4: no bearer on /delegations/:id/update → 401 - C8: no bearer on /memories POST → 401 - invalid bearer → 401 - cross-workspace token replay → 401 - valid bearer for correct workspace → 200 AdminAuth: - fail-open when no tokens exist globally (fresh install) - C10: no bearer on GET /admin/secrets → 401 - C11: no bearer on POST /admin/secrets → 401 - C11: no bearer on DELETE /admin/secrets/:key → 401 - valid bearer → 200 - invalid bearer → 401 Note: did NOT touch DELETE /admin/secrets in production — no destructive calls to live secrets endpoints were made during this work. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-15 04:37:14 +00:00
rabbitblood	76a36e8062	fix(platform): panic-recovering supervisor for every background goroutine (#92 ) Yesterday's scheduler-died incident (#85) was one instance of a systemic bug: every long-running goroutine in the platform lacks panic recovery and exposes no liveness signal. In a multi-tenant SaaS deployment, a single tenant's bad data panicking any subsystem takes down the subsystem for every tenant, silently, with all standard health probes still green. That is a scale-of-one sev-1. This PR: 1. Introduces `platform/internal/supervised/` with two primitives: a. RunWithRecover(ctx, name, fn) — runs fn in a recover wrapper. On panic logs the stack + exponential-backoff restart (1s → 2s → 4s → … → 30s cap). On clean return (fn decided to stop) returns. On ctx.Done cancels cleanly. b. Heartbeat(name) + LastTick(name) + Snapshot() + IsHealthy(names, staleThreshold) — shared in-memory liveness registry. Every subsystem calls Heartbeat(name) at the end of each tick so operators can distinguish "goroutine alive and healthy" from "alive but stuck inside a single tick". 2. Wraps every `go X.Start(ctx)` in main.go: - broadcaster.Subscribe (Redis pub/sub relay → WebSocket) - registry.StartLivenessMonitor - registry.StartHealthSweep - scheduler.Start (the one that died yesterday) - channelMgr.Start (Telegram / Slack) 3. Adds `supervised.Heartbeat("scheduler")` inside the scheduler tick loop as the first end-to-end demonstration. Follow-up PRs will add heartbeats to the other four subsystems. 4. Adds `GET /admin/liveness` endpoint returning per-subsystem last_tick_at + seconds_ago. Operators can poll this and alert on any subsystem whose seconds_ago exceeds 2x its cron/tick interval. 5. Unit tests for RunWithRecover (clean return no restart; panic restarts with backoff; ctx cancel stops restart loop) and for the liveness registry. Net new code: ~160 lines + ~100 lines of tests. Refactor of main.go: ~10 line changes. No behavior change on happy path; only lifts what happens on a panic. Closes #92. Supersedes the local recover added to scheduler.go in #90 (kept conceptually, but now via the shared helper).	2026-04-14 20:34:18 -07:00
Hongming Wang	284ef6d33a	feat(platform): TenantGuard middleware — public repo's only SaaS hook Phase 32 foundation. The SaaS control plane (private molecule-controlplane repo) provisions one platform instance per customer org on Fly Machines and sets MOLECULE_ORG_ID=<uuid> on the machine. Its subdomain router forwards requests with X-Molecule-Org-Id=<uuid>. TenantGuard: - When MOLECULE_ORG_ID is set → every non-allowlisted request must carry a matching X-Molecule-Org-Id header. Mismatched/missing header → 404 (not 403 — don't leak tenant existence by letting probers distinguish "wrong org" from "route doesn't exist"). - When unset → passthrough. Self-hosted / dev / CI behavior unchanged. - Allowlist is exact-match, not prefix — /health and /metrics only. No orgs table, no signup, no billing, no Fly provisioning in this repo — all that lives in the private control plane. The public repo's SaaS surface is exactly this one middleware. 6 tests covering: unset-is-passthrough, matching header, mismatched header 404 (with empty body), missing header 404, allowlist bypass, and allowlist-is-exact-match. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 15:20:33 -07:00
Hongming Wang	496dee8e13	feat(platform): GET /admin/workspaces/:id/test-token for E2E (#6 ) Adds a gated admin endpoint that mints a fresh workspace bearer token on demand, eliminating the register-race currently used by test_comprehensive_e2e.sh (PR #5 follow-up). - New handler admin_test_token.go: returns 404 unless MOLECULE_ENV != production or MOLECULE_ENABLE_TEST_TOKENS=1. Hides route existence in prod (404 not 403). - Mints via wsauth.IssueToken; logs at INFO without the token itself. - Verifies workspace exists before minting (missing -> 404, never 500). - Tests cover prod-hidden, enable-flag-overrides-prod, missing workspace, and happy-path + token-validates round trip. - tests/e2e/_lib.sh gains e2e_mint_test_token helper for downstream adoption. - CLAUDE.md updated with route + env vars. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 09:35:26 -07:00
Dev Lead Agent	85be574e4d	fix(security): C18 register ownership check, C20 DELETE auth gate C18 — Workspace URL hijacking (CRITICAL, CONFIRMED LIVE): POST /registry/register now calls requireWorkspaceToken() before persisting anything. If the workspace has any live auth tokens, the caller must supply a valid Bearer token matching that workspace ID. First registration (no tokens yet) passes through — token is issued at end of this function (unchanged bootstrap contract). Mirrors the same pattern already applied to /registry/heartbeat and /registry/update-card. Attacker POC — overwriting Backend Engineer URL to http://attacker.example.com:9999/steal — now returns 401. C20 — Unauthenticated workspace deletion (CRITICAL, CONFIRMED LIVE): DELETE /workspaces/:id moved from bare router into AdminAuth group. Any valid workspace bearer token grants access (same fail-open bootstrap contract as /settings/secrets). Mass-deletion attack chain (C19 list → C20 delete all) requires auth for the DELETE step. POST /workspaces (create) also moved to AdminAuth to prevent unauthenticated workspace creation. C19 (GET /workspaces topology exposure) deferred — canvas browser has no bearer token; fix requires canvas service-token refactor. Tests: 2 new registry tests — C18 bootstrap (no tokens, passes through and issues token), C18 hijack blocked (has tokens, no bearer → 401). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-14 07:38:53 +00:00
Dev Lead Agent	fec7ac82d3	fix(security): protect global secrets routes with AdminAuth middleware (Cycle 7) Three unauthenticated routes allowed arbitrary read/write/delete of all global platform secrets (API keys, provider credentials) with zero auth: - GET/PUT/POST /settings/secrets - DELETE /settings/secrets/:key - GET/POST/DELETE /admin/secrets (legacy aliases) Fix: new AdminAuth middleware with same lazy-bootstrap contract as WorkspaceAuth — fail-open when no tokens exist (fresh install / pre-Phase-30 upgrade), enforce once any workspace has a live token. Any valid workspace bearer token grants access (platform-wide scope, no workspace binding needed). Changes: wsauth/tokens.go — HasAnyLiveTokenGlobal + ValidateAnyToken functions wsauth/tokens_test.go — 5 new tests covering both new functions middleware/wsauth_middleware.go — AdminAuth middleware router/router.go — global secrets routes now registered under adminAuth group Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-14 06:33:22 +00:00
Dev Lead Agent	6c78962a33	fix(security): Cycle 5 — auth middleware, injection hardening, skill sandbox Fix A — platform/internal/middleware/wsauth_middleware.go (NEW): WorkspaceAuth() gin middleware enforces per-workspace bearer-token auth on ALL /workspaces/:id/* sub-routes. Same lazy-bootstrap contract as secrets.Values: workspaces with no live token are grandfathered through. Blocks C2, C3, C4, C5, C7, C8, C9, C12, C13 simultaneously. Fix A — platform/internal/router/router.go: Reorganised route registration: bare CRUD (/workspaces, /workspaces/:id) and /a2a remain on root router; all other /workspaces/:id/* sub-routes moved into wsAuth = r.Group("/workspaces/:id", middleware.WorkspaceAuth(db.DB)). CORS AllowHeaders updated to include Authorization so browser/agent callers can send the bearer token cross-origin. Fix B — workspace-template/heartbeat.py: _check_delegations(): validate source_id == self.workspace_id before accepting a delegation result. Attacker-crafted records with a foreign source_id are silently skipped with a WARNING log (injection attempt). trigger_msg no longer embeds raw response_preview text; references delegation_id + status only — removes the prompt-injection vector. Fix C — workspace-template/skill_loader/loader.py: load_skill_tools(): before exec_module(), verify script is within scripts_dir (path traversal guard) and temporarily scrub sensitive env vars (CLAUDE_CODE_OAUTH_TOKEN, ANTHROPIC_API_KEY, OPENAI_API_KEY, WORKSPACE_AUTH_TOKEN, GITHUB_TOKEN, GH_TOKEN) from os.environ; restore in finally block. Defence-in-depth even if /plugins auth gate is bypassed. Fix D — platform/internal/handlers/socket.go: HandleConnect(): agent connections (X-Workspace-ID present) validated via wsauth.HasAnyLiveToken + wsauth.ValidateToken before WebSocket upgrade. Canvas clients (no X-Workspace-ID) remain unauthenticated. Fix D — workspace-template/events.py: PlatformEventSubscriber._connect(): include platform_auth bearer token in WebSocket upgrade headers alongside X-Workspace-ID. Fix E — workspace-template/executor_helpers.py: recall_memories() and commit_memory() now pass platform_auth bearer token in Authorization header so WorkspaceAuth middleware allows access. Fix F — workspace-template/a2a_client.py: send_a2a_message(): timeout=None → httpx.Timeout(connect=30, read=300, write=30, pool=30). Resolves H2 flagged across 5 consecutive audits. Tests: 149/149 Python tests pass (test_heartbeat + test_events updated to assert new source_id validation behaviour and allow Authorization header). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-14 04:44:42 +00:00
Hongming Wang	dae07d61fd	chore: structural cleanup — dead dirs, moves, gitignore - Delete empty platform/plugins/ (dead remnant; plugins/ at repo root is the real registry; router.go comment updated) - Gitignore local dev cruft: platform/workspace-configs-templates/, .agents/ (codex/gemini skill cache), backups/ - Untrack .agents/skills/ (keep local, stop tracking) - Move examples/remote-agent/ → sdk/python/examples/remote-agent/ (co-locate with the SDK it exercises); update refs in molecule_agent README + __init__ + PLAN.md + the demo's own README - Move docs/superpowers/plans/ → plugins/superpowers/plans/ (plans were written by the superpowers plugin's writing-plans subskill; belong with the plugin, not under docs) - Add tests/README.md explaining the unit-tests-per-package + root-E2E split so new contributors don't ask - Add docs/README.md explaining why site tooling lives under docs/ rather than a separate docs-site/ (VitePress ergonomics) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 14:06:52 -07:00
Hongming Wang	24fec62d7f	initial commit — Molecule AI platform Forked clean from public hackathon repo (Starfire-AgentTeam, BSL 1.1) with full rebrand to Molecule AI under github.com/Molecule-AI/molecule-monorepo. Brand: Starfire → Molecule AI. Slug: starfire / agent-molecule → molecule. Env vars: STARFIRE_* → MOLECULE_*. Go module: github.com/agent-molecule/platform → github.com/Molecule-AI/molecule-monorepo/platform. Python packages: starfire_plugin → molecule_plugin, starfire_agent → molecule_agent. DB: agentmolecule → molecule. History truncated; see public repo for prior commits and contributor attribution. Verified green: go test -race ./... (platform), pytest (workspace-template 1129 + sdk 132), vitest (canvas 352), build (mcp). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 11:55:37 -07:00

22 Commits