molecule-core

Author	SHA1	Message	Date
molecule-ai[bot]	bcd256946f	Merge pull request #890 from Molecule-AI/test/issue-790-crash-resume-integration test(integration): crash-resume integration tests for Temporal checkpoints (#790)	2026-04-18 00:02:48 +00:00
molecule-ai[bot]	18cb498bca	Merge pull request #840 from Molecule-AI/feat/issue-800-opencode-mcp-bridge feat(platform): opencode MCP bridge — remote A2A tools over HTTP (#800)	2026-04-17 22:15:38 +00:00
molecule-ai[bot]	c5a1318de8	fix(mcp): add TODO(#838 ) in toolCommitMemory + document X-Workspace-ID trust in toolDelegateTask Security Auditor pre-merge conditions for PR#840: C5: toolCommitMemory passes content directly to DB insert without secret redaction. Gap is tracked to #838 (platform-wide _redactSecrets pass). Adds inline TODO(#838) comment at the insert site so the gap is visible in-code, not only in the issue tracker. C6: toolDelegateTask sets X-Workspace-ID but no bearer token on the outbound A2A call. The /workspaces/:id/a2a route is intentionally outside WorkspaceAuth (by design in router.go). CanCommunicate is enforced before the request is constructed, and callerID was authenticated by WorkspaceAuth on the MCP bridge entry point. Documents this trust assumption at the call site.	2026-04-17 22:13:55 +00:00
rabbitblood	a6ba22d8ec	fix(slack): tables as monospace blocks + ASCII dividers + strikethrough Tables: Slack has no table syntax. Converter now detects markdown tables and renders them as monospace code blocks with aligned columns. Dividers: replaced unicode em-dash (caused encoding artifacts) with plain ASCII dashes. Strikethrough: ~~text~~ converts to ~text~ (Slack native). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 15:01:46 -07:00
rabbitblood	ea574723df	fix(slack): restore FetchChannelHistory — was lost during branch juggling The function was defined on a feature branch, referenced by manager.go and slack_test.go, but never made it to main after the rebase. This caused go build to fail with 'undefined: FetchChannelHistory', which Docker masked by using a cached binary from the last successful build. That cached binary had neither the mrkdwn blocks nor the Level 3 context injection — explaining why Slack messages showed raw markdown despite the source having the converter. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 14:55:53 -07:00
rabbitblood	e3ada13adf	fix(slack): use blocks API for mrkdwn rendering + restore Level 3 Slack's chat.postMessage renders the text field as plain text when username override is used. Switching to blocks with type=mrkdwn forces rich formatting (bold, links, code, dividers). Also restores FetchWorkspaceChannelContext that was lost in rebase. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 14:47:07 -07:00
rabbitblood	a3579d92b2	fix(slack): restore mrkdwn converter + FetchWorkspaceChannelContext after rebase Both were lost during the PR #844 rebase — the converter was in the source but the binary couldn't compile because FetchWorkspaceChannelContext was missing from manager.go (interface mismatch). Previous deploys silently used the cached old binary without the converter. Also removed unused 'log' import that blocked compilation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 14:38:53 -07:00
rabbitblood	1de7e5788a	fix(slack): convert Markdown to mrkdwn before posting Agents output standard Markdown (Claude Code default) but Slack uses its own mrkdwn format. Without conversion: bold shows as literal bold ### heading shows as literal ### [text](url) shows as raw markdown link Converter handles: bold → bold (Slack bold is single asterisk) ### heading → heading (bold text, no headings in Slack) [text](url) → <url\|text> (Slack link format) --- → ——— (visual separator) `code` and ```blocks``` pass through unchanged 6 new tests: bold, heading, link, hr, code block, mixed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 14:26:41 -07:00
rabbitblood	15600b41ae	test(slack): add 12 unit tests for Slack adapter Covers: message splitting (short/long/newline boundary), config validation (bot_token/webhook/missing), FetchChannelHistory edge cases (empty token/channel), adapter type/name. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 14:16:13 -07:00
rabbitblood	847d0b88e8	feat(slack): Level 3 — ambient cross-agent context from Slack channels When a cron fires, the scheduler now fetches the last 10 messages from the workspace's Slack channel via conversations.history and prepends them to the cron prompt as '[Slack channel context — recent team messages]'. This gives each agent ambient awareness of what peers are doing: - Backend sees Frontend posted 'PR #840 ready for review' → can check - Security Auditor sees Backend posted 'new endpoint added' → plans review - PM sees all engineering activity → better synthesis in rollup Implementation: - slack.go: FetchChannelHistory() calls conversations.history, filters bot's own messages, returns last N as SlackHistoryMessage structs - manager.go: FetchWorkspaceChannelContext() looks up the workspace's Slack config, fetches history, formats as readable context block - scheduler.go: ChannelBroadcaster interface extended with FetchWorkspaceChannelContext; fireSchedule injects context before the cron prompt (prepended, not appended, so the agent sees team context BEFORE its task instructions) Best-effort: if Slack API fails or workspace has no channels, the prompt is unchanged. Truncated to 200 chars per message, 10 messages max to keep prompt overhead bounded. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 14:15:51 -07:00
rabbitblood	95d0bc25a3	fix(slack): address code review — 6 critical + improvement fixes Code review findings addressed: Critical: 1. Bot echo loop: add bot_id + subtype='bot_message' check in ParseWebhook to prevent outbound auto-posts from triggering inbound → infinite loop 2. Connection leak: close resp.Body immediately after reading instead of defer inside loop (was holding N connections open for N chunks) 3. Cancelled context: auto-post goroutine now uses context.Background() with 30s timeout instead of inheriting fireCtx (which gets cancelled by deferred cancel() when fireSchedule returns) 4. Slug validation: regex ^[a-zA-Z0-9 _-]+$ rejects path traversal and special chars in [slug] routing Improvements: 5. Shared HTTP client (slackHTTPClient) for connection pooling instead of per-request &http.Client{} 6. Rune-safe truncation in BroadcastToWorkspaceChannels for CJK/emoji 7. Log async HandleInbound errors instead of silently discarding 8. url_verification challenge properly returned (c.JSON with challenge) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 14:15:51 -07:00
rabbitblood	65bc6a8ca5	feat(channels): [slug] routing for inbound Slack messages Humans type [backend] what's #800? in a shared #mol-engineering channel and the message routes specifically to Backend Engineer's workspace. Matching logic (case-insensitive): [pm] → PM [backend] → Backend Engineer [dev-lead] → Dev Lead [security] → Security Auditor (prefix match on 'security-auditor') Unknown slugs return the available agent list for that channel so the user knows what slugs are valid. Messages without a [slug] prefix route to the first matching workspace (backward compat with Level 2). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 14:15:51 -07:00
rabbitblood	3f161a41eb	feat(slack): Level 1 auto-post + Level 2 inbound routing Level 1 — Auto-post cron output to Slack: - scheduler.go: captures A2A response body, extracts agent text via extractResponseSummary(), broadcasts to workspace's configured Slack channels on successful non-empty cron completions - manager.go: adds BroadcastToWorkspaceChannels() — fans out to all enabled channels for a workspace (engineering+firehose for eng agents, research+firehose for research agents, etc.) - main.go: wires scheduler → channel manager via SetChannels() - Truncates output to 500 chars for Slack readability Level 2 — Inbound Slack messages route to workspaces: Already implemented by the existing webhook handler (POST /webhooks/slack) + the ParseWebhook method in slack.go which handles both Events API JSON payloads and slash command form-encoded payloads. Needs Slack App Events API URL configured to: https://<platform-host>/webhooks/slack Also in this commit: - slack.go: dual-mode adapter (bot_token + webhook fallback) - 031 migration: pgvector guard wraps entire DO block Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 14:15:51 -07:00
rabbitblood	735aae6564	feat(slack): upgrade adapter to Bot API with per-agent identity + fix pgvector migration Slack adapter: adds chat.postMessage mode alongside legacy webhooks. When bot_token is configured, uses chat:write.customize for per-agent display name + emoji on every message. Each of the 15 active agents posts with a distinct identity (PM 💼, Backend ⚙️, etc.). 5 channels configured: #mol-engineering — PM, Dev Lead, Frontend, Backend, QA, Security, UIUX, Docs #mol-research — Research Lead, Market Analyst, Tech Researcher, Competitive Intel #mol-ops — DevOps, Triage, Offensive Security #mol-ceo-feed — PM synthesized rollup (CEO-facing) #mol-firehose — all agents (raw feed) Tested live: 5 test messages across 4 channels, all ok=true. pgvector migration: moved ALTER TABLE + CREATE INDEX inside the DO block so the entire migration is skipped when pgvector extension is unavailable (was crashing platform on restart — the guard caught CREATE EXTENSION but execution continued to ALTER TABLE which used the non-existent vector type). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 14:15:51 -07:00
Molecule AI Backend Engineer	29cc845c5f	feat(platform): opencode MCP bridge — remote A2A tools over HTTP (#800 ) Implements sub-issues #809 (MCPHandler), #810 (tool filtering), #811 (per-token rate limiting), #813 (opencode.json), #814 (docs). Routes (registered under wsAuth — bearer token binds to :id): GET /workspaces/:id/mcp/stream — SSE transport (backwards compat) POST /workspaces/:id/mcp — Streamable HTTP transport (primary) Security conditions from review (all mandatory): C1: WorkspaceAuth middleware rejects requests without valid bearer token C2: MCPRateLimiter (120 req/min/token, SHA-256 keyed) applied on both routes C3: commit_memory/recall_memory with scope=GLOBAL → permission error; send_message_to_user excluded unless MOLECULE_MCP_ALLOW_SEND_MESSAGE=true Tools: list_peers, get_workspace_info, delegate_task, delegate_task_async, check_task_status, send_message_to_user (opt-in), commit_memory, recall_memory. All mirror workspace-template/a2a_mcp_server.py TOOLS list. Also adds: org-templates/molecule-dev/opencode.json, docs/integrations/opencode.md, .env.example entries for MOLECULE_MCP_ALLOW_SEND_MESSAGE and MOLECULE_MCP_URL. Tests: 29 new tests (20 handler + 9 middleware). All passing. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-17 19:25:22 +00:00
Molecule AI QA Engineer	a663c8de81	test(integration): crash-resume integration tests for Temporal checkpoints (#790 ) Closes #790. Depends on feat/issue-583-1-checkpoint-persistence (PR #788). Platform (Go) — checkpoints_integration_test.go (5 new tests): 1. ThreeStepPersistence: POST task_receive/llm_call/task_complete → GET returns all 3 in step_index DESC order with correct names and payloads. 2. CrashResume_HighestStepIsResumptionPoint: POST steps 0+1 only (crash before step 2) → GET shows step_index=1 as the resume point; task_complete absent. 3. UpsertIdempotency_LatestPayloadWins: POST same (wf_id, step_name) twice with different payloads → List returns only the second payload (ON CONFLICT DO UPDATE). 4. PostCascadeDelete_Returns404: simulate post ON-DELETE-CASCADE state (empty rows) → List returns 404 as expected after workspace deletion. 5. AuthGate_NoToken_Returns401: router-level test with WorkspaceAuth middleware; POST/GET/DELETE all return 401 without a bearer token (no DB calls made). workspace-template — _save_checkpoint + 4 Python tests: - Add async _save_checkpoint() to temporal_workflow.py: POST to the platform checkpoint endpoint after each activity stage; fully non-fatal (try/except inside the function, plus defence-in-depth try/except at every call site). - 4 new pytest cases (test_temporal_workflow.py): - nonfatal_on_http_error: _save_checkpoint raises HTTPStatusError (500) → task_receive_activity still returns {"status":"received"}. - nonfatal_on_network_error: _save_checkpoint raises ConnectError → llm_call_activity still returns success LLMResult. - success_path: _save_checkpoint no-op → activity returns correctly; checkpoint called with correct args. - standalone_http_error_is_swallowed: real _save_checkpoint function swallows HTTP 500 from a mocked httpx.AsyncClient; returns None. All 36 temporal workflow Python tests pass. Go tests: Go binary not in this container; test file verified for syntax and against the sqlmock patterns used throughout the handlers package. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-17 19:17:29 +00:00
molecule-ai[bot]	2afc09fd0a	fix(scheduler): detect phantom-producing crons — consecutive-empty tracking (closes #795 ) fix(scheduler): detect phantom-producing crons — consecutive-empty tracking (#795)	2026-04-17 19:06:35 +00:00
molecule-ai[bot]	38377d2f08	feat(platform): Temporal checkpoint DB persistence layer (closes #788 ) feat(platform): Temporal checkpoint DB persistence layer (#788)	2026-04-17 19:05:48 +00:00
molecule-ai[bot]	ea59e59838	test(supply-chain): TDD spec for plugin supply-chain hardening (closes #768 ) test(supply-chain): TDD spec for plugin supply-chain hardening (#768)	2026-04-17 19:05:14 +00:00
molecule-ai[bot]	38a37eb8c2	fix(security): plugin supply chain hardening — SAFE-T1102 (closes #768 ) fix(security): plugin supply chain hardening — SAFE-T1102 (issue #768)	2026-04-17 19:04:04 +00:00
Molecule AI Backend Engineer	7c4123e6bd	feat(platform): Temporal checkpoint DB persistence layer (#788 ) Adds step-level checkpoint storage so workflows can resume from the last completed step after a crash or restart without replaying prior work. - Migration: `workflow_checkpoints` table — workspace_id (FK + CASCADE), workflow_id, step_name, step_index, completed_at, payload JSONB. UNIQUE(workspace_id, workflow_id, step_name) + covering index on (workspace_id, workflow_id, completed_at DESC). - Handlers (platform/internal/handlers/checkpoints.go): POST /workspaces/:id/checkpoints — upsert via ON CONFLICT DO UPDATE GET /workspaces/:id/checkpoints/:wfid — list steps ordered step_index DESC DELETE /workspaces/:id/checkpoints/:wfid — clear on clean shutdown (404 if none) - Router: all three routes on the wsAuth group (WorkspaceAuth middleware); workspace A's token cannot reach workspace B's checkpoints. - Tests (11 cases, sqlmock + race-safe): upsert-insert, upsert-update, payload forwarding, list-ordered, list-not-found, rows.Err() → 500, delete-success, delete-not-found, callerMismatch 403 on all 3 endpoints. Closes #788. Parent: #583-1. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-17 18:36:12 +00:00
rabbitblood	d58aab3c91	fix(scheduler): detect phantom-producing crons via consecutive-empty tracking (#795 ) Post-mortem fix: UIUX Designer ran 22 cron fires over 23 hours with every single response being empty or '(no response generated)'. The scheduler reported status=ok because the HTTP call succeeded — nobody caught it until the CEO asked. Changes: - Migration 032: adds consecutive_empty_runs INT to workspace_schedules - scheduler.go: captures response body from ProxyA2ARequest (was _), checks for empty/sentinel markers via isEmptyResponse(), increments consecutive_empty_runs on empty ok responses, resets on non-empty. When consecutive_empty_runs >= 3, sets last_status='stale' with a descriptive error message. The 'stale' status is surfaced via: - GET /admin/schedules/health (merged in #671) - PM's silence detector (companion fix in org-template PR) - Maintenance loop response-body sampling (operator-side fix) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 11:11:05 -07:00
molecule-ai[bot]	3de4d25684	feat: pgvector semantic search for agent memory recall (#576 ) Rebase of feat/issue-576-pgvector-semantic-memory onto current main, preserving the #767 security layer (globalMemoryDelimiter + GLOBAL audit log) that predates this branch. Changes layered on top of main: - Migration 031: embedding vector(1536) column + ivfflat cosine-ops index (renumbered from 029 — 029/030 were taken by workspace-hibernation and audit-events) - Commit: embed-on-write after INSERT, non-fatal on embedding failure - Search: semantic cosine-distance path when EmbeddingFunc is wired up; falls back to FTS/ILIKE; GLOBAL delimiter wrapping applies on both paths - EmbeddingFunc injection pattern; WithEmbedding chainable builder All security invariants preserved: - globalMemoryDelimiter wrapping on GLOBAL scope in both semantic + FTS - GLOBAL write audit log (SHA-256 forensic trail) in Commit - TestRecallMemory_GlobalScope_HasDelimiter passes - TestMemoriesCommit_Global_AsRoot passes - 3 new pgvector tests pass Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com>	2026-04-17 17:19:45 +00:00
Molecule AI QA Engineer	1d74168a2a	test(supply-chain): TDD spec for plugin supply-chain hardening (#768 ) Adds platform/internal/plugins/supply_chain_test.go with 8 tests (7 from the spec + 1 end-to-end combo) specifying both security controls. Control 1 — SHA256 content integrity (tests 1-3 + end-to-end): Tests call VerifyManifestIntegrity(stagedDir string) error, which does NOT exist yet → 5 compile errors / build failure until supply_chain.go is written. Once stubbed to nil, SHA256Mismatch test fails at runtime. VerifyManifestIntegrity contract: - manifest.json absent → nil (backward compat) - manifest.json present, no sha256 field → nil (backward compat) - sha256 matches computed stagedDirDigest → nil - sha256 mismatch → error mentioning "sha256" stagedDirDigest algorithm (canonical, test + impl must agree): Walk all files except manifest.json, sorted by rel path, format each as "<rel>\x00<content>", concatenate, SHA256, hex. Control 2 — Pinned-ref enforcement (tests 4-7): Tests call GithubResolver.Fetch with/without "#ref" fragment. Currently returns nil for bare refs → TestPluginInstall_UnpinnedRef_Rejected fails (GitRunner IS called; no "pinned ref" in error message). PLUGIN_ALLOW_UNPINNED=true escape hatch tested by test 7. RED state summary (current): go test ./internal/plugins/... -v -run TestPluginInstall → build failed: 5× undefined: VerifyManifestIntegrity → (with no-op stub) 2 runtime failures: FAIL TestPluginInstall_SHA256Mismatch_AbortsInstall FAIL TestPluginInstall_UnpinnedRef_Rejected Backend Engineer implementation checklist: [ ] Add supply_chain.go in package plugins with VerifyManifestIntegrity [ ] Add pinned-ref gate to GithubResolver.Fetch in github.go [ ] PLUGIN_ALLOW_UNPINNED=true check skips the gate [ ] All 8 tests GREEN before merge Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-17 16:41:32 +00:00
molecule-ai[bot]	5fa86cfbbd	fix(security): plugin supply chain hardening — SAFE-T1102 (#768 ) Add two defenses against malicious plugins from uncontrolled sources: 1. Pinned-ref enforcement (resolveAndStage): github:// install/download specs without a #<tag/sha> suffix are now rejected with HTTP 422. A mutable default-branch tip could change between audit and install, silently swapping in untrusted code. Override via PLUGIN_ALLOW_UNPINNED=true. 2. SHA-256 content integrity (installRequest.sha256): callers may supply the expected hex SHA-256 of the fetched plugin.yaml. When present, resolveAndStage verifies the digest after staging; a mismatch aborts the install with HTTP 422 and cleans up the staging dir. Updated TestPluginDownload_GithubSchemeStreamsTarball to use a pinned ref (#v1.0.0) so it reflects the new security requirement. Tests: 4 new (TestPluginInstall_SHA256Mismatch_AbortsInstall, TestPluginInstall_SHA256Match_Succeeds, TestPluginInstall_UnpinnedRef_Rejected, TestPluginInstall_PinnedRef_Accepted). All 15 packages green. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-17 16:37:45 +00:00
molecule-ai[bot]	4e4d21a8ac	Merge pull request #651 from Molecule-AI/feat/issue-594-audit-ledger feat: molecule-audit-ledger — HMAC-SHA256 immutable agent event log (#594)	2026-04-17 16:37:01 +00:00
molecule-ai[bot]	d5cdec261f	Merge pull request #724 from Molecule-AI/feat/issue-711-workspace-hibernation feat(registry): workspace hibernation — auto-pause idle workspaces	2026-04-17 16:36:27 +00:00
molecule-ai[bot]	0c3cdf6216	Merge pull request #769 from Molecule-AI/fix/issue-767-global-memory-injection fix(security): GLOBAL memory prompt injection safeguards (#767)	2026-04-17 16:35:35 +00:00
molecule-ai[bot]	f8927a84bd	Merge pull request #766 from Molecule-AI/fix/issue-761-system-caller-header-forge fix(security): reject X-Workspace-ID system-caller prefix forgery (#761)	2026-04-17 16:35:25 +00:00
molecule-ai[bot]	8d01a2a09c	fix(security): GLOBAL memory prompt injection safeguards (#767 ) Two defenses against GLOBAL-scope agent memory injection attacks: 1. Recall delimiter: Search() wraps every GLOBAL-scope memory value with a non-instructable prefix before returning it to MCP clients: [MEMORY id=<uuid> scope=GLOBAL from=<workspace_id>]: <value> This prevents stored content (e.g. "IGNORE ALL PREVIOUS INSTRUCTIONS") from being parsed as instructions in the agent's context window. Raw DB content is unchanged — the wrapper is applied on read only. 2. Write audit log: Commit() writes an activity_log entry with activity_type='memory_write_global' whenever a GLOBAL memory is stored. The entry records a SHA-256 hash of the content (never plaintext) alongside memory_id and namespace for forensic replay. Audit failure is non-fatal — a logging error must not roll back a successful write. Tests: - TestRecallMemory_GlobalScope_HasDelimiter — verifies exact delimiter format [MEMORY id=... scope=GLOBAL from=...]: <value> - TestCommitMemory_GlobalScope_AuditLogEntry — verifies activity_logs INSERT fires on every GLOBAL write (via mock.ExpectationsWereMet) - TestMemoriesCommit_Global_AsRoot — updated to expect the audit INSERT All 16 Go test packages pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-17 16:26:46 +00:00
molecule-ai[bot]	19b4dffd65	fix(security): reject X-Workspace-ID system-caller prefix forgery (#761 ) Added an early guard in ProxyA2A() that rejects HTTP requests whose X-Workspace-ID header passes isSystemCaller() with 403 Forbidden. Legitimate system callers (webhooks, scheduler, restart_context) call proxyA2ARequest() directly via ProxyA2ARequest() and never send HTTP headers with system-caller prefixes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-17 16:15:47 +00:00
Hongming Wang	f28b3922f9	Merge pull request #743 from Molecule-AI/feat/issue-727-opus-4-7-default feat: upgrade default workspace model to claude-opus-4-7	2026-04-17 08:47:27 -07:00
Molecule AI QA Engineer	10bb7127a7	test(hibernation): integration tests for workspace hibernation (#711 ) Cover the full hibernation feature (PR #724) + scheduler interaction (#722): handlers/hibernation_test.go (new, 6 tests): - HibernateWorkspace_OnlineWorkspace_Success — container stop called (nil provisioner guard), DB status set to 'hibernated', Redis keys cleared (ws:{id}, ws:{id}:url, ws:{id}:internal_url), WORKSPACE_HIBERNATED broadcast - HibernateWorkspace_NotEligible_NoOp — ErrNoRows → early return, no UPDATE, Redis keys untouched - HibernateWorkspace_DBUpdateFails_NoCrash — UPDATE error → no panic, no broadcast - HibernateHandler_Online_Returns200 — HTTP POST, online workspace → 200 {"status":"hibernated"} - HibernateHandler_NotActive_Returns404 — not online/degraded → 404 - HibernateHandler_DBError_Returns500 — DB error → 500 a2a_proxy_test.go (2 new tests): - ResolveAgentURL_HibernatedWorkspace_Returns503WithWaking — empty Redis + DB returns status=hibernated/url="" → 503 + Retry-After:15 + {waking:true,retry_after:15} - ResolveAgentURL_HibernatedWorkspace_NullURLVariant — same with SQL NULL url scheduler_test.go (1 new test): - RepairNullNextRunAt_HibernatedWorkspace_ScheduleRepaired — repair query has no workspace status filter; hibernated workspace's schedule still gets next_run_at repaired so it fires on wake Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-17 15:44:41 +00:00
Molecule AI QA Engineer	e0581a22b6	chore: merge main into test/issue-711-hibernation-integration (gets scheduler #722 fix)	2026-04-17 15:40:56 +00:00
Molecule AI Backend Engineer	ebfafb9139	feat: upgrade default workspace model to claude-opus-4-7 (#727 ) Replace the anthropic:claude-sonnet-4-6 default across config, handlers, env example, and litellm proxy config. All tests updated to match the new default; sonnet-4-6 alias kept in litellm_config.yml for pinned workspaces. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-17 15:30:57 +00:00
Molecule AI QA Engineer	7aeaf3c07c	test(security): route-specific #684 regression — three vulnerable admin routes The BE's tests (AdminTokenSet_, FailOpen_) validated the core AdminAuth contract on /admin/secrets. These table-driven additions pin the same contract on the three routes explicitly named in the #684 security report, each with three scenarios: workspace token rejected, correct ADMIN_TOKEN accepted, no bearer rejected. Routes covered: GET /admin/liveness GET /admin/github-installation-token GET /approvals/pending When ADMIN_TOKEN is set (tier 2), ValidateAnyToken is never called — the env-var comparison short-circuits before any DB lookup. The mock sets only HasAnyLiveTokenGlobal and nothing else; an extra DB expectation would itself be a test bug (calling it proves the middleware regressed to tier 3). All 18 TestAdminAuth_684* tests pass. Full go test ./... is green across all 15 platform packages. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-17 15:25:41 +00:00
Hongming Wang	00ef832e33	Merge pull request #729 from Molecule-AI/fix/issue-684-adminauth-bearer-scope fix(auth): AdminAuth rejects workspace bearer tokens when ADMIN_TOKEN is set (#684)	2026-04-17 08:17:11 -07:00
Molecule AI Backend Engineer	2452700d37	fix(a2a): restore delivery_confirmed body-read logic removed by hibernation commit (#689 ) The hibernation PR (`7f5f74d`) accidentally removed the delivery_confirmed fix that was introduced for issue #689. When io.ReadAll fails after the target has already responded with headers (200-399), the message WAS delivered — stripping delivery_confirmed from the error response caused callers to treat a successful send as a hard failure. Restore the full original body-read error block: - deliveryConfirmed flag (true when status 200-399) - log line with status/bytes_read context - logA2ASuccess call when deliveryConfirmed (audit trail accuracy) - proxyA2AError.Response includes "delivery_confirmed" field so callers can distinguish "not delivered" from "delivered, body lost" The hibernation auto-wake feature (resolveAgentURL status='hibernated' check) is orthogonal and untouched. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-17 15:14:25 +00:00
Molecule AI Backend Engineer	6259e69b42	fix(auth): tighten AdminAuth to reject workspace bearer tokens when ADMIN_TOKEN is set (#684 ) Blast-radius isolation gap: AdminAuth called ValidateAnyToken which accepted any live workspace bearer token. A compromised workspace agent could present its own token to GET /admin/github-installation-token and steal the platform's GitHub App credential, or hit /approvals/pending to enumerate cross-workspace approvals. Fix: introduce a dedicated admin credential tier via ADMIN_TOKEN env var. When set, AdminAuth verifies the bearer against that secret exclusively (crypto/subtle constant-time comparison). Workspace tokens are rejected outright — no DB lookup occurs. When ADMIN_TOKEN is not set the previous behaviour is preserved as a deprecated backward-compat fallback (tier 3) so existing deployments without the env var don't break immediately. Credential tiers (evaluated in order): 1. Fail-open — no live tokens globally (fresh install / pre-Phase-30) 2. ADMIN_TOKEN match — env var set, bearer must equal it exactly 3. Fallback (deprecated) — any valid workspace token (ADMIN_TOKEN unset) Operators should set ADMIN_TOKEN=<openssl rand -base64 32> to fully close the blast-radius gap. Tier 3 will be removed in a future release. Fixes #684. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-17 15:08:54 +00:00
molecule-ai[bot]	b83ddc7dff	fix(scheduler): prevent NULL next_run_at from permanently dropping schedules (#722 ) Three bugs caused enabled schedules to silently disappear from the fire query (which requires next_run_at IS NOT NULL AND next_run_at <= now()): Bug 1 - fireSchedule() and recordSkipped(): when ComputeNextRun returned an error, nextRunPtr stayed nil and UPDATE SET next_run_at = $2 wrote NULL. Fix: change to COALESCE($2, next_run_at) so the existing DB value is preserved when $2 is NULL, and log the error explicitly. Bug 2 - org importer (handlers/org.go): nextRun, _ := ComputeNextRun(...) silently discarded the error. A bad cron expression would pass time.Time{} (zero value) to the INSERT. Fix: surface the error, log it, and skip the schedule INSERT via continue. Bug 3 - no startup repair: schedules already NULL'd by the pre-fix binary would never recover. Fix: Start() now calls repairNullNextRunAt() once on boot, recomputing next_run_at for every enabled schedule with a NULL value. Tests: TestFireSchedule_ComputeNextRunError, TestRecordSkipped_ComputeNextRunError, TestRepairNullNextRunAt_RepairsRows, TestRepairNullNextRunAt_DBError_NoPanic, TestOrgImport_ScheduleComputeError (all pass). Fixes #722 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-17 13:34:28 +00:00
molecule-ai[bot]	7f5f74d493	feat(registry): workspace hibernation — auto-pause idle workspaces (#711 ) Implements automatic workspace hibernation for workspaces that have been idle longer than their configured hibernation_idle_minutes threshold. Changes: - migrations/029: Add hibernation_idle_minutes INT DEFAULT NULL column + partial index on workspaces table - registry/hibernation.go: New StartHibernationMonitor goroutine that ticks every 2 min and calls hibernateIdleWorkspaces via the HibernateHandler callback (same import-cycle-prevention pattern as OfflineHandler) - registry/hibernation_test.go: 5 unit tests covering handler calls, no-rows, DB error, tick behaviour, and context-cancel shutdown - handlers/workspace_restart.go: New Hibernate() HTTP handler (POST /workspaces/:id/hibernate) + HibernateWorkspace(ctx, id) method — stops container, sets status='hibernated', clears Redis keys, broadcasts event - handlers/a2a_proxy.go: Auto-wake in resolveAgentURL — when status='hibernated' and URL is empty, triggers async RestartByID and returns 503 + Retry-After: 15 so callers can retry transparently - registry/liveness.go: Exclude 'hibernated' workspaces from offline detection - router.go: Register POST /workspaces/:id/hibernate under wsAuth group - cmd/server/main.go: Wire hibernation monitor via supervised.RunWithRecover Closes #711 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-17 13:27:39 +00:00
molecule-ai[bot]	c53bf6eebd	Merge pull request #719 from Molecule-AI/fix/issue-697-validate-token-removed-workspace fix(wsauth): add removed-workspace JOIN to ValidateToken (#697)	2026-04-17 12:50:52 +00:00
Hongming Wang	87f2b9abb7	Merge pull request #696 from Molecule-AI/fix/issue-682-684-683-auth-token-fixes fix(security): metrics auth, token revocation hardening, A2A false-negative (#682 #683 #689)	2026-04-17 05:47:08 -07:00
molecule-ai[bot]	059644bc37	fix(wsauth): add removed-workspace JOIN to ValidateToken (#697 ) Defense-in-depth: workspace-scoped ValidateToken now rejects tokens belonging to workspaces with status='removed' at the DB layer, even when revoked_at IS NULL. Mirrors the same guard added to ValidateAnyToken in #696. Updated all test mock patterns (workspace_test, a2a_proxy_test, secrets_test, admin_test_token_test, middleware) to match the new JOIN query. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-17 12:46:27 +00:00
Molecule AI QA Engineer	5dbac3a5ee	test(security): regression suite for input validation fixes (#685 #686 #687 #688 ) 30 test cases covering all four security fixes from PR #701: #686 — AdminAuth gate on GET /templates and GET /org/templates: - NoAuth returns 401 when tokens are enrolled - FreshInstall fails open (bootstraps correctly) #687 — UUID path param validation: - URL-encoded traversal (..%2f..%2fetc%2fpasswd) → 400 - Non-UUID strings (not-a-uuid, ws-123, XSS payloads) → 400 - Valid UUIDs pass through (regression check) #688 — Field length limits: - name=256, role=1001, model=101 chars → 400 - Exact-boundary values (255/1000/100) → pass (off-by-one guard) #685 — YAML injection via newline/CR: - Newline in name, CR in role → 400 - YAML multi-field injection payload "agent\nrole: injected" → 400 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-17 12:37:13 +00:00
molecule-ai[bot]	63212130e3	Merge pull request #701 from Molecule-AI/fix/issue-685-686-687-688-input-validation fix(security): input validation, route auth, UUID safety (#685 #686 #687 #688)	2026-04-17 12:32:03 +00:00
Molecule AI Backend Engineer	993d39a74e	fix(wsauth): restore ValidateAnyToken removed-workspace JOIN (#682 defense-in-depth), restore ADR-001 blast-radius docs - ValidateAnyToken: add JOIN on workspaces with AND w.status != 'removed' so tokens belonging to deleted workspaces cannot be replayed against admin endpoints even before the token row is explicitly revoked. - tokens_test.go: update ValidateAnyToken regexp patterns to match new JOIN query; add TestValidateAnyToken_RemovedWorkspaceRejected. - wsauth_middleware_test.go: update validateAnyTokenSelectQuery constant to match JOIN query; add TestAdminAuth_RemovedWorkspaceToken_Returns401 to pin the AdminAuth removed-workspace rejection at the middleware layer. - ADR-001: restore full blast-radius endpoint table (15 affected admin routes), explicit risk statement ("full platform takeover"), current mitigations, and Phase-H remediation plan (schema, middleware, bootstrap flow, migration path). Tracking issue: #710.	2026-04-17 12:25:44 +00:00
molecule-ai[bot]	f1b2a2f8a6	fix(security): rebase #685-688 onto main — preserve wsAuth PATCH, add yamlSpecialChars - Rebased onto `15a850ea` (main HEAD, post-#692 IDOR fix) - PATCH /workspaces/:id remains under wsAuth group (not open router) - Added validateWorkspaceID (uuid.Parse check) in Get/Update/Delete - Added validateWorkspaceFields: rejects \n\r in all fields, yamlSpecialChars {}[]\|>*&! in name/role only, enforces max lengths - Template endpoints (GET /templates, GET /org/templates) now require AdminAuth - Replaced stale in-handler sensitiveUpdateFields gate tests with TestWorkspaceUpdate_SensitiveField_AuthEnforcedByMiddleware Closes #685 #686 #687 #688	2026-04-17 12:13:44 +00:00
molecule-ai[bot]	96c06b0174	fix(security): revert #684 schema migration, restore /admin/schedules/health, add ADR-001 Required changes from security auditor before PR #696 can merge: 1. REVERT #684 (token_type schema migration): - Remove migration 029_token_type.{up,down}.sql - Revert wsauth/tokens.go — remove IssueAdminToken, token_type constants, restore HasAnyLiveTokenGlobal and ValidateAnyToken to pre-#684 behavior - Revert admin_test_token.go to use IssueToken (not IssueAdminToken) - Revert associated tests to pre-#684 patterns Path B: formal risk acceptance documented in ADR-001. 2. RESTORE /admin/schedules/health route (regression fix): - Add platform/internal/handlers/admin_schedules_health.go (from PR #671) - Add platform/internal/handlers/admin_schedules_health_test.go (from PR #671) - Wire GET /admin/schedules/health via AdminAuth in router.go 3. ADD ADR-001 (platform/docs/adr/ADR-001-admin-token-scope.md): - Documents #684 as known risk with Phase-H remediation plan - Phase-H tracking issue: Molecule-AI/molecule-core#710	2026-04-17 12:01:12 +00:00
rabbitblood	784376f19f	fix(router): remove AdminAuth from test-token — unblocks E2E bootstrap #612 added AdminAuth to GET /admin/workspaces/:id/test-token, breaking the chicken-and-egg bootstrap that E2E tests rely on: 1. POST /workspaces creates first workspace (fail-open, no tokens) 2. Provision generates a workspace auth token → inserts into DB 3. AdminAuth now sees a live token → requires auth on ALL routes 4. E2E calls test-token to get its first admin bearer → 401 5. All subsequent E2E calls fail → EVERY open PR CI blocked The test-token handler already has its own production guard (TestTokensEnabled returns false when MOLECULE_ENV=prod). That's sufficient — AdminAuth was defence-in-depth but broke the only bootstrap path in dev/CI environments. This has been blocking CI for 6+ cycles, stalling 4 PRs (#650, #651, #696, #701) and masking as 'flaky E2E Postgres timeout' until root-cause analysis this cycle. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 04:50:14 -07:00

1 2 3 4 5

236 Commits