molecule-ai-workspace-runtime

molecule-ai/molecule-ai-workspace-runtime

Author	SHA1	Message	Date
molecule-ai[bot]	30d96b4e4e	fix(platform_auth): validate WORKSPACE_ID at import time (issue #14 , CWE-20) (#29 ) WORKSPACE_ID was read via os.environ.get("WORKSPACE_ID", "") in multiple builtin_tools modules and used directly in platform API URLs and X-Workspace-ID headers without validation. A crafted ID containing /, .., or # could cause URL path injection. Fix: validate_workspace_id() in platform_auth.py now validates the ID format at module import time using a regex that permits only lowercase alphanumerics and hyphens (matching UUIDs and org-generated IDs). The validated value is exposed as a module-level WORKSPACE_ID constant. builtin_tools/approval.py and builtin_tools/delegation.py now import from platform_auth instead of reading os.environ directly. Failing input raises ValueError with a clear message — workspace fails fast at startup rather than silently accepting malformed IDs in requests. Add 15 regression tests (45/45 passing total). Co-authored-by: Molecule AI Infra-Runtime-BE <infra-runtime-be@agents.moleculesai.app> Co-authored-by: Infra-Runtime-BE <infra-runtime-be@molecule.ai>	2026-04-21 00:04:54 +00:00
Hongming Wang	4aa0d9f110	fix(adapter-loader): fall back to any BaseAdapter subclass ADAPTER_MODULE resolution required the imported module to export a class literally named `Adapter`. The claude-code, langgraph, and openclaw adapter-template repos (3 of 4 currently in production) don't ship that alias — they export ClaudeCodeAdapter / LangGraphAdapter / OpenClawAdapter directly. Only hermes has the `Adapter = HermesAdapter` shim at the bottom of adapter.py. Consequence in prod: every fresh claude-code / langgraph / openclaw workspace crashed at runtime startup with "module 'adapter' has no attribute 'Adapter'", even with a2a-sdk correctly pinned <1.0. Provisioning looked successful from CP's side (EC2 ran) but the agent never registered because the process never reached A2A bootstrap. Fix: if `Adapter` is absent from the imported module, scan the module for any attribute that is a proper BaseAdapter subclass (excluding BaseAdapter itself — regression guard in tests). The explicit alias remains the preferred contract; this is purely additive tolerance. Bump to 0.1.4 and publish to PyPI via the existing v* tag trigger. 6 new tests cover: explicit alias, subclass-fallback, non-adapter-noise ignored, empty module → error, missing module → error, re-exported BaseAdapter → not selected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 16:59:12 -07:00
molecule-ai[bot]	2bb0f97085	Merge pull request #26 from Molecule-AI/fix/plugin-setup-env-scrub fix(plugins_registry/builtins): strip API keys from plugin setup.sh env	2026-04-20 23:04:33 +00:00
Molecule AI Infra-Runtime-BE	d6944086fe	fix(plugins_registry/builtins): strip API keys from plugin setup.sh env Issue #19 (CWE-C-312): AgentskillsAdaptor.install() passed the full os.environ to the subprocess running setup.sh, including ANTHROPIC_API_KEY, OPENAI_API_KEY, GITHUB_TOKEN, WORKSPACE_AUTH_TOKEN, etc. A malicious or compromised plugin's setup.sh could exfiltrate them. Fix: _scrubbed_env() builds a copy of os.environ with sensitive keys removed, matching the same _SCRUB_KEYS list used in skill_loader/loader.py so the scrubbing policy is consistent. CONFIGS_DIR is still passed via the extra dict. Non-secret vars (PATH, HOME, etc.) are preserved. Add 6 regression tests (30/30 passing). Co-Authored-By: Infra-Runtime-BE <infra-runtime-be@molecule.ai>	2026-04-20 22:52:13 +00:00
Molecule AI Infra-Runtime-BE	c72fbfc9a4	fix(builtin_tools/audit): fail-secure RBAC — read-only default when config unavailable Fixes #11 (CWE-285): get_workspace_roles() returned ["operator"] (full delegate/approve/memory.write) when workspace config could not be loaded. Changed to ["read-only"] — deny-by-default per Principle of Least Privilege. Add regression tests in tests/test_audit.py. Also includes: - main.py: remove token prefix log (CWE-532) — issue #10/#17 - a2a_mcp_server.py: RBAC gate on sensitive MCP tools (CWE-862) — issue #12 - cli_executor.py: sanitize stderr in error logs (CWE-209) — issue #13 - tests/test_a2a_mcp_server.py: 5 new regression tests for MCP RBAC Co-Authored-By: Infra-Runtime-BE <infra-runtime-be@molecule.ai>	2026-04-20 22:47:38 +00:00
rabbitblood	6cd4d74c5a	test: move sdk stubs to conftest.py (consistent across all test modules)	2026-04-16 11:15:45 -07:00
rabbitblood	a35d128870	test: stub claude_agent_sdk + a2a in session-resume tests CI failed on collect because claude_agent_sdk + a2a aren't test-env deps (they're installed inside the claude-code workspace image). The test file now stubs both via sys.modules so the collector can import claude_sdk_executor without pulling the real SDKs. Tests don't exercise the SDK anyway — only _resolve_resume() glob logic.	2026-04-16 11:13:33 -07:00
rabbitblood	3b56410ad5	fix: gate session resume on file existence (closes #488 ) ## Symptom (cycle 6+ of #488) Workspaces appear `online` (heartbeats fine) but every cron tick fails silently with `No conversation found with session ID: <uuid>` → `ProcessError: exit code 1` → idle loop logs HTTP 200, no actual work happens. Backend Engineer received 5 idle pulses without claiming a single one of the 6 open Hermes issues (#496-500) because the bug prevents `gh issue list` from ever firing. ## Root cause (verified live in ws-20cb8ff8-3e4 today) claude-code stores sessions at `/root/.claude/projects/<cwd-with-/-as-->/<id>.jsonl`. When a workspace container is recreated, `self._session_id` from a prior instance references a file that no longer exists. Passing it as `resume=<id>` to ClaudeAgentOptions crashes the CLI on the very first call. The existing #75 fix only fires AFTER the first ProcessError lands, and per-cycle executor re-instantiation can reload the stale id from elsewhere — restart-with-reset_claude_session was the only working mitigation, hand-fired every cycle. ## Fix New `_resolve_resume()` in ClaudeSDKExecutor: probes a handful of well-known session-file locations (`/root/.claude/projects/*/<id>.jsonl`, `/root/.claude/sessions/<id>.jsonl`, plus the agent-uid variants) via `glob.glob`. If no file matches the in-memory `_session_id`, drops the id (sets to None) AND returns None so `ClaudeAgentOptions.resume` is unset — CLI starts a fresh session. Logged at INFO with `#488` in the message so operators correlate. `_build_options()` now calls `_resolve_resume()` instead of reading `self._session_id` directly. Cheap path when no session set: zero glob calls. Hot path (session set + file exists): one glob call, short-circuits on first match. ## Drive-by fix: stale `from X import` in 4 modules Same regression class as #1 (the runtime release that closed it): - `claude_sdk_executor.py:43`: `from executor_helpers import …` - `cli_executor.py:39-40`: `from config import …`, `from executor_helpers import …` - `main.py:28-30`: `from config import …`, `from heartbeat import …`, `from preflight import …` - `preflight.py:7`: `from config import …` All rewritten to absolute `from molecule_runtime.<module> import …` so they resolve outside of workspace containers (e.g. test environments where `/app` isn't on sys.path). The grep guard in `tests/test_imports.py` already covered `adapters` — extending to all top-level imports would catch this class going forward; not in this PR to keep scope tight. ## Tests 6 new in `tests/test_session_resume_gate.py`: - baseline (no session) → no glob, returns None - file exists → keep id, returns id, single glob (early-exit) - file missing → drop id (clears `_session_id`), returns None - late-pattern match → walks all patterns until hit - log includes session id (operator triage) - log references #488 (debugger discoverability) All 16 tests (10 existing + 6 new) pass. ## Release plan - Bump version 0.1.1 → 0.1.2 (in this commit) - After merge, push v0.1.2 tag → publish.yml auto-publishes to PyPI - Then rebuild workspace template images locally so workspaces pick up the fix (templates pin `>=0.1.0`, will resolve to 0.1.2 on next build) - Then mass-restart workspaces with reset_claude_session=true once to clear any DB-side stale state, and the permanent fix kicks in Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 11:12:03 -07:00
rabbitblood	9cdae9afec	fix: switch top-level `from adapters import` to absolute imports (#1 ) Every modular workspace template repo (claude-code, hermes, langgraph, …) was crashing on boot with: KeyError: "Unknown runtime '<runtime>'. Available: " Root cause: `molecule_runtime/main.py` and four other modules used top-level imports like `from adapters import get_adapter` — a monorepo legacy that resolved when something on sys.path had an `adapters/` package. Standalone template repos COPY only `adapter.py` (singular) to /app and don't ship an `adapters/` package, so this import path went through some side-resolution that left `get_adapter` unable to see the user's adapter. The ADAPTER_MODULE → import → getattr → issubclass chain then silently fell through to the discovery branch and reported "Unknown runtime". Fix is one-line per file: `from adapters` → `from molecule_runtime.adapters` in: - molecule_runtime/main.py:27 - molecule_runtime/a2a_executor.py:44 - molecule_runtime/coordinator.py:20 - molecule_runtime/prompt.py:6 - molecule_runtime/builtin_tools/temporal_workflow.py:417 Tests + CI added so this regression class is caught at PR time, not at runtime in self-hosters' clusters: - tests/test_imports.py: parametrised import smoke for every previously affected module + a grep guard that fails if any future change reintroduces a top-level `from adapters` / `import adapters` line - .github/workflows/ci.yml: runs the smoke on every PR (no CI existed before — the publish workflow only fires on tag push) Closes #1. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 07:53:03 -07:00

9 Commits