molecule-ai-workspace-runtime

molecule-ai/molecule-ai-workspace-runtime

Author	SHA1	Message	Date
Hongming Wang	7fc3537b2f	Merge pull request #42 from Molecule-AI/fix/stderr-capture-a2a-response fix(runtime): capture stderr in A2A error response (closes #66)	2026-04-24 13:25:15 -07:00
Hongming Wang	1759e221e9	Merge pull request #53 from Molecule-AI/chore/bump-0.1.15 Some checks failed Publish to PyPI / build-and-publish (push) Failing after 41s Details chore: bump to 0.1.15 — ship A2A_ERROR observability fix (#51)	2026-04-24 12:03:36 -07:00
rabbitblood	84f3faea8a	chore: bump to 0.1.15 — ship A2A_ERROR observability fix (#51 ) PR #52 fixed the empty '[A2A_ERROR] ' suffix but didn't bump the version — the fix landed on main without a corresponding PyPI release, so workspace-template rebuilds keep pulling 0.1.14 and the fix never reaches running agents. Bump to 0.1.15 to trigger the publish-on-tag workflow (maintainer pushes v0.1.15 tag after staging→main promotion). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 11:54:13 -07:00
Hongming Wang	0d71ee8345	Merge pull request #52 from Molecule-AI/fix/a2a-error-observability-51 fix(a2a): include exception class + error code in [A2A_ERROR] (#51)	2026-04-24 11:35:42 -07:00
rabbitblood	4940abdc68	fix(tests): remove pytest-asyncio dependency from #51 regression tests CI does not install pytest-asyncio — follow test_shared_runtime.py's _run(coro) helper pattern. Tests still cover the same two paths (bare exception class-name fallback + message passthrough) but no longer require the async pytest plugin. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 11:34:30 -07:00
rabbitblood	6ead3b433e	fix(a2a): include exception class + error code in [A2A_ERROR] (#51 ) When an exception's str() is empty (bare TimeoutError(), BrokenPipeError(), some httpx transport errors) `f"{_A2A_ERROR_PREFIX}{e}"` produced `"[A2A_ERROR] "` with a trailing space and zero diagnostic context, masking the real cause of peer-delegation failures in activity_logs. Observed on main monorepo: 22+ occurrences in 75 min across 7 leads during the MiniMax M2.7 trial rate-limit episode — zero breadcrumbs to route the debug from. Fix: - Exception branch: fall back to `type(e).__name__` when str(e) is empty - Error branch: include JSON-RPC `error.code` alongside message when present Tests: test_a2a_error_observability.py covers both the bare-exception path (must surface class name) and the message-passthrough path (must preserve existing useful messages). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 11:22:57 -07:00
Hongming Wang	d75a161ee8	fix(ci): sync auto-promote workflow (ff-only, no-gates mode)	2026-04-24 08:35:15 -07:00
Hongming Wang	ae624a1f6a	Merge pull request #50 from Molecule-AI/chore/add-auto-promote-staging chore(ci): add auto-promote-staging workflow	2026-04-24 08:18:43 -07:00
Hongming Wang	f58d12bee2	chore(ci): add auto-promote-staging workflow	2026-04-24 07:43:56 -07:00
Hongming Wang	a80294766c	Merge pull request #49 from Molecule-AI/fix/precommit-skip-rebase Some checks failed Publish to PyPI / build-and-publish (push) Failing after 40s Details fix(precommit): skip during rebase/cherry-pick/merge/revert — unblocks DIRTY PR rebase	2026-04-24 04:35:19 -07:00
rabbitblood	c43df7f947	fix(precommit): skip during rebase/cherry-pick/merge/revert — unblocks DIRTY PR rebase Trace from molecule-core cycle 107 (2026-04-24): 15 staging PRs stuck DIRTY (real merge conflicts) with 0 merges in 1+ hours. Authors couldn't rebase to fix the conflicts because the pre-commit hook (shipped in 0.1.11) refuses ANY commit that includes forbidden paths in the diff — including rebase replays of historical commits that pre-date the gate. Specifically, agents trying to `git rebase staging` on a PR like "docs(marketing): Phase 30 social copy" fail at the first commit replay because that commit added marketing/* files. The fix would require interactive rebase + manual file deletion + commit amend — agents don't do that, so the PR stays DIRTY indefinitely. Detection: check .git for rebase-merge/, rebase-apply/, CHERRY_PICK_HEAD, MERGE_HEAD, or REVERT_HEAD. These state markers exist only during the corresponding git operation. Skip the hook silently when present. The hook still blocks fresh `git commit` (the failure mode it was designed for). It just doesn't try to police what was already in git history. Bumped to 0.1.14. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 04:34:55 -07:00
Hongming Wang	faa5b42aa4	Merge pull request #47 from Molecule-AI/fix/enable-v0-3-compat Some checks failed Publish to PyPI / build-and-publish (push) Failing after 42s Details fix: enable v0_3 compat in JSON-RPC dispatcher	2026-04-24 02:37:20 -07:00
rabbitblood	19f0033222	fix: enable v0_3 compat in JSON-RPC dispatcher — platform sends old method names Sister fix to 0.1.12 (root mounting). After fixing the route mount, every inbound A2A still returned `-32601 Method not found` because the 1.x dispatcher's method table doesn't recognize v0.3-shaped names (`message/send`, `tasks/get`) that the platform's ProxyA2A still sends. Reproduces in the SDK on a minimal handler: create_jsonrpc_routes(h, "/") → "Method not found" create_jsonrpc_routes(h, "/", enable_v0_3_compat=True) → dispatches OK Bumped to 0.1.13. Both 0.1.12 and 0.1.13 are needed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 02:37:07 -07:00
Hongming Wang	d22c19ad31	Merge pull request #46 from Molecule-AI/fix/jsonrpc-mount-at-root Some checks failed Publish to PyPI / build-and-publish (push) Failing after 40s Details fix: mount JSON-RPC at root — fixes silent fleet productivity loss	2026-04-24 02:06:58 -07:00
rabbitblood	30ebe9baf3	fix: mount JSON-RPC at root — platform POSTs to /, not /api/v1/jsonrpc/ Baseline restart 2026-04-24: every workspace came up healthy (uvicorn listening, agent-card serving) but produced zero delegations for two maintenance cycles. Tracing revealed platform's ProxyA2A POSTs to `http://ws-<id>:8000/` (no path suffix, see workspace-server/internal/provisioner.InternalURL) while the runtime's JSON-RPC routes were mounted at `/api/v1/jsonrpc/` under the a2a-sdk 1.x API migration. Result was silent — every inbound A2A returned 404 Not Found, the platform logged "Not Found" at INFO level, but no error bubbled up because the SDK's jsonrpc route factory doesn't respond to root when mounted at a subpath. Agents stayed warm, crons fired, but no work flowed. Fix: `create_jsonrpc_routes(handler, "/")` — matches platform expectation and the agent-card self-advertisement (which also shows root as the JSON-RPC URL). Agent-card route keeps its hard-coded `/.well-known/agent-card.json` path so there's no collision. Bumped to 0.1.12. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 02:06:04 -07:00
Hongming Wang	64720c0fc6	Merge pull request #45 from Molecule-AI/feat/precommit-hook-block-internal-paths Some checks failed Publish to PyPI / build-and-publish (push) Failing after 41s Details feat: pre-commit hook to block internal paths in public monorepo (A)	2026-04-24 00:49:42 -07:00
rabbitblood	89739bf848	feat: pre-commit hook to block internal paths in public monorepo (A) Anti-leak proposal item A. Companion to D (decision tree in role prompts, separate PR on org-templates). Why a local pre-commit hook =========================== Agents try to `git add /research/foo.md` despite SHARED_RULES, the .gitignore patterns, and the CI gate. Each leak attempt costs ~5 cycles (PR opens, CI fails, agent retries with workaround) and pollutes git history with reverts. A pre-commit hook converts the failure from "PR opens then fails" → "commit refused immediately, with the recovery command printed in the same error message the agent reads." Agents act on what's in the current response context — putting the redirect command literally in the failure output is the highest-density feedback we can provide. What changes ============ - molecule_runtime/scripts/pre-commit-block-internal-paths.sh — bash hook. Checks `git remote get-url origin`, only enforces in Molecule-AI/molecule-monorepo + molecule-core. In every other repo (internal, plugins, templates, third-party) it's a no-op. When forbidden paths are staged, refuses the commit with the redirect recipe + the alternative public-facing paths + the workflow-edit path for legitimate exceptions. - molecule_runtime/precommit_hook.py — install_pre_commit_hook(): 1. Extracts bundled hook to ~/.molecule-runtime/git-hooks/pre-commit 2. chmod +x 3. Sets core.hooksPath globally — UNLESS already set by an operator (then logs a warning + skips, doesn't clobber) - molecule_runtime/main.py — calls install_pre_commit_hook() at step 0.2, right after install_credential_helper() - pyproject.toml bumped to 0.1.11 Both A and D together close the loop: D ensures the agent knows the right path before writing; A enforces it at the local git boundary if the agent forgets. CI gate remains the third backstop for anything that gets pushed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 00:48:47 -07:00
Hongming Wang	f334872d56	Merge pull request #44 from Molecule-AI/feat/inline-credential-helper feat: ship GitHub credential-helper inline in runtime (fixes #1933 class)	2026-04-24 00:42:32 -07:00
rabbitblood	f1329fe230	feat: ship GitHub credential-helper inline in runtime (fixes #1933 class) Lifts the per-template wiring (Dockerfile COPY + entrypoint.sh git config + nohup daemon launch) into the Python runtime. Templates that depend on molecule-ai-workspace-runtime get the behavior automatically — they no longer need to maintain their own copy of the helper scripts or remember to write the right git config in their entrypoint. Background: - GitHub App installation tokens (ghs_…) expire ~60min after issue - claude-code-default template shipped without wiring → 39 workspaces lost their tokens, three PMs' A2A queues filled with retry-status messages, manual fleet restart required (cycle 62-66 incident) This commit: - Adds molecule_runtime/scripts/{molecule-git-token-helper.sh, molecule-gh-token-refresh.sh} as package data (copies from canonical workspace/scripts/ in molecule-monorepo) - Adds molecule_runtime/credential_helper.py with install_credential_helper() that: 1. Extracts bundled scripts to ~/.molecule-runtime/scripts/ 2. Configures git credential.helper for github.com 3. Creates ~/.molecule-token-cache/ mode 0700 4. Spawns refresh daemon under respawn loop (PID file dedup) 5. Runs initial gh auth login --with-token - Hooks call site early in main.py (step 0.1, before config load) - Fails-soft: each step independently fault-tolerant; missing git/gh binary doesn't block runtime startup Bumped to 0.1.10. Templates can drop their entrypoint.sh credential helper setup once they update the runtime pin (separate PRs per template). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 00:41:32 -07:00
Molecule AI SDK-Dev	19fde6f466	fix(runtime): capture stderr in A2A error response (closes #66 ) - Lower _PROCESS_ERROR_STDERR_MAX_CHARS to 1024 (was 4096) so A2A responses stay bounded — the full context is already in workspace logs via logger.error/exception. - Add stderr= kwarg to sanitize_agent_error() so callers can surface subprocess stderr verbatim in A2A responses. - In _execute_locked() non-retryable error path, extract the first 1 KB of exc.stderr and pass it to sanitize_agent_error() so the A2A response carries actionable context (rate limit message, auth error, etc.) instead of just a class name. - Add test_executor_helpers.py unit tests for the new stderr= kwarg.	2026-04-24 05:00:51 +00:00
molecule-ai[bot]	d5cf872311	feat: migrate a2a-sdk 1.x (KI-009) (#39 ) - Replace a2a.utils.new_agent_text_message → a2a.helpers.new_text_message - Replace Part(root=TextPart(...)) → Part(text=...) (flat Part API) - Replace A2AStarletteApplication → Starlette route factories (create_agent_card_routes, create_jsonrpc_routes) - Update conftest stubs: remove a2a.server.apps/a2a.utils, add a2a.server.routes/a2a.helpers/AgentInterface - Add AgentInterface to AgentCard supported_interfaces - Rename snake_case AgentCard fields per 1.x schema Co-authored-by: Molecule AI Infra-Runtime-BE <infra-runtime-be@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-24 01:54:33 +00:00
Hongming Wang	1b04da2061	Merge pull request #38 from Molecule-AI/fix/auto-detect-llm-token-type feat(runtime): auto-detect LLM token type, normalise env on boot	2026-04-23 13:53:06 -07:00
Hongming Wang	e562b7a03e	Merge branch 'staging' into fix/auto-detect-llm-token-type	2026-04-23 13:52:25 -07:00
Hongming Wang	3556244725	Merge pull request #40 from Molecule-AI/fix/heartbeat-401-token-refresh-1877 fix(heartbeat): refresh on-disk auth token on 401 + retry once (#1877)	2026-04-23 13:51:42 -07:00
rabbitblood	a78b9f229e	test(1877): convert async tests to sync httpx.Client to unblock CI CI doesn't have pytest-asyncio installed, and the async wrapping was incidental — the production retry pattern (refresh-on-401) is identical in sync and async forms. Switching to httpx.Client + MockTransport keeps the same coverage without the async dep. 6/6 still pass locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 13:35:45 -07:00
rabbitblood	050c2412b3	fix(heartbeat): refresh on-disk auth token on 401 + retry once (#1877 ) ## Problem Auto-restart rotates the workspace's auth token in two non-atomic steps: 1. Platform issues new token via wsauth.IssueToken 2. Provisioner writes the new token to /configs/.auth_token AFTER ContainerStart returns Between steps 1 and 2, the new container has booted and the runtime has already loaded the OLD cached value of .auth_token (or no value if the file was empty during boot). The runtime's first /registry/heartbeat call sends the stale token, gets 401, but the loop never re-reads the on-disk token — so subsequent heartbeats also send the stale value. Each 401 means the platform never sees the workspace as alive → status stays 'provisioning' → scheduler won't dispatch → workspace looks dead from every angle even though the container is actually running. The existing code comment in workspace_provision.go acknowledges this: "the workspace will get 401 on its first heartbeat and can recover on the next restart." That recovery only worked because workspaces used to crash for unrelated reasons and get restarted. After PR #1861 (provisioner empty-volume auto-recover) removed those crashes, workspaces get stuck in the 401 loop with no exit. ## Fix Two-part runtime-side fix in molecule-ai-workspace-runtime: 1. platform_auth.refresh_from_disk() — new helper that clears the in-memory cache and re-reads /configs/.auth_token. Returns the fresh value (or None if missing). Updates the cache as a side effect. 2. HeartbeatLoop._loop() — on 401 from /registry/heartbeat, calls refresh_from_disk() and retries the request ONCE with the new token. Same pattern in _check_delegations(). Bounded retry budget — if the on-disk token is also stale (bug elsewhere), no infinite loop. ## Tests 6/6 new tests in tests/test_token_refresh_1877.py: - refresh_picks_up_rotated_token — happy path - refresh_returns_none_when_file_missing — defensive - refresh_clears_stale_cache_when_file_disappears - refresh_is_idempotent - 401_retry_pattern_uses_refreshed_token — the production fix path - 401_retry_no_loop_when_disk_token_also_stale — bounded retry budget All pass locally on Python 3.13 + pytest 9. ## Why this fix and not the alternatives - Alternative B (platform writes token before ContainerStart): Right architecturally but invasive — needs provisioner refactor to prep volumes before docker run. - Alternative C (skip rotation on auto-restart): Breaks the multi-instance-safety invariant the existing code calls out (revoke prevents stale tokens from sister deployments). - This fix (A): 3-line core change + helper. Self-healing for any timing edge case, not just the post-restart one. Costs nothing in the happy path (only triggers on 401). ## Version Bumped to 0.1.9. Once published to PyPI + workspace template image rebuilt, deployed workspaces auto-recover from token-rotation races without operator intervention. Closes #1877. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 13:26:36 -07:00
rabbitblood	4bafea58ae	fix(llm_auth): tighten base-URL hostname match + strip whitespace + no token in logs Self-review findings on #38: 1. Token substring leak: the "unknown prefix" warning included the first 12 chars of the token in the log message. Logs get shipped to Langfuse / CloudWatch / slack-firehose — 12 bytes of a secret in a log is still 12 bytes too many. Warning no longer references the token value at all. 2. Base-URL substring match was too loose: `"anthropic.com" not in base` would accept `https://proxy.anthropic.com.evil.example/` as "looks like Anthropic, keep the URL." Replaced with an allowlist of exact hostnames parsed via urllib.parse.urlparse. 3. Whitespace in pasted tokens: operators frequently paste tokens from terminals with a trailing newline. The token would flow through startswith() detection but then fail downstream auth with a confusing "malformed token" error. Strip and persist the cleaned value. 4. Malformed base URL crash guard: if someone sets ANTHROPIC_BASE_URL to something urlparse can't handle, don't crash — fall through to clearing it, which is the safe choice in OAuth mode. Added 5 new tests covering each of the above. 16/16 tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 10:46:07 -07:00
rabbitblood	0a0f11b41f	feat(runtime): auto-detect LLM token type, normalise env on boot Platform stores per-workspace LLM credentials under a single key (ANTHROPIC_AUTH_TOKEN in workspace_secrets). But downstream tools expect different env var names depending on the token type: sk-ant-oat01-* → CLAUDE_CODE_OAUTH_TOKEN (Claude Code OAuth session) sk-ant-api03-* → ANTHROPIC_API_KEY (direct Anthropic API) sk-cp-* → ANTHROPIC_AUTH_TOKEN (proxy: MiniMax, gateways) Without normalisation, an OAuth token under ANTHROPIC_AUTH_TOKEN gets sent as a bearer to api.anthropic.com, which responds: 401 authentication_error: OAuth authentication is currently not supported. This was a platform-wide footgun: anyone rotating LLM keys had to know the exact env var for each token type, AND make sure stale overrides were cleared, AND set ANTHROPIC_BASE_URL correctly for proxies (or NOT set for native Claude). Nothing downstream could help — the SDK just saw the wrong var. Fix: - New molecule_runtime/llm_auth.py — normalise_llm_env() mutates os.environ (or any dict) to the correct shape based on token prefix. Returns a NormalisationResult for logging. - main.py calls it as step 0, before any adapter/executor import. Every adapter (claude-code, langgraph, crewai, autogen, hermes, …) benefits automatically — no per-adapter branching needed. - 11 unit tests covering all prefix paths, edge cases, and the "operator deliberately set CLAUDE_CODE_OAUTH_TOKEN" precedence rule. Operationally: this means operators can keep using one ANTHROPIC_AUTH_TOKEN slot in platform settings and just paste whatever token the agent needs. No env-var-name awareness required. Tested locally: 11/11 new tests pass. 83 other tests unchanged (pre-existing failures on staging are all unrelated: test_workspace_id_validation, test_a2a_mcp_server RBAC, the test_imports.main module-walker — same signature as on staging HEAD before this PR). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 10:41:47 -07:00
molecule-ai[bot]	dcb6edd1a1	fix(shared_runtime): push heartbeat on CLEAR in set_current_task() (#37 ) Fixes #1372 — phantom busy: canvas showed workspace as active for up to 30s after task completion because set_current_task("") returned early without posting the updated heartbeat. Before: clearing only updated the heartbeat object; the next 30s scheduled heartbeat cycle propagated the clear. Quick tasks would leave a phantom-busy indicator. After: both SET and CLEAR push immediately to /registry/heartbeat. active_tasks=0 on clear, active_tasks=1 on set. Heartbeat object update and HTTP post are now unconditional. Tests: 5 new cases covering SET/CLEAR HTTP body, error resilience, None heartbeat, and missing env vars. Co-authored-by: Molecule AI Infra-Runtime-BE <infra-runtime-be@agents.moleculesai.app>	2026-04-22 17:33:42 +00:00
rabbitblood	1e545ed6ba	chore: bump 0.1.8 — executor_helpers phantom-busy fix confirmed in tree Some checks failed Publish to PyPI / build-and-publish (push) Failing after 8s Details	2026-04-21 07:16:47 -07:00
rabbitblood	5a1990552d	chore: bump 0.1.7 — ensure executor_helpers phantom-busy fix in PyPI build Some checks failed Publish to PyPI / build-and-publish (push) Failing after 7s Details	2026-04-21 07:07:17 -07:00
rabbitblood	59f54560a0	Merge branch 'main' of https://github.com/Molecule-AI/molecule-ai-workspace-runtime into fix/507-mcp-server-path-absolute-imports Some checks failed Publish to PyPI / build-and-publish (push) Failing after 6s Details # Conflicts: # pyproject.toml	2026-04-21 06:37:38 -07:00
rabbitblood	d3235cc564	fix(heartbeat): increment/decrement active_tasks + push on clear (#1372 , #1408 ) Both set_current_task() implementations (shared_runtime.py + executor_helpers.py): - Increment active_tasks on task start, decrement on completion (was binary 0/1) - Push heartbeat immediately on BOTH increment AND decrement - Only clear current_task when active_tasks reaches 0 (preserves description for still-running tasks) Fixes phantom-busy: the old code returned early on clear, leaving active_tasks=1 in the platform DB until the next 30s heartbeat cycle. If a new cron fired before the heartbeat, the workspace appeared permanently busy — required manual DB reset every 30 min. Bump: 0.1.2 → 0.1.3 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-21 06:37:12 -07:00
Hongming Wang	7febb51382	Merge pull request #36 from Molecule-AI/chore/bump-0.1.5 Some checks failed Publish to PyPI / build-and-publish (push) Failing after 6s Details chore: bump to 0.1.5 for X-Molecule-Org-Id header fix	2026-04-20 20:30:54 -07:00
Hongming Wang	742b7d1dfb	chore: bump version to 0.1.5 for org-id-header fix	2026-04-20 20:30:31 -07:00
Hongming Wang	4b0185a57b	Merge pull request #35 from Molecule-AI/feat/send-org-id-header feat(auth): send X-Molecule-Org-Id on every outbound platform call	2026-04-20 20:28:40 -07:00
Hongming Wang	ba5466243b	feat(auth): send X-Molecule-Org-Id on every outbound platform call The SaaS tenant platform's TenantGuard middleware rejects cross-org routing with synthetic 404s unless the request carries X-Molecule-Org-Id matching the tenant's MOLECULE_ORG_ID env var. The runtime never sent it, so every non-allowlisted workspace→platform path (memories, delegations, notify, a2a, update-card, peers...) 404'd. Paired with CP change feat/workspace-export-org-id which injects MOLECULE_ORG_ID into workspace user-data env. auth_headers() now returns both headers — the existing Authorization bearer AND the new X-Molecule-Org-Id — so every caller that already threads auth_headers() through httpx picks it up for free. Self- hosted deployments with MOLECULE_ORG_ID unset keep the old behavior (no header, TenantGuard is a no-op). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 20:28:07 -07:00
molecule-ai[bot]	0e2e1fc2c4	Merge pull request #33 from Molecule-AI/fix/a2a-cli-discover-workspace-id-validation fix(a2a_cli): validate WORKSPACE_ID in discover() before X-Workspace-ID header	2026-04-21 01:53:19 +00:00
Molecule AI Infra-Runtime-BE	d4b9bff5d0	fix(a2a_cli): validate WORKSPACE_ID in discover() before X-Workspace-ID header PR #32 wrapped all platform URL construction sites with get_validated_workspace_id() but missed a2a_cli.discover(), which passed the raw unvalidated WORKSPACE_ID in the X-Workspace-ID header. All other functions (peers, info) had try/except guards added. discover() now calls get_validated_workspace_id() upfront and returns None (printing the error) if validation fails — consistent with the best-effort error handling pattern used elsewhere in the module. Tests: 2 new cases in TestA2aCliDiscoverValidation covering empty and slash-injected WORKSPACE_ID values. Follow-up to: PR #32 (fix/908-add-namespace-param-commit-memory) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-21 01:35:37 +00:00
molecule-ai[bot]	40c30c068a	Merge pull request #32 from Molecule-AI/fix/908-add-namespace-param-commit-memory fix(CI): set WORKSPACE_ID env var + validation coverage	2026-04-21 01:29:32 +00:00
Molecule AI Infra-SRE	4bfe6222a6	fix(CI): remove conflicting bandit flags from security linter step PR #31 added `-ll --severity-level=high` but these flags conflict: - `-ll` is a shorthand for `--level low` (only show low+ issues) - `--severity-level=high` suppresses everything but high-severity issues The combination causes bandit to exit 2 because `--severity-level` is not allowed alongside `-l/--level`. Use `--severity-level=high` alone. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-21 00:58:43 +00:00
Molecule AI Infra-SRE	875a8ef952	fix(CI): set WORKSPACE_ID env var for test job PR #29 introduced WORKSPACE_ID validation at module import time (platform_auth.py). The CI environment did not set WORKSPACE_ID, causing 8 failures + 13 errors on every main push. Add a dummy CI-only value so imports succeed without affecting real workspaces. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-21 00:55:08 +00:00
Molecule AI Infra-SRE	249e5c07eb	fix(builtin_tools/validation): complete WORKSPACE_ID validation in a2a_tools.py Add get_validated_workspace_id() to all 6 remaining unguarded URL positions in molecule_runtime/a2a_tools.py (the MCP tool body implementations): - report_activity(): /workspaces/{id}/activity + heartbeat - tool_delegate_task_async(): /workspaces/{id}/delegate - tool_check_task_status(): /workspaces/{id}/delegations - tool_send_message_to_user(): /workspaces/{id}/notify - tool_commit_memory(): /workspaces/{id}/memories (POST) - tool_recall_memory(): /workspaces/{id}/memories (GET) All 6 functions now use validated ws_id. The last remaining unguarded WORKSPACE_ID use in the entire molecule_runtime package is in builtin_tools/telemetry.py:142 (metric service name — not a URL path, low security risk). 67/67 tests pass.	2026-04-21 00:55:08 +00:00
Molecule AI Infra-SRE	32a7880f4f	test+fix(builtin_tools/validation): add test coverage + fix ".." bypass in regex Tests: 37 new test cases in tests/test_validation.py covering: - Valid ID patterns (6): normal IDs, underscores, dots, max-length (256) - Empty/missing (1): raises with "empty" in message - Invalid chars (10): / \ .. # ? & whitespace - Caching (2): result is cached; raises on repeated bad calls - Error type (1): WorkspaceIdValidationError is a ValueError subclass Fix: regex now uses negative lookahead `(?!.*\.\.)` to reject ".." anywhere in the string (not just at the start). The old pattern `^[A-Za-z0-9_\-.]{1,256}$` matched ".." literally because two dots ARE in the allowed character class. Also adds test cases for embedded ".." (ws..example, ws../etc). Fixes: the ".." bypass was a gap in the original CWE-20 fix.	2026-04-21 00:55:08 +00:00
Molecule AI Infra-SRE	be9c9997c0	fix(builtin_tools/validation): cover remaining WORKSPACE_ID URL usages Extend get_validated_workspace_id() to all remaining unguarded URL positions: - consolidation.py: _consolidate() — validates before GET/POST/DELETE to /workspaces/{id}/memories endpoints. Graceful skip on failure (log + return). - coordinator.py: get_children() — validates before /registry/{id}/peers. Graceful skip (empty list) on failure. - molecule_ai_status.py: set_status() — validates before /registry/heartbeat and /workspaces/{id}/activity. Exits with descriptive error on failure. With these three, every runtime use of WORKSPACE_ID in a URL path is now validated. Remaining WORKSPACE_ID uses are: - JSON body fields (not injection-risky): heartbeat, memory POST bodies - Header values (X-Workspace-ID): lower risk, non-URL-injection	2026-04-21 00:55:08 +00:00
Molecule AI Infra-SRE	42bdf530b5	fix(builtin_tools/validation): extend WORKSPACE_ID validation to top-level modules Fixes remaining unguarded WORKSPACE_ID URL usages identified after the initial builtin_tools/ fix: - a2a_client.py: get_peers() and get_workspace_info() now use get_validated_workspace_id() before URL construction. The raw module-level constant is still used in the discover_peer() header (low risk, not URL path). - a2a_cli.py: peers() and info() CLI commands now validate WORKSPACE_ID before calling the platform API. Commands exit with error code 1 + descriptive message if WORKSPACE_ID is empty or malformed. Follow-up candidates (lower priority, not URL injection risk): - coordinator.py: WORKSPACE_ID in registry peer URL - consolidation.py: WORKSPACE_ID in memory URLs (long-running consolidation job) - molecule_ai_status.py: WORKSPACE_ID in activity log URL	2026-04-21 00:55:08 +00:00
Molecule AI Infra-SRE	d52082839f	fix(builtin_tools): validate WORKSPACE_ID before URL construction Add WORKSPACE_ID format validation before every URL/header use to prevent URL injection (CWE-20 / CWE-88). The validator: - Rejects empty values (fail-fast with clear error) - Rejects path-traversal chars (/ \ ..) and fragment/query chars (# ? &) - Accepts alphanumeric, hyphen, underscore, dot (typical ID formats) - Caches the result after first successful call (zero overhead per call) Validated in: - memory.py: commit_memory, search_memory (both awareness-client + httpx paths) - approval.py: _create_approval_request, _wait_polling - delegation.py: _notify_completion, _record_delegation_on_platform, _update_delegation_on_platform - a2a_tools.py: list_peers, delegate_task Fixes #14.	2026-04-21 00:55:08 +00:00
molecule-ai[bot]	548549d5e9	feat(CI): add bandit security linter (audit rec #2 ) (#31 ) Bandit runs on every PR against molecule_runtime/ at high severity. Addresses audit recommendation from issue #9. Co-authored-by: Molecule AI Infra-Runtime-BE <infra-runtime-be@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-21 00:23:17 +00:00
molecule-ai[bot]	30d96b4e4e	fix(platform_auth): validate WORKSPACE_ID at import time (issue #14 , CWE-20) (#29 ) WORKSPACE_ID was read via os.environ.get("WORKSPACE_ID", "") in multiple builtin_tools modules and used directly in platform API URLs and X-Workspace-ID headers without validation. A crafted ID containing /, .., or # could cause URL path injection. Fix: validate_workspace_id() in platform_auth.py now validates the ID format at module import time using a regex that permits only lowercase alphanumerics and hyphens (matching UUIDs and org-generated IDs). The validated value is exposed as a module-level WORKSPACE_ID constant. builtin_tools/approval.py and builtin_tools/delegation.py now import from platform_auth instead of reading os.environ directly. Failing input raises ValueError with a clear message — workspace fails fast at startup rather than silently accepting malformed IDs in requests. Add 15 regression tests (45/45 passing total). Co-authored-by: Molecule AI Infra-Runtime-BE <infra-runtime-be@agents.moleculesai.app> Co-authored-by: Infra-Runtime-BE <infra-runtime-be@molecule.ai>	2026-04-21 00:04:54 +00:00
Hongming Wang	953aa2847c	Merge pull request #30 from Molecule-AI/fix/adapter-loader-find-subclass Some checks failed Publish to PyPI / build-and-publish (push) Failing after 7s Details fix(adapter-loader): fall back to any BaseAdapter subclass	2026-04-20 16:59:38 -07:00

1 2

78 Commits