molecule-core

Author	SHA1	Message	Date
Hongming Wang	c636022d2f	fix(runtime): auto-fallback CONFIGS_DIR for non-container hosts (closes #2458 ) The runtime persists per-workspace state (`.auth_token`, `.platform_inbound_secret`, `.mcp_inbox_cursor`) under `/configs` — the workspace-EC2 mount path. Inside a container that's writable, agent-owned. Outside a container, `/configs` either doesn't exist or isn't writable by an unprivileged user. The default broke the external-runtime path (`pip install molecule-ai-workspace-runtime` + `molecule-mcp` on a Mac/Linux laptop). First heartbeat tries to persist `.platform_inbound_secret` and crashes: [Errno 30] Read-only file system: '/configs' The heartbeat thread logs and dies. Workspace flips offline within a minute. Operator sees no actionable error. Adds workspace/configs_dir.py — single resolution point with a tiered fallback: 1. CONFIGS_DIR env var, if set — explicit operator override (preserves existing tests + custom deployments verbatim). 2. /configs — if it exists AND is writable. In-container default; unchanged behavior for every prod workspace. 3. ~/.molecule-workspace — created with mode 0700 so per-file 0600 perms aren't undermined by a world-readable parent. Migrates the four readers (platform_auth, platform_inbound_auth, mcp_cli, inbox) to call configs_dir.resolve() instead of inlining `Path(os.environ.get("CONFIGS_DIR", "/configs"))`. Existing tests that assert the old `/configs`-as-default contract updated to assert the new contract: when CONFIGS_DIR is unset, path resolves to a writable location — `/configs` if present, fallback otherwise. Tests skip the fallback branch on hosts that DO have a writable `/configs` (CI containers). Verified the original repro is fixed: with no CONFIGS_DIR set on macOS, configs_dir.resolve() returns ~/.molecule-workspace, the dir exists, and writes succeed. Test suite: 1454 passed, 3 skipped, 2 xfailed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 13:07:55 -07:00
Hongming Wang	74c5e0d7a8	fix(workspace-runtime): add Origin header so SaaS edge WAF accepts MCP tool calls Discovered while smoke-testing the molecule-mcp external-runtime path against a live tenant (hongmingwang.moleculesai.app). Every tool call that hit /workspaces/* or /registry/*/peers returned 404 — but /registry/register and /registry/heartbeat returned 200. Diagnosis: the tenant's edge WAF requires a same-origin header. Without it, unhandled paths get silently rewritten to the canvas Next.js app, which has no /workspaces or /registry/:id/peers route and returns an empty 404. The molecule-mcp-claude-channel plugin already sets this header (server.ts:271-276); the workspace runtime never did because in-container PLATFORM_URLs (Docker network) aren't behind the WAF. Fix: extend platform_auth.auth_headers() to include Origin: ${PLATFORM_URL} whenever PLATFORM_URL is set. Inside-container behavior is unchanged (the WAF is path-irrelevant for the internal hostnames). External-runtime calls now thread the WAF correctly. Verification (live, against a freshly-registered external workspace): pre-fix: get_workspace_info → "not found", list_peers → 404 post-fix: get_workspace_info → full workspace JSON, list_peers → "Claude Code Agent (ID: 97ac32e9..., status: online)" This is the kind of bug unit tests can never catch — caught only by running the wheel against the real tenant. Memory: feedback_always_run_e2e.md. Test coverage: 4 new tests in test_platform_auth.py — Origin alone when no token + Origin + Authorization both, no-PLATFORM_URL falls through to original empty-dict behavior, env-token path with Origin. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 15:30:15 -07:00
Hongming Wang	169e284d57	feat(workspace-runtime): expose universal MCP server to runtime=external operators Ship the baseline universal MCP path that any external runtime (Claude Code, hermes, codex, anything that speaks MCP stdio) can use, before optimizing per-runtime channels. Today the workspace MCP server only spins up inside the container; external operators have no way to call the 8 platform tools (delegate_task, list_peers, send_message_to_user, commit_memory, etc.) from outside. Three additive changes: 1. `platform_auth.get_token()` env-var fallback — adds `MOLECULE_WORKSPACE_TOKEN` as a fallback when no `${CONFIGS_DIR}/.auth_token` file exists. File-first preserves in-container behavior unchanged. External operators (no /configs volume) now have a way to supply the token without faking the filesystem layout. 2. `molecule-mcp` console script — adds a new entry point in the published `molecule-ai-workspace-runtime` PyPI wheel. Operators run `pip install molecule-ai-workspace-runtime`, set 3 env vars (WORKSPACE_ID, PLATFORM_URL, MOLECULE_WORKSPACE_TOKEN), and register the binary in their agent's MCP config. `mcp_cli.main` is a thin validator wrapper — it checks env BEFORE importing the heavy `a2a_mcp_server` module so a misconfigured first-run gets a friendly 3-line error instead of a 20-line module-level RuntimeError traceback. 3. Wheel smoke gate — extends `scripts/wheel_smoke.py` to assert `cli_main` and `mcp_cli.main` are importable. Same regression class as the 0.1.16 main_sync incident: a silent rename or unrewritten import here would break every external operator on the next wheel publish (memory: feedback_runtime_publish_pipeline_gates.md). Test coverage: - `tests/test_platform_auth.py` — 8 new tests for the env-var fallback: file-priority, env-fallback, whitespace handling, cache, header construction, empty-env-as-unset. - `tests/test_mcp_cli.py` — 8 new tests for the validator: each required var separately, file-or-env satisfies token requirement, whitespace-only env treated as missing, help mentions canvas Tokens tab. - Full `workspace/tests/` suite green: 1346 passed, 1 skipped. - Local end-to-end: built wheel, installed in venv, ran `molecule-mcp` with no env → friendly error; with env → MCP server starts. Why now / why this shape: user redirect was "support the baseline first so all runtimes can use, then optimize". A claude-only MCP channel leaves hermes/codex/third-party operators broken on runtime=external. This PR ships the runtime-agnostic baseline; per- runtime polish (claude-channel push delivery, hermes-native bindings) is a follow-up PR. PR #2412 fixed the partner bug where canvas Restart silently revoked the operator's token — the two together unblock the external-runtime story end-to-end. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 15:20:19 -07:00
Hongming Wang	65b531acf6	fix(workspace): tag self-originated A2A POSTs with X-Workspace-ID Workspace runtime fired four classes of A2A request to the platform without the X-Workspace-ID header that identifies the source workspace: heartbeat self-messages, initial_prompt, idle-loop fires, and peer-to-peer A2A from runtime tools. The platform's a2a_receive logger keys source_id off that header — without it, every such row was written with source_id=NULL, which the canvas's My Chat tab filters as ?source=canvas (i.e. "user typed this") and rendered the internal triggers as if the human user had sent them. The "Delegation results are ready..." heartbeat trigger was visible to end users in the chat history; delegate_task A2A calls between agents were misclassified the same way. Centralise the header construction in a new platform_auth helper self_source_headers(workspace_id) that returns auth_headers() PLUS {X-Workspace-ID: <id>}. Apply it to: - heartbeat.py self-message (refactored from inline header dict) - main.py initial_prompt POST - main.py idle_prompt POST - a2a_client.py send_a2a_message (peer A2A from runtime) - builtin_tools/a2a_tools.py delegate_task (was missing ALL headers) Tests: - test_heartbeat.py asserts the X-Workspace-ID header is set on the self-message POST. - test_a2a_tools_module.py asserts the same on delegate_task POSTs; FakeClient.post mocks updated to accept the headers kwarg. Production effect lands the moment workspace containers are rebuilt with this code; existing rows in activity_logs keep their NULL source_id (legacy data). The canvas-side filter (#follow-up) covers the historical-rows case until backfill. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 19:54:43 -07:00
Molecule AI Core-BE	b5e2142c46	fix(#1877 ): close token-rotation race on restart — Option A+Option B combined Platform side (Option B): - provisioner.go: add WriteAuthTokenToVolume() — writes .auth_token to the Docker named volume BEFORE ContainerStart using a throwaway alpine container, eliminating the race window where a restarted container could read a stale token before WriteFilesToContainer writes the new one. - workspace_provision.go: call WriteAuthTokenToVolume() in issueAndInjectToken as a best-effort pre-write before the container starts. Runtime side (Option A): - heartbeat.py: on HTTPStatusError 401 from /registry/heartbeat, call refresh_cache() to force re-read of /configs/.auth_token from disk, then retry the heartbeat once. Fall through to normal failure tracking if the retry also fails. - platform_auth.py: add refresh_cache() which discards the in-process _cached_token and calls get_token() to re-read from disk. Together these eliminate the >1 consecutive 401 window described in issue #1877. Pre-write (B) is the primary fix; runtime retry (A) is the self-healing fallback for any residual race. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-23 17:47:18 -07:00
Hongming Wang	479a027e4b	chore: open-source restructure — rename dirs, remove internal files, scrub secrets Renames: - platform/ → workspace-server/ (Go module path stays as "platform" for external dep compat — will update after plugin module republish) - workspace-template/ → workspace/ Removed (moved to separate repos or deleted): - PLAN.md — internal roadmap (move to private project board) - HANDOFF.md, AGENTS.md — one-time internal session docs - .claude/ — gitignored entirely (local agent config) - infra/cloudflare-worker/ → Molecule-AI/molecule-tenant-proxy - org-templates/molecule-dev/ → standalone template repo - .mcp-eval/ → molecule-mcp-server repo - test-results/ — ephemeral, gitignored Security scrubbing: - Cloudflare account/zone/KV IDs → placeholders - Real EC2 IPs → <EC2_IP> in all docs - CF token prefix, Neon project ID, Fly app names → redacted - Langfuse dev credentials → parameterized - Personal runner username/machine name → generic Community files: - CONTRIBUTING.md — build, test, branch conventions - CODE_OF_CONDUCT.md — Contributor Covenant 2.1 All Dockerfiles, CI workflows, docker-compose, railway.toml, render.yaml, README, CLAUDE.md updated for new directory names. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-18 00:24:44 -07:00

6 Commits