Commit Graph

35 Commits

Author SHA1 Message Date
Hongming Wang
e342d0c5a7 fix(build): register a2a_response in TOP_LEVEL_MODULES
The drift gate caught the new SSOT parser module — without registration
the wheel ships it un-rewritten and runtime imports fail. Same pattern
as inbox_uploads, a2a_tools_delegation, a2a_tools_rbac registrations.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 17:34:05 -07:00
Hongming Wang
f01f374072 feat(mcp): add molecule-mcp doctor onboarding diagnostic
Closes #2934 item 6 — the deferred follow-up from Ryan's onboarding-
friction report. Quote: "this single command would have saved me
30 of the 45 minutes."

When push delivery fails or the install half-works, the operator
today has no signal — they hand-grep the Claude Code binary or
chase the `from versions: none` red herring. Doctor renders six
checks in one screen with concrete next-step suggestions:

  1. Python version    >=3.11? (wheel's pin)
  2. Wheel install     molecule-ai-workspace-runtime importable +
                        version surfaced
  3. PATH for binary   `molecule-mcp` resolves on PATH; if not,
                        prints the resolved user-site bin dir to
                        add (or recommends pipx)
  4. Env vars          PLATFORM_URL + WORKSPACE_ID + token (env or
                        *_FILE or .auth_token)
  5. Platform reach    GET ${PLATFORM_URL}/healthz returns 2xx
  6. Registry register POST /registry/register with the resolved
                        token returns 2xx — end-to-end auth check

Each line: `[OK|WARN|FAIL] <label>: <status>` plus a `next:` hint
when not OK. ANSI colors auto-disable on non-TTY / NO_COLOR.

Exit code: 0 on all-OK or only-WARN, 1 on any FAIL — scriptable
from CI install-checks.

## Files

`workspace/mcp_doctor.py`  (new) — six check functions + `run()`
                                   entry point. Uses urllib (stdlib)
                                   so doctor works even on a partial
                                   install where `requests` is missing.

`workspace/mcp_cli.py`             Subcommand dispatch:
                                     molecule-mcp doctor   → mcp_doctor.run()
                                     molecule-mcp --help   → usage banner
                                     molecule-mcp          → server (unchanged)

`workspace/tests/test_mcp_doctor.py`  (new) — 10 tests covering each
                                       check's pass/fail/skip path
                                       plus the end-to-end exit-code
                                       contract on a stripped env.

`scripts/build_runtime_package.py`    Adds `mcp_doctor` to
                                       TOP_LEVEL_MODULES so the
                                       wheel ships the new module.

## Out of scope (deferred follow-ups)
- Claude Code-specific checks (parse ~/.claude.json, verify each
  MCP entry is plugin-sourced + dev-channels flag set). That's a
  separate Claude-Code-shaped doctor; lives in the channel plugin.
- Automated remediation. Doctor is diagnostic — tells the operator
  what's wrong + how to fix it, doesn't apply changes.

## Verification
  - python -m pytest tests/test_mcp_doctor.py -v   → 10/10 PASS
  - python -m pytest tests/test_mcp_cli*.py        → 67/67 PASS
    (existing CLI suite still green; subcommand dispatch added
    before env-validation, doesn't disturb the server-boot path)
  - manual: `molecule-mcp doctor` on a stripped env renders 4 FAIL
    + 2 WARN + exit code 1, with each `next:` hint actionable

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 15:44:36 -07:00
Hongming Wang
3b7ed9cf53
Merge pull request #2946 from Molecule-AI/fix/onboarding-followup-2934
mcp: surface specific TOKEN_FILE errors + link follow-ups (#2934)
2026-05-05 22:19:21 +00:00
Hongming Wang
da9061c131 mcp: surface specific TOKEN_FILE errors + link follow-ups (#2934)
Self-review of #2935 turned up two real defects:

1. Stale README issue references — the build_runtime_package.py
   README template said "(issue #2934 follow-up)" twice, but the
   marketplace-plugin and `doctor` items now have dedicated tracking
   issues. Updated to point at #2936 and #2937 respectively.

2. Silent fallthrough on broken MOLECULE_WORKSPACE_TOKEN_FILE — when
   an operator EXPLICITLY pointed TOKEN_FILE at a path that didn't
   exist / wasn't readable / was blank / contained internal whitespace,
   the resolver silently returned the generic "set one of these three
   vars" error. That's exactly the silent failure mode #2934 flagged
   ("a new user has no chance"). Refactor `_read_token_from_file_env`
   to return `(token, error)`; surface the SPECIFIC failure when the
   operator's intent was clearly the file path. Skip the CONFIGS_DIR
   fallback in that case so the operator's config bug isn't masked
   by a different source happening to work.

Adds 2 renames + 2 new tests in test_mcp_cli_split.py:
  - test_missing_file_returns_specific_error (asserts "does not exist")
  - test_empty_file_returns_specific_error (asserts "is empty")
  - test_multi_line_file_rejected (asserts "internal whitespace")
  - test_token_file_error_skips_configs_dir_fallback (asserts a valid
    CONFIGS_DIR/.auth_token does NOT silently rescue a broken
    TOKEN_FILE)

All 81 mcp_cli + mcp_cli_multi_workspace + mcp_cli_split tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 15:07:15 -07:00
Hongming Wang
c4807a930d
Merge pull request #2940 from Molecule-AI/refactor/a2a-tools-inbox-extract-rfc2873-iter4e
refactor(workspace): extract inbox tools from a2a_tools.py (RFC #2873 iter 4e)
2026-05-05 21:58:32 +00:00
Hongming Wang
475da5b64c refactor(workspace): extract inbox tools from a2a_tools.py (RFC #2873 iter 4e)
Continues the OSS-shape refactor. After iters 4a-4d (rbac, delegation,
memory, messaging) the only behavior left in ``a2a_tools.py`` was
``report_activity`` plus three thin inbox-tool wrappers and the
``_enrich_inbound_for_agent`` helper. This iter extracts the inbox
slice to ``a2a_tools_inbox.py`` so the kitchen-sink module shrinks
from 280 LOC to ~165 LOC of imports + report_activity + back-compat
re-export blocks.

Extracted symbols:
  - ``_INBOX_NOT_ENABLED_MSG`` (sentinel)
  - ``_enrich_inbound_for_agent`` (poll-path peer enrichment helper)
  - ``tool_inbox_peek``
  - ``tool_inbox_pop``
  - ``tool_wait_for_message``

Re-exports (`from a2a_tools_inbox import …`) preserve the public
``a2a_tools.tool_inbox_*`` surface so existing tests + call sites
continue to resolve unchanged.

New tests in test_a2a_tools_inbox_split.py:
  1. **Drift gate (5)** — every previously-public symbol on a2a_tools
     is the EXACT same object as a2a_tools_inbox.foo (`is`, not `==`),
     catches a future "wrap with logging" refactor that silently loses
     existing test coverage.
  2. **Import contract (1)** — a2a_tools_inbox does NOT eagerly import
     a2a_tools at module load. Pins the layered architecture: the
     extracted slice depends on ``inbox`` + a lazy ``a2a_client``
     import, never on the kitchen-sink that re-exports it.
  3. **_enrich_inbound_for_agent branches (5)** — peer_id-empty
     (canvas_user) returns dict unchanged; missing peer_id key same;
     a2a_client unavailable (test harness, partial install) degrades
     gracefully with a bare envelope; registry hit populates
     peer_name + peer_role + agent_card_url; registry miss still
     surfaces agent_card_url (constructable from peer_id alone).

The full timeout-clamp / validation / JSON-shape behavior matrix for
the three wrappers stays in test_a2a_tools_inbox_wrappers.py — those
tests pass identically against both the alias and the underlying impl.

Wiring updates:
  - ``scripts/build_runtime_package.py``: add ``a2a_tools_inbox`` to
    ``TOP_LEVEL_MODULES`` so it ships in the runtime wheel and the
    drift gate doesn't fail the next publish.
  - ``.github/workflows/ci.yml``: add ``a2a_tools_inbox.py`` to
    ``CRITICAL_FILES`` so the 75% MCP/inbox/auth per-file floor
    applies — this is now where the inbox-delivery code actually
    lives.
2026-05-05 14:28:58 -07:00
Hongming Wang
1ad107cc15
Merge pull request #2935 from Molecule-AI/fix/onboarding-friction-2934
fix(onboarding): address Claude Code MCP onboarding friction (#2934)
2026-05-05 21:25:57 +00:00
Hongming Wang
01deeb36cf fix(onboarding): address Claude Code MCP onboarding friction (#2934)
Ryan's bug report (#2934) walked through ~45 min of debugging a stock
external-runtime install. This PR fixes the four items he flagged that
have a small surface, and stubs out the larger ones for follow-up.

Fixed in this PR
================

#1 — Python floor disclosure (README in publish bundle)
  Add an explicit "Requires Python ≥3.11" section that calls out the
  cryptic "Could not find a version that satisfies the requirement"
  failure mode; recommend `pipx install` over `pip install` so the
  binary lands on PATH automatically; show the explicit `pip install
  --user` alternative with the PATH caveat.

#3 — MOLECULE_WORKSPACE_TOKEN_FILE support (mcp_workspace_resolver.py)
  Add a third resolution step between the inline env var and the
  in-container CONFIGS_DIR fallback. Operators can write the bearer to
  a 0600 file (e.g. ~/.config/molecule/token) and point
  MOLECULE_WORKSPACE_TOKEN_FILE at it, keeping the secret out of
  ~/.zsh_history and out of plaintext in MCP-host configs like
  ~/.claude.json. Inline TOKEN still wins on conflict so rotation flows
  are predictable. README documents the safer option as the
  recommended path. 6 new tests pin every leg (file resolves, inline
  wins, missing/empty file falls through, blank env unset-equivalent,
  help text advertises it).

#4 — Push delivery 3-condition gating (README in publish bundle)
  Document that real-time push on Claude Code requires (a) the server
  to declare experimental.claude/channel (we do), (b) the server to be
  marketplace-plugin-sourced (operators must scaffold their own until
  the official marketplace lands — see #2934 follow-up), and (c) the
  --dangerously-load-development-channels flag on the claude
  invocation. Until any of the three is in place, delivery silently
  falls back to poll mode with no diagnostic. The README now says all
  of this explicitly so a new operator doesn't grep the binary for
  channel_enable to figure it out.

#8 — serverInfo.name mismatch (a2a_mcp_server.py)
  The server reported `serverInfo.name = "a2a-delegation"` while
  operators register it as `molecule` (the name in `claude mcp add
  molecule …`). Harmless on tool routing today but matters for any
  future Claude Code allowlist that gates push by hardcoded server
  name. Renamed to "molecule" with an inline comment explaining the
  invariant.

Deferred (separate issues to track)
===================================

#2 — covered transitively by #1's pipx recommendation; no separate fix.
#5 — `moleculesai/claude-code-plugin` marketplace repo (substantial new
     repo work; the README references it as a documented follow-up).
#6 — `molecule-mcp doctor` subcommand (substantial new CLI surface;
     mentioned in the README's push-vs-poll section as the planned
     diagnostic for silent push fallback).
#7 — `--dangerously-load-development-channels` rename — not in our
     control; that's Claude Code's flag.

Tests
=====
164/164 mcp_cli + a2a_mcp_server tests pass locally
(WORKSPACE_ID=00000000-0000-0000-0000-000000000001 pytest …) including
6 new TestTokenFileEnv cases. Wheel builds successfully via
scripts/build_runtime_package.py with the new README markers verified
in the output.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 14:19:09 -07:00
Hongming Wang
0d0840d9d9
Merge branch 'staging' into refactor/a2a-tools-messaging-extract-rfc2873-iter4d 2026-05-05 13:41:55 -07:00
Hongming Wang
23d3f057d3
Merge pull request #2890 from Molecule-AI/refactor/a2a-tools-memory-extract-rfc2873-iter4c
refactor(workspace): extract memory tools from a2a_tools.py (RFC #2873 iter 4c)
2026-05-05 20:31:45 +00:00
Hongming Wang
6470e5f41b
Merge pull request #2887 from Molecule-AI/refactor/a2a-tools-delegation-extract-rfc2873-iter4b
refactor(workspace): extract delegation handlers from a2a_tools.py (RFC #2873 iter 4b)
2026-05-05 17:40:40 +00:00
Hongming Wang
abba16beb4
Merge pull request #2883 from Molecule-AI/refactor/a2a-tools-rbac-extract-rfc2873-iter4a
refactor(workspace): extract RBAC helpers from a2a_tools.py (RFC #2873 iter 4a)
2026-05-05 16:59:36 +00:00
Hongming Wang
9c752e0673
Merge pull request #2879 from Molecule-AI/refactor/mcp-cli-split-rfc2873-iter3
refactor(workspace): split mcp_cli.py into focused modules (RFC #2873 iter 3)
2026-05-05 16:58:05 +00:00
Hongming Wang
3e0d2e650a refactor(workspace): extract messaging tools from a2a_tools.py to a2a_tools_messaging.py (RFC #2873 iter 4d)
Fourth slice of the a2a_tools.py split (stacked on iter 4c). Owns the
four human-and-peer messaging MCP tools + the chat-upload helper:

  * _upload_chat_files — stage local paths to /chat/uploads
  * tool_send_message_to_user — push canvas-chat via /notify
  * tool_list_peers — discover peers across registered workspaces
  * tool_get_workspace_info — JSON-encode workspace info
  * tool_chat_history — fetch prior conversation rows with a peer

a2a_tools.py shrinks from 508 → 213 LOC (−295). The remaining 213
is just report_activity + back-compat re-exports. Inbox tools
(tool_inbox_peek/pop/wait_for_message) deferred to iter 4e.

Layered architecture: messaging depends on a2a_tools_rbac (iter 4a),
a2a_client, platform_auth — NOT on kitchen-sink a2a_tools. An
import-contract test pins this so future refactors that add
`from a2a_tools import …` fail in CI.

Tests:
  * 28 patch sites in TestToolSendMessageToUser + TestToolListPeers +
    TestToolGetWorkspaceInfo + TestChatHistory retargeted from
    `a2a_tools.{httpx, get_peers_*, get_workspace_info,
    _upload_chat_files, _peer_*, list_registered_workspaces}` to
    `a2a_tools_messaging.…` because the call sites moved.
  * test_a2a_tools_messaging.py adds 7 new tests:
    - 5 alias drift gates
    - 2 import-contract tests (no top-level a2a_tools dep + a2a_tools
      surfaces every messaging symbol)

137 tests total in the a2a_tools suite, all green.

Refs RFC #2873.
2026-05-05 09:50:47 -07:00
Hongming Wang
210a26d31a refactor(workspace): extract memory tools from a2a_tools.py to a2a_tools_memory.py (RFC #2873 iter 4c)
Third slice of the a2a_tools.py split (stacked on iter 4b). Owns the
two persistent-memory MCP tools:

  * tool_commit_memory — write to /workspaces/:id/memories with RBAC
    + GLOBAL-scope tier-zero enforcement
  * tool_recall_memory — search /workspaces/:id/memories with RBAC

a2a_tools.py shrinks from 609 → 508 LOC (−101). Both handlers depend
ONLY on a2a_tools_rbac (iter 4a), a2a_client, and the platform's
/memories endpoint — no entanglement with delegation or messaging.

Side-effects of the layered architecture: a2a_tools_memory's import
contract is "depends on a2a_tools_rbac, never on a2a_tools" — the
kitchen-sink module is for back-compat re-exports only. A test pins
this so a future refactor that re-introduces `from a2a_tools import …`
fails in CI.

Tests:
  * 49 patch sites in TestToolCommitMemory + TestToolRecallMemory
    retargeted from `a2a_tools.{_check_memory_*, _is_root_workspace,
    httpx.AsyncClient}` to `a2a_tools_memory.…` because the call sites
    moved.
  * test_a2a_tools_memory.py adds 4 new tests (alias drift gate +
    import-contract + a2a_tools-side re-export).

117 tests total (77 impl + 28 rbac + 8 delegation + 4 memory), all green.

Refs RFC #2873.
2026-05-05 09:50:39 -07:00
Hongming Wang
2227a14b1e fix(build): add a2a_tools_delegation to TOP_LEVEL_MODULES drift gate
Iter 4b's new module needs the rewrite-list entry. Stacked on iter 4a
which already added a2a_tools_rbac.

Refs RFC #2873 iter 4b.
2026-05-05 05:01:04 -07:00
Hongming Wang
17aec22f9b fix(build): add a2a_tools_rbac to TOP_LEVEL_MODULES drift gate
Iter 4a's new module needs to be in the rewrite list so the wheel
ships its imports prefixed correctly. Caught by 'PR-built wheel +
import smoke'.

Refs RFC #2873 iter 4a.
2026-05-05 05:00:47 -07:00
Hongming Wang
8388144098 fix(build): add iter-3 mcp_* modules to TOP_LEVEL_MODULES drift gate
The iter-3 split created mcp_heartbeat / mcp_inbox_pollers /
mcp_workspace_resolver but the wheel build's drift-gate check at
scripts/build_runtime_package.py:TOP_LEVEL_MODULES wasn't updated.
Without this fix the wheel ships those modules un-rewritten, so
their imports of platform_auth / configs_dir / etc. break at
runtime. Caught by the 'PR-built wheel + import smoke' check.

Refs RFC #2873 iter 3.
2026-05-05 05:00:29 -07:00
Hongming Wang
86015412eb build(runtime): register inbox_uploads in TOP_LEVEL_MODULES
The drift gate in build_runtime_package.py rejects any workspace/*.py
module not listed in TOP_LEVEL_MODULES — it would ship un-rewritten
and break wheel imports. Add inbox_uploads (introduced in this PR)
to the list.
2026-05-05 04:41:07 -07:00
Hongming Wang
a8850bac55
Merge pull request #2778 from Molecule-AI/fix/redact-secrets-1777932233
fix(runtime): redact secret-shaped tokens from JSON-RPC error.data
2026-05-04 22:13:29 +00:00
Hongming Wang
28f22609d9 fix(runtime): redact secret-shaped tokens from JSON-RPC error.data
PR #2756 piped adapter.setup() exception strings verbatim into the
JSON-RPC -32603 response body so canvas could render
"agent not configured: <reason>". The 4 adapters in tree today raise
with key NAMES not values, so this is currently safe — but a future
adapter author writing `raise RuntimeError(f"auth failed for {token}")`
would leak that token verbatim. Issue #2760 flagged the risk; this PR
closes it.

workspace/secret_redactor.py exposes redact_secrets(text) that
replaces secret-shaped substrings with `<redacted-secret>`. Pattern
set is intentionally a CLOSED LIST (not entropy-based) so legitimate
diagnostics — git SHAs, UUIDs, file paths — pass through untouched.

Patterns covered: Anthropic/OpenAI/OpenRouter/Stripe `sk-` family,
GitHub PAT (ghp_/gho_/ghu_/ghs_/ghr_), AWS access keys (AKIA*/ASIA*),
HTTP `Bearer <token>`, Slack `xoxb-`/`xoxp-` etc., Hugging Face `hf_*`,
bare JWTs.

Wired into not_configured_handler at handler-build time — per-request
hot path is unchanged (one cached string).

Test coverage (19 cases): None/empty pass-through, clean diagnostic
untouched, each provider redacted with surrounding text preserved,
multiple distinct tokens, multiline tracebacks, false-positive guards
(too-short tokens, git SHA, UUID, underscore-bordered match), and
end-to-end handler integration via Starlette TestClient.

Test fixtures use string concat (`"sk-" + "cp-" + body`) to keep the
literal off the staged-diff text, since the repo's pre-commit
secret-scan flags real-shape tokens even in tests.

`secret_redactor` registered in TOP_LEVEL_MODULES (drift gate).

Closes #2760
Pairs with: PR #2756, PR #2775

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 15:07:53 -07:00
Hongming Wang
4f4b6c4f90 test(runtime): pin PR #2756's card-vs-setup decoupling with build_routes helper
PR #2756's contract — card route always mounted regardless of
adapter.setup() outcome — lived inline in main.py's `# pragma: no cover`
boot sequence. A future refactor that re-coupled the two would have
silently bypassed PR #2756 and shipped the original "stuck booting
forever" UX again, with no pytest catching it.

This change extracts route assembly into workspace/boot_routes.py's
build_routes(card, executor, adapter_error) and pins the contract with
6 integration tests using Starlette's TestClient:

- test_card_route_serves_200_when_adapter_ready: happy path
- test_card_route_serves_200_when_adapter_failed: misconfigured boot,
  card still 200, skill stubs survive
- test_jsonrpc_returns_503_when_no_executor: full -32603 envelope with
  the adapter_error in error.data
- test_jsonrpc_returns_503_with_generic_when_no_error_string: fallback
  reason for the rare case main.py reaches this branch without one
- test_card_route_does_not_depend_on_executor: direct PR #2756
  regression guard — both branches MUST mount the card route
- test_executor_present_does_not_mount_not_configured_handler: sanity
  that a healthy workspace doesn't return -32603 to every request

Conftest stubs extended with a2a.server.routes / request_handlers
classes so the tests work under the existing a2a-mock infra (pattern
matches the AgentCard/AgentSkill stubs added for PR #2765).

main.py now calls build_routes; the inline if/else is gone. Same
production behaviour, cleaner shape, regression-proof.

Heavy a2a-sdk imports inside build_routes() are lazy (deferred to the
executor-only branch) so tests that only exercise the not-configured
path don't pull DefaultRequestHandler / InMemoryTaskStore.

card_helpers + boot_routes registered in TOP_LEVEL_MODULES (build
drift gate would have caught the missing entry on the wheel-publish
smoke).

All 18 related tests pass (test_boot_routes.py: 6, test_card_helpers.py:
6, test_not_configured_handler.py: 6).

Closes #2761
Pairs with: PR #2756 (decouple agent-card from setup),
            PR #2765 (defensive isolation of enrichment + transcript)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 14:59:56 -07:00
Hongming Wang
63ac99788b fix(runtime): isolate card-skill enrichment + transcript handler from adapter shape mismatch
PR #2756 added a try/except around adapter.setup() so a missing LLM key
doesn't crash the workspace boot. Two paths that now run AFTER setup
succeeds were not similarly isolated, leaving small but real coupling
risks for future adapter authors.

1. **Skill metadata enrichment swap (main.py:248-259).** When
   adapter.setup() returns, main.py reads adapter.loaded_skills and
   replaces the static stubs in agent_card.skills with rich metadata
   (description, tags, examples). The list comprehension assumes each
   element exposes .metadata.{id,name,description,tags,examples}. A
   future adapter that returns a non-canonical shape would raise
   AttributeError, propagate to the outer except, capture as
   adapter_error, and silently degrade an OK boot to the
   not-configured state — even though setup() actually succeeded.

   Extract to card_helpers.enrich_card_skills(card, loaded_skills) →
   bool. Helper swallows enrichment failures, logs the cause, returns
   False, leaves the static stubs in place. setup() success path
   continues unchanged. 6 unit tests cover: None input, empty list,
   canonical happy path, missing .metadata attr, partial .metadata
   (missing one canonical field), atomic-failure-no-partial-swap.

2. **/transcript handler (main.py:513).** Calls await
   adapter.transcript_lines(...) without try/except. BaseAdapter's
   default returns {"supported": false} so today's 4 adapters never
   trigger this — but a future adapter override that assumes setup()
   ran would surface as a 500 from Starlette's default error handler
   instead of a useful 503 with the exception class + message.
   Inline try/except returns 503 with the reason, matching the
   not-configured JSON-RPC handler's pattern.

Both changes match the architectural principle the PR #2756 chain
established: availability (workspace reachable) is decoupled from
configuration / adapter behavior. Operators see useful errors instead
of silent degradation; future adapter authors can't accidentally
break tenant readiness with a shape mismatch.

Adds:
- workspace/card_helpers.py (~50 lines, 100% covered)
- workspace/tests/test_card_helpers.py (6 tests)
- AgentCard/AgentSkill/AgentCapabilities/AgentInterface stubs to
  workspace/tests/conftest.py so future card-related tests work
  under the existing a2a-mock infrastructure
- card_helpers in TOP_LEVEL_MODULES (drift gate would have caught it)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 14:15:27 -07:00
Hongming Wang
d1122f8d28 fix(build): register not_configured_handler in TOP_LEVEL_MODULES
The wheel-build drift gate caught the new module added in this PR —
without registering it, the published wheel would ship `import
not_configured_handler` un-rewritten, which would `ModuleNotFoundError`
at runtime under `molecule_runtime.main`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 10:24:02 -07:00
Hongming Wang
9753d58539 fix(build): register event_log in TOP_LEVEL_MODULES
The wheel-build drift gate caught it correctly: any new top-level
module under workspace/ must be listed in TOP_LEVEL_MODULES so its
`from event_log import …` statements get rewritten to
`from molecule_runtime.event_log import …` at package time.

Without this entry, the published wheel ships event_log.py un-rewritten
and crashes at runtime with ModuleNotFoundError on first heartbeat.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 00:19:30 -07:00
Hongming Wang
b8fdbd9fab fix(runtime): register configs_dir in TOP_LEVEL_MODULES + drop alias
Wheel-build smoke gate detected `configs_dir` missing from
scripts/build_runtime_package.py:TOP_LEVEL_MODULES. Without it the
build would ship `import configs_dir` un-rewritten and every
external-runtime install would die on `ModuleNotFoundError` at first
import.

Two callers used `import configs_dir as _configs_dir` to belt-and-
suspenders against an imagined name collision, but the rewriter
rejects `import X as Y` because the rewrite would produce
`import molecule_runtime.X as X as Y` (invalid syntax). No actual
collision exists (only docstring/comment references). Switched to
plain `import configs_dir`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 13:13:57 -07:00
Hongming Wang
aacaba024c feat(wheel-smoke): exercise executor.execute() to catch lazy imports (#2275)
The existing wheel-publish smoke (`wheel_smoke.py`) only IMPORTS
`molecule_runtime.main` at module scope. Lazy imports buried inside
`async def execute(...)` bodies (e.g. `from a2a.types import FilePart`)
NEVER evaluate at static-import time — they crash at first message
delivery in production.

The 2026-04-2x v0→v1 a2a-sdk migration shipped 5 such regressions in
templates that all looked fine at module-load smoke. This change adds
`smoke_mode.py` plus a `MOLECULE_SMOKE_MODE=1` short-circuit in
`main.py`: after `adapter.create_executor(...)`, the boot path invokes
`executor.execute(stub_ctx, stub_queue)` once with a 5s timeout
(`MOLECULE_SMOKE_TIMEOUT_SECS`). Healthy import tree → execution
proceeds far enough to hit a network boundary and times out (exit 0).
Broken lazy import → `ImportError` / `ModuleNotFoundError` from inside
the executor body (exit 1). Other downstream errors (auth, validation)
pass — those are caught by adapter-level tests, not this gate.

Stub `(RequestContext, EventQueue)` is built from the real a2a-sdk so
SendMessageRequest/RequestContext constructor changes also surface as
import-tree failures (the regression class also includes "SDK
refactored mid-publish"). The stub-build itself is wrapped — if it
raises, that's a smoke fail too.

Phase 2 (separate PR, molecule-ci) wires this into
publish-template-image.yml so the publish gate runs the boot smoke
against every template image before pushing the tag.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 21:21:18 -07:00
Hongming Wang
0acdf3bb56 fix(wheel): import inbox without alias to dodge rewriter collision
PR #2433 (notifications/claude/channel) shipped 'import inbox as
_inbox_module' inside a2a_mcp_server.py:main(). The build script's
import rewriter expands plain 'import inbox' to
'import molecule_runtime.inbox as inbox', so the original source
became 'import molecule_runtime.inbox as inbox as _inbox_module',
which is invalid Python.

Caught at the publish-runtime + PR-built-wheel-smoke gate (the
SyntaxError trace is in run 25200422679). The wheel didn't ship to
PyPI because publish-runtime's smoke-import step refused to install
it, but staging is currently sitting on a broken-build commit until
this fix-forward lands.

Changes:
- a2a_mcp_server.py: lift `import inbox` to top of file (rewriter
  produces clean `import molecule_runtime.inbox as inbox`), call
  inbox.set_notification_callback directly in main()
- build_runtime_package.py: rewrite_imports() now raises ValueError
  when it sees 'import X as Y' for any X in the workspace allowlist,
  instead of silently producing a syntax-error wheel. Operator gets
  a clear actionable error at build time pointing at the offending
  line + suggested rewrites ('from X import …' or plain 'import X').

The build-time gate (this PR's rewriter check) catches the regression
class earlier than the smoke-time gate (PR #2433's failure). Adding
'PR-built wheel + import smoke' to staging branch protection's
required checks is filed separately so this class doesn't merge again.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 20:21:54 -07:00
Hongming Wang
b47d4ceb00 feat(workspace-runtime): add inbox polling for standalone molecule-mcp path
The universal MCP server (a2a_mcp_server.py) was outbound-only — agents
in standalone runtimes (Claude Code, hermes, codex, etc.) could
delegate, list peers, and write memories, but never observed the
canvas-user or peer-agent messages addressed to them. This blocked
"constantly responding" loops without forcing operators back onto a
runtime-specific channel plugin.

This PR closes the inbound gap with a poller-fed in-memory queue and
three new MCP tools:

  - wait_for_message(timeout_secs?) — block until next message arrives
  - inbox_peek(limit?)              — list pending messages (non-destructive)
  - inbox_pop(activity_id)          — drop a handled message

A daemon thread polls /workspaces/:id/activity?type=a2a_receive every
5s, fills the queue from the cursor (since_id), and persists the cursor
to ${CONFIGS_DIR}/.mcp_inbox_cursor so a restart doesn't replay backlog.
On 410 (cursor pruned) we fall back to since_secs=600 for a bounded
recovery window. Activity-row → InboxMessage extraction mirrors the
molecule-mcp-claude-channel plugin's extractText (envelope shapes #1-3
+ summary fallback).

mcp_cli.main starts the poller alongside the existing register +
heartbeat threads. In-container runtimes (which have push delivery via
canvas WebSocket) skip activation, so inbox tools return an
informational "(inbox not enabled)" message instead of double-delivery.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 16:32:48 -07:00
Hongming Wang
169e284d57 feat(workspace-runtime): expose universal MCP server to runtime=external operators
Ship the baseline universal MCP path that any external runtime (Claude
Code, hermes, codex, anything that speaks MCP stdio) can use, before
optimizing per-runtime channels. Today the workspace MCP server only
spins up inside the container; external operators have no way to call
the 8 platform tools (delegate_task, list_peers, send_message_to_user,
commit_memory, etc.) from outside.

Three additive changes:

1. **`platform_auth.get_token()` env-var fallback** — adds
   `MOLECULE_WORKSPACE_TOKEN` as a fallback when no
   `${CONFIGS_DIR}/.auth_token` file exists. File-first preserves
   in-container behavior unchanged. External operators (no /configs
   volume) now have a way to supply the token without faking the
   filesystem layout.

2. **`molecule-mcp` console script** — adds a new entry point in the
   published `molecule-ai-workspace-runtime` PyPI wheel. Operators run
   `pip install molecule-ai-workspace-runtime`, set 3 env vars
   (WORKSPACE_ID, PLATFORM_URL, MOLECULE_WORKSPACE_TOKEN), and register
   the binary in their agent's MCP config. `mcp_cli.main` is a thin
   validator wrapper — it checks env BEFORE importing the heavy
   `a2a_mcp_server` module so a misconfigured first-run gets a friendly
   3-line error instead of a 20-line module-level RuntimeError
   traceback.

3. **Wheel smoke gate** — extends `scripts/wheel_smoke.py` to assert
   `cli_main` and `mcp_cli.main` are importable. Same regression class
   as the 0.1.16 main_sync incident: a silent rename or unrewritten
   import here would break every external operator on the next wheel
   publish (memory: feedback_runtime_publish_pipeline_gates.md).

Test coverage:
- `tests/test_platform_auth.py` — 8 new tests for the env-var fallback:
  file-priority, env-fallback, whitespace handling, cache, header
  construction, empty-env-as-unset.
- `tests/test_mcp_cli.py` — 8 new tests for the validator: each
  required var separately, file-or-env satisfies token requirement,
  whitespace-only env treated as missing, help mentions canvas Tokens
  tab.
- Full `workspace/tests/` suite green: 1346 passed, 1 skipped.
- Local end-to-end: built wheel, installed in venv, ran `molecule-mcp`
  with no env → friendly error; with env → MCP server starts.

Why now / why this shape: user redirect was "support the baseline
first so all runtimes can use, then optimize". A claude-only MCP
channel leaves hermes/codex/third-party operators broken on
runtime=external. This PR ships the runtime-agnostic baseline; per-
runtime polish (claude-channel push delivery, hermes-native
bindings) is a follow-up PR. PR #2412 fixed the partner bug where
canvas Restart silently revoked the operator's token — the two
together unblock the external-runtime story end-to-end.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 15:20:19 -07:00
Hongming Wang
e955597a98 feat(chat_files): rewrite Download as HTTP-forward (RFC #2312, PR-D)
Mirrors PR-C's Upload migration: replaces the docker-cp tar-stream
extraction with a streaming HTTP GET to the workspace's own
/internal/file/read endpoint. Closes the SaaS gap for downloads —
without this PR, GET /workspaces/:id/chat/download still returns 503
on Railway-hosted SaaS even after A+B+C+F land.

Stacks: PR-A #2313 → PR-B #2314 → PR-C #2315 → PR-F #2319 → this PR.

Why a single broad /internal/file/read instead of /internal/chat/download:

  Today's chat_files.go::Download already accepts paths under any of the
  four allowed roots {/configs, /workspace, /home, /plugins} — it's not
  strictly chat. Future PRs (template export, etc.) will reuse this
  endpoint via the same forward pattern; reusing avoids three near-
  identical handlers (one per domain) with duplicated path-safety logic.

Path safety is duplicated on platform + workspace sides — defence in
depth via two parallel checks, not "trust the workspace."

Changes:
  * workspace/internal_file_read.py — Starlette handler. Validates path
    (must be absolute, under allowed roots, no traversal, canonicalises
    cleanly). lstat (not stat) so a symlink at the path doesn't redirect
    the read. Streams via FileResponse (no buffering). Mirrors Go's
    contentDispositionAttachment for Content-Disposition header.
  * workspace/main.py — registers GET /internal/file/read alongside the
    POST /internal/chat/uploads/ingest from PR-B.
  * scripts/build_runtime_package.py — adds internal_file_read to
    TOP_LEVEL_MODULES so the publish-runtime cascade rewrites its
    imports correctly. Also includes the PR-B additions
    (internal_chat_uploads, platform_inbound_auth) since this branch
    was rooted before PR-B's drift-gate fix; merge-clean alphabetic
    additions.
  * workspace-server/internal/handlers/chat_files.go — Download
    rewritten as streaming HTTP GET forward. Resolves workspace URL +
    platform_inbound_secret (same shape as Upload), builds GET request
    with path query param, propagates response headers (Content-Type /
    Content-Length / Content-Disposition) + body. Drops archive/tar
    + mime imports (no longer needed). Drops Docker-exec branch entirely
    — Download is now uniform across self-hosted Docker and SaaS EC2.
  * workspace-server/internal/handlers/chat_files_test.go — replaces
    TestChatDownload_DockerUnavailable (stale post-rewrite) with 4
    new tests:
      - TestChatDownload_WorkspaceNotInDB → 404 on missing row
      - TestChatDownload_NoInboundSecret → 503 on NULL column
        (with RFC #2312 detail in body)
      - TestChatDownload_ForwardsToWorkspace_HappyPath → forward shape
        (auth header, GET method, /internal/file/read path) + headers
        propagated + body byte-for-byte
      - TestChatDownload_404FromWorkspacePropagated → 404 from
        workspace propagates (NOT remapped to 500)
    Existing TestChatDownload_InvalidPath path-safety tests preserved.
  * workspace/tests/test_internal_file_read.py — 21 tests covering
    _validate_path matrix (absolute, allowed roots, traversal, double-
    slash, exact-match-on-root), 401 on missing/wrong/no-secret-file
    bearer, 400 on missing path/outside-root/traversal, 404 on missing
    file, happy-path streaming with correct Content-Type +
    Content-Disposition, special-char escaping in Content-Disposition,
    symlink-redirect-rejection (lstat-not-stat protection).

Test results:
  * go test ./internal/handlers/ ./internal/wsauth/ — green
  * pytest workspace/tests/ — 1292 passed (was 1272 before PR-D)

Refs #2312 (parent RFC), #2308 (chat upload+download 503 incident).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 15:19:02 -07:00
Hongming Wang
f323def18f chore(build): include platform_tools in runtime wheel SUBPACKAGES
The PR-built wheel + import smoke gate refused the platform_tools
package because it's a new subdirectory under workspace/ that wasn't
in scripts/build_runtime_package.py:SUBPACKAGES. The drift gate (which
exists for exactly this reason) caught it cleanly:

  error: SUBPACKAGES drifted from workspace/ subdirectories:
    in workspace/ but NOT in SUBPACKAGES (will ship un-rewritten or
    be excluded): ['platform_tools']

Adding platform_tools to SUBPACKAGES wires the package into the
runtime wheel + applies the canonical
  from platform_tools.<x> -> from molecule_runtime.platform_tools.<x>
import-rewrite step that every other subpackage uses.

Verified locally: scripts/build_runtime_package.py succeeds, the
rewritten a2a_mcp_server.py reads
  from molecule_runtime.platform_tools.registry import TOOLS
which matches the package layout in the wheel.
2026-04-28 17:19:00 -07:00
Hongming Wang
6e732ab714 fix(build): ship lib/ subpackage + extend drift gate to SUBPACKAGES
Two compounding bugs that bit hermes (and any other workspace that
reaches main.py:142):

1. workspace/lib/ was in EXCLUDE_DIRS so the published wheel didn't
   contain the directory at all. main.py imports `from lib.pre_stop
   import read_snapshot` (and `build_snapshot`, `write_snapshot`) so
   every workspace startup that reaches the snapshot path crashed
   with `ModuleNotFoundError: No module named 'lib'`.

2. Even if lib/ had shipped, `lib` wasn't in SUBPACKAGES so the
   import-rewriter would have left the bare `from lib.pre_stop`
   unqualified — it would still fail because the package would only
   be reachable as `molecule_runtime.lib`.

Fix: move `lib` from EXCLUDE_DIRS to SUBPACKAGES (one entry each).

Drift gate extension: the existing gate I added in #2163 only
asserted TOP_LEVEL_MODULES against workspace/*.py. This change adds
the symmetric assertion for SUBPACKAGES against workspace/<dir>/
(filtered by EXCLUDE_DIRS + presence of __init__.py). Catches both:
- Subpackage added to workspace/ but missed in SUBPACKAGES
- Subpackage missing from workspace/ but lingering in SUBPACKAGES
- Subpackage wrongly in EXCLUDE_DIRS while also referenced by
  rewritten imports (the lib case)

Tested locally: build of 0.1.99 now ships lib/ and main.py contains
`from molecule_runtime.lib.pre_stop import ...` correctly rewritten.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 06:03:46 -07:00
Hongming Wang
c68dc1877f fix(release): drift-gate TOP_LEVEL_MODULES + smoke-import main in publish
Two compounding bugs surfaced when 0.1.16 hit production today:

1. scripts/build_runtime_package.py had a hand-curated TOP_LEVEL_MODULES
   set listing every workspace/*.py that should get its bare imports
   rewritten to `molecule_runtime.X`. The set silently went stale:
   - Missing: transcript_auth (added since #87 phase 1c), runtime_wedge,
     watcher → unrewritten imports shipped, every workspace startup
     died with ModuleNotFoundError.
   - Stale: claude_sdk_executor, cli_executor (both removed in #87),
     hermes_executor (never existed) → harmless but misleading.

2. publish-runtime.yml's wheel-smoke step asserted on stable invariants
   (BaseAdapter, AdapterConfig, a2a_client error sentinel) but never
   imported main. So even though main.py held the broken bare
   `from transcript_auth import ...`, the smoke check passed.

Fixes:

- Build script now derives the on-disk module set from workspace/*.py
  and asserts it matches TOP_LEVEL_MODULES exactly. Drift in either
  direction fails the build with a specific diff message instead of
  shipping a broken wheel. Closed-list typo guard preserved (we still
  edit the set explicitly when a module is added/removed) — the gate
  just makes drift impossible to ignore.

- TOP_LEVEL_MODULES updated to current reality: drop the 3 stale,
  add the 3 missing.

- publish-runtime.yml wheel-smoke now `import molecule_runtime.main`
  before the invariant asserts. main is the entry point and
  transitively imports every module — any bare-import bug surfaces
  as ModuleNotFoundError before PyPI accepts the upload.

Tested locally: `python3 scripts/build_runtime_package.py
--version 0.1.99 --out /tmp/build-test` succeeds, and
/tmp/build-test/molecule_runtime/main.py contains the rewritten
`from molecule_runtime.transcript_auth import ...`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 03:19:17 -07:00
Hongming Wang
0de67cd379 feat(platform/admin): /admin/workspace-images/refresh + Docker SDK + GHCR auth
The production-side end of the runtime CD chain. Operators (or the post-
publish CI workflow) hit this after a runtime release to pull the latest
workspace-template-* images from GHCR and recreate any running ws-* containers
so they adopt the new image. Without this, freshly-published runtime sat in
the registry but containers kept the old image until naturally cycled.

Implementation notes:
- Uses Docker SDK ImagePull rather than shelling out to docker CLI — the
  alpine platform container has no docker CLI installed.
- ghcrAuthHeader() reads GHCR_USER + GHCR_TOKEN env, builds the base64-
  encoded JSON payload Docker engine expects in PullOptions.RegistryAuth.
  Both empty → public/cached images only; both set → private GHCR pulls.
- Container matching uses ContainerInspect (NOT ContainerList) because
  ContainerList returns the resolved digest in .Image, not the human tag.
  Inspect surfaces .Config.Image which is what we need.
- Provisioner.DefaultImagePlatform() exported so admin handler picks the
  same Apple-Silicon-needs-amd64 platform as the provisioner — single
  source of truth for the multi-arch override.

Local-dev companion: scripts/refresh-workspace-images.sh runs on the
host and inherits the host's docker keychain auth — alternate path for
when GHCR_USER/TOKEN aren't set in the platform env.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
2026-04-26 10:17:21 -07:00