Commit Graph

179 Commits

Author SHA1 Message Date
Hongming Wang
fd4b4e0723 test: pin null-required_env tolerance + drop unused MINIMAX env clear
Two self-review nits on the prior commit:
- Add test_per_model_required_env_null_treated_as_empty_no_auth — pins
  parser tolerance for YAML 'required_env:' (deserializes to None). The
  'or []' fallback handles it, but the behavior wasn't asserted, and a
  template author who writes 'required_env:' with no value (common YAML
  mistake) needs the no-auth path, not a confusing TypeError.
- Drop the MINIMAX_API_KEY delenv from the explicit-empty test — there's
  no MINIMAX in any required_env list of that scenario, so the cleanup
  was dead noise.

78/78 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 21:56:40 -07:00
Hongming Wang
3e5955f04f fix(runtime): explicit empty per-model required_env means "no auth"
Two follow-ups from the independent review of #2538.

preflight.py
============
Today: `if per_model_env: required_env = list(per_model_env)` falls
through on `[]`, so a template entry that says "this model needs no
auth" (`required_env: []` — Ollama, llamafile, self-hosted OpenAI-
compat, anything where the SDK doesn't surface a key) is silently
overridden by the top-level fallback list. The template author cannot
express a zero-auth model without lying about its env requirements.

Fix: key off `"required_env" in entry` (key presence, not truthiness).
Missing key still falls back to top-level — that path is unchanged
and preserves "many templates list name/description per model without
enumerating env vars when auth is identical across the family". Empty
list now wins outright. Comment updated to call out the distinction.

test_preflight.py
=================
Renamed `test_per_model_match_with_no_required_env_falls_back_to_top_level`
to `…_no_required_env_KEY_…` and tightened its docstring to reflect
that it's the missing-KEY case only. Added new
`test_per_model_explicit_empty_required_env_means_no_auth` to pin the
new explicit-empty semantic.

test_config.py
==============
New `test_runtime_config_model_env_wins_over_explicit_yaml`. Pins the
intentional precedence inversion shipped in #2538 with both
MODEL_PROVIDER and runtime_config.model in YAML set — MODEL_PROVIDER
wins. Without this pin a future refactor could quietly restore the
old YAML-wins order and re-introduce Bug B.

77/77 targeted tests pass locally.

Closes #250 (review follow-up). Builds on merged #2538.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 21:51:01 -07:00
Hongming Wang
97ebd1910a fix(runtime): canvas-picked model wins universally + per-model required_env
Two surgical edits to the molecule-runtime workspace package that fix
Bug B (canvas-picked model silently dropped for templated workspaces)
and Bug D (preflight rejects valid auth for non-default models),
universally for every adapter.

Bug B — canvas-picked model dropped (config.py)
================================================
Before: load_config resolved runtime_config.model as
  runtime_raw.get("model") or model
which means a template's `runtime_config.model: sonnet` always wins
over the canvas-picked MODEL_PROVIDER env var. Surfaced 2026-05-02
during MiniMax E2E — picking MiniMax-M2.7 in canvas, server plumbed
MODEL_PROVIDER=MiniMax-M2.7 correctly, but the workspace booted with
sonnet because the template's verbatim config.yaml won.

After:
  os.environ.get("MODEL_PROVIDER") or runtime_raw.get("model") or model

Centralising in load_config means EVERY adapter (claude-code, hermes,
codex, langgraph, future ones) gets canvas-picked-model passthrough
for free — no per-adapter env-reading code required.

Bug D — preflight per-model required_env (preflight.py)
========================================================
Before: preflight read the top-level required_env list, which
declares the auth needed by the *default* model. A template like
claude-code-default declares CLAUDE_CODE_OAUTH_TOKEN at the top
level. When a user picked MiniMax instead and only set
MINIMAX_API_KEY, preflight rejected the workspace with
"missing CLAUDE_CODE_OAUTH_TOKEN" and the workspace crash-looped
despite the user having satisfied the picked model's actual auth.

After: when runtime_config.models[] declares per-entry required_env,
preflight matches the picked model id (case-insensitive) and uses
that entry's required_env outright instead of the top-level list.
REPLACE semantics, not union — different models have *different*
auth paths (OAuth vs API key vs third-party provider key); unioning
would re-introduce the very crash-loop this fix closes.

Surface enabling both fixes (config.py)
========================================
RuntimeConfig now carries `models: list[dict]` so the canvas Model
dropdown source flows through to preflight without forcing the
parser schema to grow. Malformed entries are silently dropped to
match the rest of the lenient parser.

Tests
=====
- workspace/tests/test_preflight.py: 9 new tests covering the
  per-model lookup (case-insensitive, REPLACE not union, fallback
  to top-level when no models[] or no match, multi-entry, malformed
  entries dropped, etc.)
- workspace/tests/test_config.py: existing 48 pass; field
  initialisation already covered by parser tests.
- All 75 targeted tests pass locally; CI runs the full suite
  including coverage gate.

Closes part of #246. Sibling PR opens against
molecule-ai-workspace-template-claude-code for per-template
defensive fixes + boot debug logging.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 21:36:24 -07:00
Hongming Wang
9c03b1084f
Merge pull request #2524 from Molecule-AI/dependabot/pip/workspace/opentelemetry-api-gte-1.41.1
chore(deps)(deps): update opentelemetry-api requirement from >=1.24.0 to >=1.41.1 in /workspace
2026-05-03 01:25:34 +00:00
Hongming Wang
476dbc83a3
Merge pull request #2530 from Molecule-AI/dependabot/pip/workspace/opentelemetry-exporter-otlp-proto-http-gte-1.41.1
chore(deps)(deps): update opentelemetry-exporter-otlp-proto-http requirement from >=1.24.0 to >=1.41.1 in /workspace
2026-05-03 01:25:31 +00:00
Hongming Wang
8dc07b46dd
Merge pull request #2526 from Molecule-AI/dependabot/pip/workspace/python-multipart-gte-0.0.27
chore(deps)(deps): update python-multipart requirement from >=0.0.18 to >=0.0.27 in /workspace
2026-05-03 01:25:25 +00:00
dependabot[bot]
dfc1f6d455
chore(deps)(deps): update pyyaml requirement in /workspace
Updates the requirements on [pyyaml](https://github.com/yaml/pyyaml) to permit the latest version.
- [Release notes](https://github.com/yaml/pyyaml/releases)
- [Changelog](https://github.com/yaml/pyyaml/blob/6.0.3/CHANGES)
- [Commits](https://github.com/yaml/pyyaml/compare/6.0...6.0.3)

---
updated-dependencies:
- dependency-name: pyyaml
  dependency-version: 6.0.3
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-05-02 19:23:25 +00:00
dependabot[bot]
0e0550c640
chore(deps)(deps): update opentelemetry-exporter-otlp-proto-http requirement
Updates the requirements on [opentelemetry-exporter-otlp-proto-http](https://github.com/open-telemetry/opentelemetry-python) to permit the latest version.
- [Release notes](https://github.com/open-telemetry/opentelemetry-python/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-python/blob/v1.41.1/CHANGELOG.md)
- [Commits](https://github.com/open-telemetry/opentelemetry-python/compare/v1.24.0...v1.41.1)

---
updated-dependencies:
- dependency-name: opentelemetry-exporter-otlp-proto-http
  dependency-version: 1.41.1
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-05-02 19:23:21 +00:00
dependabot[bot]
1d99b3b8ae
chore(deps)(deps): update python-multipart requirement in /workspace
Updates the requirements on [python-multipart](https://github.com/Kludex/python-multipart) to permit the latest version.
- [Release notes](https://github.com/Kludex/python-multipart/releases)
- [Changelog](https://github.com/Kludex/python-multipart/blob/main/CHANGELOG.md)
- [Commits](https://github.com/Kludex/python-multipart/compare/0.0.18...0.0.27)

---
updated-dependencies:
- dependency-name: python-multipart
  dependency-version: 0.0.27
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-05-02 19:23:15 +00:00
dependabot[bot]
8072f00b2f
chore(deps)(deps): update opentelemetry-api requirement in /workspace
Updates the requirements on [opentelemetry-api](https://github.com/open-telemetry/opentelemetry-python) to permit the latest version.
- [Release notes](https://github.com/open-telemetry/opentelemetry-python/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-python/blob/v1.41.1/CHANGELOG.md)
- [Commits](https://github.com/open-telemetry/opentelemetry-python/compare/v1.24.0...v1.41.1)

---
updated-dependencies:
- dependency-name: opentelemetry-api
  dependency-version: 1.41.1
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-05-02 19:23:11 +00:00
Hongming Wang
fc33cf1131 docs(a2a): correct misleading v1-tolerance comments
Follow-up to PR #2509/#2510. The defensive v1-detection branches in
extract_attached_files (Python) and extractFilesFromTask (TypeScript)
were merged with comments claiming they fix a "v0→v1 silent-drop"
bug that surfaced as the 2026-05-01 hongming "no text content"
incident. Live test disproved that hypothesis: a2a-sdk's JSON-RPC
layer validates inbound requests against the v0 Pydantic union, so
v1 shapes are rejected at the request boundary — the v1 detection
branch is unreachable on the JSON-RPC ingress path. The actual root
cause of the hongming incident was the missing /workspace chown
fixed by CP PR #381 + test #382.

Update the comments to honestly describe these branches as
defensive future-proofing (kept against an eventual SDK schema
migration or in-process callers that construct Parts directly from
protobuf), not as fixes for an observed bug. Also trims
ChatTab.tsx's outbound-shape comment block from ~21 lines to a
3-line pointer to the SDK union.

Comment-only change. No behavior change. 86 workspace tests + 91
canvas tests still pass.
2026-05-02 02:33:00 -07:00
Hongming Wang
02a8841402 fix(a2a): send v1 file Part shape; tolerate v1 server-side
Image-only chats surface "Error: message contained no text content"
because canvas posts v0 `{kind:"file", file:{uri,name,mimeType}}` shapes
that the workspace runtime's a2a-sdk v1 protobuf parser silently drops:
v1 `Part` has fields `[text, raw, url, data, metadata, filename,
media_type]` and `ignore_unknown_fields=True` discards `kind`+`file`,
producing a fully-empty Part. With no text and no extracted file
attachments, the executor's "no text content" guard fires.

Three coordinated changes close the gap:

1. canvas/ChatTab.tsx — outbound file parts now carry the v1 flat
   shape `{url, filename, mediaType}` so the v1 protobuf parser
   populates Part fields instead of dropping them.
2. workspace/executor_helpers.py — extract_attached_files learns the
   v1 detection branch (non-empty `part.url` + `filename` +
   `media_type`) alongside the existing v0 RootModel and flat-file
   shapes. Defends every runtime that mounts the OSS wheel against
   the same drop, including any pre-fix client still on the wire.
3. canvas/message-parser.ts — extractFilesFromTask tolerates the v1
   shape on incoming agent responses too, so file chips render in
   chat history regardless of which Part shape the runtime emits.

Test pins:
- workspace/tests/test_executor_helpers.py:
  + v1 protobuf shape extraction
  + empty-Part defense (v0→v1 silent-drop fall-through returns [])
- canvas message-parser test:
  + v1 protobuf flat parts
  + filename fallback to URL basename for v1
2026-05-02 00:58:05 -07:00
Hongming Wang
6f0e914521
Merge pull request #2479 from Molecule-AI/fix/molecule-mcp-non-pipe-stdout
fix(mcp): friendly fail-fast when stdio isn't pipe-compatible
2026-05-02 02:20:51 +00:00
Hongming Wang
f6a48d593e test: standardise on from a2a_mcp_server import ... in TestStdioPipeAssertion
github-code-quality bot flagged 4 instances of `import a2a_mcp_server` in
the new TestStdioPipeAssertion class — every other test in the file uses
the `from a2a_mcp_server import ...` per-test pattern, so this is a real
inconsistency.

Switching the new tests to match. No behavior change; resolves the
4 unresolved review threads blocking the merge queue.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 19:17:55 -07:00
Hongming Wang
1181699482
Merge pull request #2481 from Molecule-AI/fix/channel-peer-id-trust-boundary
fix(channel): validate peer_id at envelope build — close path-traversal foothold
2026-05-02 01:46:49 +00:00
Hongming Wang
0b979aed78 fix(channel): validate peer_id at envelope build — close path-traversal foothold
Two trust-boundary leaks surfaced in code review of the channel-envelope
enrichment work:

1. _agent_card_url_for(peer_id) interpolated raw input into
   ${PLATFORM_URL}/registry/discover/<peer_id> with no UUID guard. An
   upstream row with peer_id=`../../foo` produced an agent-visible URL
   pointing at a sibling registry path. Same trust-boundary rationale
   discover_peer's docstring already calls out: "never interpolate
   path-traversal characters into the URL". Now gated by _validate_peer_id;
   returns "" on validation failure.

2. _build_channel_notification echoed raw peer_id back into
   meta["peer_id"], which on the push path renders inside the agent's
   <channel peer_id="..." kind="..."> XML-attribute context. Attacker
   bytes (control chars, embedded quotes) would land in agent-rendered
   text wired into the next conversation turn. Now canonicalised through
   _validate_peer_id before any meta write; on validation failure we
   set "" rather than reflecting the raw bytes.

Defense-in-depth — both layers gate independently. Mutation-verified by
stashing both prod-side files and confirming both regression tests fail.

Tests:
- test_envelope_enrichment_invalid_peer_id_skips_lookup: updated to
  pin the safe behavior (peer_id="" + agent_card_url absent), not the
  prior leak shape.
- test_envelope_enrichment_strips_path_traversal_peer_id: NEW. Hard
  regression for peer_id="../../foo" — pins both the URL-builder and
  the meta echo against this specific exploit shape.
- Two existing tests updated to use UUID-shape placeholders instead
  of "ws-peer-uuid" / "peer-ws-uuid" since those non-UUIDs now correctly
  get stripped by the validator.

Resolves the Required-grade finding from the multi-axis review on PR #2471.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 18:43:49 -07:00
Hongming Wang
88b156a3bc
Merge pull request #2480 from Molecule-AI/chore/runtime-wedge-dedup-fixture
chore(tests): drop redundant local _reset fixture from test_runtime_wedge
2026-05-02 01:33:31 +00:00
Hongming Wang
8838f99ed3 chore(tests): drop redundant local _reset fixture from test_runtime_wedge
PR #2475 promoted runtime_wedge reset to an autouse conftest fixture in
workspace/tests/conftest.py covering every test in this directory. The
local @pytest.fixture(autouse=True) _reset in test_runtime_wedge.py
became dead-but-harmless (idempotent reset is idempotent — both fixtures
ran on every test, double-resetting). Remove the local copy so future
maintainers don't have to keep two definitions in sync.

Caught during a deeper /code-review-and-quality pass on the #2475
follow-ups — the original PR landed the conftest fixture but missed
the dedup of the now-redundant in-file fixture.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 18:31:21 -07:00
Hongming Wang
9bbf32b526
Merge pull request #2471 from Molecule-AI/feat/channel-envelope-enrichment
feat(a2a-mcp): enrich channel envelope with peer name/role/agent_card_url
2026-05-02 01:31:15 +00:00
Hongming Wang
885eff2350 test: drop unused _OTHER_PEER constant
github-code-quality bot flagged it as an unused module-level global —
correctly. The earlier draft of the negative-cache test was going to
exercise two distinct peer IDs hitting the registry concurrently, but
the test was simplified to a single-peer flow before merge and the
constant lost its consumer.

Resolves the only blocking review thread on PR #2471.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 18:28:24 -07:00
Hongming Wang
82beb98fff
Merge pull request #2474 from Molecule-AI/feat/chat-history-mcp-tool
feat(a2a-mcp): add chat_history tool for prior turns with a peer
2026-05-02 01:27:38 +00:00
Hongming Wang
afc01d6995 fix(mcp): friendly fail-fast when stdio isn't pipe-compatible
When molecule-mcp is launched with stdin or stdout redirected to a
regular file (molecule-mcp > out.txt, ad-hoc CI smoke-tests, local
debugging), asyncio.connect_read_pipe / connect_write_pipe later raise
ValueError: Pipe transport is only for pipes, sockets and character
devices — surfaced to the operator as a confusing traceback with no
hint about what to do.

Add _assert_stdio_is_pipe_compatible() to detect the same constraint
synchronously before the event loop starts, exit cleanly with code 2,
and print a stderr message that names:
  - which stream failed (stdin vs stdout)
  - the asyncio transport requirement
  - the two common causes (>file, <file) and a working alternative
    (molecule-mcp 2>&1 | tee out.txt)

Wired into cli_main() (the synchronous wrapper around asyncio.run(main()))
so wheel-smoke + the production launch path both go through the guard
without changing the async stdio loop body. Closed/stale-fd case also
handled — os.fstat OSError exits 2 with the same guidance instead of
escaping.

Tests: 4 new in TestStdioPipeAssertion — pipe-pair happy path,
regular-file stdout (the bug condition), regular-file stdin (symmetric
case), and closed-fd. Mutation-verified — all 4 fail without the prod
helper. 37/37 in test_a2a_mcp_server.py.

Closes Molecule-AI/molecule-ai-workspace-runtime#61.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 18:26:24 -07:00
Hongming Wang
e6eda38318 fix(a2a-client): negative-cache registry failures in enrich_peer_metadata
Self-review on PR #2471: failure outcomes (4xx/5xx/non-JSON/network
exception) weren't writing to _peer_metadata, so a peer with a flaky
or missing registry record re-fired the 2s-bounded GET on EVERY
push. The cache became a no-op for the exact failure scenarios it
most needs to defend against, and the poller thread stalled 2s per
push for that peer until the registry came back.

Cache the failure outcome as `(now, None)` so the TTL window
suppresses re-fetch. Two new tests pin the behaviour for both
HTTP failures (5xx) and transport exceptions (httpx.ConnectError).

Type signature widens to `dict | None` on the value tuple's second
slot to match the new sentinel; readers already handle `None` as
"no enrichment available" — that's the documented graceful-degrade
contract — so no caller change needed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 18:16:35 -07:00
Hongming Wang
1282b6f3d6 docs(a2a-tools): drop stale comment — before_ts is now server-supported
Self-review on PR #2474 + #2476: the comment said we don't forward
before_ts, but the code below does. Misleading after #2476 added
the server-side filter. Replace with a one-liner that just states
the forward-and-validate contract.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 18:13:51 -07:00
Hongming Wang
46bc63e373 chore(smoke): runtime_wedge follow-ups from PR #2473 review
Three review nits from PR #2473:

1. Narrow `_check_runtime_wedge` import catch to (ImportError,
   ModuleNotFoundError). The bare `except Exception:` would have
   masked an `AttributeError`/`TypeError` from a runtime_wedge API
   rename — silently degrading the smoke gate to "no wedge info" with
   no log line. The `runtime_wedge_signature.json` snapshot test
   (task #169) carries the API-drift load instead.

2. Drop the unreachable `or "<unspecified>"` fallback. `wedge_reason()`
   only returns "" when not wedged, but the call is guarded by
   `is_wedged()` being True and `mark_wedged` requires a non-None
   reason. The defensive arm couldn't fire.

3. Promote `reset_runtime_wedge` from a per-file fixture in
   test_smoke_mode.py to an autouse fixture in
   workspace/tests/conftest.py. Heartbeat tests or future adapter
   tests that call `mark_wedged` without cleanup would otherwise leak
   a sticky wedge into smoke tests later in the same pytest process —
   smoke tests would fail-via-leak instead of asserting their actual
   contract. Two-sided reset survives early test failures.

Also: `test_check_runtime_wedge_returns_none_when_module_missing`
now `monkeypatch.delitem(sys.modules, "runtime_wedge")` before
patching `__import__`, so the test re-exercises the import path
instead of resolving from the module cache (the test was passing
today by luck — it would still pass even if the catch arm were
deleted, because the cached module's `is_wedged` returned False).

Tests: 28 still pass in test_smoke_mode.py, 57 across smoke + wedge +
heartbeat. Regression-injection-checked: catch tightening doesn't
regress the existing wedge tests.
2026-05-01 18:01:51 -07:00
Hongming Wang
09e99a09c6 feat(a2a-mcp): add chat_history tool for prior turns with a peer
When a peer_agent push lands and the agent needs context from prior
turns with that workspace ("what task did this peer assign me last
hour?", "what did I tell them?"), the only options today are
re-deriving from memory (lossy) or scrolling activity_logs in the
canvas (no agent-facing tool). Surface the platform's existing
audit log directly via a new MCP tool so agents can read both sides
of an A2A conversation in chronological order.

Implementation:
- a2a_tools.py: new tool_chat_history(peer_id, limit=20, before_ts="")
  hits /workspaces/<self>/activity?peer_id=X&limit=N (the new server
  filter from molecule-core#2472). Reverses the DESC response into
  chronological order so the agent reads top-down. Graceful error
  envelope on validation/network/non-200 — never crashes the MCP
  server, agent can branch on Error: prefix.
- platform_tools/registry.py: ToolSpec wired into the A2A section so
  the rendered system-prompt block automatically includes it. Same
  pattern as the existing inbox_peek/inbox_pop/wait_for_message.
- a2a_mcp_server.py: dispatch in handle_tool_call.
- executor_helpers.py: _CLI_A2A_COMMAND_KEYWORDS gets a None entry
  (CLI runtimes don't expose chat history today; flip to a keyword
  when a2a_cli grows a `history` subcommand).
- snapshots/a2a_instructions_mcp.txt regenerated.

Tests: 10 new branches in TestChatHistory (validation / param
forwarding / limit cap / before_ts pass-through / DESC→chronological
reorder / 400 verbatim / 500 generic / network exc / non-list resp).
Mutation-verified: reverting a2a_tools.py fails 10/10. Full test
suite remains green at 1516 passed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 17:54:23 -07:00
Hongming Wang
b39dc62de6
Merge pull request #2473 from Molecule-AI/feat/universal-turn-smoke-runtime-wedge
feat(smoke): consult runtime_wedge after execute() to catch SDK init wedges
2026-05-02 00:52:31 +00:00
Hongming Wang
103ac09aeb docs(a2a-mcp): list new envelope attrs in initialize instructions
The agent learns about <channel> tag attributes ONLY from the
instructions string returned by initialize. Without this update the
wheel ships peer_name / peer_role / agent_card_url on the wire but
no agent ever uses them — they get printed inline in the push tag,
the agent doesn't know they're there, and the UX gain from the
enrichment is lost.

Update _build_channel_instructions to:
- List the new attrs in the <channel> tag template under PUSH PATH
- Add per-attribute semantics (when present, what to do with them,
  what \"absent\" means — graceful-degrade vs bug)
- Point at the discover endpoint for agent_card_url so the agent
  treats it as a follow-on URL not the body of the message

Tests: structural pin asserting all three attr names appear in the
instructions AND the per-field semantics phrases (\"registry
resolved\", \"discover endpoint\") so a future copy-edit that
shortens the prose can't silently drop the agent guidance.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 17:49:40 -07:00
Hongming Wang
59f0a449bd feat(smoke): consult runtime_wedge after execute() to catch SDK init wedges
Timeout-as-PASS in run_executor_smoke missed the PR-25-class
regression: claude-agent-sdk takes 60s to time out on a malformed
argv, our outer wait_for fires at 5s default and reports "imports
healthy, hit a network boundary." A broken image then ships to GHCR.

Universal fix uses the existing runtime_wedge module (already
documented as the cross-cutting wedge holder, already read by
heartbeat). Adapters opt-in by calling runtime_wedge.mark_wedged()
from their executor's wedge catch arm; the smoke now consults
runtime_wedge.is_wedged() at the end of every result path and
upgrades a provisional PASS to FAIL when the flag is set. Non-opt-in
adapters keep working as before — the check is additive.

CI uses MOLECULE_SMOKE_TIMEOUT_SECS=90 to outlast the SDK's 60s
initialize() handshake so the wedge marks before our outer wait_for
fires. Module + helper docstrings call out the calibration so a
future contributor doesn't lower it without thinking through what
that wins back vs. what it loses.

Tests: 7 new cases pinning the wedge-aware paths — mark+raise (PR-25
shape), mark+block (still-running execute that wait_for cuts short),
clean+clean (additive contract), import-resilience (fail-open when
runtime_wedge unimportable). Regression-injection-checked: silencing
the new check fails both wedge-shape tests at unit-test time.
2026-05-01 17:46:43 -07:00
Hongming Wang
0fec3d6fe4 fix(test): anchor envelope-enrichment TTL test to monotonic baseline
Setting fetched_at = 0.0 assumed wall-clock semantics, but
time.monotonic() returns process uptime — when this test ran
early in the pytest run, current was <300s and the entry was
treated as fresh, silently skipping the re-fetch the assertion
expects. Anchor to time.monotonic() - TTL - 60 so the entry is
unambiguously past the freshness window regardless of when
in the run the test fires.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 17:45:05 -07:00
Hongming Wang
050aa33fc1 feat(a2a-mcp): enrich channel envelope with peer name/role/agent_card_url
The bare envelope only carried `peer_id` for peer_agent inbound, so a
receiving agent had to round-trip to /registry to find out who's
talking. Surface the sender's display name, role, and an agent-card
URL alongside the routing fields so the agent can render
"ops-agent (sre): ping" in one shot without an extra lookup.

a2a_client.py:
- Add _peer_metadata cache `dict[peer_id → (fetched_at, record)]`
- Add enrich_peer_metadata(peer_id) — sync, hits cache or registry
  with a tight 2s timeout, returns None on validation/network/non-200
  so callers can degrade gracefully
- TTL = 5 min so a busy multi-peer chat doesn't hit registry on every
  push, but role/name renames propagate within a session
- Add _agent_card_url_for(peer_id) — deterministic from peer_id alone

a2a_mcp_server.py:
- _build_channel_notification calls enrich_peer_metadata when peer_id
  is non-empty; meta carries peer_name + peer_role + agent_card_url
  alongside the existing routing fields
- agent_card_url surfaces unconditionally (constructable from peer_id);
  peer_name/role only when registry lookup succeeds — never blocks the
  push on a registry stall

Tests: 6 new branches (canvas_user no enrichment / cache hit no GET /
cache miss fetches once / registry-fail graceful degrade / TTL expiry
re-fetches / invalid peer_id skips lookup). Mutation-verified: 6/6
fail without prod code, 39/39 pass with.

Tracks the broader RFC at #2469 (workspace-server activity_type rename
to break the echo loop). Independent of PR #2470 — this is the
metadata-enrichment half of the same UX improvement.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 17:40:09 -07:00
Hongming Wang
2d8c45989a fix(inbox): skip self-notify rows in poller to break echo loop
The workspace-server's `/notify` handler writes the agent's own
send_message_to_user POSTs to activity_logs as activity_type=
'a2a_receive', method='notify', source_id=NULL so the canvas
chat-history loader can restore those bubbles after a page reload.
The activity API exposes the row to /workspaces/:id/activity?
type=a2a_receive, so the inbox poller picks it up and pushes the
agent's own outbound back as an inbound `← molecule: Agent
message: ...` — confirmed live 2026-05-01.

Add `_is_self_notify_row` predicate matched on (method='notify' AND
no source_id) and call it from `_poll_once` before enqueue. The
predicate combines BOTH discriminators so a future caller using
method='notify' with a real peer_id still passes through. Cursor
advances past skipped rows so we don't re-poll the same self-notify
on every iteration.

Belt-and-braces: long-term fix lives in workspace-server (rename
the misclassified activity_type to 'agent_outbound' — RFC at
#2469). This guard stays regardless because it only excludes rows
we never want.

Tests: 7 new — predicate true/false matrix + integrated _poll_once
behavior (skip, cursor advance, notification suppression).
Mutation-verified: reverting inbox.py to the prior shape fails 7/7;
applied state passes 48/48.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 17:35:49 -07:00
Hongming Wang
f96bb9f860 docs(mcp): tagged server:NAME form in dev-channels reference
Claude Code 2.1.x's --dangerously-load-development-channels takes an
allowlist of tagged entries (`server:<name>` or
`plugin:<name>@<marketplace>`), not a bare switch. The instructions
field's push-only-mode message and the inline comment in
`_poll_timeout_secs` both referenced the old bare form. Update both
so an agent or operator reading them lands on the right invocation —
matched against the docs change in [molecule-docs PR #110](https://github.com/Molecule-AI/docs/pull/110).

No behavior change (string-only edits in instructions text + comment).
33/33 tests still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 17:11:03 -07:00
Hongming Wang
f80e054a95
Merge pull request #2466 from Molecule-AI/feat/universal-push-via-instructions
feat(mcp): universal inbound delivery — instructions-driven polling + optional push
2026-05-01 23:13:05 +00:00
Hongming Wang
c61a6ff9bd chore(mcp): drop unused module-level _CHANNEL_INSTRUCTIONS
The frozen copy was a self-justification — the comment claimed "tests +
tooling rely on import-time identity" but no test or tooling code path
actually references the binding. _build_initialize_result() calls
_build_channel_instructions() fresh per call so env changes take effect,
which is the documented runtime contract.

github-code-quality flagged it; resolving the unused-variable thread so
the staging branch protection's all-conversations-resolved gate clears.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 16:11:09 -07:00
Hongming Wang
dbd086c7ad test(mcp): comment empty except in bridge test cleanup
Address github-code-quality review on PR #2465: explain why the
OSError swallow in pipe teardown is intentional (best-effort
cleanup of a possibly-already-closed fd).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 16:07:33 -07:00
Hongming Wang
ea206043d8 feat(mcp): universal inbound delivery — instructions-driven polling + optional push
Why this exists
---------------
Live evidence on 2026-05-01 caught a regression latent in #46's
"push-feel inbound" closure: standard `claude` launches without
`--dangerously-load-development-channels` silently drop our
`notifications/claude/channel` emissions, so canvas/peer messages sat
in the wheel inbox and never reached the agent loop until manual
`inbox_peek`. The flag is research-preview-only; non-Claude-Code MCP
clients (Cursor, Cline, OpenCode, hermes-agent, codex) never receive
the notification at all because the method namespace is Claude-
specific. Push-only delivery shipped as the universal contract is
not actually universal.

What this changes
-----------------
Adds a poll path that works on every spec-compliant MCP client. The
`initialize` `instructions` field — read by every client and surfaced
to the agent's system prompt automatically — now tells the agent to
call `wait_for_message(timeout_secs=N)` at the start of every turn.
Push remains as the strictly-better delivery for hosts that opt in
(Claude Code with the dev flag or a future allowlist entry), but is
no longer load-bearing.

Both paths converge on the same `inbox_pop` ack so duplicate-delivery
on a push+poll race is impossible: whoever surfaces the message to
the agent first pops it, the other side returns empty.

Operator knob
-------------
`MOLECULE_MCP_POLL_TIMEOUT_SECS` controls per-turn poll blocking
(default 2s). 0 disables polling for push-only Claude Code with the
dev flag. Above 60 clamps to 60 — protects against an accidental
five-minute stall per turn. Resolved fresh on every `initialize` so
a relaunch with new env is enough; no wheel rebuild required.

Tests
-----
- structural pins on the new instructions: `wait_for_message` +
  `timeout_secs` named, both PUSH PATH / POLL PATH labels present
- env-resolution: default fallback, garbage fallback, negative
  fallback, 60s clamp
- operator override: `MOLECULE_MCP_POLL_TIMEOUT_SECS=7` reaches the
  agent's instructions string
- timeout=0 toggles to push-only-mode messaging (no
  wait_for_message call asked of the agent)
- existing pins on push path, reply tools, prompt-injection defense,
  meta attributes — all preserved

Successor to #46. Closure milestone for this PR (per
feedback_close_on_user_visible_not_merge.md): launched `claude`
against the published wheel, sent a canvas message, observed the
agent surfaces the message inline at the start of its next turn
without me running `inbox_peek` — verified live before declaring done.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 15:32:57 -07:00
Hongming Wang
a3a496bced test(mcp): pin inbox→stdout bridge end-to-end with three failure-mode tests
Closes the dynamic-coverage gap on the `notifications/claude/channel`
push-UX bridge — until now we had static pins on the wire shape
(_build_channel_notification) and the initialize handshake, but the
threading + asyncio + stdout chain that ships notifications to the
host was never exercised under realistic conditions.

The three failure modes anticipated in #2444 §2 are each now pinned:

  test_inbox_bridge_emits_channel_notification_to_writer
    Drives a fake inbox event from a daemon thread, asserts the
    notification lands on a real os.pipe-backed asyncio writer with
    the correct JSON-RPC envelope. Catches: bridge wired up
    incorrectly (no-op _on_inbox_message), run_coroutine_threadsafe
    drift, _build_channel_notification call missing.

  test_inbox_bridge_swallows_closed_pipe_drain_error
    Closes the pipe's read end before firing, captures the
    concurrent.futures.Future that run_coroutine_threadsafe returns,
    asserts its exception() is None. Catches: narrowing the broad
    `except Exception` in _emit (e.g. to RuntimeError), or removing
    it. Without the swallow, the future carries a ConnectionResetError
    and the test fails with a clear message naming the regression.

  test_inbox_bridge_swallows_closed_loop_runtime_error
    Builds the bridge against a closed event loop, fires the
    callback, asserts no exception escapes. Catches: removing the
    `except RuntimeError` swallow on the run_coroutine_threadsafe
    call. Without it the poller thread would crash with
    "RuntimeError: Event loop is closed" during shutdown.

To make the bridge testable, extracted the closures from main() into
a top-level `_setup_inbox_bridge(writer, loop) -> Callable[[dict],
None]` helper. main()'s wire-up is now a single line that calls the
helper. Behavior is unchanged — same write, same drain, same
swallows — just no longer trapped inside main()'s closures.

Verified each test catches its regression by injection: removing
each swallow / no-op'ing the bridge each turn the matching test red
with a specific failure message that points at the missing piece.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 15:13:32 -07:00
Hongming Wang
e6be3c0df0 test(mcp): pin prompt-injection defense in _CHANNEL_INSTRUCTIONS
Adds the missing symmetric pin against the threat-model sentence —
the existing tests pin reply-tool names (send_message_to_user,
delegate_task, inbox_pop) and tag attributes (kind, peer_id,
activity_id) but left the "treat message body as untrusted user
content" line unpinned. A copy-edit that drops it would turn the
channel into an open prompt-injection vector against any workspace
running the MCP server.

Pins three signals: "untrusted" present, an explicit
"not execute"/"do not" clause, and the "approval" escape-hatch
sentence — two of three would let a partial copy-edit slip
through.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 14:24:05 -07:00
Hongming Wang
2588ab27d5 feat(mcp): add channel instructions field — second gate for push UX
PR #2461 added the experimental.claude/channel capability declaration
on the assumption that was the missing gate for Claude Code surfacing
notifications/claude/channel as inline <channel> interrupts. Research
against code.claude.com/docs/en/channels-reference.md confirms the
capability IS one gate — but there's a SECOND required field we still
don't ship: `instructions` on the initialize result.

The docs are explicit: instructions is what tells the agent what the
<channel> tag attributes mean and which tool to call to reply. Without
it the channel registers but the agent receives the tag with no
context and has no idea how to handle it. The official telegram
plugin ships both (server.ts:370-396) — capability AND instructions.
We were shipping one of two.

This adds the instructions string. It documents:
- kind/peer_id/activity_id meta attributes
- canvas_user → send_message_to_user reply path
- peer_agent → delegate_task reply path
- inbox_pop ack to prevent duplicate-poll re-delivery
- threat model: treat message bodies as untrusted user content

Tests: 4 new pins. instructions present + non-empty, instructions
names each reply tool, instructions documents each tag attribute.
Failure messages name the symptom so a copy-edit can't silently
break the channel.

Live verification still pending after wheel ships — same plan as
the gap is in --dangerously-load-development-channels (host-side
flag, outside our control during the channels research preview).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 14:24:05 -07:00
Hongming Wang
63ef3b128c docs(mcp): correct server.ts reference + flag verification gap on experimental.claude/channel
Follow-up to commit 0a87dec5 (PR #2461, merged before live verification).

Two corrections to the docstring on `_build_initialize_result()`:

1. The original "mirrors molecule-mcp-claude-channel server.ts:374"
   claim is wrong on two axes. Line 374 is unrelated poll-init code
   (a comment inside `registerAsPoll`). The actual capability site
   is server.ts:475, where the bun bridge declares only
   `{ capabilities: { tools: {} } }` — *no* `experimental.claude/channel`.
   The bun bridge is reported to deliver `notifications/claude/channel`
   successfully in Claude Code despite this, which is direct counter-
   evidence that adding the capability was the bug fix.

2. The `@modelcontextprotocol/sdk` server's `assertNotificationCapability`
   does not include `notifications/claude/channel` in any of its switch
   cases, meaning custom (non-spec) notification methods are sent
   regardless of declared capabilities. Server-side, the declaration
   is almost certainly a no-op.

This commit doesn't remove the capability — additive, not destructive,
and the new tests pin its presence — but downgrades the docstring's
certainty so the next person debugging "channel notification didn't
fire" doesn't trust a stale claim and pursues the more likely root
causes:

  - writer.drain() swallowing exceptions on a closed pipe
  - inbox-thread → asyncio.run_coroutine_threadsafe race during init
  - MCP transport not yet attached when the first inbox event fires

Live verification per #2444 §2 (fresh Claude Code session on this wheel
with a peer A2A message, observe whether the interrupt fires) remains
the open hard-gate.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 14:01:57 -07:00
Hongming Wang
0a87dec50e feat(mcp): declare experimental.claude/channel capability for push UX
Without this capability declaration in the initialize handshake,
Claude Code's MCP client receives our notifications/claude/channel
emissions but silently drops them — they never become inline
<channel> tags in the conversation. The push-UX bridge added in
PR #2433 ships, fires, and is invisible.

This was anticipated as a failure mode in #2444 §2 ("Notification
arrives but Claude Code doesn't surface it — host doesn't recognize
the method"), and confirmed live in this session: a canvas chat
"hi" landed in the inbox queue (inbox_peek returned it) but never
woke the agent until inbox_peek was called by hand.

The contract matches molecule-mcp-claude-channel/server.ts:374
where the bun bridge declares the same experimental flag.

Refactor: extracted _build_initialize_result() so the handshake
shape is unit-testable. Pure function, no behavioral change beyond
adding the experimental capability to the result.

Tests: 3 new pins on the initialize result (capability presence,
tools-still-there, protocolVersion stable). Closes the live-
verification gap §2 of #2444.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 13:45:06 -07:00
Hongming Wang
b8fdbd9fab fix(runtime): register configs_dir in TOP_LEVEL_MODULES + drop alias
Wheel-build smoke gate detected `configs_dir` missing from
scripts/build_runtime_package.py:TOP_LEVEL_MODULES. Without it the
build would ship `import configs_dir` un-rewritten and every
external-runtime install would die on `ModuleNotFoundError` at first
import.

Two callers used `import configs_dir as _configs_dir` to belt-and-
suspenders against an imagined name collision, but the rewriter
rejects `import X as Y` because the rewrite would produce
`import molecule_runtime.X as X as Y` (invalid syntax). No actual
collision exists (only docstring/comment references). Switched to
plain `import configs_dir`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 13:13:57 -07:00
Hongming Wang
c636022d2f fix(runtime): auto-fallback CONFIGS_DIR for non-container hosts (closes #2458)
The runtime persists per-workspace state (`.auth_token`,
`.platform_inbound_secret`, `.mcp_inbox_cursor`) under `/configs` —
the workspace-EC2 mount path. Inside a container that's writable,
agent-owned. Outside a container, `/configs` either doesn't exist or
isn't writable by an unprivileged user.

The default broke the external-runtime path (`pip install
molecule-ai-workspace-runtime` + `molecule-mcp` on a Mac/Linux
laptop). First heartbeat tries to persist `.platform_inbound_secret`
and crashes:

    [Errno 30] Read-only file system: '/configs'

The heartbeat thread logs and dies. Workspace flips offline within
a minute. Operator sees no actionable error.

Adds workspace/configs_dir.py — single resolution point with a tiered
fallback:

  1. CONFIGS_DIR env var, if set — explicit operator override
     (preserves existing tests + custom deployments verbatim).
  2. /configs — if it exists AND is writable. In-container default;
     unchanged behavior for every prod workspace.
  3. ~/.molecule-workspace — created with mode 0700 so per-file 0600
     perms aren't undermined by a world-readable parent.

Migrates the four readers (platform_auth, platform_inbound_auth,
mcp_cli, inbox) to call configs_dir.resolve() instead of
inlining `Path(os.environ.get("CONFIGS_DIR", "/configs"))`.

Existing tests that assert the old `/configs`-as-default contract
updated to assert the new contract: when CONFIGS_DIR is unset, path
resolves to a writable location — `/configs` if present, fallback
otherwise. Tests skip the fallback branch on hosts that DO have a
writable `/configs` (CI containers).

Verified the original repro is fixed: with no CONFIGS_DIR set on
macOS, configs_dir.resolve() returns ~/.molecule-workspace, the dir
exists, and writes succeed.

Test suite: 1454 passed, 3 skipped, 2 xfailed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 13:07:55 -07:00
Hongming Wang
2e8892ebc4 fix(workspace): surface errno + path on chat-upload mkdir failure
Production incident on hongming.moleculesai.app 2026-05-01T18:30Z —
fresh-tenant signup chat upload returned 500 with the body
{"error":"failed to prepare uploads dir"}. Diagnosis required SSM
access to the workspace stderr to recover errno + actual path.

The root-cause fix lives in claude-code template entrypoint
(molecule-ai-workspace-template-claude-code#23 — pre-create the
.molecule subtree as root before gosu drops to agent). This change
is the diagnostic improvement: when mkdir fails for any reason in
the future (EACCES, ENOSPC, EROFS, etc.), the response carries
the errno + offending path so the operator inspecting browser
devtools sees the real cause without needing SSM.

Backwards compatible — top-level "error" key is unchanged so
existing canvas / external alert rules continue to match. New
fields are additive: path, errno, detail.

Test pins the diagnostic shape so a future struct refactor can't
silently drop these fields.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 11:47:53 -07:00
Hongming Wang
c06c4c0f56
Merge pull request #2450 from Molecule-AI/feat/observability-config-schema
feat(config): observability block schema (#119 PR-1 of 4)
2026-05-01 05:20:11 +00:00
Hongming Wang
645c1862c4 feat(a2a-client): surface 410 Gone as 'removed' error so callers can re-onboard (#2429)
Follow-up A to PR #2449 — that PR taught the platform to return 410
Gone for status='removed' workspaces; this PR teaches get_workspace_info
to consume that signal.

Before: every non-200 collapsed into {"error": "not found"}, which
made the 2026-04-30 incident impossible to diagnose — the operator
KNEW the workspace_id existed (they'd just registered it), but the
runtime kept reporting "not found" for a deleted-but-not-purged row.

After: 410 produces a distinct {"error": "removed", "id", "removed_at",
"hint"} dict so callers (heartbeat-loop, channel bridge, dashboard
tools) can surface "your workspace was deleted, re-onboard" instead
of "not found". Falls back to a default hint if the platform body
isn't parseable so the actionable signal doesn't depend on body
shape parity.

Two new tests:
  - TestGetWorkspaceInfo.test_410_returns_removed_with_hint
  - TestGetWorkspaceInfo.test_410_with_unparseable_body_falls_back_to_default_hint

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 22:08:08 -07:00
Hongming Wang
59902bce83 feat(config): add observability block schema (#119 PR-1 of 4)
Hermes-style declarative block grouping cadence + verbosity knobs into
one place. Schema-only in this PR — wiring into heartbeat.py and main.py
lands in PR-3 of the #119 stack.

Two fields with live consumers waiting:
- heartbeat_interval_seconds (default 30, clamped to [5, 300])
  → heartbeat.py:134 currently has hard-coded HEARTBEAT_INTERVAL = 30
- log_level (default "INFO", uppercased at parse)
  → main.py:465 currently has hard-coded log_level="info"

Clamp band [5, 300] is intentional: sub-5s flooded the platform during
IR-2026-03-11; >5min lets crashed workspaces look healthy long enough
to mask failure. Coerce at parse so adapters and heartbeat.py can read
the value without re-validating.

Tests pin defaults, explicit YAML override, partial override, and
parametrized clamp behavior (10 cases including garbage strings + None).

Part of: task #119 (adopt hermes-style architecture)
Stack:  PR-1 schema → PR-2 event_log → PR-3 wire consumers → PR-4 skill compat
2026-04-30 21:58:45 -07:00
Hongming Wang
661eec2659 chore(smoke-mode): harden module-load + drop dead except clause
Two follow-ups from the #2275 Phase 1 self-review:

1. `_SMOKE_TIMEOUT_SECS = float(os.environ.get(...))` was evaluated at
   module load. main.py imports smoke_mode unconditionally — before
   the is_smoke_mode() check — so a malformed
   MOLECULE_SMOKE_TIMEOUT_SECS env value would SystemExit every
   workspace boot, not just smoke runs. Wrapped in try/except with a
   5.0 fallback. Probability of a typo'd env var hitting production
   is low (it's a CI-only knob), but the footgun is removed entirely.
   Regression test reloads the module under a malformed env value.

2. `_real_a2a_sdk_available()` caught (ImportError, AttributeError).
   `from X import Y` raises ImportError when Y is missing on X — never
   AttributeError. Dropped the unreachable branch.

No behavior change for the happy path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 21:31:08 -07:00
Hongming Wang
aacaba024c feat(wheel-smoke): exercise executor.execute() to catch lazy imports (#2275)
The existing wheel-publish smoke (`wheel_smoke.py`) only IMPORTS
`molecule_runtime.main` at module scope. Lazy imports buried inside
`async def execute(...)` bodies (e.g. `from a2a.types import FilePart`)
NEVER evaluate at static-import time — they crash at first message
delivery in production.

The 2026-04-2x v0→v1 a2a-sdk migration shipped 5 such regressions in
templates that all looked fine at module-load smoke. This change adds
`smoke_mode.py` plus a `MOLECULE_SMOKE_MODE=1` short-circuit in
`main.py`: after `adapter.create_executor(...)`, the boot path invokes
`executor.execute(stub_ctx, stub_queue)` once with a 5s timeout
(`MOLECULE_SMOKE_TIMEOUT_SECS`). Healthy import tree → execution
proceeds far enough to hit a network boundary and times out (exit 0).
Broken lazy import → `ImportError` / `ModuleNotFoundError` from inside
the executor body (exit 1). Other downstream errors (auth, validation)
pass — those are caught by adapter-level tests, not this gate.

Stub `(RequestContext, EventQueue)` is built from the real a2a-sdk so
SendMessageRequest/RequestContext constructor changes also surface as
import-tree failures (the regression class also includes "SDK
refactored mid-publish"). The stub-build itself is wrapped — if it
raises, that's a smoke fail too.

Phase 2 (separate PR, molecule-ci) wires this into
publish-template-image.yml so the publish gate runs the boot smoke
against every template image before pushing the tag.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 21:21:18 -07:00