Commit Graph

9 Commits

Author SHA1 Message Date
Hongming Wang
02e4520cf3 chore(executor): runtime_wedge mirror follow-ups from PR #29 review
Two review nits:

1. Narrow the import-arm catch in _mark_sdk_wedged and
   _clear_sdk_wedge_on_success to (ImportError, ModuleNotFoundError).
   The bare `except Exception:` would have masked an AttributeError /
   TypeError from a runtime_wedge API rename — silently degrading the
   mirror to "no-op" and making heartbeat + the smoke gate (#131)
   blind to claude-code wedges. The structural snapshot test in
   molecule-core (task #169) catches the rename at PR-time. Older
   runtimes that don't ship runtime_wedge at all still hit ImportError
   and silently no-op — the local sticky flag still gates is_wedged()
   inside this module so internal callers keep working.

2. Add mirror-CALL-failure injection tests. The recorder used by the
   original tests never raised, so the inner try around
   _mark_runtime_wedged(reason) (and the symmetric clear) wasn't
   pinned. New tests inject a recorder whose mark/clear raise on call,
   then assert: (a) the call attempt was recorded, (b) the local
   sticky flag stayed correct, (c) the failure was logged at ERROR.
   Pins both the contract ("mirror is best-effort, local is source of
   truth") AND the operator-visible signal (an ERROR log line is the
   only way to see a silent mirror regression).

Regression-injection-checked: removing the call-side try arm makes
both new tests fail with clear messages. Tests: 7 in
test_runtime_wedge_mirror.py, 45 across the whole tests/ tree.
2026-05-01 18:04:24 -07:00
Hongming Wang
b2561aa825 feat(executor): mirror SDK wedge into molecule_runtime.runtime_wedge
The local _sdk_wedged_reason flag was only observed inside this module
— heartbeat reads runtime_wedge.is_wedged() (universal cross-cutting
holder) and so does the new boot-smoke gate from molecule-core PR
#2473 / task #131. Without the mirror, a wedged claude-code workspace
stayed green-dot on the canvas while every chat hung, AND the
publish-image gate could not catch PR-25-class init wedges before
the broken image shipped to GHCR.

_mark_sdk_wedged now mirrors into runtime_wedge.mark_wedged, and
_clear_sdk_wedge_on_success mirrors into runtime_wedge.clear_wedge.
Both are best-effort — older runtimes that don't ship runtime_wedge
silently no-op the mirror, so a template pinned to an older runtime
still boots. Mirror exceptions are logged but don't suppress the
local sticky flag, so internal callers (retry loop, cancel handler)
see consistent state regardless of the universal-side outcome.

Tests cover: mark mirrors with reason, first-call-wins propagates,
clear mirrors, no-op when not wedged, ImportError-resilience.
Regression-injection-checked: silencing the mirror branch fails the
mark+first-wins tests at unit-test time with a clear message naming
the missing runtime_wedge call.
2026-05-01 17:52:24 -07:00
Hongming Wang
9eb7d7b6cd fix(executor): pass tagged server:molecule to dev-channels flag
Claude Code 2.1.x changed the flag's signature to take an *allowlist* of
tagged entries — `server:<name>` for manually-configured MCP servers,
`plugin:<name>@<marketplace>` for plugin channels. PR #25's
`{flag: None}` rendered as a bare `--<flag>` with no value, the CLI
rejected with `argument missing`, and the SDK timed out at `initialize`,
surfacing upstream as `Control request timeout: initialize` (caught
live on workspace dd40faf8 on 2026-05-01 — 100% of A2A turns wedged).

Pass `server:molecule` so the SDK forwards
`--dangerously-load-development-channels server:molecule`. Live-verified
end-to-end: A2A returns coherent replies AND the host claude session
renders inbound canvas messages as `<channel source="molecule" ...>`
tags inline (push UX without inbox poll).

Tests: replace the unconditional `None` pin with a tagged-form pin
that asserts the exact `server:molecule` value, plus a defense-in-depth
test that pins the invariants (non-None, non-empty, contains tag
colon) so any regression to the bare-switch shape fails at unit-test
time instead of surfacing as a live SDK initialize wedge. 38/38 pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 17:15:49 -07:00
Hongming Wang
874029fca0 Revert "Merge pull request #25 from Molecule-AI/feat/forward-dev-channels-flag"
This reverts commit 4d5e85f3a0, reversing
changes made to b70aa1846b.
2026-05-01 17:02:55 -07:00
Hongming Wang
e14f33a670 feat(executor): forward --dangerously-load-development-channels to claude CLI
The wheel-side push UX gates (capability + instructions, molecule-core
PR #2463) only matter if the host claude CLI is willing to register a
non-allowlisted experimental channel. During the channels research
preview the CLI requires --dangerously-load-development-channels to
bypass its allowlist; without it, every notifications/claude/channel
fired by the inbox bridge arrives at the host and is silently dropped.

claude-agent-sdk forwards arbitrary CLI flags to the spawned subprocess
via ClaudeAgentOptions.extra_args (claude_agent_sdk/_internal/transport/
subprocess_cli.py:340). Wire the flag in unconditionally — the flag is
harmless on builds that already allowlist the capability and required
on builds during the research preview, so there is no version skew to
guard. Remove the line once channels graduate to the default allowlist.

Test pins the wiring with a stubbed ClaudeAgentOptions recorder; runs
in CI without claude_agent_sdk / a2a / molecule_runtime installed via
the same _ensure_module/_ensure_attr pattern as the existing adapter
prevalidate test, but tolerates real packages being present locally.

Verified by injection: removing the extra_args line makes the test
fail with a message naming the missing flag and citing the SDK file
that consumes it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 16:26:58 -07:00
Hongming Wang
1a84de8a61 fix(a2a-v1): rewrite FilePart emit using v1 protobuf Part struct
a2a-sdk v1.0.2 replaced the v0 Pydantic discriminated-union types
(Part(root=TextPart(...))/Part(root=FilePart(file=FileWithUri(...))))
with a single protobuf Part struct that has optional `text`, `url`,
`raw`, `data`, `filename`, `media_type` fields. The classes
FilePart, TextPart, FileWithUri don't exist in v1 — import fails:

    File "claude_sdk_executor.py", line 592
        from a2a.types import FilePart, FileWithUri, Message, Part, Role, TextPart
    ImportError: cannot import name 'FilePart' from 'a2a.types'

Production impact: every claude-code workspace (Design Director, UX
Researcher, all coordinators in molecule-core teams) crashes on
result delivery whenever the response includes a /workspace/* file
reference. The A2A delegation loop is broken at the result-delivery
step. Workspaces can receive tasks but can't ship results back.

Fix:

  - Drop FilePart/TextPart/FileWithUri imports (don't exist in v1).
  - `Part(root=TextPart(text=t))` → `Part(text=t)`.
  - `Part(root=FilePart(file=FileWithUri(uri=u, name=n, mimeType=m)))` →
    `Part(url=u, filename=n, media_type=m)`.
  - `messageId=...` → `message_id=...` (snake_case in protobuf).
  - `Role.agent` → `Role.ROLE_AGENT` (v1 enum).

Verified by constructing the exact shape against v1.0.2 in the
running claude-code template image:

  Message:
    message_id: 03ff9367
    role: ROLE_AGENT
    parts count: 2
    text part: hello
    file part: workspace:foo.txt foo.txt text/plain

Refs: molecule-core memory `reference_a2a_sdk_v0_to_v1_migration`
documents the Pydantic→protobuf shift; this is the fifth migration
finding today (after the new_agent_text_message rename in
crewai/openclaw/autogen/gemini-cli).

Test plan:

  - [x] `python3 -m py_compile claude_sdk_executor.py` clean.
  - [x] Runtime construction smoke verified against the live v1.0.2
        a2a-sdk in the claude-code template image.
  - [ ] End-to-end: provision a claude-code workspace, send a task
        whose response references a /workspace/* file, confirm
        result lands without ImportError.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 00:46:47 -07:00
Hongming Wang
2b0b0d9fcd fix: migrate claude_sdk_executor to a2a-sdk 1.x (new_text_message)
Same a2a-sdk 1.x rename already shipped in hermes/executor.py and
workspace/a2a_executor.py: a2a-sdk dropped `new_agent_text_message`
in favor of `new_text_message` (role=Role.agent default preserves
behavior). Three call sites in this file.

Symptom: every claude-code workspace died at create_executor →
ImportError: cannot import name 'new_agent_text_message' from
'a2a.helpers'. Why this slipped past every prior fix:

The boot smoke gate only does `import adapter`. adapter.py imports
ClaudeSDKExecutor lazily INSIDE create_executor() (line 106),
which means claude_sdk_executor.py is never loaded at module
import time. The lazy-load pattern hid the bug from CI.

molecule-ci PR #8 (lint + import-every-app-py smoke) catches this
class going forward — the new smoke loop iterates every /app/*.py
including claude_sdk_executor.py, forcing module-level imports to
resolve.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 05:55:19 -07:00
Hongming Wang
e7dea39df2 fix: qualify all bare imports of runtime modules
Five `from <runtime_module> import` statements in adapter.py +
claude_sdk_executor.py were never qualified when the template was
extracted to its own repo (#87). They worked when the runtime was
bundled into workspace/ where bare imports resolved against
sibling files; in the template repo they explode at startup with
ModuleNotFoundError as soon as Python reaches the import.

Caught by manual provision after pipeline-3 wire-real E2E. The
plugins import was the first one tripped because it sits in
adapter.setup() — earlier bare imports inside claude_sdk_executor.py
are deferred until the executor is constructed.

Pattern: any `from <X> import Y` where X is a workspace/ module ->
`from molecule_runtime.X import Y`. Fixes:
- adapter.py:97          plugins
- claude_sdk_executor.py executor_helpers, heartbeat, a2a_client, platform_auth

Same class of bug as the runtime's TOP_LEVEL_MODULES drift but
inverted — instead of forgetting to rewrite imports IN the wheel,
the template authors forgot to qualify imports IN the template
code (the build script's rewriter only runs on workspace/ -> wheel).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 05:20:24 -07:00
Hongming Wang
fab7c6a929 feat(template): own claude_sdk_executor locally (universal-runtime refactor)
First half of molecule-core task #87 — move adapter-specific code out
of the universal molecule-runtime package into the template that
actually consumes it.

Adds:
  - claude_sdk_executor.py (757 LOC) — copied verbatim from
    molecule-core/workspace/claude_sdk_executor.py @ commit 186f25c2.
    The adapter at adapter.py:59 already does
    `from claude_sdk_executor import ClaudeSDKExecutor` — once this
    file lands at /app/, Python's import order picks the local copy
    over the same-named module that older molecule-runtime versions
    ship under site-packages.
  - Dockerfile: COPY claude_sdk_executor.py . alongside adapter.py.

Pure additive at this stage — molecule-runtime still ships the
file too, so any image built from this template just has two copies
on disk (local /app shadows the site-packages one). No behavior
change.

Sequencing (the molecule-core PR follows AFTER this image rebuilds):
  1. THIS PR — template gets local copy, image rebuilds with it
     (current PR; safe because no removal yet)
  2. molecule-core PR — drop workspace/claude_sdk_executor.py, bump
     molecule-ai-workspace-runtime PyPI version. Templates that
     haven't pulled the new runtime version still work because their
     local copy is unchanged.
  3. (later) Bump requirements.txt pin in this template once the
     new runtime version is on PyPI, so future builds explicitly
     install the slimmed runtime.

Why local-copy-first:
  - Reverse order (drop from runtime first, then add to template)
    creates a window where any template image build pulling the
    latest runtime would fail to import claude_sdk_executor.
  - This order has zero downtime: every intermediate state is valid.

Validates the capability primitives shipped in molecule-core PRs
#2137-#2144 — once this template image rebuilds and the molecule-
core deletion lands, the claude-code workspace is the FIRST adapter
to live entirely outside molecule-runtime, with native_session +
idle_timeout_override declared via capabilities() (PR #12 here).

Source: molecule-core/workspace/claude_sdk_executor.py @ 186f25c2
(commit hash pinned for traceability of any future divergence).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 23:58:05 -07:00