forked from molecule-ai/molecule-core
76c604fb4f
126 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
5ad41f63ce
|
Merge pull request #2438 from Molecule-AI/fix/runtime-config-model-fallback-clean
fix(config): runtime_config.model falls back to top-level model |
||
|
|
0070d0bd59 |
fix(config): runtime_config.model falls back to top-level model
External feedback (2026-04-30): "Provisioner doesn't read model from config.yaml and doesn't set MODEL env var. Without MODEL, the adapter defaults to sonnet and bypasses the mimo routing." Confirmed accurate for SaaS workspaces. Trace: claude-code-default/adapter.py reads `runtime_config.model or "sonnet"` (and hermes reads HERMES_DEFAULT_MODEL via install.sh, which IS plumbed). For claude-code there's nothing — workspace/config.py loaded `runtime_config.model` only from YAML, ignoring MODEL_PROVIDER env. The CP user-data script regenerates /configs/config.yaml at every boot with only `name`, `runtime`, `a2a` keys (intentionally minimal so it doesn't carry stale state) — so any user-set runtime_config.model is wiped on every restart, and the adapter falls back to "sonnet" even when the user picked Opus in the canvas Config tab. Fix: when YAML omits runtime_config.model, fall back to the top-level resolved `model`, which already honors MODEL_PROVIDER env override. One-line in workspace/config.py. Now MODEL_PROVIDER → top-level model → runtime_config.model → adapter sees the user's selection. Sticky across CP-driven restarts; the canvas Save+Restart loop works as intended for every runtime, not just hermes. Tests: test_runtime_config_model_falls_back_to_top_level — top-level set, runtime_config empty → fallback wins test_runtime_config_model_yaml_wins_over_top_level — YAML explicit → fallback skipped (precedence) test_runtime_config_model_picks_up_env_via_top_level — full canvas Save+Restart simulation: env → top-level → runtime_config.model Negative-control verified: removing the `or model` flips both fallback tests red with the expected "" vs expected-model mismatch; restoring flips them green. The yaml-wins test passes either way (correctly, because precedence is preserved). Replaces closed PR #2435 — that PR's commit was on a contaminated branch and accidentally captured unrelated WIP changes (build script + a2a_mcp_server refactor) instead of this fix. Self-review caught it and closed the PR. This branch is clean off main + diff verified before push. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
0acdf3bb56 |
fix(wheel): import inbox without alias to dodge rewriter collision
PR #2433 (notifications/claude/channel) shipped 'import inbox as _inbox_module' inside a2a_mcp_server.py:main(). The build script's import rewriter expands plain 'import inbox' to 'import molecule_runtime.inbox as inbox', so the original source became 'import molecule_runtime.inbox as inbox as _inbox_module', which is invalid Python. Caught at the publish-runtime + PR-built-wheel-smoke gate (the SyntaxError trace is in run 25200422679). The wheel didn't ship to PyPI because publish-runtime's smoke-import step refused to install it, but staging is currently sitting on a broken-build commit until this fix-forward lands. Changes: - a2a_mcp_server.py: lift `import inbox` to top of file (rewriter produces clean `import molecule_runtime.inbox as inbox`), call inbox.set_notification_callback directly in main() - build_runtime_package.py: rewrite_imports() now raises ValueError when it sees 'import X as Y' for any X in the workspace allowlist, instead of silently producing a syntax-error wheel. Operator gets a clear actionable error at build time pointing at the offending line + suggested rewrites ('from X import …' or plain 'import X'). The build-time gate (this PR's rewriter check) catches the regression class earlier than the smoke-time gate (PR #2433's failure). Adding 'PR-built wheel + import smoke' to staging branch protection's required checks is filed separately so this class doesn't merge again. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
0a3ec53f34 |
feat(mcp): notifications/claude/channel for push-feel inbox UX
Adds a notification seam to the universal molecule-mcp wheel so push- notification-capable MCP hosts (Claude Code today; any compliant client tomorrow) get inbound A2A messages as conversation interrupts instead of having to poll wait_for_message / inbox_peek. Wire-up: - inbox.py: module-level _NOTIFICATION_CALLBACK + set_notification_callback() Fires from InboxState.record() AFTER lock release, with same dict shape inbox_peek returns. Best-effort — a raising callback never prevents the message from landing in the queue. - a2a_mcp_server.py: _build_channel_notification() pure helper + bridge wiring in main() that schedules notifications via asyncio.run_coroutine_threadsafe (poller is a daemon thread, MCP loop is asyncio). - Method name 'notifications/claude/channel' matches the contract documented in molecule-mcp-claude-channel/server.ts:509. - wheel_smoke.py: pin set_notification_callback as a published name, same regression class as the 0.1.16 main_sync incident. Pollers (wait_for_message / inbox_peek) keep working unchanged for runtimes without notification support. Tests: 6 new in test_inbox.py (callback fires once on record, dedupe short-circuits before fire, raising cb doesn't break inbox, set/clear semantics), 5 new in test_a2a_mcp_server.py (method name pin, content mapping, meta routing, no-id JSON-RPC notification spec, missing- field tolerance). All 59 combined tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
c4bb803329 |
feat(mcp_cli): agent_card from env vars (capability discovery)
External molecule-mcp runtimes register with hardcoded agent_card.name
= molecule-mcp-{id[:8]} and skills=[]. That made every external
workspace look identical on the canvas and gave peer agents calling
list_peers no signal beyond name — they had to guess capabilities.
Three new env vars let the operator declare identity + capabilities
without code changes:
* MOLECULE_AGENT_NAME — display name on canvas (default unchanged)
* MOLECULE_AGENT_DESCRIPTION — one-line description (default empty)
* MOLECULE_AGENT_SKILLS — comma-separated skill names
Comma-separated skills get expanded to {"name": "..."} objects — the
minimum shape that satisfies both shared_runtime.summarize_peers
(reads s["name"]) AND canvas SkillsTab.tsx (id falls back to name).
Strict-superset behaviour: when no env vars are set, agent_card
matches the previous hardcoded value exactly. No regression for
operators who haven't migrated.
Why this matters end-to-end:
* Canvas Skills tab now shows each declared skill as a chip
* Peer agents calling list_peers see {name, skills} per peer and
can route delegations to the right specialist
* Same applies to the canvas Details tab + workspace card hover
Tests cover: defaults match prior behaviour; name override; CSV →
skill objects; whitespace stripping + empty entries dropped;
description omitted when unset (keeps wire payload minimal);
whitespace-only name falls back to default; end-to-end through
_platform_register's payload.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
210f6e066a
|
Merge pull request #2424 from Molecule-AI/fix/in-container-heartbeat-persists-inbound-secret
fix(workspace): in-container heartbeat persists platform_inbound_secret |
||
|
|
d887ce8e96 |
fix(mcp_cli): escalate consecutive heartbeat 401s with re-onboard guidance
The universal molecule-mcp wheel runs in a daemon thread, posting
/registry/heartbeat every 20s. When the workspace gets deleted
server-side (DELETE /workspaces/:id), the platform revokes all tokens
for that workspace. Previous behaviour: heartbeat would 401 forever,
log at WARNING per tick, no actionable signal anywhere.
Failure mode hit on hongmingwang tenant 2026-04-30: workspace
a1771dba was deleted at some prior time, the channel-bridge .env
still pointed at it, MCP tools 401-ed silently with the operator
having no idea why. The register-time path at mcp_cli.py:104-111
already does loud + actionable for 401 (sys.exit(3) with regenerate-
from-canvas-Tokens text) — extend the same pattern to the heartbeat.
Behaviour:
* count < 3: WARNING per tick (could be transient blip)
* count == 3: ERROR with re-onboard instructions, names the dead
workspace_id, points at the canvas Tokens tab
* count > 3 and every 20 ticks (~7 min): re-log ERROR so a session
that started after the first ERROR still catches it
5xx and other non-auth HTTP errors do NOT increment the auth-failure
counter — that would mislead the operator (e.g. a server blip would
trigger "token revoked" when the token is fine).
Tests cover: single 401 stays at WARNING; 3 consecutive 401s escalate
to ERROR with the right keywords; 403 treated identically; recovery
via 200 resets the counter; 5xx never triggers the auth path.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
98845c8f42 |
fix(workspace): in-container heartbeat persists platform_inbound_secret
Follow-up to PR #2421. The standalone wrapper (mcp_cli.py) got heartbeat-time secret persistence in #2421, but the in-container heartbeat (workspace/heartbeat.py) was missed — and that's the path every workspace EC2 actually runs. Result: hongmingwang Claude Code agent stayed 401-forever on chat upload after this morning's deploy because the workspace's runtime never picked up the lazy-healed secret. The in-container _loop now captures the heartbeat response and calls the same _persist_inbound_secret_from_heartbeat helper used by the standalone path, on both the first POST and the 401-retry POST. Defensive on every error (non-JSON, non-dict, empty, save failure) — liveness contract trumps secret persistence. Tests pin: happy path, absent secret, empty string, non-JSON body, non-dict body, save_inbound_secret OSError, end-to-end loop. |
||
|
|
c733454a56
|
Merge pull request #2421 from Molecule-AI/fix/heartbeat-delivers-inbound-secret
fix(workspace): deliver platform_inbound_secret on every heartbeat |
||
|
|
993f8c494e |
refactor(workspace-runtime): send_a2a_message takes peer_id, validates UUID
Two cleanups stacked on PR #2418: 1. Refactor `send_a2a_message(target_url, msg)` → `send_a2a_message(peer_id, msg)`. After #2418 every caller passes `${PLATFORM_URL}/workspaces/{peer_id}/a2a` — the function's parameter pretended to accept arbitrary URLs but in practice only one shape is meaningful. Owning URL construction inside the function makes the contract honest and centralises the peer-id validation introduced below. 2. Add `_validate_peer_id` UUID-shape check at the trust boundary. `discover_peer` and `send_a2a_message` are the entry points where agent-controlled strings flow into URL paths; rejecting non-UUID input at this layer eliminates the URL-interpolation class of bug (`workspace_id="../admin"` etc.) regardless of how the rest of the codebase interpolates ids elsewhere. Auth was already gating malicious access — this is consistency + clear failure over silent platform 4xx. In-container tests cover positive UUIDs, malformed input (``"ws-abc"``, ``"../admin"``, empty), and the contract that ``tool_delegate_task`` hands the peer_id to ``send_a2a_message`` without building URLs itself. Live-verified: external delegation 8dad3e29 → 97ac32e9 returned "refactor verified" from Claude Code Agent through the refactored code; ``_validate_peer_id`` rejects ``"ws-abc"`` and ``"../admin"`` and accepts canonical UUIDs. Stacked on PR #2418 (proxy-routing fix). Will rebase onto staging once #2418 merges. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
a5c5139e3a |
fix(workspace): deliver platform_inbound_secret on every heartbeat
Heartbeat now echoes the workspace's platform_inbound_secret on every
beat (mirroring /registry/register), and the molecule-mcp client
persists it to /configs/.platform_inbound_secret on receipt.
Symptom (2026-04-30, hongmingwang tenant): chat upload returned 503
"workspace will pick it up on its next heartbeat" and then 401 on
retry — permanent until workspace restart. The 503 message was a lie:
heartbeat used to discard the platform_inbound_secret entirely; only
register delivered it, and register fires once at startup.
Server (Go):
- Heartbeat handler reuses readOrLazyHealInboundSecret (the same
helper chat_files + register use), so heartbeat-time recovery
covers the rotate / mid-life NULL-column case the existing
register-time heal can't reach.
- Failure is non-fatal: liveness contract trumps secret delivery,
chat_files retries lazy-heal on its own next request.
Client (Python):
- _persist_inbound_secret_from_heartbeat parses the heartbeat 200
response and persists via platform_inbound_auth.save_inbound_secret.
- All exceptions swallowed — heartbeat liveness > secret persistence;
next tick (≤20s) retries.
Tests:
- Server: pin secret-present, lazy-heal-mint-on-NULL, and heal-
failure-omits-field branches.
- Client: pin persist-on-200, skip-on-empty, skip-on-non-dict-body,
skip-on-401, swallow-save-OSError.
|
||
|
|
aefb44aff2 |
fix(workspace-runtime): route delegate_task through platform A2A proxy
tool_delegate_task was POSTing directly to peer["url"], which is the Docker-internal hostname (e.g. http://ws-X-Y:8000) for in- container peers. External callers — the standalone molecule-mcp wrapper running on an operator's laptop — get [Errno 8] nodename nor servname every single delegation, breaking the universal-MCP path's last "ride the same code as in-container" claim. The platform's /workspaces/:peer-id/a2a proxy endpoint already handles internal forwarding for in-container peers AND is the only path external runtimes can use. Unify on it: in-container callers pay one extra HTTP hop on the same Docker bridge (microseconds); external callers get a working delegation path for the first time. discover_peer is still called for access-control + online-status detection — only the routing target changes. Verified live on 2026-04-30 against workspace 8dad3e29 (external mac runtime) → 97ac32e9 (Claude Code Agent in-container): direct POST returned ConnectError, proxy POST returned "acknowledged from claude code agent" as requested. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
d061642cfc |
test(inbox): bind side-effecting pop() before assert
CodeQL flagged the bare `assert state.pop(...) is None` — under `python -O` asserts are stripped, which would skip the call entirely and the test would silently pass without exercising the code. Bind the result first so the call always runs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
b47d4ceb00 |
feat(workspace-runtime): add inbox polling for standalone molecule-mcp path
The universal MCP server (a2a_mcp_server.py) was outbound-only — agents
in standalone runtimes (Claude Code, hermes, codex, etc.) could
delegate, list peers, and write memories, but never observed the
canvas-user or peer-agent messages addressed to them. This blocked
"constantly responding" loops without forcing operators back onto a
runtime-specific channel plugin.
This PR closes the inbound gap with a poller-fed in-memory queue and
three new MCP tools:
- wait_for_message(timeout_secs?) — block until next message arrives
- inbox_peek(limit?) — list pending messages (non-destructive)
- inbox_pop(activity_id) — drop a handled message
A daemon thread polls /workspaces/:id/activity?type=a2a_receive every
5s, fills the queue from the cursor (since_id), and persists the cursor
to ${CONFIGS_DIR}/.mcp_inbox_cursor so a restart doesn't replay backlog.
On 410 (cursor pruned) we fall back to since_secs=600 for a bounded
recovery window. Activity-row → InboxMessage extraction mirrors the
molecule-mcp-claude-channel plugin's extractText (envelope shapes #1-3
+ summary fallback).
mcp_cli.main starts the poller alongside the existing register +
heartbeat threads. In-container runtimes (which have push delivery via
canvas WebSocket) skip activation, so inbox tools return an
informational "(inbox not enabled)" message instead of double-delivery.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
b54ceb799f |
fix: address 5-axis review findings on PR #2413
Critical:
- ExternalConnectModal.tsx: filledUniversalMcp substitution searched
for WORKSPACE_AUTH_TOKEN but the snippet's placeholder is now
MOLECULE_WORKSPACE_TOKEN (changed in the previous polish commit
|
||
|
|
427300f3a4 |
feat: make molecule-mcp standalone (built-in register + heartbeat) + recover awaiting_agent on heartbeat
Two paired fixes that together let an external operator run a single
process (molecule-mcp) and see their workspace come up online in the
canvas — the bug surfaced live when status stuck at "awaiting_agent /
OFFLINE" despite an active MCP server.
Platform side (workspace-server/internal/handlers/registry.go):
Heartbeat handler already auto-recovers offline → online and
provisioning → online, but NOT awaiting_agent → online. Healthsweep
flips stale-heartbeat external workspaces TO awaiting_agent, and
with no recovery path the workspace stays "OFFLINE — Restart" in the
canvas forever. Add the symmetric branch: if currentStatus ==
"awaiting_agent" and a heartbeat arrives, flip to online + broadcast
WORKSPACE_ONLINE. Mirrors the existing offline/provisioning patterns
exactly. Test: TestHeartbeatHandler_AwaitingAgentToOnline asserts
the SQL UPDATE fires with the awaiting_agent guard clause.
Wheel side (workspace/mcp_cli.py):
molecule-mcp was outbound-only — operators had to run a separate
SDK process to register + heartbeat. Now mcp_cli.main():
1. Calls /registry/register at startup (idempotent upsert flips
status awaiting_agent → online via the existing register path).
2. Spawns a daemon thread that POSTs /registry/heartbeat every
20s. 20s is comfortably under the healthsweep stale window so
a single missed beat doesn't cause status churn.
3. Runs the MCP stdio loop in the foreground.
Both calls set Origin: ${PLATFORM_URL} so the SaaS edge WAF accepts
them. Threaded heartbeat (not asyncio) chosen because it doesn't
need to share an event loop with the MCP stdio server — daemon=True
cleanly dies when the operator's runtime exits.
MOLECULE_MCP_DISABLE_HEARTBEAT=1 escape hatch lets in-container
callers (which have heartbeat.py running already) reuse the entry
point without double-heartbeating. Default is enabled.
End-to-end verification (live, against
hongmingwang.moleculesai.app, workspace 8dad3e29-...):
pre-fix: status=awaiting_agent → canvas shows OFFLINE forever
post-fix: ran `molecule-mcp` for 5s standalone → canvas state:
status=online runtime=external agent=molecule-mcp-8dad3e29
Test coverage: 7 new mcp_cli tests (register-at-startup, heartbeat-
thread-spawned, disable-env-skips-both, env-and-file token resolution,
register payload shape, heartbeat endpoint + headers); 1 new platform
test (awaiting_agent → online recovery). Full workspace + handlers
suites green: 1355 Python, full Go handlers passing.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
74c5e0d7a8 |
fix(workspace-runtime): add Origin header so SaaS edge WAF accepts MCP tool calls
Discovered while smoke-testing the molecule-mcp external-runtime path
against a live tenant (hongmingwang.moleculesai.app). Every tool call
that hit /workspaces/* or /registry/*/peers returned 404 — but
/registry/register and /registry/heartbeat returned 200. Diagnosis:
the tenant's edge WAF requires a same-origin header. Without it,
unhandled paths get silently rewritten to the canvas Next.js app,
which has no /workspaces or /registry/:id/peers route and returns an
empty 404. The molecule-mcp-claude-channel plugin already sets this
header (server.ts:271-276); the workspace runtime never did because
in-container PLATFORM_URLs (Docker network) aren't behind the WAF.
Fix: extend platform_auth.auth_headers() to include
Origin: ${PLATFORM_URL} whenever PLATFORM_URL is set. Inside-container
behavior is unchanged (the WAF is path-irrelevant for the internal
hostnames). External-runtime calls now thread the WAF correctly.
Verification (live, against a freshly-registered external workspace):
pre-fix: get_workspace_info → "not found", list_peers → 404
post-fix: get_workspace_info → full workspace JSON,
list_peers → "Claude Code Agent (ID: 97ac32e9..., status: online)"
This is the kind of bug unit tests can never catch — caught only by
running the wheel against the real tenant. Memory:
feedback_always_run_e2e.md.
Test coverage: 4 new tests in test_platform_auth.py — Origin alone
when no token + Origin + Authorization both, no-PLATFORM_URL falls
through to original empty-dict behavior, env-token path with Origin.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
169e284d57 |
feat(workspace-runtime): expose universal MCP server to runtime=external operators
Ship the baseline universal MCP path that any external runtime (Claude
Code, hermes, codex, anything that speaks MCP stdio) can use, before
optimizing per-runtime channels. Today the workspace MCP server only
spins up inside the container; external operators have no way to call
the 8 platform tools (delegate_task, list_peers, send_message_to_user,
commit_memory, etc.) from outside.
Three additive changes:
1. **`platform_auth.get_token()` env-var fallback** — adds
`MOLECULE_WORKSPACE_TOKEN` as a fallback when no
`${CONFIGS_DIR}/.auth_token` file exists. File-first preserves
in-container behavior unchanged. External operators (no /configs
volume) now have a way to supply the token without faking the
filesystem layout.
2. **`molecule-mcp` console script** — adds a new entry point in the
published `molecule-ai-workspace-runtime` PyPI wheel. Operators run
`pip install molecule-ai-workspace-runtime`, set 3 env vars
(WORKSPACE_ID, PLATFORM_URL, MOLECULE_WORKSPACE_TOKEN), and register
the binary in their agent's MCP config. `mcp_cli.main` is a thin
validator wrapper — it checks env BEFORE importing the heavy
`a2a_mcp_server` module so a misconfigured first-run gets a friendly
3-line error instead of a 20-line module-level RuntimeError
traceback.
3. **Wheel smoke gate** — extends `scripts/wheel_smoke.py` to assert
`cli_main` and `mcp_cli.main` are importable. Same regression class
as the 0.1.16 main_sync incident: a silent rename or unrewritten
import here would break every external operator on the next wheel
publish (memory: feedback_runtime_publish_pipeline_gates.md).
Test coverage:
- `tests/test_platform_auth.py` — 8 new tests for the env-var fallback:
file-priority, env-fallback, whitespace handling, cache, header
construction, empty-env-as-unset.
- `tests/test_mcp_cli.py` — 8 new tests for the validator: each
required var separately, file-or-env satisfies token requirement,
whitespace-only env treated as missing, help mentions canvas Tokens
tab.
- Full `workspace/tests/` suite green: 1346 passed, 1 skipped.
- Local end-to-end: built wheel, installed in venv, ran `molecule-mcp`
with no env → friendly error; with env → MCP server starts.
Why now / why this shape: user redirect was "support the baseline
first so all runtimes can use, then optimize". A claude-only MCP
channel leaves hermes/codex/third-party operators broken on
runtime=external. This PR ships the runtime-agnostic baseline; per-
runtime polish (claude-channel push delivery, hermes-native
bindings) is a follow-up PR. PR #2412 fixed the partner bug where
canvas Restart silently revoked the operator's token — the two
together unblock the external-runtime story end-to-end.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
3b34dfefbc |
feat(workspace): surface peer-discovery failure reason instead of "may be isolated"
Closes #2397. Today, every empty-peer condition (true empty, 401/403, 404, 5xx, network) collapses to a single message: "No peers available (this workspace may be isolated)". The user has no way to tell whether they need to provision more workspaces (true isolation), restart the workspace (auth), re-register (404), page on-call (5xx), or check network (timeout) — five different operator actions, one ambiguous string. Wire: - new helper get_peers_with_diagnostic() in a2a_client.py returns (peers, error_summary). error_summary is None on 200; a short actionable string on every other branch. - get_peers() now shims through it so non-tool callers (system-prompt formatters) keep the bare-list contract. - tool_list_peers() switches to the diagnostic helper and surfaces the actual reason. The "may be isolated" string is removed; true empty now reads "no peers in the platform registry." Tests: - TestGetPeersWithDiagnostic: 200, 200-empty, 401, 403, 404, 5xx, network exception, 200-but-non-list-body, and the bare-list-shim regression guard. - TestToolListPeers: each diagnostic branch surfaces its reason + explicit assertion that "may be isolated" is gone. Coverage 91.53% (floor 86%). 122 a2a tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
899a2231d6 |
test(platform_auth): module-functions signature snapshot drift gate
Pin the 5 public functions adapters and the runtime hot-path import through ``from platform_auth import``: - ``auth_headers`` — every outbound httpx call merges this in - ``self_source_headers`` — A2A peer + self-message header builder - ``get_token`` — main.py reads on boot to decide register-vs-resume - ``save_token`` — main.py persists the platform-issued token - ``refresh_cache`` — 401-retry path drops in-process cache (#1877) A grep across workspace/ shows 14+ runtime modules import these: main.py, heartbeat.py, a2a_client.py, a2a_tools.py, consolidation.py, events.py, executor_helpers.py (3 sites), molecule_ai_status.py, builtin_tools/memory.py (3 sites), builtin_tools/temporal_workflow.py (2 sites). Renaming any of the five (e.g. ``auth_headers`` → ``bearer_headers``) makes every one of those imports raise ImportError at workspace boot — the failure surface is deep in heartbeat init, nowhere near the rename site. Same drift class as the BaseAdapter signature snapshot (#2378, #2380), skill_loader gate (#2381), runtime_wedge gate (#2383). Reuses the ``_signature_snapshot.py`` helpers shipped in #2381. Defense-in-depth: ``test_snapshot_has_required_functions`` asserts the five names are still present, so removing one even with a synchronized snapshot edit forces an explicit edit here with a justification. ``clear_cache`` is intentionally NOT in the snapshot — it's a test-only helper. Production code MUST NOT depend on it. Verified red on deliberate rename: ``auth_headers`` → ``bearer_headers`` produces a clean diff of the missing function in the failure message. Restored before commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
70176e6c8f |
test(runtime_wedge): module-functions signature snapshot drift gate
BaseAdapter docstring tells adapter authors: > ``runtime_wedge.mark_wedged()`` / ``clear_wedge()`` — flip the > workspace to ``degraded`` + auto-recover when your SDK hits a > non-recoverable error class. Import directly from ``runtime_wedge``; > the heartbeat forwards the state to the platform automatically. That's a contract — adapter templates depend on the four module-level functions (``is_wedged``, ``wedge_reason``, ``mark_wedged``, ``clear_wedge``) being importable by those exact names with those exact signatures. Renaming any silently breaks every adapter that calls them: the import resolves the module fine, the ``AttributeError`` only surfaces when the adapter actually hits its first SDK error — long after the rename merges. Same drift class as #2378 / #2380 / #2381 (BaseAdapter, skill_loader) applied to the module-level function surface. Changes: - tests/_signature_snapshot.py gains build_module_functions_record. Walks a module's public top-level functions, optionally filtered to a specific name list (used here — runtime_wedge has internal helpers like reset_for_test that intentionally aren't part of the contract). Skips re-exports via __module__ check so a `from foo import bar` doesn't pollute the snapshot. - tests/test_runtime_wedge_signature.py snapshots the four contract functions. Plus a defense-in-depth required-functions test that catches removal even when source + snapshot are updated together. Verified: deliberately renaming `mark_wedged` → `mark_wedged_RENAMED` trips the gate with full snapshot diff in the failure message. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
e336688278 |
test: extract shared signature-snapshot helpers + skill_loader gate
Two changes in one PR (tightly coupled — the second wouldn't make
sense without the first):
1. Hoist the inspect-based snapshot helpers out of
test_adapter_base_signature.py into tests/_signature_snapshot.py
so future surfaces don't copy-paste introspection logic.
- build_class_signature_record(cls): walks public methods,
unwraps static/class/abstract methods, returns a stable
{class, methods: [...]} dict.
- build_dataclass_record(cls): walks dataclass fields via
dataclasses.fields(), returns {name, frozen, fields: [...]}.
- compare_against_snapshot(actual, path): writes-on-first-run +
diff-on-drift, with both expected and actual JSON in failure
message.
test_adapter_base_signature.py is rewritten to use the helpers;
the existing snapshot file is byte-identical (no behavior change).
2. New gate: tests/test_skill_loader_signature.py covers the
public dataclasses exported from skill_loader/loader.py:
- SkillMetadata: every adapter pattern-matches on .runtime for
skill-compat filtering. Renaming this field would silently
break per-adapter skill loading — the loader still returns
objects, but adapters' `if "*" in skill.metadata.runtime`
raises AttributeError at workspace boot.
- LoadedSkill: returned in SetupResult.loaded_skills.
Includes test_snapshot_has_required_skill_metadata_fields
defense-in-depth: ensures the runtime / id / name / description
fields stay even if both source and snapshot are updated together.
Verified: deliberately renaming SkillMetadata.runtime trips the
gate with full snapshot diff in the failure message.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
12e39c7311 |
test(adapter_base): extend signature snapshot to public dataclasses (#2364 item 2 followup)
Follows up #2378. The BaseAdapter snapshot covers method signatures but `adapter_base.py` also exports three public dataclasses that form the call/return contract between the platform and every adapter: - SetupResult — returned by adapter._common_setup() - AdapterConfig — passed into adapter setup hooks - RuntimeCapabilities — returned by adapter.capabilities(); drives platform-side dispatch routing (#117) Renaming a RuntimeCapabilities flag silently disables every adapter's capability declaration (the platform fallback runs) without an AttributeError to surface the breakage. That's exactly the drift class the snapshot pattern is meant to catch. Changes: - _build_dataclass_snapshot walks SetupResult, AdapterConfig, RuntimeCapabilities via dataclasses.fields(), capturing field name + type annotation + has_default per field, plus the @dataclass(frozen=...) flag. - _build_full_snapshot composes method + dataclass records into one stable JSON snapshot. - test_snapshot_has_required_dataclass_fields — defense-in-depth test parallel to test_snapshot_has_required_methods. Catches field removal even when both source AND snapshot are updated together. Required field set is intentionally short (the flags that drive platform dispatch + the adapter-level config knobs). Verified: deliberately renaming `provides_native_heartbeat` → `provides_native_heartbeat_RENAMED` trips test_base_adapter_signature_matches_snapshot with a full diff in the failure message. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
8488a188c2 |
test(adapter_base): signature snapshot — drift gate for adapter public surface (#2364 item 2)
Every workspace template (langgraph, claude-code, hermes, etc.) subclasses BaseAdapter. Renaming, removing, or re-typing a method on the base class silently breaks templates: the override stops being recognized as an override; the old method-name's caller silently invokes the default no-op; the new method-name is unimplemented in templates that haven't migrated. Recent #87 universal-runtime + #1957 recordResource refactor both renamed/added methods. Without a frozen snapshot, the next rename ships quietly and surfaces only when a template's CI catches the AttributeError days later — long after the merge window for an easy revert. This snapshot pins BaseAdapter's public method surface against a checked-in JSON file. Same-shape pattern as PR #2363's A2A protocol-compat replay gate, applied to a Python public-API surface instead of JSON message shapes. Both close drift classes by snapshotting the structural surface that consumers depend on. Two tests: 1. test_base_adapter_signature_matches_snapshot — full introspection diff against tests/snapshots/adapter_base_signature.json. Drift = test failure with both expected + actual JSON in the message so the reviewer sees what changed. 2. test_snapshot_has_required_methods — defense-in-depth: even if both the source AND snapshot are updated together (intentional API removal), this catches removal of the short list of methods that EVERY template depends on (name, display_name, description, capabilities, memory_filename). Removing one of these requires explicit edit to the `required` set with a justification. Verified the gate fires red on a deliberate rename (memory_filename → memory_filename_RENAMED) — failure message shows the full snapshot diff including parameter shapes and return annotations. Updating the snapshot is the explicit acknowledgment that a template-affecting API change is intentional. Reviewer of the introducing PR sees the snapshot diff and decides whether template repos need coordinated updates. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
4299475746 |
feat(prompt): Platform Capabilities preamble at top of system prompt
Closes #2332 item 1 (workspace awareness — agents don't surface platform-native tools up front). The dogfooding session surfaced that agents weren't using A2A delegation, persistent memory, or send_message_to_user. The tools were registered AND documented in the system prompt — but only in sections #8 (Inter-Agent Communication) and #9 (Hierarchical Memory), which agents read AFTER they've already started reasoning about a plan from earlier sections. This adds a tight inventory at section #1.5 (immediately after Platform Instructions, before role-specific prompt files) — every tool name + its short description in a bulleted block. Detailed when_to_use docs in sections #8/#9 stay; this preamble is the elevator pitch ("you have these"), the later sections are the manual ("here's when and how"). Generated from `platform_tools.registry` ToolSpecs — every tool's `name` + `short` flow through automatically, no manual sync. A new `get_capabilities_preamble(mcp: bool)` helper in executor_helpers mirrors the existing get_a2a_instructions / get_hma_instructions pattern. CLI-runtime agents (mcp=False) get an empty preamble — they see _A2A_INSTRUCTIONS_CLI's hand-written subcommand vocabulary further down, and the registry's MCP tool names would conflict. Tests: - test_capabilities_preamble_appears_in_mcp_prompt: header present - test_capabilities_preamble_lists_every_registry_tool: every a2a + memory tool from registry shows up (drift catches at test time — adding a new tool to registry surfaces here automatically) - test_capabilities_preamble_precedes_prompt_files: ordering invariant (toolkit before role docs) - test_capabilities_preamble_skipped_for_cli_runtime: empty when mcp=False All 40 prompt + platform_tools tests pass. |
||
|
|
e955597a98 |
feat(chat_files): rewrite Download as HTTP-forward (RFC #2312, PR-D)
Mirrors PR-C's Upload migration: replaces the docker-cp tar-stream extraction with a streaming HTTP GET to the workspace's own /internal/file/read endpoint. Closes the SaaS gap for downloads — without this PR, GET /workspaces/:id/chat/download still returns 503 on Railway-hosted SaaS even after A+B+C+F land. Stacks: PR-A #2313 → PR-B #2314 → PR-C #2315 → PR-F #2319 → this PR. Why a single broad /internal/file/read instead of /internal/chat/download: Today's chat_files.go::Download already accepts paths under any of the four allowed roots {/configs, /workspace, /home, /plugins} — it's not strictly chat. Future PRs (template export, etc.) will reuse this endpoint via the same forward pattern; reusing avoids three near- identical handlers (one per domain) with duplicated path-safety logic. Path safety is duplicated on platform + workspace sides — defence in depth via two parallel checks, not "trust the workspace." Changes: * workspace/internal_file_read.py — Starlette handler. Validates path (must be absolute, under allowed roots, no traversal, canonicalises cleanly). lstat (not stat) so a symlink at the path doesn't redirect the read. Streams via FileResponse (no buffering). Mirrors Go's contentDispositionAttachment for Content-Disposition header. * workspace/main.py — registers GET /internal/file/read alongside the POST /internal/chat/uploads/ingest from PR-B. * scripts/build_runtime_package.py — adds internal_file_read to TOP_LEVEL_MODULES so the publish-runtime cascade rewrites its imports correctly. Also includes the PR-B additions (internal_chat_uploads, platform_inbound_auth) since this branch was rooted before PR-B's drift-gate fix; merge-clean alphabetic additions. * workspace-server/internal/handlers/chat_files.go — Download rewritten as streaming HTTP GET forward. Resolves workspace URL + platform_inbound_secret (same shape as Upload), builds GET request with path query param, propagates response headers (Content-Type / Content-Length / Content-Disposition) + body. Drops archive/tar + mime imports (no longer needed). Drops Docker-exec branch entirely — Download is now uniform across self-hosted Docker and SaaS EC2. * workspace-server/internal/handlers/chat_files_test.go — replaces TestChatDownload_DockerUnavailable (stale post-rewrite) with 4 new tests: - TestChatDownload_WorkspaceNotInDB → 404 on missing row - TestChatDownload_NoInboundSecret → 503 on NULL column (with RFC #2312 detail in body) - TestChatDownload_ForwardsToWorkspace_HappyPath → forward shape (auth header, GET method, /internal/file/read path) + headers propagated + body byte-for-byte - TestChatDownload_404FromWorkspacePropagated → 404 from workspace propagates (NOT remapped to 500) Existing TestChatDownload_InvalidPath path-safety tests preserved. * workspace/tests/test_internal_file_read.py — 21 tests covering _validate_path matrix (absolute, allowed roots, traversal, double- slash, exact-match-on-root), 401 on missing/wrong/no-secret-file bearer, 400 on missing path/outside-root/traversal, 404 on missing file, happy-path streaming with correct Content-Type + Content-Disposition, special-char escaping in Content-Disposition, symlink-redirect-rejection (lstat-not-stat protection). Test results: * go test ./internal/handlers/ ./internal/wsauth/ — green * pytest workspace/tests/ — 1292 passed (was 1272 before PR-D) Refs #2312 (parent RFC), #2308 (chat upload+download 503 incident). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
055e447355 |
feat(saas): deliver platform_inbound_secret via /registry/register (RFC #2312, PR-F)
Closes the SaaS-side gap that PR-A acknowledged but didn't fix: SaaS workspaces have no persistent /configs volume, so the platform_inbound_secret that PR-A's provisioner wrote at workspace creation never reaches the runtime. Without this, even after the entire RFC #2312 stack lands, SaaS chat upload would 401 (workspace fails-closed when /configs/.platform_inbound_secret is missing). Solution: return the secret in the /registry/register response body on every register call. The runtime extracts it and persists to /configs/.platform_inbound_secret at mode 0600. Idempotent — Docker- mode workspaces also receive it and overwrite the value the provisioner already wrote (same value until rotation). Why on every register, not just first-register: * SaaS containers can be restarted (deploys, drains, EBS detach/ re-attach) — /configs is rebuilt empty on each fresh start. * The auth_token is "issue once" because re-issuing rotates and invalidates the previous one. The inbound secret has no rotation flow yet (#2318) so re-sending the same value is harmless. * Eliminates the bootstrap window where a restarted SaaS workspace has no inbound secret on disk and would 401 every platform call. Changes: * workspace-server/internal/handlers/registry.go — Register handler reads workspaces.platform_inbound_secret via wsauth.ReadPlatformInboundSecret and includes it in the response body. Legacy workspaces (NULL column) get a successful registration with the field omitted. * workspace-server/internal/handlers/registry_test.go — two new tests: - TestRegister_ReturnsPlatformInboundSecret_RFC2312_PRF: secret present in DB → secret in response, alongside auth_token. - TestRegister_NoInboundSecret_OmitsField: NULL column → field omitted, registration still 200. * workspace/platform_inbound_auth.py — adds save_inbound_secret(secret). Atomic write via tmp + os.replace, mode 0600 from os.open(O_CREAT, 0o600) so a concurrent reader never sees 0644-default. Resets the in-process cache after write so the next get_inbound_secret() returns the freshly-written value (rotation-safe when it lands). * workspace/main.py — register-response handler extracts platform_inbound_secret alongside auth_token and persists via save_inbound_secret. Mirrors the existing save_token pattern. * workspace/tests/test_platform_inbound_auth.py — 6 new tests for save_inbound_secret: writes file, mode 0600, overwrite-existing, cache invalidation after save, empty-input no-op, parent-dir creation for fresh installs. Test results: * go test ./internal/handlers/ ./internal/wsauth/ — all green * pytest workspace/tests/ — 1272 passed (was 1266 before this PR) Refs #2312 (parent RFC), #2308 (chat upload 503 incident). Stacks: PR-A #2313 → PR-B #2314 → PR-C #2315 → this PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
d1de330152 |
feat(workspace): /internal/chat/uploads/ingest endpoint (RFC #2312, PR-B)
Stacked on PR-A (#2313). The platform-side rewrite that actually calls this endpoint lands in PR-C; this PR adds the workspace-side consumer + hardening so PR-C is a small Go-only diff. What this adds: * platform_inbound_auth.py — auth gate mirroring transcript_auth.py. Reads /configs/.platform_inbound_secret (delivered by the PR-A provisioner). Fail-closed when the file is missing/empty/unreadable. Constant-time compare via hmac.compare_digest. * internal_chat_uploads.py — POST /internal/chat/uploads/ingest. Multipart parse → sanitize each filename → write to /workspace/.molecule/chat-uploads/<random>-<name> with O_CREAT|O_EXCL|O_NOFOLLOW. Same response shape (uri/name/mimeType/ size + workspace: URI scheme) as the legacy Go handler — canvas / agent code that resolves "workspace:..." paths keeps working. * Wired into workspace/main.py via starlette_app.add_route alongside the existing /transcript route. * python-multipart>=0.0.18 added to requirements.txt (Starlette's Request.form() needs it; ≥ 0.0.18 closes CVE-2024-53981). Test coverage (36 tests, all green; full workspace suite 1266 passed): * test_platform_inbound_auth.py — 14 tests: happy path, fail-closed on missing file, empty file, whitespace- only file, missing/case-wrong/empty Bearer prefix, in-process cache, default CONFIGS_DIR fallback, end-to-end file → authorized. * test_internal_chat_uploads.py — 22 tests: sanitize_filename matrix (incl. ../traversal, CJK chars, length truncation), 401 on missing/wrong/no-secret-file bearer, single + batch upload happy paths, unique random prefix on duplicate names, mimetype guess fallback, 400 on missing files field, 413 on per- file + total-body oversize, symlink-at-target refusal (with sentinel-content unchanged assertion). Why this is safe to ship before PR-C: * No platform-side caller yet → no behavior change visible to users. * Auth fails closed; nothing on the network can hit a write path until the platform forwards with the matching bearer. * Workspace's existing routes (/health, /transcript, /handle/*) are unchanged. Refs #2312 (parent RFC), #2308 (chat upload 503 incident). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
59d65ba557 |
fix: resolve git/gh from PATH instead of hardcoded /usr/local/bin
Closes #2289. Some workspace template images ship `/usr/local/bin/{git,gh}` wrappers that bake `GH_TOKEN` into argv handling (preferred — auto-PR creation authenticates without explicit token plumbing); other templates have plain `/usr/bin/git` installed via apt with no wrapper. The hardcoded `_GIT = "/usr/local/bin/git"` crashed every auto-push attempt on the latter image class: FileNotFoundError: [Errno 2] No such file or directory: '/usr/local/bin/git' File "/app/molecule_runtime/executor_helpers.py", line 524, in _auto_push_and_pr_sync subprocess.run(['/usr/local/bin/git', 'rev-parse', '--is-inside-work-tree'], ...) `shutil.which("git")` walks PATH in order — finds the `/usr/local/bin/` wrapper first when it exists, falls back to `/usr/bin/git` otherwise. GH_TOKEN injection still wins on wrapper-equipped images; auto-push no longer crashes on bare-apt images. Verified locally: `shutil.which("git")` resolves to `/usr/bin/git` on the bug-reporter's image; `shutil.which("gh")` resolves to the homebrew path on dev. Both paths exist + are executable on respective hosts. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
a57382e918 |
feat(runtime): add new_response_message helper for adapter A2A responses
Surfaced via cross-template review of the a2a-sdk v0→v1 migration:
every adapter executor (claude-code, gemini-cli, crewai, openclaw,
autogen) builds A2A response Messages independently using
`new_text_message(text)` from the SDK, which omits `task_id` and
`context_id`. The runtime's own canonical pattern in
`workspace/a2a_executor.py:466-475` correctly threads both:
Message(
message_id=uuid.uuid4().hex,
role=Role.ROLE_AGENT,
parts=_parts,
task_id=task_id, # ← canonical
context_id=context_id, # ← canonical
)
Adapters skipping these correlation fields means the platform's a2a
proxy can't reliably tie the response back to the originating task.
This is a divergence from canonical, not necessarily a strict bug
(task_id may be optional with a default) — but it's enough of a
correlation/observability gap that the canonical pattern bothers to
thread it.
Add `new_response_message(context, text, files=None)` to
executor_helpers.py — single home for response Message construction.
Templates can migrate from `new_text_message(text)` to this helper
in stacked PRs once the runtime publishes to PyPI.
The helper:
- Reads `context.task_id`/`context.context_id` from the inbound
RequestContext, falling back to fresh UUIDs (RequestContextBuilder
always sets them in production; fallback is for unit tests).
- Sets `role=Role.ROLE_AGENT` (the v1 enum value).
- Builds text Parts via `Part(text=...)` and file Parts via
`Part(url="workspace:<path>", filename=..., media_type=...)`.
- Returns a v1 protobuf Message ready for
`event_queue.enqueue_event(...)`.
Why "files=None" with the workspace: URI scheme as the file Part
shape: matches the canonical pattern in a2a_executor.py exactly so
the platform's chat-attachment download path (executor_helpers.py
`resolve_attachment_uri`) interprets responses uniformly across all
adapters.
Tests (5, all pass with --no-cov against the live runtime image):
- test_new_response_message_text_only
- test_new_response_message_with_files
- test_new_response_message_files_only_no_text
- test_new_response_message_falls_back_when_context_ids_unset
- test_new_response_message_handles_missing_attrs
The conftest's a2a stubs needed an extension for Message + Role +
Part with kwargs preservation. Strictly additive — no existing tests
affected. (The 19 pre-existing failures in test_executor_helpers.py
are unrelated debt from the commit_memory/recall_memory rewrite,
visible on staging baseline before this change.)
Per-template migration is the follow-up: claude-code, gemini-cli,
crewai, openclaw, autogen all call `new_text_message(text)` today;
each gets a per-repo PR replacing it with
`new_response_message(context, text)`. This PR ships the helper
first so the templates have something to import.
Refs: PR #2266/#2267 (restart-race), claude-code #15 (FilePart fix),
gemini-cli #10/crewai #8/openclaw #9/autogen #8 (rename PRs).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
a18d116606
|
Merge pull request #2261 from Molecule-AI/fix/harness-cleanup-failed-event
harness: SaaS routing + provider-agnostic config for RFC #2251 measurement |
|||
|
|
00e4766046 |
docs: registry pattern + harness scripts READMEs
Two docs covering load-bearing patterns from today's work that weren't previously discoverable: 1. workspace/platform_tools/README.md — explains the ToolSpec single-source-of-truth pattern (#2240), the CLI-block alignment gap that hand-maintained generation can't close (#2258), the snapshot golden files + LF-pinning (#2260), and the add/rename/ remove playbook. The next reader who lands in workspace/platform_tools/ now has the design rationale + the safe-edit procedure colocated with the code. 2. scripts/README.md — disambiguates the three measure-coordinator- task-bounds.sh files that now exist across two repos: - scripts/measure-coordinator-task-bounds.sh (canonical OSS, this repo) - scripts/measure-coordinator-task-bounds-runner.sh (Hermes/MiniMax variant, this repo) - scripts/measure-coordinator-task-bounds.sh (production-shape, in molecule-controlplane) Cross-references reference_harness_pair_pattern (auto-memory) for the cross-repo design rationale. Documents the common safety pattern (cleanup trap, DRY_RUN, non-target guard, cleanup_*_failed events) and the heartbeat-trace caveat. Refs: #2240, #2254, #2257, #2258, #2259, #2260; molecule-controlplane#321. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
9bc3d6e352
|
Potential fix for pull request finding 'Unused global variable'
Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com> |
|||
|
|
ddf6720498 |
chore(registry): snapshot tests + CLI-block alignment for #2240
Two follow-ups from the #2240 code review: 1. Snapshot tests for the rendered tool-instruction blocks. The structural tests added in #2240 guarantee tool NAMES are present; these new tests pin the SHAPE — bullet ordering, heading style, footer placement — so a future contributor who reorders fields in `_render_section` or rewrites a `when_to_use` paragraph sees the diff in CI rather than shipping a silently-different system prompt. Golden files live under workspace/tests/snapshots/. 2. CLI-block alignment test + corrected source-of-truth comment. `_A2A_INSTRUCTIONS_CLI` is a separate hand-maintained surface for ollama and other non-MCP runtimes — the registry can't auto-generate it because the CLI subprocess interface uses different command shapes (`peers` vs `list_peers`, etc.). A new `_CLI_A2A_COMMAND_KEYWORDS` mapping declares the registry-tool → CLI-keyword correspondence (or explicit `None` for tools not exposed via subprocess). Two tests enforce coverage: - every a2a tool in the registry is keyed in the mapping - every non-None subcommand keyword literally appears in `_A2A_INSTRUCTIONS_CLI` Caught one real gap: `send_message_to_user` is in the registry but has no CLI subcommand. Mapped to `None` with an explanatory comment. The "no other source of truth" claim in registry.py's docstring was wrong post-#2240 (the CLI block survived) — corrected to describe the two surfaces explicitly and point at the alignment tests as the gate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
5fe52b08e7 |
feat(harness): coordinator phase-boundary instrumentation for RFC #2251
Adds structured `rfc2251_phase=...` log lines at the deterministic phase
boundaries inside route_task_to_team and check_task_status, so an
operator running scripts/measure-coordinator-task-bounds.sh against
staging can correlate the harness's external timing trace with what
phase the coordinator was in at any given second.
The harness already exists in staging and measures end-to-end response
time + heartbeat trace. What it CAN'T do without this PR is answer
"the coordinator response took 7 minutes — was it stuck delegating, or
stuck polling children, or stuck synthesizing after all children
returned?" The phase logs answer that question.
Phases instrumented (deterministic Python boundaries, no agent prompt
involvement):
route_start → enter route_task_to_team
children_fetched → after get_children() returns
routing_decided → after build_team_routing_payload
delegate_invoked → just before delegate_task_async.ainvoke
delegate_returned → after delegate_task_async returns
check_status → every check_task_status poll (per-poll)
route_returning_decision_only → fall-through path
Each line includes elapsed_ms from route_start so per-phase durations
are extractable via:
grep rfc2251_phase= <container.log> \
| awk '{...}' to compute deltas between consecutive phases
The synthesis phase (after all children return, before agent emits
final A2A response) is NOT instrumented here because it's
agent-driven (no deterministic Python boundary). The harness operator
infers synthesis_secs = total_response_secs − max(check_status_ts).
This is reproduction-harness scaffolding; it adds zero behavior. Strip
the rfc2251_phase log lines when V1.0 ships and the phase data lands
in the structured heartbeat payload instead.
Refs:
- RFC: molecule-core#2251
- Harness: scripts/measure-coordinator-task-bounds.sh (shipped earlier)
- V1.0 gate: this is deliverable #2 of the four pre-V1.0 gates
|
||
|
|
f323def18f |
chore(build): include platform_tools in runtime wheel SUBPACKAGES
The PR-built wheel + import smoke gate refused the platform_tools
package because it's a new subdirectory under workspace/ that wasn't
in scripts/build_runtime_package.py:SUBPACKAGES. The drift gate (which
exists for exactly this reason) caught it cleanly:
error: SUBPACKAGES drifted from workspace/ subdirectories:
in workspace/ but NOT in SUBPACKAGES (will ship un-rewritten or
be excluded): ['platform_tools']
Adding platform_tools to SUBPACKAGES wires the package into the
runtime wheel + applies the canonical
from platform_tools.<x> -> from molecule_runtime.platform_tools.<x>
import-rewrite step that every other subpackage uses.
Verified locally: scripts/build_runtime_package.py succeeds, the
rewritten a2a_mcp_server.py reads
from molecule_runtime.platform_tools.registry import TOOLS
which matches the package layout in the wheel.
|
||
|
|
e9a59cda3b |
feat(platform): single-source-of-truth tool registry — adapters consume, no drift
Establishes workspace/platform_tools/registry.py as THE place tool
naming and docs live. Every consumer reads from it; nothing duplicates
the source. Closes the architectural gap behind the doc/tool drift
discussion 2026-04-28 — adding hundreds of future runtime SDK adapters
should not require touching tool names anywhere except the registry.
What the registry owns
ToolSpec dataclass with: name, short (one-line description), when_to_use
(multi-paragraph agent-facing usage guidance), input_schema (JSON Schema),
impl (the actual coroutine in a2a_tools.py), section ('a2a' | 'memory').
TOOLS list with 8 entries — delegate_task, delegate_task_async,
check_task_status, list_peers, get_workspace_info, send_message_to_user,
commit_memory, recall_memory.
What now reads from the registry
- workspace/a2a_mcp_server.py
The hardcoded TOOLS list (167 lines of hand-maintained dicts) is
gone. Replaced with a 6-line list comprehension over the registry.
MCP description = spec.short. inputSchema = spec.input_schema.
- workspace/executor_helpers.py
get_a2a_instructions(mcp=True) and get_hma_instructions() now
GENERATE the agent-facing system-prompt text from the registry.
Heading + per-tool bullet (spec.short) + per-tool when_to_use +
a section-specific footer. No more hand-maintained instruction
blocks that drift from reality.
- workspace/builtin_tools/delegation.py
Renamed delegate_to_workspace -> delegate_task_async to match
registry. check_delegation_status -> check_task_status. Added
sync delegate_task @tool wrapping a2a_tools.tool_delegate_task
(was missing for LangChain runtimes — CP review Issue 3).
- workspace/builtin_tools/memory.py
Renamed search_memory -> recall_memory to match registry.
- workspace/adapter_base.py, workspace/main.py
Bundle all 7 core tools (was 6) into all_tools / base_tools.
- workspace/coordinator.py, shared_runtime.py, policies/routing.py
Updated system-prompt-text references to use the registry names.
Structural alignment tests
workspace/tests/test_platform_tools.py — 9 tests pin every
registry-to-adapter mapping:
- registry names are unique
- a2a + memory partition is complete (no orphans)
- by_name lookup works
- MCP server registers exactly the registry's tool set
- MCP description equals registry.short for every tool
- MCP inputSchema equals registry.input_schema for every tool
- get_a2a_instructions text contains every a2a tool name
- get_hma_instructions text contains every memory tool name
- pre-rename names (delegate_to_workspace, search_memory,
check_delegation_status) cannot leak back
Adding a future tool means adding one ToolSpec; the test failure
list tells the author exactly which adapter to update.
Adapter pattern for future SDK support
When (e.g.) AutoGen or Pydantic AI gets adapters, the only work
needed for tool surfacing is "wrap registry.TOOLS in your SDK's
tool format." Names, descriptions, schemas, impl come from the
registry — adapter author writes zero strings.
Why this needed to ship now
PR #2237 (already in staging) injected MCP-world docs as the
default system-prompt content. Without the registry, those docs
said "delegate_task" while LangChain runtimes only had
"delegate_to_workspace" — workers see docs for tools that don't
exist (CP review Issue 1+3). PR #2239 was a tactical rename;
this PR is the structural fix that prevents the same class of
drift from recurring as new adapters ship.
PR #2239 was closed in favor of this — same renames, plus the
registry, plus structural tests. Single coherent change.
Tests: 1232 pass, 2 xfailed (pre-existing). 9 new in
test_platform_tools.py; 4 alignment tests in test_prompt.py from
#2237 still pass; original test_executor_helpers tests adapted to
the registry-driven world.
Refs: CP review Issues 1, 2, 3, 5; project memory
project_runtime_native_pluggable.md (platform owns A2A);
project memory feedback_doc_tool_alignment.md (this is the structural
fix for the tactical lesson).
|
||
|
|
3f99fede5a
|
Merge pull request #2237 from Molecule-AI/fix/inject-a2a-hma-tool-instructions
fix(prompt): inject A2A and HMA tool instructions into system prompt |
||
|
|
448709f4b4 |
fix(prompt): inject A2A and HMA tool instructions into system prompt
Workers were registering platform tools (delegate_task, delegate_task_async, list_peers, check_task_status, send_message_to_user, commit_memory, recall_memory) but the build_system_prompt assembly never included documentation for any of them. The instruction-text functions get_a2a_instructions() and get_hma_instructions() exist in executor_helpers.py and have unit tests, but were not called from any production code path — workers received system-prompt.md content only and saw the tools as bare names with no usage guidance. Symptom: agents called commit_memory and delegate_task without knowing they were platform tools. They worked when the agent guessed the API correctly and silently failed when the agent didn't. Fix: build_system_prompt() now appends both instruction sets between the Skills section and the Peers section. The placement is intentional — A2A docs explain how to call delegate_task; the peer list is the data that delegate_task operates over, so the docs precede the peer table. New parameter `a2a_mcp: bool = True` lets adapters opt into the CLI subprocess variant of the A2A instructions for runtimes without MCP support (ollama, custom CLI runtimes). Default True covers the MCP-capable majority (claude-code, hermes, langchain, crewai). Adapter callers don't need to change unless they specifically need CLI mode. Tests: 4 new regression tests in test_prompt.py pin - A2A MCP variant injection (default) - A2A CLI variant injection (a2a_mcp=False, with MCP-only fields absent) - HMA instruction injection - A2A docs precede peer list ordering Full suite green: 1223 passed, 2 xfailed. |
||
|
|
0cdbc2c4f6 |
chore(deps): batch dep bumps — 11 safe upgrades from 2026-04-28 dependabot wave
Consolidates 11 of the 17 open Dependabot PRs (#2215, #2217, #2219-#2225, #2227, #2229) into one PR. Every entry is a patch / minor / floor bump where the impact surface is small and CI carries the proof. Same pattern as the 2026-04-15 batch. Go (workspace-server/go.mod + go.sum, regenerated via `go mod tidy`): - golang.org/x/crypto 0.49.0 → 0.50.0 (#2225) - github.com/golang-jwt/jwt/v5 5.2.2 → 5.3.1 (#2222) - github.com/gin-contrib/cors 1.7.2 → 1.7.7 (#2220) - github.com/docker/go-connections 0.6.0 → 0.7.0 (#2223) - github.com/redis/go-redis/v9 9.7.3 → 9.19.0 (#2217) Python floor bumps (workspace/requirements.txt; current pip-resolved versions don't change unless they happen to be below the new floor): - httpx >=0.27 → >=0.28.1 (#2221) - uvicorn >=0.30 → >=0.46 (#2229) - temporalio >=1.7 → >=1.26 (#2227) - websockets >=12 → >=16 (#2224) - opentelemetry-sdk >=1.24 → >=1.41.1 (#2219) GitHub Actions (SHA-pinned per existing convention): - dorny/paths-filter@d1c1ffe (v3) → @fbd0ab8 (v4.0.1) (#2215) REMOVED from this batch (lockfile platform mismatch): - #2231 @types/node ^22 → ^25.6 (npm install on macOS strips Linux-only @emnapi/* entries from package-lock.json that CI's `npm ci` then refuses; needs a Linux-side install to land cleanly) - #2230 jsdom ^25 → ^29.1 (same) NOT included in this batch (deferred to per-PR human review): - #2228 github/codeql-action v3 → v4 (CodeQL CLI alignment risk) - #2218 actions/setup-node v4 → v6 (default Node version drift) - #2216 actions/upload-artifact v4 → v7 (3 major versions) - #2214 actions/setup-python v5 → v6 (action major) NOT merged (CI failing on dependabot's own PR): - #2233 next 15 → 16 - #2232 tailwindcss 3 → 4 - #2226 typescript 5 → 6 Verified: - workspace-server: `go mod tidy && go build ./... && go test ./...` — green - workspace requirements.txt: floor bumps only |
||
|
|
96acbd719b |
test: update test_peer_capabilities_format for fallback behavior
The previous assertion `'Silent Agent' not in result` was pinning the buggy behavior — peers without an agent_card were silently dropped from the prompt. With the fallback to DB name+role those peers are correctly visible. Flip the assertion so the test pins the new (correct) rendering and would catch a regression to the silent-drop behavior. |
||
|
|
8ff0748ab9 |
fix(workspace): keep peers visible in coordinator prompt when agent_card is null
Bug: a Design Director coordinator with 6 freshly-created worker peers rendered an empty `## Your Peers` section in its system prompt — the hosting registry endpoint correctly returned all 6 peers, but `summarize_peer_cards()` silently dropped every entry whose `agent_card` column was null (the default until A2A discovery has run end-to-end against the worker). The coordinator then refused to delegate any task because "no peers exist". Fix: fall back to the registry row's `name` and `role` columns when `agent_card` is missing, malformed, or wrong-typed, instead of skipping the peer. The registry endpoint (`workspace-server/internal/handlers/discovery.go:queryPeerMaps`) has always returned both fields — they were just being thrown away on the consumer side. `build_peer_section()` now renders `Role: …` when the agent_card-derived skill list is empty so the coordinator's prompt still has something concrete to delegate against. Also hoists `import json` out of the per-peer loop body to module level (was previously imported once per iteration). Tests: new `test_shared_runtime_peer_summary.py` pins all four fallback cases (null / malformed string / wrong type / null + no DB name) plus the agent-card-present happy path and the mixed-list case the coordinator actually consumes. First peer-summary test coverage `shared_runtime.py` has had — no prior tests existed. Refs: 2026-04-27 Design Director discovery report from infra team. |
||
|
|
3eb599bbb6 |
fix(workspace): use SDK constant for agent-card readiness probe
The initial-prompt readiness probe in workspace/main.py hardcoded the
pre-1.x well-known path. After the a2a-sdk 1.x bump the SDK started
mounting the agent card at the new canonical path (the value of
`a2a.utils.constants.AGENT_CARD_WELL_KNOWN_PATH`), so the probe
returned 404 every attempt and silently fell through to "server not
ready after 30s, skipping". Net effect: every workspace silently
dropped its `initial_prompt` from config.yaml — the agent never sent
the kickoff self-message, and users hit a fresh chat with no context.
Reported by an external user as "/.well-known/agent.json 404 — the
a2a-sdk agent card route was not being mounted at the expected path".
The route IS mounted; the probe was looking at the wrong place.
Fix imports `AGENT_CARD_WELL_KNOWN_PATH` from `a2a.utils.constants`
and uses it directly in the probe URL — the SDK constant is now the
single source of truth, so any future rename travels through
automatically.
Adds two static regression tests pinning the invariant:
1. No hardcoded `/.well-known/agent.json` literal anywhere in
main.py.
2. The probe URL fstring interpolates AGENT_CARD_WELL_KNOWN_PATH
(catches a "fix" that imports the constant for show but reverts
to a literal in the actual GET).
Verified manually inside ghcr.io/molecule-ai/workspace-template-langgraph
that AGENT_CARD_WELL_KNOWN_PATH == '/.well-known/agent-card.json' and
that `create_agent_card_routes(card)` mounts at exactly that path —
constant + mount are aligned in the runtime image, so the probe will
now find the server.
Full workspace test suite: 1209 passed, 2 xfailed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
e87a9c3858 |
fix(a2a): auto-retry transient transport errors in send_a2a_message
Three different intermittent failures observed during a single
manual-test session — RemoteProtocolError, ReadTimeout, ConnectError —
each surfaced as a "Failed to deliver to <peer>" error chip in the
canvas Agent Comms panel even though the next attempt would have
succeeded (verified by direct probes from the same source workspace
to the same peer). The error message even told the user "Usually a
transient network blip — retry once," but it left the retry to a
human reading the error message.
Auto-retry inside send_a2a_message itself: up to 5 attempts (1
initial + 4 retries) with exponential backoff (1s, 2s, 4s, 8s,
16s-capped), each backoff jittered ±25% to break sync across
siblings. Cumulative wall-clock capped at 600s by
_DELEGATE_TOTAL_BUDGET_S so a string of 5×300s ReadTimeouts can't
make the caller wait 25 minutes — once the deadline elapses, retries
stop even if attempts remain.
Retry only on transport-layer transients:
- ConnectError / ConnectTimeout (peer's listening socket not ready)
- RemoteProtocolError (peer closed TCP without writing — observed
when a peer's prior in-flight Claude SDK session aborted)
- ReadError / WriteError (network blip on Docker bridge)
- ReadTimeout (peer wrote no response in 300s)
Application-level errors are NOT retried — they're deterministic and
retrying just wastes wall-clock:
- HTTP 4xx (peer rejected the request format)
- JSON parse failures (peer returned garbage)
- JSON-RPC error in response body (peer's runtime errored cleanly)
- Programmer-bug exceptions (ValueError, etc.)
8 new tests pin the contract:
- retry succeeds after 2 RemoteProtocolErrors
- retry succeeds after 1 ConnectError
- all 5 attempts fail → returns formatted last-error
- capped at exactly _DELEGATE_MAX_ATTEMPTS (regression cover for
"did someone bump the constant accidentally?")
- JSON-RPC error response NOT retried (1 attempt only)
- non-httpx exception NOT retried (programmer bugs stay loud)
- total budget caps the loop even if attempts remain
- backoff schedule grows exponentially with ±25% jitter
Refactor: extracted _format_a2a_error() so the success and exhausted
paths share one error-formatting routine. _delegate_backoff_seconds()
is a pure function so the schedule is unit-testable without monkey-
patching asyncio.sleep.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
81c4c1321c |
fix(runtime): use lowercase wire role for v0.3 JSON-RPC compat layer
Manual-test failure surfaced what was hidden behind the MCP-path bug:
once delegate_task could actually fire, every cross-workspace call
came back as JSON-RPC -32600 "Invalid Request" with the underlying
pydantic ValidationError:
params.message.role
Input should be 'agent' or 'user' [type=enum,
input_value='ROLE_USER', input_type=str]
PR #2184's a2a-sdk 1.x migration sweep over-corrected: it changed
every `"role": "user"` literal in JSON-RPC payload construction to
`"role": "ROLE_USER"` to match the protobuf enum names of the 1.x
native types (a2a.types.Role.ROLE_USER / ROLE_AGENT). That was
correct for in-process Message construction (which the SDK
serialises before wire transmission) but WRONG for the 8 sites that
hand-build JSON-RPC payloads. The workspace's own a2a-sdk runs
inbound requests through the v0.3 compat adapter
(/usr/local/lib/python3.11/site-packages/a2a/compat/v0_3/) because
main.py sets enable_v0_3_compat=True for backwards compatibility,
and that adapter validates against the v0.3 Pydantic Role enum
(`agent` | `user` lowercase). The protobuf-style names blow it up.
Reverted the 8 wire-payload sites to lowercase:
- workspace/a2a_client.py:74
- workspace/a2a_cli.py:74, 111
- workspace/heartbeat.py:378
- workspace/main.py:464, 563
- workspace/builtin_tools/a2a_tools.py:60
- workspace/builtin_tools/delegation.py:272
Native-type usage at workspace/a2a_executor.py:471 (`Role.ROLE_AGENT`)
stays — that's an in-process Message construction; the SDK handles
wire serialisation correctly.
Updated the misleading comment at main.py:255-257 (which said
"outbound payloads are now 1.x-shaped (ROLE_USER)") to spell out
the actual rule: outbound JSON-RPC wire payloads MUST use v0.3
shape, native types are only for in-process construction.
New regression test test_jsonrpc_wire_role_format.py greps the 6
wire-payload-emitting files for any "ROLE_USER" / "ROLE_AGENT"
string literal and fails loud — cheapest possible drift detector.
Why E2E missed it: the priority-runtimes harness sends a single
message canvas → workspace, but the canvas already used lowercase
"user" (it never went through the migration sweep). The bug only
surfaces on workspace → workspace delegation, which the harness
doesn't exercise. Same gap as #131 (extend smoke to call main()
against a stub).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
9c3695df6d |
test(runtime): update molecule_ai_status test for renamed error prefix
Pre-existing test_set_status_exception_prints_to_stderr asserted on the legacy "molecule-monorepo-status: failed to update" prefix string. The prior commit renamed it to "molecule_ai_status: failed to update" so the printed label matches the canonical module-form invocation (`python3 -m molecule_runtime.molecule_ai_status`) instead of a shell alias that only ever existed in the dev-only base image. Updating the expected substring in lockstep. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
28fc7a8cbd |
fix(runtime): replace remaining /app/ legacy paths in agent prompts + docstrings
Comprehensive sweep follow-up to the MCP server path fix. Audited every /app/ reference in the runtime source against the live claude-code template image and confirmed the actual /app/ contents post-#87 are ONLY: __init__.py, adapter.py, claude_sdk_executor.py, requirements.txt — every other workspace module ships in the wheel under site-packages/molecule_runtime/. Two more leaks found: 1. executor_helpers.py:_A2A_INSTRUCTIONS_CLI — inter-agent system prompt for non-MCP runtimes (Ollama, custom) had 5 lines telling the model `python3 /app/a2a_cli.py X`. Models copy these examples verbatim, so every CLI-runtime delegation would fail at the shell layer (no such file). Replaced with `python3 -m molecule_runtime.a2a_cli` form, which works regardless of where the wheel is installed. 2. molecule_ai_status.py docstring — usage examples invoked `python3 /app/molecule_ai_status.py` and claimed a `molecule-monorepo-status` shell alias. Both broken in current templates: the file's at site-packages, and `which molecule-monorepo-status` errors (the legacy symlink only existed in the dev-only workspace/Dockerfile base image, not in the standalone template Dockerfiles that ship to production). Updated docstring + the __main__ usage banner + the stderr error prefix to use the same `python3 -m molecule_runtime.X` form. Plugins audited and clean: WORKSPACE_PLUGINS_DIR=/configs/plugins, SHARED_PLUGINS_DIR=$PLUGINS_DIR fallback /plugins. No /app/ assumptions. Regression test: `test_a2a_cli_instructions_use_module_invocation_not_legacy_app_path` asserts the legacy /app/a2a_cli.py path can't drift back into the CLI system prompt and that the canonical module form is present. The legacy workspace/Dockerfile + workspace/entrypoint.sh + workspace/scripts/ still contain /app/-shaped paths but are dev-only base-image scaffolding (per workspace/build-all.sh's own header comment) — not shipped to the standalone template images. Out of scope here; can be cleaned up in a separate dead-code pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
203a4f0f91 |
fix(runtime): resolve a2a_mcp_server.py path from wheel install location
DEFAULT_MCP_SERVER_PATH was hardcoded to /app/a2a_mcp_server.py, which was correct under the pre-#87 monolithic-template Docker layout where the workspace/ tree was COPY'd into /app/. After the universal-runtime refactor (#87, #117), workspace modules ship inside the molecule-ai-workspace-runtime wheel under site-packages/molecule_runtime/, while /app/ now holds only template-specific files (adapter.py + the runtime-native executor for that template). Net effect: in every workspace built since the wheel cutover, Claude Code SDK's mcp_servers={"a2a": {"command": python, "args": ["/app/a2a_mcp_server.py"]}} pointed at a missing file. The subprocess launch failed silently, the SDK registered zero MCP tools, and the agent's list_peers / delegate_task / a2a_send_message / a2a_send_signal all disappeared. Symptom observed today: Design Director said "I tried to reach the perf auditor via the inter-agent MCP tools (list_peers, delegate_task) but those tools didn't resolve in this environment" and fell back to running the audit itself with WebFetch. Why this slipped through E2E: the priority-runtimes harness sends a single message and verifies a reply — it does not exercise inter-agent delegation, so the missing MCP tools are invisible at that layer. Fix: resolve the path relative to executor_helpers.py via __file__, which tracks wherever the wheel is installed (site-packages today, anywhere else tomorrow). The A2A_MCP_SERVER_PATH env override is preserved for tests / non-default layouts. Regression test: assert os.path.exists(DEFAULT_MCP_SERVER_PATH) so any future move of a2a_mcp_server.py out of the package directory fails at unit-test time instead of silently disabling delegation in production. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
dd57a840b6 |
fix: comprehensive a2a-sdk 1.x migration sweep across workspace/
Audited every a2a-sdk surface in workspace/ against the installed
1.0.2 wheel. Found and fixed:
main.py (the live workspace startup path):
• create_jsonrpc_routes(rpc_url='/', enable_v0_3_compat=True) —
rpc_url required in 1.x; v0.3 compat enables inbound legacy
clients (`"role": "user"` lowercase) without forcing them to
upgrade. Pairs with the outbound rename below.
a2a_executor.py:
• TextPart/FilePart/FileWithUri removed in 1.x. Part is now a
flat proto message: Part(text=…) / Part(url=…, filename=…,
media_type=…). Updated the file-attachment branch (only
reachable when an agent emits files; the harness's PONG path
didn't exercise this, but it's a latent crash).
• Message field names: messageId/taskId/contextId →
message_id/task_id/context_id (proto3 snake_case).
• Role enum: Role.agent → Role.ROLE_AGENT (proto enum).
Outbound JSON-RPC payloads (8 files):
• "role": "user" → "role": "ROLE_USER" — proto3 JSON serialization
is strict about enum values. Sites: a2a_client, a2a_cli, main
(initial+idle prompts), heartbeat, builtin_tools/a2a_tools,
builtin_tools/delegation. Wire JSON keys stay camelCase
(proto3 default), only the role enum value changed.
google-adk/adapter.py:
• new_agent_text_message → new_text_message (4 sites). This
adapter's directory has a hyphen, so it can't be imported as a
Python module — effectively dead code, but the wheel ships the
file and a future fix should keep it correct against 1.x.
Why one PR instead of seven: every previous a2a-sdk migration find
landed as its own publish → cascade → harness → next-bug cycle.
Today's audit ran every a2a-sdk symbol/type/method in workspace/
against the installed 1.0.2 wheel in a single sweep + tested the
critical paths (Message construction, Part construction, Role enum
parsing) against the actual SDK. Should be the last migration PR.
Verified locally:
python3 scripts/build_runtime_package.py --version 0.1.99 \
--out /tmp/build-final
pip install /tmp/build-final
python -c "import molecule_runtime.main; \
from molecule_runtime.a2a_executor import LangGraphA2AExecutor"
→ ✓ all imports clean against a2a-sdk 1.0.2
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
c80b3ff0eb |
fix: pass rpc_url='/' to create_jsonrpc_routes (a2a-sdk 1.x requirement)
7th a2a-sdk migration find from the v0 → v1 transition. create_jsonrpc_routes() now requires rpc_url as a positional arg (was implicit at root in 0.x). Pass '/' to match a2a.utils.constants.DEFAULT_RPC_URL — that's also what workspace-server's a2a_proxy.go forwards to (POSTs to workspace URL without appending a path). Symptom before fix: every workspace startup crashed with TypeError: create_jsonrpc_routes() missing 1 required positional argument: 'rpc_url' Caught by harness 9 phase 4 (claude-code + langgraph both on 0.1.24). The user's "use langgraph for fast iteration" call cut the diagnose cycle from 15min to ~30s — without that, this would have taken another hermes round-trip to surface. Updated reference_a2a_sdk_v0_to_v1_migration.md memory with this entry alongside the previous 6 finds. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |