Commit Graph

3242 Commits

Author SHA1 Message Date
Hongming Wang
49ded74876 docs(cli-runtime): use module-form invocation, drop dead shell-alias claim
Same root cause as the workspace/molecule_ai_status.py docstring fix
in this PR: this doc claimed `molecule-monorepo-status` was a usable
shell alias and `from molecule_ai_status import set_status` was a
usable Python import. Both worked under the pre-#87 monolithic-template
layout (where workspace/Dockerfile created the symlink and COPY'd the
modules into /app/) but neither works in current standalone template
images that install the runtime as a wheel:

- `which molecule-monorepo-status` errors — only `a2a-db` and
  `molecule-runtime` are registered console scripts.
- `from molecule_ai_status` raises ImportError — modules are under the
  `molecule_runtime` package now.

Switched both examples to the canonical `python3 -m
molecule_runtime.molecule_ai_status` form (CLI) and `from
molecule_runtime.molecule_ai_status import set_status` (Python). Same
form the runtime ships in its own usage banner, so anyone discovering
this doc gets a runnable example.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 12:27:50 -07:00
Hongming Wang
9c3695df6d test(runtime): update molecule_ai_status test for renamed error prefix
Pre-existing test_set_status_exception_prints_to_stderr asserted on the
legacy "molecule-monorepo-status: failed to update" prefix string. The
prior commit renamed it to "molecule_ai_status: failed to update" so
the printed label matches the canonical module-form invocation
(`python3 -m molecule_runtime.molecule_ai_status`) instead of a shell
alias that only ever existed in the dev-only base image. Updating the
expected substring in lockstep.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 11:48:05 -07:00
Hongming Wang
28fc7a8cbd fix(runtime): replace remaining /app/ legacy paths in agent prompts + docstrings
Comprehensive sweep follow-up to the MCP server path fix. Audited every
/app/ reference in the runtime source against the live claude-code
template image and confirmed the actual /app/ contents post-#87 are
ONLY: __init__.py, adapter.py, claude_sdk_executor.py, requirements.txt
— every other workspace module ships in the wheel under
site-packages/molecule_runtime/. Two more leaks found:

1. executor_helpers.py:_A2A_INSTRUCTIONS_CLI — inter-agent system prompt
   for non-MCP runtimes (Ollama, custom) had 5 lines telling the model
   `python3 /app/a2a_cli.py X`. Models copy these examples verbatim, so
   every CLI-runtime delegation would fail at the shell layer (no such
   file). Replaced with `python3 -m molecule_runtime.a2a_cli` form,
   which works regardless of where the wheel is installed.

2. molecule_ai_status.py docstring — usage examples invoked
   `python3 /app/molecule_ai_status.py` and claimed a
   `molecule-monorepo-status` shell alias. Both broken in current
   templates: the file's at site-packages, and `which
   molecule-monorepo-status` errors (the legacy symlink only existed
   in the dev-only workspace/Dockerfile base image, not in the
   standalone template Dockerfiles that ship to production).
   Updated docstring + the __main__ usage banner + the stderr error
   prefix to use the same `python3 -m molecule_runtime.X` form.

Plugins audited and clean: WORKSPACE_PLUGINS_DIR=/configs/plugins,
SHARED_PLUGINS_DIR=$PLUGINS_DIR fallback /plugins. No /app/
assumptions.

Regression test: `test_a2a_cli_instructions_use_module_invocation_not_legacy_app_path`
asserts the legacy /app/a2a_cli.py path can't drift back into the CLI
system prompt and that the canonical module form is present.

The legacy workspace/Dockerfile + workspace/entrypoint.sh + workspace/scripts/
still contain /app/-shaped paths but are dev-only base-image scaffolding
(per workspace/build-all.sh's own header comment) — not shipped to the
standalone template images. Out of scope here; can be cleaned up in a
separate dead-code pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 11:22:00 -07:00
Hongming Wang
203a4f0f91 fix(runtime): resolve a2a_mcp_server.py path from wheel install location
DEFAULT_MCP_SERVER_PATH was hardcoded to /app/a2a_mcp_server.py, which
was correct under the pre-#87 monolithic-template Docker layout where
the workspace/ tree was COPY'd into /app/. After the universal-runtime
refactor (#87, #117), workspace modules ship inside the
molecule-ai-workspace-runtime wheel under
site-packages/molecule_runtime/, while /app/ now holds only
template-specific files (adapter.py + the runtime-native executor for
that template).

Net effect: in every workspace built since the wheel cutover, Claude
Code SDK's mcp_servers={"a2a": {"command": python, "args":
["/app/a2a_mcp_server.py"]}} pointed at a missing file. The subprocess
launch failed silently, the SDK registered zero MCP tools, and the
agent's list_peers / delegate_task / a2a_send_message / a2a_send_signal
all disappeared. Symptom observed today: Design Director said
"I tried to reach the perf auditor via the inter-agent MCP tools
(list_peers, delegate_task) but those tools didn't resolve in this
environment" and fell back to running the audit itself with WebFetch.

Why this slipped through E2E: the priority-runtimes harness sends a
single message and verifies a reply — it does not exercise inter-agent
delegation, so the missing MCP tools are invisible at that layer.

Fix: resolve the path relative to executor_helpers.py via __file__,
which tracks wherever the wheel is installed (site-packages today,
anywhere else tomorrow). The A2A_MCP_SERVER_PATH env override is
preserved for tests / non-default layouts.

Regression test: assert os.path.exists(DEFAULT_MCP_SERVER_PATH) so
any future move of a2a_mcp_server.py out of the package directory
fails at unit-test time instead of silently disabling delegation in
production.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 11:15:06 -07:00
Hongming Wang
9532890f04
Merge pull request #2184 from Molecule-AI/fix/jsonrpc-routes-rpc-url
fix: pass rpc_url='/' to create_jsonrpc_routes (a2a-sdk 1.x)
2026-04-27 16:45:48 +00:00
Hongming Wang
dd57a840b6 fix: comprehensive a2a-sdk 1.x migration sweep across workspace/
Audited every a2a-sdk surface in workspace/ against the installed
1.0.2 wheel. Found and fixed:

main.py (the live workspace startup path):
  • create_jsonrpc_routes(rpc_url='/', enable_v0_3_compat=True) —
    rpc_url required in 1.x; v0.3 compat enables inbound legacy
    clients (`"role": "user"` lowercase) without forcing them to
    upgrade. Pairs with the outbound rename below.

a2a_executor.py:
  • TextPart/FilePart/FileWithUri removed in 1.x. Part is now a
    flat proto message: Part(text=…) / Part(url=…, filename=…,
    media_type=…). Updated the file-attachment branch (only
    reachable when an agent emits files; the harness's PONG path
    didn't exercise this, but it's a latent crash).
  • Message field names: messageId/taskId/contextId →
    message_id/task_id/context_id (proto3 snake_case).
  • Role enum: Role.agent → Role.ROLE_AGENT (proto enum).

Outbound JSON-RPC payloads (8 files):
  • "role": "user" → "role": "ROLE_USER" — proto3 JSON serialization
    is strict about enum values. Sites: a2a_client, a2a_cli, main
    (initial+idle prompts), heartbeat, builtin_tools/a2a_tools,
    builtin_tools/delegation. Wire JSON keys stay camelCase
    (proto3 default), only the role enum value changed.

google-adk/adapter.py:
  • new_agent_text_message → new_text_message (4 sites). This
    adapter's directory has a hyphen, so it can't be imported as a
    Python module — effectively dead code, but the wheel ships the
    file and a future fix should keep it correct against 1.x.

Why one PR instead of seven: every previous a2a-sdk migration find
landed as its own publish → cascade → harness → next-bug cycle.
Today's audit ran every a2a-sdk symbol/type/method in workspace/
against the installed 1.0.2 wheel in a single sweep + tested the
critical paths (Message construction, Part construction, Role enum
parsing) against the actual SDK. Should be the last migration PR.

Verified locally:
  python3 scripts/build_runtime_package.py --version 0.1.99 \
      --out /tmp/build-final
  pip install /tmp/build-final
  python -c "import molecule_runtime.main; \
             from molecule_runtime.a2a_executor import LangGraphA2AExecutor"
  → ✓ all imports clean against a2a-sdk 1.0.2

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 09:42:57 -07:00
Hongming Wang
c80b3ff0eb fix: pass rpc_url='/' to create_jsonrpc_routes (a2a-sdk 1.x requirement)
7th a2a-sdk migration find from the v0 → v1 transition.
create_jsonrpc_routes() now requires rpc_url as a positional arg
(was implicit at root in 0.x). Pass '/' to match
a2a.utils.constants.DEFAULT_RPC_URL — that's also what
workspace-server's a2a_proxy.go forwards to (POSTs to workspace URL
without appending a path).

Symptom before fix: every workspace startup crashed with
  TypeError: create_jsonrpc_routes() missing 1 required positional
  argument: 'rpc_url'

Caught by harness 9 phase 4 (claude-code + langgraph both on
0.1.24). The user's "use langgraph for fast iteration" call cut
the diagnose cycle from 15min to ~30s — without that, this would
have taken another hermes round-trip to surface.

Updated reference_a2a_sdk_v0_to_v1_migration.md memory with this
entry alongside the previous 6 finds.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 09:33:23 -07:00
Hongming Wang
d3d57eb3a7
Merge pull request #2183 from Molecule-AI/fix/default-request-handler-agent-card
fix: pass agent_card to DefaultRequestHandler (a2a-sdk 1.x)
2026-04-27 16:06:36 +00:00
Hongming Wang
6859099a08 fix: pass agent_card to DefaultRequestHandler (a2a-sdk 1.x requirement)
a2a-sdk 1.x added agent_card as a required argument to
DefaultRequestHandler.__init__. main.py constructed it with only
agent_executor + task_store, so every workspace startup that reached
the handler init step crashed with:

  TypeError: DefaultRequestHandlerV2.__init__() missing 1 required
  positional argument: 'agent_card'

This is the 6th a2a-sdk migration find from the v0 → v1 transition
(see reference_a2a_sdk_v0_to_v1_migration memory). Pattern is the
same: SDK exposes a new required arg, our call site needs to pass
the existing object we already construct upstream.

Why the import-only smoke gates didn't catch this: it's a call-time
constructor error inside `async def main()`, not a module load
error. The runtime-pin-compat smoke imports main_sync but doesn't
invoke main() against a real config. Worth filing a follow-up to
extend the smoke to a "construct + dispose" cycle.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 08:53:47 -07:00
Hongming Wang
5920fc856d
Merge pull request #2182 from Molecule-AI/ci/agentcard-smoke-followup-2179
fix(workspace): rename supported_protocols → supported_interfaces (CRITICAL — every boot crashes)
2026-04-27 14:58:28 +00:00
Hongming Wang
851fd21fb1 fix(workspace): rename supported_protocols → supported_interfaces (a2a-sdk 1.0)
CRITICAL: every workspace boot since the a2a-sdk 1.0 migration (#1974)
has been crashing at AgentCard construction with:
  ValueError: Protocol message AgentCard has no "supported_protocols" field

The protobuf field is `supported_interfaces` (plural, interfaces — see
a2a-sdk types/a2a_pb2.pyi:189). The 0.3→1.0 migration left the kwarg
as `supported_protocols`, which doesn't exist in the 1.0 schema, so
the constructor raises before any subsequent line of main runs.

Why this hid for so long:
  - publish-runtime.yml's smoke step only IMPORTED molecule_runtime.main;
    importing the module is fine, only CONSTRUCTING the AgentCard fails
  - The user-visible symptom is "Workspace failed: " with empty
    last_sample_error, indistinguishable from generic boot timeouts
  - The state_transition_history=True bug (fixed in #2179) was a
    sibling of this — same migration, same class, just caught first

Fix is symmetric with #2179:
  1. workspace/main.py: rename the kwarg + comment explaining why
  2. .github/workflows/publish-runtime.yml: extend the smoke block to
     instantiate AgentCard with the exact production call shape, so
     the next field-rename of this class fails at publish time
     instead of breaking every workspace startup

Verification:
  - Constructed AgentCard against fresh a2a-sdk 1.0.2 in a clean
    venv with the corrected kwarg → succeeds
  - Constructed it with the original `supported_protocols` kwarg →
    fails immediately with the exact error production sees
  - Smoke test pinned to mirror main.py's exact call shape; main.py
    + smoke must stay in lockstep going forward

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 07:54:23 -07:00
Hongming Wang
2a39061635
Merge pull request #2181 from Molecule-AI/fix/cascade-pypi-wait-and-paths-filter
fix(publish-runtime): wait for PyPI propagation + expand path filter
2026-04-27 14:48:03 +00:00
Hongming Wang
1a703f5687 fix(publish-runtime): wait for PyPI propagation + expand path filter
Two structural fixes for the cascade race conditions that bit us
five times today:

1. **PyPI propagation wait** (cascade job): poll PyPI for the
   just-published version with a 60s budget BEFORE firing
   repository_dispatch. PyPI accepts the upload but takes a few
   seconds to make it available via the package index. Cascade was
   firing too fast — downstream template builds ran `pip install`
   against a stale index, resolved to the previous version, and
   docker layer cache locked that in for subsequent rebuilds.
   Pairs with the build-arg cache invalidation in molecule-ci PR
   (separate change). Wait without invalidation = next build still
   pip-resolves correctly. Invalidation without wait = first cascade
   build may still race PyPI propagation. Together: no race, no
   stale cache.

2. **Path filter expansion**: scripts/build_runtime_package.py is
   the build script and changes to it (e.g. import-rewrite fixes,
   manifest emit, lib/ subpackage move) directly affect what ships
   in the wheel. Was missing from the path filter, so PRs touching
   only scripts/ (like #2174's lib/ fix) didn't auto-publish — the
   operator had to remember a manual dispatch. Add it to the closed
   list of files that trigger auto-publish.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 07:42:37 -07:00
Hongming Wang
2f5ea7a537
Merge pull request #2180 from Molecule-AI/harness/diagnostic-burst-step2-cp-285
test(e2e): diagnostic burst on step-2 provisioning failure (CP #285)
2026-04-27 14:27:15 +00:00
Hongming Wang
3c345f5674 test(e2e): diagnostic burst on step-2 provisioning failure (CP #285)
Closes the molecule-core-side ask of controlplane #285. CP #289 already
landed migration 022 + the handler change exposing \`last_error\` in
/cp/admin/orgs responses. This makes the canary harness actually USE
that field — pre-fix the harness exited with just "Tenant provisioning
failed for <slug>" and forced operators to scrape CP server logs to
learn WHY.

The diagnostic burst dumps the matched org row from the LIST_JSON
already in scope (no extra HTTP call), pretty-printed and prefixed,
right before \`fail\`. Mirrors the TLS-readiness burst pattern from
PR #2107 at step 4. Includes a not-found fallback for DB-drift cases.

No redaction needed — adminOrgSummary is already ops-safe (id, slug,
name, plan, member_count, instance_status, last_error, timestamps;
no tokens, no encrypted fields).

Verification: smoke-tested both branches (org found with last_error +
slug-not-found fallback) with synthetic JSON; bash syntax OK; the only
shellcheck warning is pre-existing on line 93.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 07:22:12 -07:00
Hongming Wang
11e149f05c
Merge pull request #2179 from Molecule-AI/fix/agent-capabilities-state-transition-history
fix: drop state_transition_history (removed in a2a-sdk 1.x)
2026-04-27 14:22:09 +00:00
Hongming Wang
12d446bc8e docs: explain why state_transition_history is gone (research-backed)
Adds a comment block citing a2a-sdk's own
a2a/compat/v0_3/conversions.py, which says verbatim:

  state_transition_history=None,  # No longer supported in v1.0

So a future reader who notices the missing kwarg won't try to add it
back. The capability is now universal: every v1.x Task carries a
history list and tasks/get supports historyLength via the
apply_history_length helper. No flag because nothing's optional.

Confirmed by reading the SDK source directly:
- a2a/types.py AgentCapabilities exposes only: streaming,
  push_notifications, extensions, extended_agent_card.
- a2a/compat/v0_3/conversions.py explicitly maps None when
  down-converting v1 → v0.3 (deliberate removal, not rename).
- a2a/server/request_handlers/default_request_handler_v2.py uses
  apply_history_length(task, params) — agent doesn't opt in.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 07:20:05 -07:00
Hongming Wang
f531fe1367 fix: drop state_transition_history field — removed in a2a-sdk 1.x
a2a-sdk 1.x's AgentCapabilities only exposes 4 fields:
streaming, push_notifications, extensions, extended_agent_card.
The state_transition_history field was removed in the v1 protobuf
schema. main.py still passed it as a kwarg, so every workspace
that reached the AgentCard construction step (line 188) crashed:

  ValueError: Protocol message AgentCapabilities has no
  "state_transition_history" field

Symptom: every claude-code + hermes workspace stuck in `provisioning`
forever — caught when the user provisioned a Design Director crew
manually via the canvas while harness 5 was running.

Why every prior smoke gate missed it:
- runtime-pin-compat.yml smokes `from molecule_runtime.main import
  main_sync` — only imports the module. AgentCapabilities() runs
  inside `async def main()`, not at module load.
- Template image boot smoke does `import every /app/*.py` — same
  story. main.py imports fine; the field error only fires at call.

The fix is one line — drop the kwarg. Fields we actually need
(streaming + push_notifications) are still passed.

Follow-up worth filing: smoke step that instantiates Adapter() +
calls a no-op setup() against a stub config. That would have
caught this before publish.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 07:16:16 -07:00
Hongming Wang
3d617ec421
Merge pull request #2178 from Molecule-AI/deps/go-redis-9.7.3-ghsa-92cp-5422-2mw7
deps(redis): bump go-redis/v9 v9.7.0 → v9.7.3 (GHSA-92cp-5422-2mw7, low)
2026-04-27 14:00:37 +00:00
Hongming Wang
7acdd21c88
Merge pull request #2177 from Molecule-AI/docs/pr-merge-safety-guards
docs: document the two PR auto-merge safety guards
2026-04-27 13:55:26 +00:00
Hongming Wang
fa5e0f5e4c deps(redis): bump go-redis/v9 v9.7.0 → v9.7.3 (GHSA-92cp-5422-2mw7)
Closes the LOW-severity dependabot alert on workspace-server's go-redis
pin. Upstream advisory GHSA-92cp-5422-2mw7: "go-redis allows potential
out-of-order responses when CLIENT SETINFO times out" — fixed in 9.7.3.

Patch bump within the v9.7 line; semver guarantees no API change.
Full workspace-server test suite passes (18/18 packages clean).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 06:54:13 -07:00
Hongming Wang
6589929f87 docs: document the two PR auto-merge safety guards
Adds a section to CONTRIBUTING.md → "Pull Requests" explaining the two
system-level guards that protect against the "I enabled auto-merge then
pushed more commits" race:

1. Repo-wide setting: "Automatically delete head branches" (catches
   pushes to a merged-and-deleted branch — the post-merge orphan case).
2. CI workflow `pr-guards` calling molecule-ci's
   disable-auto-merge-on-push (catches pushes during queue
   processing — disables auto-merge, posts a comment, requires
   explicit re-engage).

Why doc-not-just-memory: my agent-side memory is local. Other
contributors on other machines need this in the repo where they
read it. Cites the 2026-04-27 PR #2174 incident with the
specific commit SHAs that got orphaned.

Companion: molecule-ci README updated separately to document the
reusable workflow under "What each workflow validates" so devs
who land in the molecule-ci repo first can find the contract.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 06:45:55 -07:00
Hongming Wang
b96f99da0f
Merge pull request #2175 from Molecule-AI/deps/docker-v28.5.2-ghsa-x4rx-4gw3-53p4
deps(docker): bump docker/docker v28.2.2 → v28.5.2 (GHSA-x4rx-4gw3-53p4, medium)
2026-04-27 13:42:29 +00:00
Hongming Wang
182de6f2b3
Merge pull request #2176 from Molecule-AI/feat/pr-guards-caller
ci: add pr-guards caller (disable auto-merge on push)
2026-04-27 13:42:17 +00:00
Hongming Wang
82b366fce5 ci: add pr-guards caller that disables auto-merge on push
Thin caller for molecule-ci's reusable disable-auto-merge-on-push
workflow. Forces operator re-engagement when a commit is pushed to
an open PR with auto-merge already enabled.

Pairs with the org-wide "Automatically delete head branches" repo
setting (also enabled today). Defense in depth:

1. Repo setting blocks pushes to a merged-and-deleted branch
   (post-merge orphan case — what bit #2174 today: my second
   commit landed on an already-merged-and-deleted branch).
2. This workflow catches in-queue races (push lands while the
   merge queue is processing) by disabling auto-merge so the
   operator must explicitly re-engage.

Together they cover the full lifecycle of "auto-merge enabled →
new commits arrive" without relying on operator discipline.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 06:39:31 -07:00
Hongming Wang
394dda2a4a deps(docker): bump docker/docker v28.2.2 → v28.5.2 (GHSA-x4rx-4gw3-53p4)
Closes the medium-severity dependabot alert #7 on workspace-server's
docker pin: "Moby firewalld reload makes published container ports
accessible from remote hosts" — fixed in v28.3.3, pulling v28.5.2
(latest in the v28 line).

Patch+minor bump within the v28 train; no client-API breaks
(workspace-server only uses docker.Client for container exec /
inspect, all stable since v20+).

Verification: full workspace-server test suite passes (18/18 packages
clean). Build clean.

Out of scope:
  - Alerts #10 and #11 (the AuthZ bypass + plugin-priv off-by-one)
    require v29.3.1, which is not yet published to the Go module
    proxy (latest published is v28.5.2). They'll close in a follow-up
    PR once v29 lands as a Go module.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 06:26:53 -07:00
Hongming Wang
a354ae2feb
Merge pull request #2174 from Molecule-AI/fix/lib-subpackage-and-drift-gate
fix(build): ship lib/ subpackage + extend drift gate to SUBPACKAGES
2026-04-27 13:07:00 +00:00
Hongming Wang
6e732ab714 fix(build): ship lib/ subpackage + extend drift gate to SUBPACKAGES
Two compounding bugs that bit hermes (and any other workspace that
reaches main.py:142):

1. workspace/lib/ was in EXCLUDE_DIRS so the published wheel didn't
   contain the directory at all. main.py imports `from lib.pre_stop
   import read_snapshot` (and `build_snapshot`, `write_snapshot`) so
   every workspace startup that reaches the snapshot path crashed
   with `ModuleNotFoundError: No module named 'lib'`.

2. Even if lib/ had shipped, `lib` wasn't in SUBPACKAGES so the
   import-rewriter would have left the bare `from lib.pre_stop`
   unqualified — it would still fail because the package would only
   be reachable as `molecule_runtime.lib`.

Fix: move `lib` from EXCLUDE_DIRS to SUBPACKAGES (one entry each).

Drift gate extension: the existing gate I added in #2163 only
asserted TOP_LEVEL_MODULES against workspace/*.py. This change adds
the symmetric assertion for SUBPACKAGES against workspace/<dir>/
(filtered by EXCLUDE_DIRS + presence of __init__.py). Catches both:
- Subpackage added to workspace/ but missed in SUBPACKAGES
- Subpackage missing from workspace/ but lingering in SUBPACKAGES
- Subpackage wrongly in EXCLUDE_DIRS while also referenced by
  rewritten imports (the lib case)

Tested locally: build of 0.1.99 now ships lib/ and main.py contains
`from molecule_runtime.lib.pre_stop import ...` correctly rewritten.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 06:03:46 -07:00
Hongming Wang
1100c50da8
Merge pull request #2172 from Molecule-AI/feat/e2e-cover-all-8-runtimes
feat(e2e): extend priority-runtimes test to cover all 8 templates
2026-04-27 13:00:43 +00:00
Hongming Wang
c7478af99f feat(e2e): extend priority-runtimes test to cover all 8 templates
Tonight's wire-real E2E sweep exposed 12+ root causes across the post-
#87 template extraction. Most would have been caught by an actual
provision-and-online test running on each template — but the test only
covered claude-code + hermes. Extending it to cover all 8 ensures any
future regression in any template fails the test, not production.

What's added:
- run_openai_runtime(runtime, label): generic provisioner for the 5
  OpenAI-backed templates (langgraph, crewai, autogen, deepagents,
  openclaw). Same shape as run_hermes minus the HERMES_* config block
  that hermes-agent needs.
- run_gemini_cli: separate function — gemini-cli wants a Google AI
  key (E2E_GEMINI_API_KEY), not OpenAI.
- Each new runtime registered in the dispatch loop. New `all` keyword
  for E2E_RUNTIMES runs every covered runtime.

claude-code + hermes keep their dedicated functions; both have unique
provisioning quirks (claude-code OAuth + claude-code-specific volume
mounts; hermes 15-min cold-boot) that don't generalize cleanly.

Skip-if-no-key pattern matches the existing one — partially-keyed CI
gets clean skips, not false-fails.

Usage:
  E2E_OPENAI_API_KEY=... E2E_RUNTIMES=langgraph     ./test_priority_runtimes_e2e.sh
  E2E_OPENAI_API_KEY=... E2E_RUNTIMES=all           ./test_priority_runtimes_e2e.sh

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 05:57:59 -07:00
Hongming Wang
1a2ddb4539
Merge pull request #2171 from Molecule-AI/deps/jwt-go-v5.2.2-cve-2025-30204
deps(jwt): bump golang-jwt/jwt/v5 v5.2.1 → v5.2.2 (CVE-2025-30204, HIGH)
2026-04-27 12:44:54 +00:00
Hongming Wang
e63c3b2044
Merge pull request #2170 from Molecule-AI/fix/a2a-executor-sdk-migration
fix(a2a_executor): migrate to a2a-sdk 1.x API
2026-04-27 12:44:42 +00:00
Hongming Wang
041d255091
Merge pull request #2168 from Molecule-AI/ops/audit-railway-sha-pins
ops: add Railway SHA-pin drift audit script + regression test (#2001)
2026-04-27 12:44:31 +00:00
Hongming Wang
5b05d663ee test: update a2a.helpers mock to export new_text_message
The conftest mock only exposed `new_agent_text_message`, the pre-v1
name. After fixing a2a_executor.py to use the v1 name
`new_text_message`, the mock didn't satisfy the import → CI red.

Mock both names (aliased to the same lambda) so any in-flight test
that still references the old name keeps working until the next
sweep removes those references.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 05:34:28 -07:00
Hongming Wang
86bdfa3b47 deps(jwt): bump golang-jwt/jwt/v5 v5.2.1 → v5.2.2 (CVE-2025-30204)
Closes the HIGH-severity dependabot alert on workspace-server's jwt-go
pin. Upstream advisory GHSA-mh63-6h87-95cp / CVE-2025-30204:
"jwt-go allows excessive memory allocation during header parsing" —
fixed in v5.2.2.

Patch bump within the v5.x line; semver guarantees no API change. Full
workspace-server test suite passes (\`go test ./...\` clean across all
18 packages).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 05:31:58 -07:00
Hongming Wang
722e1fd175 fix(a2a_executor): migrate to a2a-sdk 1.x API — new_agent_text_message → new_text_message
a2a-sdk v1 renamed `new_agent_text_message` → `new_text_message`
(role=Role.agent is now the default). Same fix landed in the hermes
template earlier today; this is the runtime-side equivalent.

NOT dead code: a2a_executor.py is the LangGraph A2A executor, used by
the langgraph + deepagents templates. Both templates currently import
it via bare `from a2a_executor import LangGraphA2AExecutor` — which is
a separate bug in those templates, filed/fixed separately.

Symptom in a2a_executor.py form: any langgraph or deepagents workspace
that calls create_executor crashes with `ImportError: cannot import
name 'new_agent_text_message' from 'a2a.helpers'`. Doesn't surface for
claude-code or hermes (their templates use their own executors and
don't load a2a_executor).

Five call sites updated, one import line, one comment. Test suite
already passes against the new symbol — `python -c "from
molecule_runtime.a2a_executor import LangGraphA2AExecutor"` resolves
cleanly after this change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 05:29:59 -07:00
Hongming Wang
026f5e51d9 ops: add Railway SHA-pin drift audit script + regression test (#2001)
#2000 fixed one symptom — TENANT_IMAGE pinned to `staging-a14cf86`
(10 days stale) silently no-op'd four upstream fixes on 2026-04-24.
This adds the audit pattern as a re-runnable script so the broader
class is observable on demand without new CI infrastructure.

Audit results today (2026-04-27):
  controlplane / production: 54 vars audited, 0 drift-prone pins
  controlplane / staging:    52 vars audited, 0 drift-prone pins

So the immediate audit deliverable is clean — TENANT_IMAGE is the only
known violation and #2000 already fixed it. The script makes the
ongoing audit a 5-second command instead of a manual one.

Detection regex catches:
  * branch-SHA suffixes (`staging|main|prod|production-<6+ hex>`)
    — the exact 2026-04-24 incident shape
  * version pins after `:` or `=`  (`:v1.2.3`, `=v0.1.16`)
    — same drift class, just rendered differently

Anchoring on `:` or `=` keeps prose like "version 1.2.3 of the api"
out of the false-positive set. UUIDs, ARNs, AMI IDs, secrets, and
floating tags (`:staging-latest`, `:main`) pass through untouched.

Regression test (tests/ops/test_audit_railway_sha_pins.sh) pins 20
representative cases — 9 should-flag (covering all four branch
prefixes + semver variants + middle-of-value matches) and 11
should-pass (the false-positive guards).  Same regex inlined in both
files so a future tweak that weakens detection fails the test in
lockstep with weakening the audit.

Both files shellcheck clean.

CI gate (acceptance criterion's "regression: add a CI check") is
deliberately scoped out — querying Railway from CI requires plumbing
RAILWAY_TOKEN as a repo secret, which is multi-step setup. The
re-runnable script + test cover the same surface today; the CI
workflow is a small follow-up once the token is provisioned.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 05:01:23 -07:00
Hongming Wang
7cf77f274a
Merge pull request #2166 from Molecule-AI/test/unblock-resolveandstage-test
test(plugins): unblock TestResolveAndStage_NoInternalErrorsInHTTPErr (#1814)
2026-04-27 11:36:15 +00:00
Hongming Wang
dc2f6bd378
Merge pull request #2167 from Molecule-AI/fix/saas-federation-tutorial-409
docs(saas-federation): fix workspace-limit response code (409, not 402) (#1754)
2026-04-27 11:36:02 +00:00
Hongming Wang
3679a6eff6 docs(saas-federation): fix workspace-limit response code (409, not 402) (#1754)
Quota gates are resource-state conflicts, not payment failures —
RFC 9110 reserves 402 for billing/payment failures specifically. The
canonical Molecule-AI/docs PR #82 already shipped the corrected text;
this brings the molecule-core copy of the tutorial in line.

The inline parenthetical "(not 402 Payment Required — quota gates are
resource-state conflicts, not payment failures, per RFC 9110)" doubles
as a regression anchor: a future edit that flips 409 back to 402 would
have to also reword that explanation, making the change a deliberate
two-step act rather than a casual oversight.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 04:30:46 -07:00
Hongming Wang
a0154ea0b4 test(plugins): unblock TestResolveAndStage_NoInternalErrorsInHTTPErr (#1814)
Closes the second of two skipped tests in workspace_provision_test.go
that were blocked on interface refactors. The Broadcaster + CP
provisioner halves landed in earlier #1814 cycles; this is the
plugin-source-registry half.

Refactor:
  - Add handlers.pluginSources interface with the 3 methods handler
    code actually calls (Register, Resolve, Schemes)
  - Compile-time assertion `var _ pluginSources = (*plugins.Registry)(nil)`
    catches future method-signature drift at build time
  - PluginsHandler.sources narrowed from *plugins.Registry to the
    interface; production wiring (NewPluginsHandler, WithSourceResolver)
    still passes *plugins.Registry — satisfies the interface

Production fix (#1206 leak):
  - resolveAndStage's Fetch-failure path was interpolating err.Error()
    into the HTTP response body via `failed to fetch plugin from %s: %v`.
    Resolver errors routinely contain rate-limit text, github request
    IDs, raw HTTP body fragments, and (for local resolvers) file system
    paths — none has any business landing in a user's browser.
  - Body now carries just `failed to fetch plugin from <scheme>`; the
    status code already differentiates the failure shape (404 not
    found, 504 timeout, 502 generic). Full err detail stays in the
    server-side log line one statement above.

Test:
  - 6 sub-tests covering every error path inside resolveAndStage:
    empty source, invalid format, unknown scheme, local
    path-traversal, unpinned github (PLUGIN_ALLOW_UNPINNED unset),
    Fetch failure with a leaky synthetic error
  - The Fetch-failure case plants 5 realistic leak markers in the
    resolver's error string (rate limit text, x-github-request-id,
    auth_token, ghp_-prefixed token, /etc/passwd path); the assertion
    fails if ANY appears in the response body
  - Table-driven so a future error path added to resolveAndStage gets
    one new row, not a copy-paste of the assertion logic

Verification:
  - 6/6 sub-tests pass
  - Full workspace-server test suite passes (interface refactor is
    non-breaking; production caller paths unchanged)
  - go build ./... clean

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 04:00:39 -07:00
104650941a
Merge pull request #2165 from Molecule-AI/fix/main-sync-entry-point
fix: restore main_sync entry point in workspace/main.py
2026-04-27 10:54:44 +00:00
4c839cb306
Merge pull request #2164 from Molecule-AI/test/unblock-cp-provision-broadcast-test
test(provisioner): unblock TestProvisionWorkspaceCP_NoInternalErrorsInBroadcast (#1814)
2026-04-27 10:54:44 +00:00
Hongming Wang
3df5867b56 fix: restore main_sync entry point in workspace/main.py
The wheel's pyproject.toml has declared
`molecule-runtime = "molecule_runtime.main:main_sync"` since the
publish pipeline was created on 2026-04-26, but the function
itself was never present in workspace/main.py — it lived in the
pre-monorepo molecule-ai-workspace-runtime repo and was lost
during the consolidation that made workspace/ the source of truth.

The 0.1.15 wheel still had main_sync from a leftover snapshot,
so the regression went unnoticed until 0.1.16 (the first wheel
built from the new source-of-truth) shipped. Symptom: every
workspace container restart loops with

  ImportError: cannot import name 'main_sync' from 'molecule_runtime.main'

— the molecule-runtime CLI script's first line tries to import
the missing symbol. Workspaces stay in `provisioning` until the
10-min sweep marks them failed.

Caught by .github/workflows/runtime-pin-compat.yml, which already
imports the symbol by name as its smoke test. (That check kept
failing red on every recent merge_group run; this PR fixes the
underlying symbol-not-found instead of the smoke step.)

Also strengthens publish-runtime.yml's wheel smoke from
`import molecule_runtime.main` (loads the module — passes even
when entry-point target is missing) to `from molecule_runtime.main
import main_sync` (the actual contract the CLI script needs).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 03:35:49 -07:00
Hongming Wang
e15d1182cd test(provisioner): unblock TestProvisionWorkspaceCP_NoInternalErrorsInBroadcast (#1814)
The skipped test exists to assert that provisionWorkspaceCP never
leaks err.Error() in WORKSPACE_PROVISION_FAILED broadcasts (regression
guard for #1206). Writing the test body required substituting a
failing CPProvisioner — but the handler's `cpProv` field was the
concrete *CPProvisioner type, so a mock had nowhere to plug in.

Refactor:
  - Add provisioner.CPProvisionerAPI interface with the 3 methods
    handlers actually call (Start, Stop, GetConsoleOutput)
  - Compile-time assertion `var _ CPProvisionerAPI = (*CPProvisioner)(nil)`
    catches future method-signature drift at build time
  - WorkspaceHandler.cpProv narrowed to the interface; SetCPProvisioner
    accepts the interface (production caller passes *CPProvisioner
    from NewCPProvisioner unchanged)

Test:
  - stubFailingCPProv whose Start returns a deliberately leaky error
    (machine_type=t3.large, ami=…, vpc=…, raw HTTP body fragment)
  - Drive provisionWorkspaceCP via the cpProv.Start failure path
  - Assert broadcast["error"] == "provisioning failed" (canned)
  - Assert no leak markers (machine type, AMI, VPC, subnet, HTTP
    body, raw error head) in any broadcast string value
  - Stop/GetConsoleOutput on the stub panic — flags a future
    regression that reaches into them on this path

Verification:
  - Full workspace-server test suite passes (interface refactor
    is non-breaking; production caller path unchanged)
  - go build ./... clean
  - The other skipped test in this file (TestResolveAndStage_…)
    is a separate plugins.Registry refactor and remains skipped

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 03:28:25 -07:00
Hongming Wang
5022a740e1
Merge pull request #2163 from Molecule-AI/fix/build-script-drift-gate-and-main-smoke
fix(release): drift-gate TOP_LEVEL_MODULES + smoke-import main (post-0.1.16 incident)
2026-04-27 10:22:06 +00:00
Hongming Wang
c68dc1877f fix(release): drift-gate TOP_LEVEL_MODULES + smoke-import main in publish
Two compounding bugs surfaced when 0.1.16 hit production today:

1. scripts/build_runtime_package.py had a hand-curated TOP_LEVEL_MODULES
   set listing every workspace/*.py that should get its bare imports
   rewritten to `molecule_runtime.X`. The set silently went stale:
   - Missing: transcript_auth (added since #87 phase 1c), runtime_wedge,
     watcher → unrewritten imports shipped, every workspace startup
     died with ModuleNotFoundError.
   - Stale: claude_sdk_executor, cli_executor (both removed in #87),
     hermes_executor (never existed) → harmless but misleading.

2. publish-runtime.yml's wheel-smoke step asserted on stable invariants
   (BaseAdapter, AdapterConfig, a2a_client error sentinel) but never
   imported main. So even though main.py held the broken bare
   `from transcript_auth import ...`, the smoke check passed.

Fixes:

- Build script now derives the on-disk module set from workspace/*.py
  and asserts it matches TOP_LEVEL_MODULES exactly. Drift in either
  direction fails the build with a specific diff message instead of
  shipping a broken wheel. Closed-list typo guard preserved (we still
  edit the set explicitly when a module is added/removed) — the gate
  just makes drift impossible to ignore.

- TOP_LEVEL_MODULES updated to current reality: drop the 3 stale,
  add the 3 missing.

- publish-runtime.yml wheel-smoke now `import molecule_runtime.main`
  before the invariant asserts. main is the entry point and
  transitively imports every module — any bare-import bug surfaces
  as ModuleNotFoundError before PyPI accepts the upload.

Tested locally: `python3 scripts/build_runtime_package.py
--version 0.1.99 --out /tmp/build-test` succeeds, and
/tmp/build-test/molecule_runtime/main.py contains the rewritten
`from molecule_runtime.transcript_auth import ...`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 03:19:17 -07:00
Hongming Wang
6f0774c708
Merge pull request #2162 from Molecule-AI/fix/e2e-sanity-rc-normalization
fix(e2e-sanity): normalize unexpected curl exit codes in cleanup trap (#2159)
2026-04-27 10:05:14 +00:00
Hongming Wang
99fb61bb8c fix(e2e-sanity): normalize unexpected curl exit codes in cleanup trap (#2159)
When E2E_INTENTIONAL_FAILURE=1 poisons the tenant token, step 5/11's
`tenant_call POST /workspaces` curl exits 22 (HTTP error under
--fail-with-body). `set -e` propagates rc=22 directly, but the
script's documented contract emits only {0,1,2,3,4}, and the sanity
workflow's case statement only matches those. rc=22 falls through
to "Unexpected rc — investigate harness" and opens a false-positive
priority-high "safety net broken" issue (#2159, weekly run on
2026-04-27).

The trap now captures $? at entry (must be the first statement
before any command clobbers it) and at the end normalizes any
non-contract code to 1 (generic failure). Leak detection continues
to exit 4 directly, so its semantics are preserved.

Adds tests/e2e/test_harness_rc_normalization.sh — a self-contained
regression test that builds a stub harness with the same trap
pattern, triggers controlled exit codes, and asserts the
normalization. Covers the 5 contracted codes + curl-22 (the bug) +
3 representative network-failure codes + sigsegv-139.

Verification:
  - 10/10 regression tests pass
  - shellcheck clean on both modified files
  - production teardown path unchanged for legitimate {1,2,3,4}
    failures and the leak-detection exit 4

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 02:55:44 -07:00
c3d29941b8
Merge pull request #2161 from Molecule-AI/feat/auto-publish-runtime-on-staging
feat(publish-runtime): auto-publish to PyPI on staging pushes touching workspace/
2026-04-27 09:20:12 +00:00