molecule-core

Author	SHA1	Message	Date
Hongming Wang	b96f99da0f	Merge pull request #2175 from Molecule-AI/deps/docker-v28.5.2-ghsa-x4rx-4gw3-53p4 deps(docker): bump docker/docker v28.2.2 → v28.5.2 (GHSA-x4rx-4gw3-53p4, medium)	2026-04-27 13:42:29 +00:00
Hongming Wang	182de6f2b3	Merge pull request #2176 from Molecule-AI/feat/pr-guards-caller ci: add pr-guards caller (disable auto-merge on push)	2026-04-27 13:42:17 +00:00
Hongming Wang	82b366fce5	ci: add pr-guards caller that disables auto-merge on push Thin caller for molecule-ci's reusable disable-auto-merge-on-push workflow. Forces operator re-engagement when a commit is pushed to an open PR with auto-merge already enabled. Pairs with the org-wide "Automatically delete head branches" repo setting (also enabled today). Defense in depth: 1. Repo setting blocks pushes to a merged-and-deleted branch (post-merge orphan case — what bit #2174 today: my second commit landed on an already-merged-and-deleted branch). 2. This workflow catches in-queue races (push lands while the merge queue is processing) by disabling auto-merge so the operator must explicitly re-engage. Together they cover the full lifecycle of "auto-merge enabled → new commits arrive" without relying on operator discipline. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 06:39:31 -07:00
Hongming Wang	394dda2a4a	deps(docker): bump docker/docker v28.2.2 → v28.5.2 (GHSA-x4rx-4gw3-53p4) Closes the medium-severity dependabot alert #7 on workspace-server's docker pin: "Moby firewalld reload makes published container ports accessible from remote hosts" — fixed in v28.3.3, pulling v28.5.2 (latest in the v28 line). Patch+minor bump within the v28 train; no client-API breaks (workspace-server only uses docker.Client for container exec / inspect, all stable since v20+). Verification: full workspace-server test suite passes (18/18 packages clean). Build clean. Out of scope: - Alerts #10 and #11 (the AuthZ bypass + plugin-priv off-by-one) require v29.3.1, which is not yet published to the Go module proxy (latest published is v28.5.2). They'll close in a follow-up PR once v29 lands as a Go module. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 06:26:53 -07:00
Hongming Wang	a354ae2feb	Merge pull request #2174 from Molecule-AI/fix/lib-subpackage-and-drift-gate fix(build): ship lib/ subpackage + extend drift gate to SUBPACKAGES	2026-04-27 13:07:00 +00:00
Hongming Wang	6e732ab714	fix(build): ship lib/ subpackage + extend drift gate to SUBPACKAGES Two compounding bugs that bit hermes (and any other workspace that reaches main.py:142): 1. workspace/lib/ was in EXCLUDE_DIRS so the published wheel didn't contain the directory at all. main.py imports `from lib.pre_stop import read_snapshot` (and `build_snapshot`, `write_snapshot`) so every workspace startup that reaches the snapshot path crashed with `ModuleNotFoundError: No module named 'lib'`. 2. Even if lib/ had shipped, `lib` wasn't in SUBPACKAGES so the import-rewriter would have left the bare `from lib.pre_stop` unqualified — it would still fail because the package would only be reachable as `molecule_runtime.lib`. Fix: move `lib` from EXCLUDE_DIRS to SUBPACKAGES (one entry each). Drift gate extension: the existing gate I added in #2163 only asserted TOP_LEVEL_MODULES against workspace/*.py. This change adds the symmetric assertion for SUBPACKAGES against workspace/<dir>/ (filtered by EXCLUDE_DIRS + presence of __init__.py). Catches both: - Subpackage added to workspace/ but missed in SUBPACKAGES - Subpackage missing from workspace/ but lingering in SUBPACKAGES - Subpackage wrongly in EXCLUDE_DIRS while also referenced by rewritten imports (the lib case) Tested locally: build of 0.1.99 now ships lib/ and main.py contains `from molecule_runtime.lib.pre_stop import ...` correctly rewritten. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 06:03:46 -07:00
Hongming Wang	1100c50da8	Merge pull request #2172 from Molecule-AI/feat/e2e-cover-all-8-runtimes feat(e2e): extend priority-runtimes test to cover all 8 templates	2026-04-27 13:00:43 +00:00
Hongming Wang	c7478af99f	feat(e2e): extend priority-runtimes test to cover all 8 templates Tonight's wire-real E2E sweep exposed 12+ root causes across the post- #87 template extraction. Most would have been caught by an actual provision-and-online test running on each template — but the test only covered claude-code + hermes. Extending it to cover all 8 ensures any future regression in any template fails the test, not production. What's added: - run_openai_runtime(runtime, label): generic provisioner for the 5 OpenAI-backed templates (langgraph, crewai, autogen, deepagents, openclaw). Same shape as run_hermes minus the HERMES_* config block that hermes-agent needs. - run_gemini_cli: separate function — gemini-cli wants a Google AI key (E2E_GEMINI_API_KEY), not OpenAI. - Each new runtime registered in the dispatch loop. New `all` keyword for E2E_RUNTIMES runs every covered runtime. claude-code + hermes keep their dedicated functions; both have unique provisioning quirks (claude-code OAuth + claude-code-specific volume mounts; hermes 15-min cold-boot) that don't generalize cleanly. Skip-if-no-key pattern matches the existing one — partially-keyed CI gets clean skips, not false-fails. Usage: E2E_OPENAI_API_KEY=... E2E_RUNTIMES=langgraph ./test_priority_runtimes_e2e.sh E2E_OPENAI_API_KEY=... E2E_RUNTIMES=all ./test_priority_runtimes_e2e.sh Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 05:57:59 -07:00
Hongming Wang	1a2ddb4539	Merge pull request #2171 from Molecule-AI/deps/jwt-go-v5.2.2-cve-2025-30204 deps(jwt): bump golang-jwt/jwt/v5 v5.2.1 → v5.2.2 (CVE-2025-30204, HIGH)	2026-04-27 12:44:54 +00:00
Hongming Wang	e63c3b2044	Merge pull request #2170 from Molecule-AI/fix/a2a-executor-sdk-migration fix(a2a_executor): migrate to a2a-sdk 1.x API	2026-04-27 12:44:42 +00:00
Hongming Wang	041d255091	Merge pull request #2168 from Molecule-AI/ops/audit-railway-sha-pins ops: add Railway SHA-pin drift audit script + regression test (#2001)	2026-04-27 12:44:31 +00:00
Hongming Wang	5b05d663ee	test: update a2a.helpers mock to export new_text_message The conftest mock only exposed `new_agent_text_message`, the pre-v1 name. After fixing a2a_executor.py to use the v1 name `new_text_message`, the mock didn't satisfy the import → CI red. Mock both names (aliased to the same lambda) so any in-flight test that still references the old name keeps working until the next sweep removes those references. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 05:34:28 -07:00
Hongming Wang	86bdfa3b47	deps(jwt): bump golang-jwt/jwt/v5 v5.2.1 → v5.2.2 (CVE-2025-30204) Closes the HIGH-severity dependabot alert on workspace-server's jwt-go pin. Upstream advisory GHSA-mh63-6h87-95cp / CVE-2025-30204: "jwt-go allows excessive memory allocation during header parsing" — fixed in v5.2.2. Patch bump within the v5.x line; semver guarantees no API change. Full workspace-server test suite passes (\`go test ./...\` clean across all 18 packages). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 05:31:58 -07:00
Hongming Wang	722e1fd175	fix(a2a_executor): migrate to a2a-sdk 1.x API — new_agent_text_message → new_text_message a2a-sdk v1 renamed `new_agent_text_message` → `new_text_message` (role=Role.agent is now the default). Same fix landed in the hermes template earlier today; this is the runtime-side equivalent. NOT dead code: a2a_executor.py is the LangGraph A2A executor, used by the langgraph + deepagents templates. Both templates currently import it via bare `from a2a_executor import LangGraphA2AExecutor` — which is a separate bug in those templates, filed/fixed separately. Symptom in a2a_executor.py form: any langgraph or deepagents workspace that calls create_executor crashes with `ImportError: cannot import name 'new_agent_text_message' from 'a2a.helpers'`. Doesn't surface for claude-code or hermes (their templates use their own executors and don't load a2a_executor). Five call sites updated, one import line, one comment. Test suite already passes against the new symbol — `python -c "from molecule_runtime.a2a_executor import LangGraphA2AExecutor"` resolves cleanly after this change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 05:29:59 -07:00
Hongming Wang	026f5e51d9	ops: add Railway SHA-pin drift audit script + regression test (#2001 ) #2000 fixed one symptom — TENANT_IMAGE pinned to `staging-a14cf86` (10 days stale) silently no-op'd four upstream fixes on 2026-04-24. This adds the audit pattern as a re-runnable script so the broader class is observable on demand without new CI infrastructure. Audit results today (2026-04-27): controlplane / production: 54 vars audited, 0 drift-prone pins controlplane / staging: 52 vars audited, 0 drift-prone pins So the immediate audit deliverable is clean — TENANT_IMAGE is the only known violation and #2000 already fixed it. The script makes the ongoing audit a 5-second command instead of a manual one. Detection regex catches: * branch-SHA suffixes (`staging\|main\|prod\|production-<6+ hex>`) — the exact 2026-04-24 incident shape * version pins after `:` or `=` (`:v1.2.3`, `=v0.1.16`) — same drift class, just rendered differently Anchoring on `:` or `=` keeps prose like "version 1.2.3 of the api" out of the false-positive set. UUIDs, ARNs, AMI IDs, secrets, and floating tags (`:staging-latest`, `:main`) pass through untouched. Regression test (tests/ops/test_audit_railway_sha_pins.sh) pins 20 representative cases — 9 should-flag (covering all four branch prefixes + semver variants + middle-of-value matches) and 11 should-pass (the false-positive guards). Same regex inlined in both files so a future tweak that weakens detection fails the test in lockstep with weakening the audit. Both files shellcheck clean. CI gate (acceptance criterion's "regression: add a CI check") is deliberately scoped out — querying Railway from CI requires plumbing RAILWAY_TOKEN as a repo secret, which is multi-step setup. The re-runnable script + test cover the same surface today; the CI workflow is a small follow-up once the token is provisioned. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 05:01:23 -07:00
Hongming Wang	7cf77f274a	Merge pull request #2166 from Molecule-AI/test/unblock-resolveandstage-test test(plugins): unblock TestResolveAndStage_NoInternalErrorsInHTTPErr (#1814)	2026-04-27 11:36:15 +00:00
Hongming Wang	dc2f6bd378	Merge pull request #2167 from Molecule-AI/fix/saas-federation-tutorial-409 docs(saas-federation): fix workspace-limit response code (409, not 402) (#1754)	2026-04-27 11:36:02 +00:00
Hongming Wang	3679a6eff6	docs(saas-federation): fix workspace-limit response code (409, not 402) (#1754 ) Quota gates are resource-state conflicts, not payment failures — RFC 9110 reserves 402 for billing/payment failures specifically. The canonical Molecule-AI/docs PR #82 already shipped the corrected text; this brings the molecule-core copy of the tutorial in line. The inline parenthetical "(not 402 Payment Required — quota gates are resource-state conflicts, not payment failures, per RFC 9110)" doubles as a regression anchor: a future edit that flips 409 back to 402 would have to also reword that explanation, making the change a deliberate two-step act rather than a casual oversight. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 04:30:46 -07:00
Hongming Wang	a0154ea0b4	test(plugins): unblock TestResolveAndStage_NoInternalErrorsInHTTPErr (#1814 ) Closes the second of two skipped tests in workspace_provision_test.go that were blocked on interface refactors. The Broadcaster + CP provisioner halves landed in earlier #1814 cycles; this is the plugin-source-registry half. Refactor: - Add handlers.pluginSources interface with the 3 methods handler code actually calls (Register, Resolve, Schemes) - Compile-time assertion `var _ pluginSources = (plugins.Registry)(nil)` catches future method-signature drift at build time - PluginsHandler.sources narrowed from plugins.Registry to the interface; production wiring (NewPluginsHandler, WithSourceResolver) still passes *plugins.Registry — satisfies the interface Production fix (#1206 leak): - resolveAndStage's Fetch-failure path was interpolating err.Error() into the HTTP response body via `failed to fetch plugin from %s: %v`. Resolver errors routinely contain rate-limit text, github request IDs, raw HTTP body fragments, and (for local resolvers) file system paths — none has any business landing in a user's browser. - Body now carries just `failed to fetch plugin from <scheme>`; the status code already differentiates the failure shape (404 not found, 504 timeout, 502 generic). Full err detail stays in the server-side log line one statement above. Test: - 6 sub-tests covering every error path inside resolveAndStage: empty source, invalid format, unknown scheme, local path-traversal, unpinned github (PLUGIN_ALLOW_UNPINNED unset), Fetch failure with a leaky synthetic error - The Fetch-failure case plants 5 realistic leak markers in the resolver's error string (rate limit text, x-github-request-id, auth_token, ghp_-prefixed token, /etc/passwd path); the assertion fails if ANY appears in the response body - Table-driven so a future error path added to resolveAndStage gets one new row, not a copy-paste of the assertion logic Verification: - 6/6 sub-tests pass - Full workspace-server test suite passes (interface refactor is non-breaking; production caller paths unchanged) - go build ./... clean Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 04:00:39 -07:00
hongmingwang-moleculeai	104650941a	Merge pull request #2165 from Molecule-AI/fix/main-sync-entry-point fix: restore main_sync entry point in workspace/main.py	2026-04-27 10:54:44 +00:00
hongmingwang-moleculeai	4c839cb306	Merge pull request #2164 from Molecule-AI/test/unblock-cp-provision-broadcast-test test(provisioner): unblock TestProvisionWorkspaceCP_NoInternalErrorsInBroadcast (#1814)	2026-04-27 10:54:44 +00:00
Hongming Wang	3df5867b56	fix: restore main_sync entry point in workspace/main.py The wheel's pyproject.toml has declared `molecule-runtime = "molecule_runtime.main:main_sync"` since the publish pipeline was created on 2026-04-26, but the function itself was never present in workspace/main.py — it lived in the pre-monorepo molecule-ai-workspace-runtime repo and was lost during the consolidation that made workspace/ the source of truth. The 0.1.15 wheel still had main_sync from a leftover snapshot, so the regression went unnoticed until 0.1.16 (the first wheel built from the new source-of-truth) shipped. Symptom: every workspace container restart loops with ImportError: cannot import name 'main_sync' from 'molecule_runtime.main' — the molecule-runtime CLI script's first line tries to import the missing symbol. Workspaces stay in `provisioning` until the 10-min sweep marks them failed. Caught by .github/workflows/runtime-pin-compat.yml, which already imports the symbol by name as its smoke test. (That check kept failing red on every recent merge_group run; this PR fixes the underlying symbol-not-found instead of the smoke step.) Also strengthens publish-runtime.yml's wheel smoke from `import molecule_runtime.main` (loads the module — passes even when entry-point target is missing) to `from molecule_runtime.main import main_sync` (the actual contract the CLI script needs). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 03:35:49 -07:00
Hongming Wang	e15d1182cd	test(provisioner): unblock TestProvisionWorkspaceCP_NoInternalErrorsInBroadcast (#1814 ) The skipped test exists to assert that provisionWorkspaceCP never leaks err.Error() in WORKSPACE_PROVISION_FAILED broadcasts (regression guard for #1206). Writing the test body required substituting a failing CPProvisioner — but the handler's `cpProv` field was the concrete CPProvisioner type, so a mock had nowhere to plug in. Refactor: - Add provisioner.CPProvisionerAPI interface with the 3 methods handlers actually call (Start, Stop, GetConsoleOutput) - Compile-time assertion `var _ CPProvisionerAPI = (CPProvisioner)(nil)` catches future method-signature drift at build time - WorkspaceHandler.cpProv narrowed to the interface; SetCPProvisioner accepts the interface (production caller passes *CPProvisioner from NewCPProvisioner unchanged) Test: - stubFailingCPProv whose Start returns a deliberately leaky error (machine_type=t3.large, ami=…, vpc=…, raw HTTP body fragment) - Drive provisionWorkspaceCP via the cpProv.Start failure path - Assert broadcast["error"] == "provisioning failed" (canned) - Assert no leak markers (machine type, AMI, VPC, subnet, HTTP body, raw error head) in any broadcast string value - Stop/GetConsoleOutput on the stub panic — flags a future regression that reaches into them on this path Verification: - Full workspace-server test suite passes (interface refactor is non-breaking; production caller path unchanged) - go build ./... clean - The other skipped test in this file (TestResolveAndStage_…) is a separate plugins.Registry refactor and remains skipped Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 03:28:25 -07:00
Hongming Wang	5022a740e1	Merge pull request #2163 from Molecule-AI/fix/build-script-drift-gate-and-main-smoke fix(release): drift-gate TOP_LEVEL_MODULES + smoke-import main (post-0.1.16 incident)	2026-04-27 10:22:06 +00:00
Hongming Wang	c68dc1877f	fix(release): drift-gate TOP_LEVEL_MODULES + smoke-import main in publish Two compounding bugs surfaced when 0.1.16 hit production today: 1. scripts/build_runtime_package.py had a hand-curated TOP_LEVEL_MODULES set listing every workspace/.py that should get its bare imports rewritten to `molecule_runtime.X`. The set silently went stale: - Missing: transcript_auth (added since #87 phase 1c), runtime_wedge, watcher → unrewritten imports shipped, every workspace startup died with ModuleNotFoundError. - Stale: claude_sdk_executor, cli_executor (both removed in #87), hermes_executor (never existed) → harmless but misleading. 2. publish-runtime.yml's wheel-smoke step asserted on stable invariants (BaseAdapter, AdapterConfig, a2a_client error sentinel) but never imported main. So even though main.py held the broken bare `from transcript_auth import ...`, the smoke check passed. Fixes: - Build script now derives the on-disk module set from workspace/.py and asserts it matches TOP_LEVEL_MODULES exactly. Drift in either direction fails the build with a specific diff message instead of shipping a broken wheel. Closed-list typo guard preserved (we still edit the set explicitly when a module is added/removed) — the gate just makes drift impossible to ignore. - TOP_LEVEL_MODULES updated to current reality: drop the 3 stale, add the 3 missing. - publish-runtime.yml wheel-smoke now `import molecule_runtime.main` before the invariant asserts. main is the entry point and transitively imports every module — any bare-import bug surfaces as ModuleNotFoundError before PyPI accepts the upload. Tested locally: `python3 scripts/build_runtime_package.py --version 0.1.99 --out /tmp/build-test` succeeds, and /tmp/build-test/molecule_runtime/main.py contains the rewritten `from molecule_runtime.transcript_auth import ...`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 03:19:17 -07:00
Hongming Wang	6f0774c708	Merge pull request #2162 from Molecule-AI/fix/e2e-sanity-rc-normalization fix(e2e-sanity): normalize unexpected curl exit codes in cleanup trap (#2159)	2026-04-27 10:05:14 +00:00
Hongming Wang	99fb61bb8c	fix(e2e-sanity): normalize unexpected curl exit codes in cleanup trap (#2159 ) When E2E_INTENTIONAL_FAILURE=1 poisons the tenant token, step 5/11's `tenant_call POST /workspaces` curl exits 22 (HTTP error under --fail-with-body). `set -e` propagates rc=22 directly, but the script's documented contract emits only {0,1,2,3,4}, and the sanity workflow's case statement only matches those. rc=22 falls through to "Unexpected rc — investigate harness" and opens a false-positive priority-high "safety net broken" issue (#2159, weekly run on 2026-04-27). The trap now captures $? at entry (must be the first statement before any command clobbers it) and at the end normalizes any non-contract code to 1 (generic failure). Leak detection continues to exit 4 directly, so its semantics are preserved. Adds tests/e2e/test_harness_rc_normalization.sh — a self-contained regression test that builds a stub harness with the same trap pattern, triggers controlled exit codes, and asserts the normalization. Covers the 5 contracted codes + curl-22 (the bug) + 3 representative network-failure codes + sigsegv-139. Verification: - 10/10 regression tests pass - shellcheck clean on both modified files - production teardown path unchanged for legitimate {1,2,3,4} failures and the leak-detection exit 4 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 02:55:44 -07:00
hongmingwang-moleculeai	c3d29941b8	Merge pull request #2161 from Molecule-AI/feat/auto-publish-runtime-on-staging feat(publish-runtime): auto-publish to PyPI on staging pushes touching workspace/	2026-04-27 09:20:12 +00:00
Hongming Wang	7d872f9661	Merge pull request #2160 from Molecule-AI/feat/skill-runtime-compat feat(skills): per-skill runtime compatibility (#119)	2026-04-27 09:15:01 +00:00
Hongming Wang	0a455b7d71	feat(publish-runtime): auto-publish to PyPI on staging pushes that touch workspace/ Adds a third trigger so any merge to staging that changes workspace/ auto-publishes a new molecule-ai-workspace-runtime patch release. Closes the human-in-loop gap that caused tonight's RuntimeCapabilities ImportError outage. Tonight: #117 added RuntimeCapabilities to molecule_runtime.adapters.base. The merge landed at 02:37 UTC. Templates rebuilt their images at 07:37 UTC (4 hours later) and started importing the new symbol. PyPI was still serving 0.1.15 (pre-#117) because nobody remembered to push a runtime-vX.Y.Z tag or workflow_dispatch the publish. Result: every template image shipped tonight runs `from molecule_runtime.adapters.base import RuntimeCapabilities` against an installed runtime that doesn't export it -> ImportError -> workspace never registers -> stuck in provisioning until 10-min sweep. Mechanism: - New trigger: push to staging filtered to paths: ['workspace/']. Path filter applies only to branch pushes; the existing tag trigger still fires unconditionally. - Version derivation for the auto case: query PyPI's JSON API for current latest, bump the patch component. PyPI is the source of truth so concurrent runs don't double-publish (HTTP 400 on collision). - concurrency: group serializes parallel staging merges so they don't race on the bump computation. cancel-in-progress: false because each workspace/** change deserves its own release. - publish job now exposes its derived version as a job-level output so the cascade reads it cleanly. Fixes a latent bug: cascade tried to read steps.version.outputs.version, which is from a different job's scope and silently resolved to empty -- then re-derived from GITHUB_REF_NAME, which would have been "staging" under the new trigger and produced an invalid version. Tag-driven and manual-dispatch paths are unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 02:11:45 -07:00
Hongming Wang	d19d35f6b3	test(skills): make watcher test fakes accept current_runtime kwarg The runtime-compat change in this branch added a `current_runtime` kwarg to load_skills(); the watcher passes it through. Test mocks that pre-date the kwarg signature broke with TypeError, which the watcher's reload-error try/except swallowed — the symptom was empty callback lists, not a clear failure. Switching the fakes to accept **kwargs keeps them forward-compat for future load_skills additions without another test churn. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 02:04:26 -07:00
Hongming Wang	d0057912d2	feat(skills): per-skill runtime compatibility (#119 , hermes pattern) SKILL.md frontmatter can now declare `runtime: [claude-code]` or `runtime: [hermes, claude-code]` to opt out of incompatible adapters instead of failing at first invocation. Default `[""]` means universal — existing skill libraries need zero migration. Borrowed from hermes' declarative skill-compat pattern surfaced in the hermes architecture survey. The remaining two patterns (event-log layer, observability config block) stay open under #119. Wiring: - SkillMetadata.runtime: list[str] = [""] - _normalize_runtime_field accepts list, string-sugar, missing -> [""]; malformed warns and falls back to universal so a typo never silently drops a skill. - load_skills(..., current_runtime=...) filters out skills whose runtime list lacks "" or current_runtime, with an INFO log line. - BaseAdapter.start passes type(self).name() so the live adapter drives the filter; SkillsWatcher takes the same kwarg so hot-reload honors it. 8 new tests cover default universal, no-field universal, explicit match/mismatch, string sugar, wildcard short-circuit, current_runtime=None (preserves old behavior), and malformed-warns-not-drops. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 01:57:43 -07:00
Hongming Wang	e99f937630	Merge pull request #2157 from Molecule-AI/chore/drop-cli-executor-from-runtime chore(workspace): drop cli_executor — Phase 3 of #87 [DRAFT]	2026-04-27 08:24:30 +00:00
Hongming Wang	4959c37040	Merge pull request #2158 from Molecule-AI/feat/steer-agent-to-attachments-field feat(tools): tighten send_message_to_user description to forbid pasting URLs in body	2026-04-27 08:24:02 +00:00
Hongming Wang	98ca5c50fa	chore(workspace): drop cli_executor — Phase 3 of #87 (DRAFT, blocked on gemini-cli image rebuild) DRAFT — do NOT merge until gemini-cli template image rebuilds with its local cli_executor.py copy (template PR #9 just merged at 07:59 UTC; image build kicks off now). Final adapter-specific deletion from molecule-runtime, completing #87 for the priority adapters (claude-code via PR #2156, plus gemini-cli via this PR + template #9). Deletes: - workspace/cli_executor.py (461 LOC) — CLIAgentExecutor + the RUNTIME_PRESETS dict for codex / ollama / gemini-cli. The file moved to molecule-ai-workspace-template-gemini-cli (PR #9, merged). - workspace/tests/test_agent_base_urls.py — only consumer of CLIAgentExecutor in the test suite. Tests for the executor behavior live in the template repo now. Updates: - workspace/tests/test_executor_helpers.py — docstring refresh: executor_helpers.py is the runtime-agnostic shared helpers; the executor classes themselves live in template repos post-#87. Codex / ollama presets disappear naturally with the file. They never had template repos, so no production path could invoke them anyway — this is dead-code removal as a side effect of the move. Verified-safe-to-delete: - heartbeat.py: doesn't import cli_executor - claude_sdk_executor.py: deleted by PR #2156 (in flight) - preflight.py: only references runtime names by string; no import - main.py: doesn't import cli_executor (uses adapter discovery via ADAPTER_MODULE; the template's adapter constructs the executor) - Only test_agent_base_urls.py + test_executor_helpers.py docstring referenced cli_executor Verification: - 1249/1249 workspace pytest pass (was 1251; -2 = test_agent_base_urls.py cases — exact match) - No live import of cli_executor anywhere in molecule-core after deletion (grep verified) Sequencing: 1. ✅ Template PR #9 (gemini-cli local copy) — MERGED 2. ⏳ Template image rebuild — running 3. THIS PR — wait until image is published, then mark ready-for-review Closes #87 for the priority adapters: workspace/ is now adapter- agnostic except for adapter discovery (ADAPTER_MODULE) + the runtime_wedge primitive. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 01:22:39 -07:00
Hongming Wang	7504aba934	feat(tools): tighten send_message_to_user description to forbid pasting URLs in body Root-cause fix for #118 (chat attachments rendering as plain text links instead of download chips). User flagged with screenshot 2026-04-26 showing the Design Director agent pasting https://files.catbox.moe/… in the message body — chat rendered the URL as plain markdown text, unclickable in the canvas's bubble layout, and unreachable in any SaaS deployment where the user's browser can't egress to catbox. The structured `attachments` field already exists, the canvas's AttachmentChip already renders well, the WebSocket broadcast already carries attachments verbatim — the missing piece was the LLM choosing the body over the structured field. Tighten the tool description so it trains the right behavior. Three targeted strengthenings: 1. Top-level tool description: enumerated use case (4) now reads "via the `attachments` field (NEVER paste file URLs in `message`)". The all-caps NEVER + the explicit field name move the LLM toward the structured path on first read. 2. `message` param: adds an explicit DO NOT rule with rationale. Includes the SaaS-reachability reason so operators can grep for "SaaS" and find this design constraint instead of re-discovering it after a tenant complaint. Calls out catbox.moe + file:// by name as concrete examples of forbidden hosts (those are the two we've seen in production). 3. `attachments` param: leads with REQUIRED, lists the bad alternatives explicitly (pasting URLs, base64-encoding, telling user to look at a path). LLMs handle "use X, NOT Y" framings better than "use X" alone — observed during prompt-engineering iteration on hermes' tool descriptions. Tests pin all three load-bearing phrases (4 new in test_a2a_mcp_server.py) so a future doc edit that softens or drops them fails CI. Brittle by design — these are prompt-engineering invariants, not implementation details. This is the root-cause fix. A defensive canvas-side backstop (auto- detect download-shaped URLs in body and convert to chips) is a follow-up that could land separately if the steering proves insufficient in practice. Verification: - 1190/1190 workspace pytest pass - 4 new test_a2a_mcp_server.py cases all green Closes the steering half of #118. The structured-attachments-only contract was already enforced server-side (PR #2130 added per-attachment validation); this PR closes the prompt-side gap. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 01:13:11 -07:00
Hongming Wang	4e6030d783	Merge pull request #2156 from Molecule-AI/chore/drop-claude-sdk-executor-from-runtime chore(workspace): drop claude_sdk_executor — Phase 2 of #87	2026-04-27 08:02:51 +00:00
Hongming Wang	2fbf6b6b27	Merge pull request #2155 from Molecule-AI/feat/preflight-runtime-discovery feat(preflight): replace SUPPORTED_RUNTIMES static list with adapter discovery	2026-04-27 08:02:39 +00:00
Hongming Wang	4b5ac2ebc2	chore(workspace): drop claude_sdk_executor — Phase 2 of #87 Phase 2 of the universal-runtime refactor (task #87). Now that the claude-code template repo ships its own claude_sdk_executor.py (template PR #13 merged + image rebuilt at 07:36 UTC) the molecule-runtime no longer needs to ship the file. Deletes: - workspace/claude_sdk_executor.py (704 LOC) - workspace/tests/test_claude_sdk_executor.py (~1.6K LOC) Updates: - workspace/runtime_wedge.py — drops the "Compatibility shim" docstring section. The shim was time-bounded ("removed once #87 Phase 2 lands"); this is that PR. - workspace/tests/test_runtime_wedge.py — drops the TestClaudeSdkExecutorReExportShim test class (the shim doesn't exist anymore so the identity assertions would fail at import). - workspace/tests/conftest.py — drops the claude_agent_sdk stub. Its only consumer was test_claude_sdk_executor.py which is gone; no other test imports the SDK. - workspace/cli_executor.py — comment refresh: claude-code template repo (not workspace/) is now the home for ClaudeSDKExecutor. Verified-safe-to-delete: - heartbeat.py: migrated to runtime_wedge in PR #2154 (no longer imports from claude_sdk_executor) - cli_executor.py: only comments referenced claude_sdk_executor; its line-117 ValueError defends against accidental routing - tests: only test_claude_sdk_executor.py + test_runtime_wedge.py's shim class consumed the deleted module; both removed in this PR Verification: - 1182/1182 workspace pytest pass (was 1251; -69 = exactly the deleted test cases — zero unexpected regressions) - No live import of claude_sdk_executor anywhere in molecule-core after deletion (grep verified) Closes #87 for the claude-code adapter. Hermes is already template-only. The remaining adapter-specific code in workspace/ is cli_executor.py (codex/ollama/gemini-cli) tracked by task #122. preflight.py's SUPPORTED_RUNTIMES static list is tracked by task #123 (PR #2155 in flight). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 00:52:55 -07:00
Hongming Wang	7dba700ac3	feat(preflight): replace SUPPORTED_RUNTIMES static list with adapter discovery Closes task #123 — last piece of #87 cleanup. Pre-fix: workspace/preflight.py:11 hardcoded a tuple of "supported" runtime names (claude-code, codex, ollama, langgraph, etc.). Every new template repo required a code change in molecule-runtime to be recognized — direct violation of the universal-runtime principle (#87) where adapters declare themselves and the runtime stays generic. Post-fix: discovery-based validation via the same ADAPTER_MODULE env var that production load paths already consult (workspace/adapters/__init__.py:get_adapter). Distinguished failure modes so operator messages are concrete: - ADAPTER_MODULE unset → "no adapter installed; set the env var" - ADAPTER_MODULE set but module won't import → import error type + message - module imports but no Adapter class → "convention violation, add `Adapter = YourClass`" - Adapter.name() raises → caught with operator message - Adapter.name() returns non-string → contract violation message - Adapter.name() doesn't match config.runtime → drift WARNING (not fatal; the adapter wins in production, config.yaml is just documentation) The drift case is the one behavioral change worth calling out: the prior static-list path would have hard-failed config.runtime values not in the allowlist. With discovery, an unknown runtime in config.yaml is just a documentation drift — the adapter that's actually installed runs regardless. Operator gets a warning naming both the configured and installed names so they can fix whichever is stale. Tests: - Replaces the obsolete "static list pass/fail" tests with 6 new cases covering each distinguished failure mode, plus a positive test for the adapter-matches-config happy path - Adds an autouse `_default_langgraph_adapter` fixture that pre-installs a fake adapter via sys.modules monkey-patching, so existing tests building default WorkspaceConfig (runtime="langgraph") inherit a valid adapter without each test setting ADAPTER_MODULE - Failure-mode tests opt out of the default fixture via @pytest.mark.no_default_adapter (registered in pytest.ini) - Sentinel pattern (`_UNSET = object()`) for `name_returns` so None is a passable test value (otherwise `is not None` would skip the None branch — exact bug the sentinel avoids) Verification: - 22/22 preflight tests pass (was 16; +6 new failure-path tests) - 1256/1256 workspace pytest pass (was 1251; +5 net) - No production code path other than preflight changed Source: 2026-04-27 #87 cleanup audit after PR #2154 (wedge extraction). This change is independent of the cli_executor.py template moves (task #122) — completes one of the two remaining cleanup items. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 00:44:51 -07:00
Hongming Wang	66b9c04057	Merge pull request #2154 from Molecule-AI/refactor/extract-wedge-state-from-claude-sdk refactor(wedge): extract claude_sdk_executor wedge state into runtime_wedge module	2026-04-27 07:22:20 +00:00
Hongming Wang	5e049244d6	refactor(wedge): mark re-exports explicit via __all__ Addresses github-code-quality unused-import flag on the runtime_wedge re-export shim. Adds __all__ listing the names that exist purely for backwards-compat (is_wedged / wedge_reason / _reset_sdk_wedge_for_test) so static analysis recognizes the imports as deliberate exports. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 00:20:23 -07:00
Hongming Wang	feb544938b	refactor(wedge): address review feedback — class wrap + import-path doc + dedupe shim rationale Three changes from /code-review-and-quality on PR #2154: 1. Optional (architecture): wrap state in a private _WedgeState class instead of bare module-level globals. Public API (mark_wedged / clear_wedge / is_wedged / wedge_reason / reset_for_test) is unchanged — adapters never see the class. The class is forward-cover for any future per-scope variant (multiple executors per process, a keyed registry, etc.) without churning the call sites. Today there's exactly one instance (_DEFAULT) so behavior is identical. 2. Optional (readability): clarify the import path in the integration recipe — in a TEMPLATE repo it's `from molecule_runtime.runtime_wedge` (PyPI package); in molecule-core itself it's `from runtime_wedge` (top-level module). Removes the trap where a contributor reading the docstring while editing in-repo copies the template-style import and gets ImportError. 3. Nit (readability): dedupe the shim rationale. claude_sdk_executor's re-export comment now points to runtime_wedge's "Compatibility shim" section as the source of truth instead of restating the same content. Avoids docs-in-two-places drift risk. Verification: - 1251/1251 workspace pytest pass (no behavior change — class wrap is pure plumbing; module-level helpers delegate to the singleton) - All shim re-export identity tests still pass (the shim's `is_wedged is runtime_wedge.is_wedged` assertion holds because we re-export the SAME function object that delegates to _DEFAULT) No new tests needed — the existing test suite covers the public API contract; the class is an implementation detail behind that contract. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 00:16:33 -07:00
Hongming Wang	cd899c969f	docs(wedge): integration recipe for adapters that want to flip-to-degraded Doc-only follow-up to the wedge-state extraction. Adds proactive guidance so the next adapter (hermes / codex / langgraph / a future template) discovers the runtime_wedge primitive and integrates the ~6 LOC pattern uniformly instead of inventing its own wedge state. Two additions: - workspace/runtime_wedge.py — new "How to use from a NEW adapter" section in the module docstring with the minimum viable integration recipe, what-you-get-for-free list, and explicit DON'TS (don't store local wedge state, don't mark for transient errors, don't write your own clear logic). Plus a "when wedge is the WRONG primitive" note to keep adopters from over-using it. - workspace/adapter_base.py — adds runtime_wedge to the "Cross-cutting capabilities your adapter can opt into" list in BaseAdapter's docstring (alongside capabilities() and idle_timeout_override()). Discoverability path: adapter author reads BaseAdapter docstring → sees runtime_wedge mention → reads runtime_wedge module docstring → has the recipe. Also tightens the "to add a new agent infra" steps in BaseAdapter to match the actual current model (standalone template repo + ADAPTER_MODULE env var) rather than the obsolete workspace/adapters/<infra>/ layout that hasn't been the path since the universal-runtime extraction started. Zero code change. Tests untouched (1251/1251 still pass). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 00:12:14 -07:00
Hongming Wang	1d231ed295	refactor(wedge): extract claude_sdk_executor wedge state into runtime_wedge module Prerequisite for the universal-runtime refactor (task #87) to move claude_sdk_executor.py out of molecule-runtime into the claude-code template repo. heartbeat.py had a hard import: from claude_sdk_executor import is_wedged, wedge_reason which would break the moment the executor moves out of the runtime package — the heartbeat would lose access to the wedge state used to flip workspace status to degraded. Extract the wedge state to a runtime-side module that the heartbeat can keep importing regardless of which adapter executor is wedged: - workspace/runtime_wedge.py — single-flag state + mark_wedged / clear_wedge / is_wedged / wedge_reason / reset_for_test. Same semantics as the original claude_sdk_executor implementation (sticky first-write-wins, auto-clear on observed success). 100 LOC of pure stateless helpers; lock-free ok because there's one executor per workspace process today. - workspace/claude_sdk_executor.py — drops the in-file definitions; re-exports the same names from runtime_wedge as a backwards-compat shim. Any third-party adapter that imported is_wedged / wedge_reason / _mark_sdk_wedged from claude_sdk_executor keeps working for one release cycle while they migrate to runtime_wedge. - workspace/heartbeat.py — _runtime_state_payload() now imports from runtime_wedge instead of claude_sdk_executor. Lazy-import pattern preserved; the docstring updated to explain the new cross-cutting source-of-truth. Tests (10 new in test_runtime_wedge.py): - Default state (unwedged), mark sets flag, first-write-wins, clear restores healthy, clear-when-not-wedged is no-op, re-marking after clear is allowed - Re-export shim: each old name in claude_sdk_executor IS the runtime_wedge function (identity check), state is shared (marking via the executor shim is observable via runtime_wedge and vice versa) Verification: - 1251/1251 workspace pytest pass (was 1241 after orphan deletion; +10 = exactly the new test_runtime_wedge.py cases) - All existing test_claude_sdk_executor.py cases (which call _mark_sdk_wedged via the shim) still pass After this lands + the claude-code template image rebuilds with the local claude_sdk_executor.py copy (template PR #13), the molecule- core deletion of workspace/claude_sdk_executor.py becomes safe (the shim deletion comes alongside the file deletion, since runtime_wedge is the new public API). See project memory `project_runtime_native_pluggable.md`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 00:08:53 -07:00
Hongming Wang	c1e9aa7461	Merge pull request #2153 from Molecule-AI/fix/block-internal-paths-shallow-clone-bug fix(ci): block-internal-paths handle merge_group + shallow-clone BASE	2026-04-27 06:58:32 +00:00
hongmingwang-moleculeai	5d49cd7843	Merge pull request #2152 from Molecule-AI/chore/delete-orphan-hermes-executor chore(workspace): delete orphan HermesA2AExecutor (-1.8K LOC dead code)	2026-04-27 06:58:21 +00:00
Hongming Wang	d46d558ca9	Merge pull request #2148 from Molecule-AI/test/canvas-lib-utils-runtime-names-1815 test(canvas): cover utils.cn + runtime-names.runtimeDisplayName (0% → 100%) (#1815)	2026-04-27 06:57:57 +00:00
Hongming Wang	a682dcb502	Merge pull request #2149 from Molecule-AI/test/canvas-actions-1815 test(canvas): cover canvas-actions restart-pending helpers (25% → 100%) (#1815)	2026-04-27 06:55:36 +00:00
Hongming Wang	17a6800374	Merge pull request #2150 from Molecule-AI/feat/priority-runtimes-e2e test(e2e): claude-code + hermes priority-runtimes happy path	2026-04-27 06:55:20 +00:00

1 2 3 4 5 ...

3220 Commits