Thin caller for molecule-ci's reusable disable-auto-merge-on-push
workflow. Forces operator re-engagement when a commit is pushed to
an open PR with auto-merge already enabled.
Pairs with the org-wide "Automatically delete head branches" repo
setting (also enabled today). Defense in depth:
1. Repo setting blocks pushes to a merged-and-deleted branch
(post-merge orphan case — what bit #2174 today: my second
commit landed on an already-merged-and-deleted branch).
2. This workflow catches in-queue races (push lands while the
merge queue is processing) by disabling auto-merge so the
operator must explicitly re-engage.
Together they cover the full lifecycle of "auto-merge enabled →
new commits arrive" without relying on operator discipline.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two compounding bugs that bit hermes (and any other workspace that
reaches main.py:142):
1. workspace/lib/ was in EXCLUDE_DIRS so the published wheel didn't
contain the directory at all. main.py imports `from lib.pre_stop
import read_snapshot` (and `build_snapshot`, `write_snapshot`) so
every workspace startup that reaches the snapshot path crashed
with `ModuleNotFoundError: No module named 'lib'`.
2. Even if lib/ had shipped, `lib` wasn't in SUBPACKAGES so the
import-rewriter would have left the bare `from lib.pre_stop`
unqualified — it would still fail because the package would only
be reachable as `molecule_runtime.lib`.
Fix: move `lib` from EXCLUDE_DIRS to SUBPACKAGES (one entry each).
Drift gate extension: the existing gate I added in #2163 only
asserted TOP_LEVEL_MODULES against workspace/*.py. This change adds
the symmetric assertion for SUBPACKAGES against workspace/<dir>/
(filtered by EXCLUDE_DIRS + presence of __init__.py). Catches both:
- Subpackage added to workspace/ but missed in SUBPACKAGES
- Subpackage missing from workspace/ but lingering in SUBPACKAGES
- Subpackage wrongly in EXCLUDE_DIRS while also referenced by
rewritten imports (the lib case)
Tested locally: build of 0.1.99 now ships lib/ and main.py contains
`from molecule_runtime.lib.pre_stop import ...` correctly rewritten.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tonight's wire-real E2E sweep exposed 12+ root causes across the post-
#87 template extraction. Most would have been caught by an actual
provision-and-online test running on each template — but the test only
covered claude-code + hermes. Extending it to cover all 8 ensures any
future regression in any template fails the test, not production.
What's added:
- run_openai_runtime(runtime, label): generic provisioner for the 5
OpenAI-backed templates (langgraph, crewai, autogen, deepagents,
openclaw). Same shape as run_hermes minus the HERMES_* config block
that hermes-agent needs.
- run_gemini_cli: separate function — gemini-cli wants a Google AI
key (E2E_GEMINI_API_KEY), not OpenAI.
- Each new runtime registered in the dispatch loop. New `all` keyword
for E2E_RUNTIMES runs every covered runtime.
claude-code + hermes keep their dedicated functions; both have unique
provisioning quirks (claude-code OAuth + claude-code-specific volume
mounts; hermes 15-min cold-boot) that don't generalize cleanly.
Skip-if-no-key pattern matches the existing one — partially-keyed CI
gets clean skips, not false-fails.
Usage:
E2E_OPENAI_API_KEY=... E2E_RUNTIMES=langgraph ./test_priority_runtimes_e2e.sh
E2E_OPENAI_API_KEY=... E2E_RUNTIMES=all ./test_priority_runtimes_e2e.sh
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The conftest mock only exposed `new_agent_text_message`, the pre-v1
name. After fixing a2a_executor.py to use the v1 name
`new_text_message`, the mock didn't satisfy the import → CI red.
Mock both names (aliased to the same lambda) so any in-flight test
that still references the old name keeps working until the next
sweep removes those references.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the HIGH-severity dependabot alert on workspace-server's jwt-go
pin. Upstream advisory GHSA-mh63-6h87-95cp / CVE-2025-30204:
"jwt-go allows excessive memory allocation during header parsing" —
fixed in v5.2.2.
Patch bump within the v5.x line; semver guarantees no API change. Full
workspace-server test suite passes (\`go test ./...\` clean across all
18 packages).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
a2a-sdk v1 renamed `new_agent_text_message` → `new_text_message`
(role=Role.agent is now the default). Same fix landed in the hermes
template earlier today; this is the runtime-side equivalent.
NOT dead code: a2a_executor.py is the LangGraph A2A executor, used by
the langgraph + deepagents templates. Both templates currently import
it via bare `from a2a_executor import LangGraphA2AExecutor` — which is
a separate bug in those templates, filed/fixed separately.
Symptom in a2a_executor.py form: any langgraph or deepagents workspace
that calls create_executor crashes with `ImportError: cannot import
name 'new_agent_text_message' from 'a2a.helpers'`. Doesn't surface for
claude-code or hermes (their templates use their own executors and
don't load a2a_executor).
Five call sites updated, one import line, one comment. Test suite
already passes against the new symbol — `python -c "from
molecule_runtime.a2a_executor import LangGraphA2AExecutor"` resolves
cleanly after this change.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
#2000 fixed one symptom — TENANT_IMAGE pinned to `staging-a14cf86`
(10 days stale) silently no-op'd four upstream fixes on 2026-04-24.
This adds the audit pattern as a re-runnable script so the broader
class is observable on demand without new CI infrastructure.
Audit results today (2026-04-27):
controlplane / production: 54 vars audited, 0 drift-prone pins
controlplane / staging: 52 vars audited, 0 drift-prone pins
So the immediate audit deliverable is clean — TENANT_IMAGE is the only
known violation and #2000 already fixed it. The script makes the
ongoing audit a 5-second command instead of a manual one.
Detection regex catches:
* branch-SHA suffixes (`staging|main|prod|production-<6+ hex>`)
— the exact 2026-04-24 incident shape
* version pins after `:` or `=` (`:v1.2.3`, `=v0.1.16`)
— same drift class, just rendered differently
Anchoring on `:` or `=` keeps prose like "version 1.2.3 of the api"
out of the false-positive set. UUIDs, ARNs, AMI IDs, secrets, and
floating tags (`:staging-latest`, `:main`) pass through untouched.
Regression test (tests/ops/test_audit_railway_sha_pins.sh) pins 20
representative cases — 9 should-flag (covering all four branch
prefixes + semver variants + middle-of-value matches) and 11
should-pass (the false-positive guards). Same regex inlined in both
files so a future tweak that weakens detection fails the test in
lockstep with weakening the audit.
Both files shellcheck clean.
CI gate (acceptance criterion's "regression: add a CI check") is
deliberately scoped out — querying Railway from CI requires plumbing
RAILWAY_TOKEN as a repo secret, which is multi-step setup. The
re-runnable script + test cover the same surface today; the CI
workflow is a small follow-up once the token is provisioned.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Quota gates are resource-state conflicts, not payment failures —
RFC 9110 reserves 402 for billing/payment failures specifically. The
canonical Molecule-AI/docs PR #82 already shipped the corrected text;
this brings the molecule-core copy of the tutorial in line.
The inline parenthetical "(not 402 Payment Required — quota gates are
resource-state conflicts, not payment failures, per RFC 9110)" doubles
as a regression anchor: a future edit that flips 409 back to 402 would
have to also reword that explanation, making the change a deliberate
two-step act rather than a casual oversight.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the second of two skipped tests in workspace_provision_test.go
that were blocked on interface refactors. The Broadcaster + CP
provisioner halves landed in earlier #1814 cycles; this is the
plugin-source-registry half.
Refactor:
- Add handlers.pluginSources interface with the 3 methods handler
code actually calls (Register, Resolve, Schemes)
- Compile-time assertion `var _ pluginSources = (*plugins.Registry)(nil)`
catches future method-signature drift at build time
- PluginsHandler.sources narrowed from *plugins.Registry to the
interface; production wiring (NewPluginsHandler, WithSourceResolver)
still passes *plugins.Registry — satisfies the interface
Production fix (#1206 leak):
- resolveAndStage's Fetch-failure path was interpolating err.Error()
into the HTTP response body via `failed to fetch plugin from %s: %v`.
Resolver errors routinely contain rate-limit text, github request
IDs, raw HTTP body fragments, and (for local resolvers) file system
paths — none has any business landing in a user's browser.
- Body now carries just `failed to fetch plugin from <scheme>`; the
status code already differentiates the failure shape (404 not
found, 504 timeout, 502 generic). Full err detail stays in the
server-side log line one statement above.
Test:
- 6 sub-tests covering every error path inside resolveAndStage:
empty source, invalid format, unknown scheme, local
path-traversal, unpinned github (PLUGIN_ALLOW_UNPINNED unset),
Fetch failure with a leaky synthetic error
- The Fetch-failure case plants 5 realistic leak markers in the
resolver's error string (rate limit text, x-github-request-id,
auth_token, ghp_-prefixed token, /etc/passwd path); the assertion
fails if ANY appears in the response body
- Table-driven so a future error path added to resolveAndStage gets
one new row, not a copy-paste of the assertion logic
Verification:
- 6/6 sub-tests pass
- Full workspace-server test suite passes (interface refactor is
non-breaking; production caller paths unchanged)
- go build ./... clean
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The wheel's pyproject.toml has declared
`molecule-runtime = "molecule_runtime.main:main_sync"` since the
publish pipeline was created on 2026-04-26, but the function
itself was never present in workspace/main.py — it lived in the
pre-monorepo molecule-ai-workspace-runtime repo and was lost
during the consolidation that made workspace/ the source of truth.
The 0.1.15 wheel still had main_sync from a leftover snapshot,
so the regression went unnoticed until 0.1.16 (the first wheel
built from the new source-of-truth) shipped. Symptom: every
workspace container restart loops with
ImportError: cannot import name 'main_sync' from 'molecule_runtime.main'
— the molecule-runtime CLI script's first line tries to import
the missing symbol. Workspaces stay in `provisioning` until the
10-min sweep marks them failed.
Caught by .github/workflows/runtime-pin-compat.yml, which already
imports the symbol by name as its smoke test. (That check kept
failing red on every recent merge_group run; this PR fixes the
underlying symbol-not-found instead of the smoke step.)
Also strengthens publish-runtime.yml's wheel smoke from
`import molecule_runtime.main` (loads the module — passes even
when entry-point target is missing) to `from molecule_runtime.main
import main_sync` (the actual contract the CLI script needs).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The skipped test exists to assert that provisionWorkspaceCP never
leaks err.Error() in WORKSPACE_PROVISION_FAILED broadcasts (regression
guard for #1206). Writing the test body required substituting a
failing CPProvisioner — but the handler's `cpProv` field was the
concrete *CPProvisioner type, so a mock had nowhere to plug in.
Refactor:
- Add provisioner.CPProvisionerAPI interface with the 3 methods
handlers actually call (Start, Stop, GetConsoleOutput)
- Compile-time assertion `var _ CPProvisionerAPI = (*CPProvisioner)(nil)`
catches future method-signature drift at build time
- WorkspaceHandler.cpProv narrowed to the interface; SetCPProvisioner
accepts the interface (production caller passes *CPProvisioner
from NewCPProvisioner unchanged)
Test:
- stubFailingCPProv whose Start returns a deliberately leaky error
(machine_type=t3.large, ami=…, vpc=…, raw HTTP body fragment)
- Drive provisionWorkspaceCP via the cpProv.Start failure path
- Assert broadcast["error"] == "provisioning failed" (canned)
- Assert no leak markers (machine type, AMI, VPC, subnet, HTTP
body, raw error head) in any broadcast string value
- Stop/GetConsoleOutput on the stub panic — flags a future
regression that reaches into them on this path
Verification:
- Full workspace-server test suite passes (interface refactor
is non-breaking; production caller path unchanged)
- go build ./... clean
- The other skipped test in this file (TestResolveAndStage_…)
is a separate plugins.Registry refactor and remains skipped
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two compounding bugs surfaced when 0.1.16 hit production today:
1. scripts/build_runtime_package.py had a hand-curated TOP_LEVEL_MODULES
set listing every workspace/*.py that should get its bare imports
rewritten to `molecule_runtime.X`. The set silently went stale:
- Missing: transcript_auth (added since #87 phase 1c), runtime_wedge,
watcher → unrewritten imports shipped, every workspace startup
died with ModuleNotFoundError.
- Stale: claude_sdk_executor, cli_executor (both removed in #87),
hermes_executor (never existed) → harmless but misleading.
2. publish-runtime.yml's wheel-smoke step asserted on stable invariants
(BaseAdapter, AdapterConfig, a2a_client error sentinel) but never
imported main. So even though main.py held the broken bare
`from transcript_auth import ...`, the smoke check passed.
Fixes:
- Build script now derives the on-disk module set from workspace/*.py
and asserts it matches TOP_LEVEL_MODULES exactly. Drift in either
direction fails the build with a specific diff message instead of
shipping a broken wheel. Closed-list typo guard preserved (we still
edit the set explicitly when a module is added/removed) — the gate
just makes drift impossible to ignore.
- TOP_LEVEL_MODULES updated to current reality: drop the 3 stale,
add the 3 missing.
- publish-runtime.yml wheel-smoke now `import molecule_runtime.main`
before the invariant asserts. main is the entry point and
transitively imports every module — any bare-import bug surfaces
as ModuleNotFoundError before PyPI accepts the upload.
Tested locally: `python3 scripts/build_runtime_package.py
--version 0.1.99 --out /tmp/build-test` succeeds, and
/tmp/build-test/molecule_runtime/main.py contains the rewritten
`from molecule_runtime.transcript_auth import ...`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When E2E_INTENTIONAL_FAILURE=1 poisons the tenant token, step 5/11's
`tenant_call POST /workspaces` curl exits 22 (HTTP error under
--fail-with-body). `set -e` propagates rc=22 directly, but the
script's documented contract emits only {0,1,2,3,4}, and the sanity
workflow's case statement only matches those. rc=22 falls through
to "Unexpected rc — investigate harness" and opens a false-positive
priority-high "safety net broken" issue (#2159, weekly run on
2026-04-27).
The trap now captures $? at entry (must be the first statement
before any command clobbers it) and at the end normalizes any
non-contract code to 1 (generic failure). Leak detection continues
to exit 4 directly, so its semantics are preserved.
Adds tests/e2e/test_harness_rc_normalization.sh — a self-contained
regression test that builds a stub harness with the same trap
pattern, triggers controlled exit codes, and asserts the
normalization. Covers the 5 contracted codes + curl-22 (the bug) +
3 representative network-failure codes + sigsegv-139.
Verification:
- 10/10 regression tests pass
- shellcheck clean on both modified files
- production teardown path unchanged for legitimate {1,2,3,4}
failures and the leak-detection exit 4
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a third trigger so any merge to staging that changes workspace/**
auto-publishes a new molecule-ai-workspace-runtime patch release. Closes
the human-in-loop gap that caused tonight's RuntimeCapabilities
ImportError outage.
Tonight: #117 added RuntimeCapabilities to molecule_runtime.adapters.base.
The merge landed at 02:37 UTC. Templates rebuilt their images at 07:37
UTC (4 hours later) and started importing the new symbol. PyPI was
still serving 0.1.15 (pre-#117) because nobody remembered to push a
runtime-vX.Y.Z tag or workflow_dispatch the publish. Result: every
template image shipped tonight runs `from molecule_runtime.adapters.base
import RuntimeCapabilities` against an installed runtime that doesn't
export it -> ImportError -> workspace never registers -> stuck in
provisioning until 10-min sweep.
Mechanism:
- New trigger: push to staging filtered to paths: ['workspace/**'].
Path filter applies only to branch pushes; the existing tag trigger
still fires unconditionally.
- Version derivation for the auto case: query PyPI's JSON API for
current latest, bump the patch component. PyPI is the source of
truth so concurrent runs don't double-publish (HTTP 400 on collision).
- concurrency: group serializes parallel staging merges so they don't
race on the bump computation. cancel-in-progress: false because each
workspace/** change deserves its own release.
- publish job now exposes its derived version as a job-level output so
the cascade reads it cleanly. Fixes a latent bug: cascade tried to
read steps.version.outputs.version, which is from a different job's
scope and silently resolved to empty -- then re-derived from
GITHUB_REF_NAME, which would have been "staging" under the new
trigger and produced an invalid version.
Tag-driven and manual-dispatch paths are unchanged.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The runtime-compat change in this branch added a `current_runtime`
kwarg to load_skills(); the watcher passes it through. Test mocks
that pre-date the kwarg signature broke with TypeError, which the
watcher's reload-error try/except swallowed — the symptom was empty
callback lists, not a clear failure.
Switching the fakes to accept **kwargs keeps them forward-compat for
future load_skills additions without another test churn.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
SKILL.md frontmatter can now declare `runtime: [claude-code]` or
`runtime: [hermes, claude-code]` to opt out of incompatible adapters
instead of failing at first invocation. Default `["*"]` means universal —
existing skill libraries need zero migration.
Borrowed from hermes' declarative skill-compat pattern surfaced in the
hermes architecture survey. The remaining two patterns (event-log
layer, observability config block) stay open under #119.
Wiring:
- SkillMetadata.runtime: list[str] = ["*"]
- _normalize_runtime_field accepts list, string-sugar, missing -> ["*"];
malformed warns and falls back to universal so a typo never silently
drops a skill.
- load_skills(..., current_runtime=...) filters out skills whose runtime
list lacks "*" or current_runtime, with an INFO log line.
- BaseAdapter.start passes type(self).name() so the live adapter drives
the filter; SkillsWatcher takes the same kwarg so hot-reload honors it.
8 new tests cover default universal, no-field universal, explicit
match/mismatch, string sugar, wildcard short-circuit, current_runtime=None
(preserves old behavior), and malformed-warns-not-drops.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
DRAFT — do NOT merge until gemini-cli template image rebuilds with
its local cli_executor.py copy (template PR #9 just merged at
07:59 UTC; image build kicks off now).
Final adapter-specific deletion from molecule-runtime, completing #87
for the priority adapters (claude-code via PR #2156, plus gemini-cli
via this PR + template #9).
Deletes:
- workspace/cli_executor.py (461 LOC) — CLIAgentExecutor + the
RUNTIME_PRESETS dict for codex / ollama / gemini-cli. The file
moved to molecule-ai-workspace-template-gemini-cli (PR #9, merged).
- workspace/tests/test_agent_base_urls.py — only consumer of
CLIAgentExecutor in the test suite. Tests for the executor
behavior live in the template repo now.
Updates:
- workspace/tests/test_executor_helpers.py — docstring refresh:
executor_helpers.py is the runtime-agnostic shared helpers; the
executor classes themselves live in template repos post-#87.
Codex / ollama presets disappear naturally with the file. They never
had template repos, so no production path could invoke them anyway —
this is dead-code removal as a side effect of the move.
Verified-safe-to-delete:
- heartbeat.py: doesn't import cli_executor
- claude_sdk_executor.py: deleted by PR #2156 (in flight)
- preflight.py: only references runtime names by string; no import
- main.py: doesn't import cli_executor (uses adapter discovery via
ADAPTER_MODULE; the template's adapter constructs the executor)
- Only test_agent_base_urls.py + test_executor_helpers.py docstring
referenced cli_executor
Verification:
- 1249/1249 workspace pytest pass (was 1251; -2 = test_agent_base_urls.py
cases — exact match)
- No live import of cli_executor anywhere in molecule-core after deletion
(grep verified)
Sequencing:
1. ✅ Template PR #9 (gemini-cli local copy) — MERGED
2. ⏳ Template image rebuild — running
3. THIS PR — wait until image is published, then mark ready-for-review
Closes#87 for the priority adapters: workspace/ is now adapter-
agnostic except for adapter discovery (ADAPTER_MODULE) + the
runtime_wedge primitive.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Root-cause fix for #118 (chat attachments rendering as plain text links
instead of download chips). User flagged with screenshot 2026-04-26
showing the Design Director agent pasting https://files.catbox.moe/…
in the message body — chat rendered the URL as plain markdown text,
unclickable in the canvas's bubble layout, and unreachable in any SaaS
deployment where the user's browser can't egress to catbox.
The structured `attachments` field already exists, the canvas's
AttachmentChip already renders well, the WebSocket broadcast already
carries attachments verbatim — the missing piece was the LLM choosing
the body over the structured field. Tighten the tool description so it
trains the right behavior.
Three targeted strengthenings:
1. Top-level tool description: enumerated use case (4) now reads
"via the `attachments` field (NEVER paste file URLs in `message`)".
The all-caps NEVER + the explicit field name move the LLM toward
the structured path on first read.
2. `message` param: adds an explicit DO NOT rule with rationale.
Includes the SaaS-reachability reason so operators can grep for
"SaaS" and find this design constraint instead of re-discovering it
after a tenant complaint. Calls out catbox.moe + file:// by name as
concrete examples of forbidden hosts (those are the two we've seen
in production).
3. `attachments` param: leads with REQUIRED, lists the bad
alternatives explicitly (pasting URLs, base64-encoding, telling
user to look at a path). LLMs handle "use X, NOT Y" framings
better than "use X" alone — observed during prompt-engineering
iteration on hermes' tool descriptions.
Tests pin all three load-bearing phrases (4 new in test_a2a_mcp_server.py)
so a future doc edit that softens or drops them fails CI. Brittle by
design — these are prompt-engineering invariants, not implementation
details.
This is the root-cause fix. A defensive canvas-side backstop (auto-
detect download-shaped URLs in body and convert to chips) is a
follow-up that could land separately if the steering proves
insufficient in practice.
Verification:
- 1190/1190 workspace pytest pass
- 4 new test_a2a_mcp_server.py cases all green
Closes the steering half of #118. The structured-attachments-only
contract was already enforced server-side (PR #2130 added per-attachment
validation); this PR closes the prompt-side gap.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 2 of the universal-runtime refactor (task #87). Now that the
claude-code template repo ships its own claude_sdk_executor.py
(template PR #13 merged + image rebuilt at 07:36 UTC) the
molecule-runtime no longer needs to ship the file.
Deletes:
- workspace/claude_sdk_executor.py (704 LOC)
- workspace/tests/test_claude_sdk_executor.py (~1.6K LOC)
Updates:
- workspace/runtime_wedge.py — drops the "Compatibility shim" docstring
section. The shim was time-bounded ("removed once #87 Phase 2 lands");
this is that PR.
- workspace/tests/test_runtime_wedge.py — drops the
TestClaudeSdkExecutorReExportShim test class (the shim doesn't
exist anymore so the identity assertions would fail at import).
- workspace/tests/conftest.py — drops the claude_agent_sdk stub.
Its only consumer was test_claude_sdk_executor.py which is gone;
no other test imports the SDK.
- workspace/cli_executor.py — comment refresh: claude-code template
repo (not workspace/) is now the home for ClaudeSDKExecutor.
Verified-safe-to-delete:
- heartbeat.py: migrated to runtime_wedge in PR #2154 (no longer
imports from claude_sdk_executor)
- cli_executor.py: only comments referenced claude_sdk_executor;
its line-117 ValueError defends against accidental routing
- tests: only test_claude_sdk_executor.py + test_runtime_wedge.py's
shim class consumed the deleted module; both removed in this PR
Verification:
- 1182/1182 workspace pytest pass (was 1251; -69 = exactly the
deleted test cases — zero unexpected regressions)
- No live import of claude_sdk_executor anywhere in molecule-core
after deletion (grep verified)
Closes#87 for the claude-code adapter. Hermes is already template-only.
The remaining adapter-specific code in workspace/ is cli_executor.py
(codex/ollama/gemini-cli) tracked by task #122. preflight.py's
SUPPORTED_RUNTIMES static list is tracked by task #123 (PR #2155 in
flight).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes task #123 — last piece of #87 cleanup.
Pre-fix: workspace/preflight.py:11 hardcoded a tuple of "supported"
runtime names (claude-code, codex, ollama, langgraph, etc.). Every
new template repo required a code change in molecule-runtime to be
recognized — direct violation of the universal-runtime principle
(#87) where adapters declare themselves and the runtime stays generic.
Post-fix: discovery-based validation via the same ADAPTER_MODULE env
var that production load paths already consult
(workspace/adapters/__init__.py:get_adapter). Distinguished failure
modes so operator messages are concrete:
- ADAPTER_MODULE unset → "no adapter installed; set the env var"
- ADAPTER_MODULE set but module won't import → import error type +
message
- module imports but no Adapter class → "convention violation, add
`Adapter = YourClass`"
- Adapter.name() raises → caught with operator message
- Adapter.name() returns non-string → contract violation message
- Adapter.name() doesn't match config.runtime → drift WARNING (not
fatal; the adapter wins in production, config.yaml is just
documentation)
The drift case is the one behavioral change worth calling out: the
prior static-list path would have hard-failed config.runtime values
not in the allowlist. With discovery, an unknown runtime in
config.yaml is just a documentation drift — the adapter that's
actually installed runs regardless. Operator gets a warning naming
both the configured and installed names so they can fix whichever
is stale.
Tests:
- Replaces the obsolete "static list pass/fail" tests with 6 new
cases covering each distinguished failure mode, plus a positive
test for the adapter-matches-config happy path
- Adds an autouse `_default_langgraph_adapter` fixture that
pre-installs a fake adapter via sys.modules monkey-patching, so
existing tests building default WorkspaceConfig (runtime="langgraph")
inherit a valid adapter without each test setting ADAPTER_MODULE
- Failure-mode tests opt out of the default fixture via
@pytest.mark.no_default_adapter (registered in pytest.ini)
- Sentinel pattern (`_UNSET = object()`) for `name_returns` so None
is a passable test value (otherwise `is not None` would skip the
None branch — exact bug the sentinel avoids)
Verification:
- 22/22 preflight tests pass (was 16; +6 new failure-path tests)
- 1256/1256 workspace pytest pass (was 1251; +5 net)
- No production code path other than preflight changed
Source: 2026-04-27 #87 cleanup audit after PR #2154 (wedge extraction).
This change is independent of the cli_executor.py template moves
(task #122) — completes one of the two remaining cleanup items.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Addresses github-code-quality unused-import flag on the runtime_wedge
re-export shim. Adds __all__ listing the names that exist purely for
backwards-compat (is_wedged / wedge_reason / _reset_sdk_wedge_for_test)
so static analysis recognizes the imports as deliberate exports.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three changes from /code-review-and-quality on PR #2154:
1. Optional (architecture): wrap state in a private _WedgeState class
instead of bare module-level globals. Public API (mark_wedged /
clear_wedge / is_wedged / wedge_reason / reset_for_test) is
unchanged — adapters never see the class. The class is forward-cover
for any future per-scope variant (multiple executors per process, a
keyed registry, etc.) without churning the call sites. Today there's
exactly one instance (_DEFAULT) so behavior is identical.
2. Optional (readability): clarify the import path in the integration
recipe — in a TEMPLATE repo it's `from molecule_runtime.runtime_wedge`
(PyPI package); in molecule-core itself it's `from runtime_wedge`
(top-level module). Removes the trap where a contributor reading the
docstring while editing in-repo copies the template-style import and
gets ImportError.
3. Nit (readability): dedupe the shim rationale. claude_sdk_executor's
re-export comment now points to runtime_wedge's "Compatibility shim"
section as the source of truth instead of restating the same content.
Avoids docs-in-two-places drift risk.
Verification:
- 1251/1251 workspace pytest pass (no behavior change — class wrap
is pure plumbing; module-level helpers delegate to the singleton)
- All shim re-export identity tests still pass (the shim's
`is_wedged is runtime_wedge.is_wedged` assertion holds because we
re-export the SAME function object that delegates to _DEFAULT)
No new tests needed — the existing test suite covers the public API
contract; the class is an implementation detail behind that contract.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Doc-only follow-up to the wedge-state extraction. Adds proactive
guidance so the next adapter (hermes / codex / langgraph / a future
template) discovers the runtime_wedge primitive and integrates the
~6 LOC pattern uniformly instead of inventing its own wedge state.
Two additions:
- workspace/runtime_wedge.py — new "How to use from a NEW adapter"
section in the module docstring with the minimum viable
integration recipe, what-you-get-for-free list, and explicit
DON'TS (don't store local wedge state, don't mark for transient
errors, don't write your own clear logic). Plus a "when wedge is
the WRONG primitive" note to keep adopters from over-using it.
- workspace/adapter_base.py — adds runtime_wedge to the
"Cross-cutting capabilities your adapter can opt into" list in
BaseAdapter's docstring (alongside capabilities() and
idle_timeout_override()). Discoverability path: adapter author
reads BaseAdapter docstring → sees runtime_wedge mention → reads
runtime_wedge module docstring → has the recipe.
Also tightens the "to add a new agent infra" steps in BaseAdapter to
match the actual current model (standalone template repo + ADAPTER_MODULE
env var) rather than the obsolete workspace/adapters/<infra>/ layout
that hasn't been the path since the universal-runtime extraction
started.
Zero code change. Tests untouched (1251/1251 still pass).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Prerequisite for the universal-runtime refactor (task #87) to move
claude_sdk_executor.py out of molecule-runtime into the claude-code
template repo. heartbeat.py had a hard import:
from claude_sdk_executor import is_wedged, wedge_reason
which would break the moment the executor moves out of the runtime
package — the heartbeat would lose access to the wedge state used to
flip workspace status to degraded.
Extract the wedge state to a runtime-side module that the heartbeat
can keep importing regardless of which adapter executor is wedged:
- workspace/runtime_wedge.py — single-flag state + mark_wedged /
clear_wedge / is_wedged / wedge_reason / reset_for_test. Same
semantics as the original claude_sdk_executor implementation
(sticky first-write-wins, auto-clear on observed success). 100
LOC of pure stateless helpers; lock-free ok because there's one
executor per workspace process today.
- workspace/claude_sdk_executor.py — drops the in-file definitions;
re-exports the same names from runtime_wedge as a backwards-compat
shim. Any third-party adapter that imported is_wedged / wedge_reason
/ _mark_sdk_wedged from claude_sdk_executor keeps working for one
release cycle while they migrate to runtime_wedge.
- workspace/heartbeat.py — _runtime_state_payload() now imports
from runtime_wedge instead of claude_sdk_executor. Lazy-import
pattern preserved; the docstring updated to explain the new
cross-cutting source-of-truth.
Tests (10 new in test_runtime_wedge.py):
- Default state (unwedged), mark sets flag, first-write-wins,
clear restores healthy, clear-when-not-wedged is no-op,
re-marking after clear is allowed
- Re-export shim: each old name in claude_sdk_executor IS the
runtime_wedge function (identity check), state is shared
(marking via the executor shim is observable via runtime_wedge
and vice versa)
Verification:
- 1251/1251 workspace pytest pass (was 1241 after orphan deletion;
+10 = exactly the new test_runtime_wedge.py cases)
- All existing test_claude_sdk_executor.py cases (which call
_mark_sdk_wedged via the shim) still pass
After this lands + the claude-code template image rebuilds with the
local claude_sdk_executor.py copy (template PR #13), the molecule-
core deletion of workspace/claude_sdk_executor.py becomes safe (the
shim deletion comes alongside the file deletion, since runtime_wedge
is the new public API).
See project memory `project_runtime_native_pluggable.md`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
[Molecule-Platform-Evolvement-Manager]
## What was broken
Same bug class as the secret-scan.yml fix in #2120 — block-internal-paths
hit `fatal: bad object <sha>` exit 128 on the staging push at
2026-04-27 06:50:33Z.
Two cases:
1. **`merge_group` events**: BASE/HEAD came from
`github.event.before` / `.after` which are push-event-only
properties. On merge_group both came back empty, the script fell
through to "scan entire tree" mode which is correct but
inefficient. Worse, when this workflow is required for the merge
queue (line 21-22), an empty-BASE entire-tree scan would run on
every queue check.
2. **`push` events with shallow clones**: `fetch-depth: 2` doesn't
always cover BASE across true merge commits. When BASE is in the
payload but absent from the local object DB, `git diff` errors out
with `fatal: bad object <sha>` and the job exits 128. This is what
broke today's staging push.
## Fix
Same shape as the secret-scan.yml fix (#2120):
- Add a dedicated `git fetch` step for `merge_group.base_sha`.
- Move event-specific SHAs into a step `env:` block; script uses a
`case` over `${{ github.event_name }}` covering pull_request /
merge_group / push (rather than `if pull_request / else push`
which left merge_group on the empty-BASE branch).
- On-demand fetch + `git cat-file -e` guard for push BASE so a SHA
that's payload-present-but-DB-absent triggers the fetch, and a
fetch failure falls through cleanly to "scan entire tree" instead
of exiting 128.
## Test plan
- [x] YAML structure preserved (no schema changes)
- [x] Bash logic mirrors the secret-scan recovery path tested in #2120
- [ ] CI green on this PR's pull_request scan + push to staging post-merge
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>