Ryan's bug report (#2934) walked through ~45 min of debugging a stock external-runtime install. This PR fixes the four items he flagged that have a small surface, and stubs out the larger ones for follow-up. Fixed in this PR ================ #1 — Python floor disclosure (README in publish bundle) Add an explicit "Requires Python ≥3.11" section that calls out the cryptic "Could not find a version that satisfies the requirement" failure mode; recommend `pipx install` over `pip install` so the binary lands on PATH automatically; show the explicit `pip install --user` alternative with the PATH caveat. #3 — MOLECULE_WORKSPACE_TOKEN_FILE support (mcp_workspace_resolver.py) Add a third resolution step between the inline env var and the in-container CONFIGS_DIR fallback. Operators can write the bearer to a 0600 file (e.g. ~/.config/molecule/token) and point MOLECULE_WORKSPACE_TOKEN_FILE at it, keeping the secret out of ~/.zsh_history and out of plaintext in MCP-host configs like ~/.claude.json. Inline TOKEN still wins on conflict so rotation flows are predictable. README documents the safer option as the recommended path. 6 new tests pin every leg (file resolves, inline wins, missing/empty file falls through, blank env unset-equivalent, help text advertises it). #4 — Push delivery 3-condition gating (README in publish bundle) Document that real-time push on Claude Code requires (a) the server to declare experimental.claude/channel (we do), (b) the server to be marketplace-plugin-sourced (operators must scaffold their own until the official marketplace lands — see #2934 follow-up), and (c) the --dangerously-load-development-channels flag on the claude invocation. Until any of the three is in place, delivery silently falls back to poll mode with no diagnostic. The README now says all of this explicitly so a new operator doesn't grep the binary for channel_enable to figure it out. #8 — serverInfo.name mismatch (a2a_mcp_server.py) The server reported `serverInfo.name = "a2a-delegation"` while operators register it as `molecule` (the name in `claude mcp add molecule …`). Harmless on tool routing today but matters for any future Claude Code allowlist that gates push by hardcoded server name. Renamed to "molecule" with an inline comment explaining the invariant. Deferred (separate issues to track) =================================== #2 — covered transitively by #1's pipx recommendation; no separate fix. #5 — `moleculesai/claude-code-plugin` marketplace repo (substantial new repo work; the README references it as a documented follow-up). #6 — `molecule-mcp doctor` subcommand (substantial new CLI surface; mentioned in the README's push-vs-poll section as the planned diagnostic for silent push fallback). #7 — `--dangerously-load-development-channels` rename — not in our control; that's Claude Code's flag. Tests ===== 164/164 mcp_cli + a2a_mcp_server tests pass locally (WORKSPACE_ID=00000000-0000-0000-0000-000000000001 pytest …) including 6 new TestTokenFileEnv cases. Wheel builds successfully via scripts/build_runtime_package.py with the new README markers verified in the output. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| demo-freeze-snapshots | ||
| ops | ||
| build_runtime_package.py | ||
| build-images.sh | ||
| bundle-compile.sh | ||
| canary-smoke.sh | ||
| check-cascade-list-vs-manifest.sh | ||
| cleanup-rogue-workspaces.sh | ||
| clone-manifest.sh | ||
| demo-day-runbook.md | ||
| demo-freeze.sh | ||
| demo-thaw.sh | ||
| dev-start.sh | ||
| import-agent.sh | ||
| lockdown-tenant-sg.sh | ||
| measure-coordinator-task-bounds-runner.sh | ||
| measure-coordinator-task-bounds.sh | ||
| nuke-and-rebuild.sh | ||
| post-rebuild-setup.sh | ||
| README.md | ||
| refresh-workspace-images.sh | ||
| rollback-latest.sh | ||
| test_build_runtime_package.py | ||
| test-a2a-cross-runtime.sh | ||
| test-all-adapters.sh | ||
| test-all-runtimes-a2a-e2e.sh | ||
| test-all.sh | ||
| test-cross-agent-chat.sh | ||
| test-hermes-plugin-e2e.sh | ||
| test-nuke-and-rebuild.sh | ||
| test-team-e2e.sh | ||
| wheel_smoke.py | ||
scripts/
Operational and one-off scripts for molecule-core. Most are self-documenting — see the header comments in each file.
RFC #2251 coordinator task-bound harnesses
There are three related scripts; pick the right one:
| Script | Purpose | Targets |
|---|---|---|
measure-coordinator-task-bounds.sh |
Canonical v1 harness for the RFC #2251 / Issue 4 reproduction. Provisions a PM coordinator + Researcher child via claude-code-default + langgraph templates, sends a synthesis-heavy A2A kickoff, observes elapsed time + activity trace. |
OSS-shape platform — localhost or any /workspaces-shaped endpoint. Has tenant/admin-token guards for non-localhost runs. |
measure-coordinator-task-bounds-runner.sh |
Generalised runner for the same measurement contract but with arbitrary template + secret + model combinations (Hermes/MiniMax, etc.). Useful for cross-runtime variants without modifying the canonical harness. | Same as above (local or SaaS via MODE=saas). |
measure-coordinator-task-bounds.sh (in molecule-controlplane) |
Production-shape variant that bootstraps a real staging tenant via POST /cp/admin/orgs, then runs the same measurement against <slug>.staging.moleculesai.app. |
Staging controlplane only — refuses to run against production. |
See reference_harness_pair_pattern (auto-memory) for when to use which
and the cross-repo design rationale.
Common safety pattern across all three
- Cleanup trap on EXIT/INT/TERM auto-deletes provisioned resources.
DRY_RUN=1prints plan + auth fingerprint, exits before any state mutation. Run this before pointing at staging or any shared infrastructure.- Non-target guard refuses arbitrary endpoints (the controlplane
variant is locked to
staging-api.moleculesai.app; the OSS variant requires explicit auth + tenant scoping for non-localhost PLATFORM). - Cleanup failures emit
cleanup_*_failedevents with remediation hints; no silenced curl. ADMIN_TOKEN expiring mid-run surfaces as a structured event rather than a silent leak.
Activity trace caveat
If activity_trace.raw == "<endpoint_unavailable>", the per-workspace
/activity endpoint isn't wired on the target build — the bound
measurement is INCONCLUSIVE on the platform-ceiling question. Either
wire the endpoint or replace with the equivalent Datadog query. Note
that /activity accepts a since_secs query parameter; see the
endpoint handler for the supported range.
Other scripts
cleanup-rogue-workspaces.sh— emergency teardown for leaked workspaces. Prompts for confirmation. Pair with the harnesses if a cleanup trap fails (seecleanup_*_failedevents).canary-smoke.sh— quick smoke test for canary releases.dev-start.sh— local-dev platform bring-up.
The rest are self-documenting in their header comments.