molecule-core

Author	SHA1	Message	Date
Hongming Wang	dc2f6bd378	Merge pull request #2167 from Molecule-AI/fix/saas-federation-tutorial-409 docs(saas-federation): fix workspace-limit response code (409, not 402) (#1754)	2026-04-27 11:36:02 +00:00
Hongming Wang	3679a6eff6	docs(saas-federation): fix workspace-limit response code (409, not 402) (#1754 ) Quota gates are resource-state conflicts, not payment failures — RFC 9110 reserves 402 for billing/payment failures specifically. The canonical Molecule-AI/docs PR #82 already shipped the corrected text; this brings the molecule-core copy of the tutorial in line. The inline parenthetical "(not 402 Payment Required — quota gates are resource-state conflicts, not payment failures, per RFC 9110)" doubles as a regression anchor: a future edit that flips 409 back to 402 would have to also reword that explanation, making the change a deliberate two-step act rather than a casual oversight. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 04:30:46 -07:00
hongmingwang-moleculeai	104650941a	Merge pull request #2165 from Molecule-AI/fix/main-sync-entry-point fix: restore main_sync entry point in workspace/main.py	2026-04-27 10:54:44 +00:00
hongmingwang-moleculeai	4c839cb306	Merge pull request #2164 from Molecule-AI/test/unblock-cp-provision-broadcast-test test(provisioner): unblock TestProvisionWorkspaceCP_NoInternalErrorsInBroadcast (#1814)	2026-04-27 10:54:44 +00:00
Hongming Wang	3df5867b56	fix: restore main_sync entry point in workspace/main.py The wheel's pyproject.toml has declared `molecule-runtime = "molecule_runtime.main:main_sync"` since the publish pipeline was created on 2026-04-26, but the function itself was never present in workspace/main.py — it lived in the pre-monorepo molecule-ai-workspace-runtime repo and was lost during the consolidation that made workspace/ the source of truth. The 0.1.15 wheel still had main_sync from a leftover snapshot, so the regression went unnoticed until 0.1.16 (the first wheel built from the new source-of-truth) shipped. Symptom: every workspace container restart loops with ImportError: cannot import name 'main_sync' from 'molecule_runtime.main' — the molecule-runtime CLI script's first line tries to import the missing symbol. Workspaces stay in `provisioning` until the 10-min sweep marks them failed. Caught by .github/workflows/runtime-pin-compat.yml, which already imports the symbol by name as its smoke test. (That check kept failing red on every recent merge_group run; this PR fixes the underlying symbol-not-found instead of the smoke step.) Also strengthens publish-runtime.yml's wheel smoke from `import molecule_runtime.main` (loads the module — passes even when entry-point target is missing) to `from molecule_runtime.main import main_sync` (the actual contract the CLI script needs). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 03:35:49 -07:00
Hongming Wang	e15d1182cd	test(provisioner): unblock TestProvisionWorkspaceCP_NoInternalErrorsInBroadcast (#1814 ) The skipped test exists to assert that provisionWorkspaceCP never leaks err.Error() in WORKSPACE_PROVISION_FAILED broadcasts (regression guard for #1206). Writing the test body required substituting a failing CPProvisioner — but the handler's `cpProv` field was the concrete CPProvisioner type, so a mock had nowhere to plug in. Refactor: - Add provisioner.CPProvisionerAPI interface with the 3 methods handlers actually call (Start, Stop, GetConsoleOutput) - Compile-time assertion `var _ CPProvisionerAPI = (CPProvisioner)(nil)` catches future method-signature drift at build time - WorkspaceHandler.cpProv narrowed to the interface; SetCPProvisioner accepts the interface (production caller passes *CPProvisioner from NewCPProvisioner unchanged) Test: - stubFailingCPProv whose Start returns a deliberately leaky error (machine_type=t3.large, ami=…, vpc=…, raw HTTP body fragment) - Drive provisionWorkspaceCP via the cpProv.Start failure path - Assert broadcast["error"] == "provisioning failed" (canned) - Assert no leak markers (machine type, AMI, VPC, subnet, HTTP body, raw error head) in any broadcast string value - Stop/GetConsoleOutput on the stub panic — flags a future regression that reaches into them on this path Verification: - Full workspace-server test suite passes (interface refactor is non-breaking; production caller path unchanged) - go build ./... clean - The other skipped test in this file (TestResolveAndStage_…) is a separate plugins.Registry refactor and remains skipped Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 03:28:25 -07:00
Hongming Wang	5022a740e1	Merge pull request #2163 from Molecule-AI/fix/build-script-drift-gate-and-main-smoke fix(release): drift-gate TOP_LEVEL_MODULES + smoke-import main (post-0.1.16 incident)	2026-04-27 10:22:06 +00:00
Hongming Wang	c68dc1877f	fix(release): drift-gate TOP_LEVEL_MODULES + smoke-import main in publish Two compounding bugs surfaced when 0.1.16 hit production today: 1. scripts/build_runtime_package.py had a hand-curated TOP_LEVEL_MODULES set listing every workspace/.py that should get its bare imports rewritten to `molecule_runtime.X`. The set silently went stale: - Missing: transcript_auth (added since #87 phase 1c), runtime_wedge, watcher → unrewritten imports shipped, every workspace startup died with ModuleNotFoundError. - Stale: claude_sdk_executor, cli_executor (both removed in #87), hermes_executor (never existed) → harmless but misleading. 2. publish-runtime.yml's wheel-smoke step asserted on stable invariants (BaseAdapter, AdapterConfig, a2a_client error sentinel) but never imported main. So even though main.py held the broken bare `from transcript_auth import ...`, the smoke check passed. Fixes: - Build script now derives the on-disk module set from workspace/.py and asserts it matches TOP_LEVEL_MODULES exactly. Drift in either direction fails the build with a specific diff message instead of shipping a broken wheel. Closed-list typo guard preserved (we still edit the set explicitly when a module is added/removed) — the gate just makes drift impossible to ignore. - TOP_LEVEL_MODULES updated to current reality: drop the 3 stale, add the 3 missing. - publish-runtime.yml wheel-smoke now `import molecule_runtime.main` before the invariant asserts. main is the entry point and transitively imports every module — any bare-import bug surfaces as ModuleNotFoundError before PyPI accepts the upload. Tested locally: `python3 scripts/build_runtime_package.py --version 0.1.99 --out /tmp/build-test` succeeds, and /tmp/build-test/molecule_runtime/main.py contains the rewritten `from molecule_runtime.transcript_auth import ...`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 03:19:17 -07:00
Hongming Wang	6f0774c708	Merge pull request #2162 from Molecule-AI/fix/e2e-sanity-rc-normalization fix(e2e-sanity): normalize unexpected curl exit codes in cleanup trap (#2159)	2026-04-27 10:05:14 +00:00
Hongming Wang	99fb61bb8c	fix(e2e-sanity): normalize unexpected curl exit codes in cleanup trap (#2159 ) When E2E_INTENTIONAL_FAILURE=1 poisons the tenant token, step 5/11's `tenant_call POST /workspaces` curl exits 22 (HTTP error under --fail-with-body). `set -e` propagates rc=22 directly, but the script's documented contract emits only {0,1,2,3,4}, and the sanity workflow's case statement only matches those. rc=22 falls through to "Unexpected rc — investigate harness" and opens a false-positive priority-high "safety net broken" issue (#2159, weekly run on 2026-04-27). The trap now captures $? at entry (must be the first statement before any command clobbers it) and at the end normalizes any non-contract code to 1 (generic failure). Leak detection continues to exit 4 directly, so its semantics are preserved. Adds tests/e2e/test_harness_rc_normalization.sh — a self-contained regression test that builds a stub harness with the same trap pattern, triggers controlled exit codes, and asserts the normalization. Covers the 5 contracted codes + curl-22 (the bug) + 3 representative network-failure codes + sigsegv-139. Verification: - 10/10 regression tests pass - shellcheck clean on both modified files - production teardown path unchanged for legitimate {1,2,3,4} failures and the leak-detection exit 4 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 02:55:44 -07:00
hongmingwang-moleculeai	c3d29941b8	Merge pull request #2161 from Molecule-AI/feat/auto-publish-runtime-on-staging feat(publish-runtime): auto-publish to PyPI on staging pushes touching workspace/	2026-04-27 09:20:12 +00:00
Hongming Wang	7d872f9661	Merge pull request #2160 from Molecule-AI/feat/skill-runtime-compat feat(skills): per-skill runtime compatibility (#119)	2026-04-27 09:15:01 +00:00
Hongming Wang	0a455b7d71	feat(publish-runtime): auto-publish to PyPI on staging pushes that touch workspace/ Adds a third trigger so any merge to staging that changes workspace/ auto-publishes a new molecule-ai-workspace-runtime patch release. Closes the human-in-loop gap that caused tonight's RuntimeCapabilities ImportError outage. Tonight: #117 added RuntimeCapabilities to molecule_runtime.adapters.base. The merge landed at 02:37 UTC. Templates rebuilt their images at 07:37 UTC (4 hours later) and started importing the new symbol. PyPI was still serving 0.1.15 (pre-#117) because nobody remembered to push a runtime-vX.Y.Z tag or workflow_dispatch the publish. Result: every template image shipped tonight runs `from molecule_runtime.adapters.base import RuntimeCapabilities` against an installed runtime that doesn't export it -> ImportError -> workspace never registers -> stuck in provisioning until 10-min sweep. Mechanism: - New trigger: push to staging filtered to paths: ['workspace/']. Path filter applies only to branch pushes; the existing tag trigger still fires unconditionally. - Version derivation for the auto case: query PyPI's JSON API for current latest, bump the patch component. PyPI is the source of truth so concurrent runs don't double-publish (HTTP 400 on collision). - concurrency: group serializes parallel staging merges so they don't race on the bump computation. cancel-in-progress: false because each workspace/** change deserves its own release. - publish job now exposes its derived version as a job-level output so the cascade reads it cleanly. Fixes a latent bug: cascade tried to read steps.version.outputs.version, which is from a different job's scope and silently resolved to empty -- then re-derived from GITHUB_REF_NAME, which would have been "staging" under the new trigger and produced an invalid version. Tag-driven and manual-dispatch paths are unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 02:11:45 -07:00
Hongming Wang	d19d35f6b3	test(skills): make watcher test fakes accept current_runtime kwarg The runtime-compat change in this branch added a `current_runtime` kwarg to load_skills(); the watcher passes it through. Test mocks that pre-date the kwarg signature broke with TypeError, which the watcher's reload-error try/except swallowed — the symptom was empty callback lists, not a clear failure. Switching the fakes to accept **kwargs keeps them forward-compat for future load_skills additions without another test churn. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 02:04:26 -07:00
Hongming Wang	d0057912d2	feat(skills): per-skill runtime compatibility (#119 , hermes pattern) SKILL.md frontmatter can now declare `runtime: [claude-code]` or `runtime: [hermes, claude-code]` to opt out of incompatible adapters instead of failing at first invocation. Default `[""]` means universal — existing skill libraries need zero migration. Borrowed from hermes' declarative skill-compat pattern surfaced in the hermes architecture survey. The remaining two patterns (event-log layer, observability config block) stay open under #119. Wiring: - SkillMetadata.runtime: list[str] = [""] - _normalize_runtime_field accepts list, string-sugar, missing -> [""]; malformed warns and falls back to universal so a typo never silently drops a skill. - load_skills(..., current_runtime=...) filters out skills whose runtime list lacks "" or current_runtime, with an INFO log line. - BaseAdapter.start passes type(self).name() so the live adapter drives the filter; SkillsWatcher takes the same kwarg so hot-reload honors it. 8 new tests cover default universal, no-field universal, explicit match/mismatch, string sugar, wildcard short-circuit, current_runtime=None (preserves old behavior), and malformed-warns-not-drops. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 01:57:43 -07:00
Hongming Wang	e99f937630	Merge pull request #2157 from Molecule-AI/chore/drop-cli-executor-from-runtime chore(workspace): drop cli_executor — Phase 3 of #87 [DRAFT]	2026-04-27 08:24:30 +00:00
Hongming Wang	4959c37040	Merge pull request #2158 from Molecule-AI/feat/steer-agent-to-attachments-field feat(tools): tighten send_message_to_user description to forbid pasting URLs in body	2026-04-27 08:24:02 +00:00
Hongming Wang	98ca5c50fa	chore(workspace): drop cli_executor — Phase 3 of #87 (DRAFT, blocked on gemini-cli image rebuild) DRAFT — do NOT merge until gemini-cli template image rebuilds with its local cli_executor.py copy (template PR #9 just merged at 07:59 UTC; image build kicks off now). Final adapter-specific deletion from molecule-runtime, completing #87 for the priority adapters (claude-code via PR #2156, plus gemini-cli via this PR + template #9). Deletes: - workspace/cli_executor.py (461 LOC) — CLIAgentExecutor + the RUNTIME_PRESETS dict for codex / ollama / gemini-cli. The file moved to molecule-ai-workspace-template-gemini-cli (PR #9, merged). - workspace/tests/test_agent_base_urls.py — only consumer of CLIAgentExecutor in the test suite. Tests for the executor behavior live in the template repo now. Updates: - workspace/tests/test_executor_helpers.py — docstring refresh: executor_helpers.py is the runtime-agnostic shared helpers; the executor classes themselves live in template repos post-#87. Codex / ollama presets disappear naturally with the file. They never had template repos, so no production path could invoke them anyway — this is dead-code removal as a side effect of the move. Verified-safe-to-delete: - heartbeat.py: doesn't import cli_executor - claude_sdk_executor.py: deleted by PR #2156 (in flight) - preflight.py: only references runtime names by string; no import - main.py: doesn't import cli_executor (uses adapter discovery via ADAPTER_MODULE; the template's adapter constructs the executor) - Only test_agent_base_urls.py + test_executor_helpers.py docstring referenced cli_executor Verification: - 1249/1249 workspace pytest pass (was 1251; -2 = test_agent_base_urls.py cases — exact match) - No live import of cli_executor anywhere in molecule-core after deletion (grep verified) Sequencing: 1. ✅ Template PR #9 (gemini-cli local copy) — MERGED 2. ⏳ Template image rebuild — running 3. THIS PR — wait until image is published, then mark ready-for-review Closes #87 for the priority adapters: workspace/ is now adapter- agnostic except for adapter discovery (ADAPTER_MODULE) + the runtime_wedge primitive. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 01:22:39 -07:00
Hongming Wang	7504aba934	feat(tools): tighten send_message_to_user description to forbid pasting URLs in body Root-cause fix for #118 (chat attachments rendering as plain text links instead of download chips). User flagged with screenshot 2026-04-26 showing the Design Director agent pasting https://files.catbox.moe/… in the message body — chat rendered the URL as plain markdown text, unclickable in the canvas's bubble layout, and unreachable in any SaaS deployment where the user's browser can't egress to catbox. The structured `attachments` field already exists, the canvas's AttachmentChip already renders well, the WebSocket broadcast already carries attachments verbatim — the missing piece was the LLM choosing the body over the structured field. Tighten the tool description so it trains the right behavior. Three targeted strengthenings: 1. Top-level tool description: enumerated use case (4) now reads "via the `attachments` field (NEVER paste file URLs in `message`)". The all-caps NEVER + the explicit field name move the LLM toward the structured path on first read. 2. `message` param: adds an explicit DO NOT rule with rationale. Includes the SaaS-reachability reason so operators can grep for "SaaS" and find this design constraint instead of re-discovering it after a tenant complaint. Calls out catbox.moe + file:// by name as concrete examples of forbidden hosts (those are the two we've seen in production). 3. `attachments` param: leads with REQUIRED, lists the bad alternatives explicitly (pasting URLs, base64-encoding, telling user to look at a path). LLMs handle "use X, NOT Y" framings better than "use X" alone — observed during prompt-engineering iteration on hermes' tool descriptions. Tests pin all three load-bearing phrases (4 new in test_a2a_mcp_server.py) so a future doc edit that softens or drops them fails CI. Brittle by design — these are prompt-engineering invariants, not implementation details. This is the root-cause fix. A defensive canvas-side backstop (auto- detect download-shaped URLs in body and convert to chips) is a follow-up that could land separately if the steering proves insufficient in practice. Verification: - 1190/1190 workspace pytest pass - 4 new test_a2a_mcp_server.py cases all green Closes the steering half of #118. The structured-attachments-only contract was already enforced server-side (PR #2130 added per-attachment validation); this PR closes the prompt-side gap. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 01:13:11 -07:00
Hongming Wang	4e6030d783	Merge pull request #2156 from Molecule-AI/chore/drop-claude-sdk-executor-from-runtime chore(workspace): drop claude_sdk_executor — Phase 2 of #87	2026-04-27 08:02:51 +00:00
Hongming Wang	2fbf6b6b27	Merge pull request #2155 from Molecule-AI/feat/preflight-runtime-discovery feat(preflight): replace SUPPORTED_RUNTIMES static list with adapter discovery	2026-04-27 08:02:39 +00:00
Hongming Wang	4b5ac2ebc2	chore(workspace): drop claude_sdk_executor — Phase 2 of #87 Phase 2 of the universal-runtime refactor (task #87). Now that the claude-code template repo ships its own claude_sdk_executor.py (template PR #13 merged + image rebuilt at 07:36 UTC) the molecule-runtime no longer needs to ship the file. Deletes: - workspace/claude_sdk_executor.py (704 LOC) - workspace/tests/test_claude_sdk_executor.py (~1.6K LOC) Updates: - workspace/runtime_wedge.py — drops the "Compatibility shim" docstring section. The shim was time-bounded ("removed once #87 Phase 2 lands"); this is that PR. - workspace/tests/test_runtime_wedge.py — drops the TestClaudeSdkExecutorReExportShim test class (the shim doesn't exist anymore so the identity assertions would fail at import). - workspace/tests/conftest.py — drops the claude_agent_sdk stub. Its only consumer was test_claude_sdk_executor.py which is gone; no other test imports the SDK. - workspace/cli_executor.py — comment refresh: claude-code template repo (not workspace/) is now the home for ClaudeSDKExecutor. Verified-safe-to-delete: - heartbeat.py: migrated to runtime_wedge in PR #2154 (no longer imports from claude_sdk_executor) - cli_executor.py: only comments referenced claude_sdk_executor; its line-117 ValueError defends against accidental routing - tests: only test_claude_sdk_executor.py + test_runtime_wedge.py's shim class consumed the deleted module; both removed in this PR Verification: - 1182/1182 workspace pytest pass (was 1251; -69 = exactly the deleted test cases — zero unexpected regressions) - No live import of claude_sdk_executor anywhere in molecule-core after deletion (grep verified) Closes #87 for the claude-code adapter. Hermes is already template-only. The remaining adapter-specific code in workspace/ is cli_executor.py (codex/ollama/gemini-cli) tracked by task #122. preflight.py's SUPPORTED_RUNTIMES static list is tracked by task #123 (PR #2155 in flight). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 00:52:55 -07:00
Hongming Wang	7dba700ac3	feat(preflight): replace SUPPORTED_RUNTIMES static list with adapter discovery Closes task #123 — last piece of #87 cleanup. Pre-fix: workspace/preflight.py:11 hardcoded a tuple of "supported" runtime names (claude-code, codex, ollama, langgraph, etc.). Every new template repo required a code change in molecule-runtime to be recognized — direct violation of the universal-runtime principle (#87) where adapters declare themselves and the runtime stays generic. Post-fix: discovery-based validation via the same ADAPTER_MODULE env var that production load paths already consult (workspace/adapters/__init__.py:get_adapter). Distinguished failure modes so operator messages are concrete: - ADAPTER_MODULE unset → "no adapter installed; set the env var" - ADAPTER_MODULE set but module won't import → import error type + message - module imports but no Adapter class → "convention violation, add `Adapter = YourClass`" - Adapter.name() raises → caught with operator message - Adapter.name() returns non-string → contract violation message - Adapter.name() doesn't match config.runtime → drift WARNING (not fatal; the adapter wins in production, config.yaml is just documentation) The drift case is the one behavioral change worth calling out: the prior static-list path would have hard-failed config.runtime values not in the allowlist. With discovery, an unknown runtime in config.yaml is just a documentation drift — the adapter that's actually installed runs regardless. Operator gets a warning naming both the configured and installed names so they can fix whichever is stale. Tests: - Replaces the obsolete "static list pass/fail" tests with 6 new cases covering each distinguished failure mode, plus a positive test for the adapter-matches-config happy path - Adds an autouse `_default_langgraph_adapter` fixture that pre-installs a fake adapter via sys.modules monkey-patching, so existing tests building default WorkspaceConfig (runtime="langgraph") inherit a valid adapter without each test setting ADAPTER_MODULE - Failure-mode tests opt out of the default fixture via @pytest.mark.no_default_adapter (registered in pytest.ini) - Sentinel pattern (`_UNSET = object()`) for `name_returns` so None is a passable test value (otherwise `is not None` would skip the None branch — exact bug the sentinel avoids) Verification: - 22/22 preflight tests pass (was 16; +6 new failure-path tests) - 1256/1256 workspace pytest pass (was 1251; +5 net) - No production code path other than preflight changed Source: 2026-04-27 #87 cleanup audit after PR #2154 (wedge extraction). This change is independent of the cli_executor.py template moves (task #122) — completes one of the two remaining cleanup items. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 00:44:51 -07:00
Hongming Wang	66b9c04057	Merge pull request #2154 from Molecule-AI/refactor/extract-wedge-state-from-claude-sdk refactor(wedge): extract claude_sdk_executor wedge state into runtime_wedge module	2026-04-27 07:22:20 +00:00
Hongming Wang	5e049244d6	refactor(wedge): mark re-exports explicit via __all__ Addresses github-code-quality unused-import flag on the runtime_wedge re-export shim. Adds __all__ listing the names that exist purely for backwards-compat (is_wedged / wedge_reason / _reset_sdk_wedge_for_test) so static analysis recognizes the imports as deliberate exports. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 00:20:23 -07:00
Hongming Wang	feb544938b	refactor(wedge): address review feedback — class wrap + import-path doc + dedupe shim rationale Three changes from /code-review-and-quality on PR #2154: 1. Optional (architecture): wrap state in a private _WedgeState class instead of bare module-level globals. Public API (mark_wedged / clear_wedge / is_wedged / wedge_reason / reset_for_test) is unchanged — adapters never see the class. The class is forward-cover for any future per-scope variant (multiple executors per process, a keyed registry, etc.) without churning the call sites. Today there's exactly one instance (_DEFAULT) so behavior is identical. 2. Optional (readability): clarify the import path in the integration recipe — in a TEMPLATE repo it's `from molecule_runtime.runtime_wedge` (PyPI package); in molecule-core itself it's `from runtime_wedge` (top-level module). Removes the trap where a contributor reading the docstring while editing in-repo copies the template-style import and gets ImportError. 3. Nit (readability): dedupe the shim rationale. claude_sdk_executor's re-export comment now points to runtime_wedge's "Compatibility shim" section as the source of truth instead of restating the same content. Avoids docs-in-two-places drift risk. Verification: - 1251/1251 workspace pytest pass (no behavior change — class wrap is pure plumbing; module-level helpers delegate to the singleton) - All shim re-export identity tests still pass (the shim's `is_wedged is runtime_wedge.is_wedged` assertion holds because we re-export the SAME function object that delegates to _DEFAULT) No new tests needed — the existing test suite covers the public API contract; the class is an implementation detail behind that contract. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 00:16:33 -07:00
Hongming Wang	cd899c969f	docs(wedge): integration recipe for adapters that want to flip-to-degraded Doc-only follow-up to the wedge-state extraction. Adds proactive guidance so the next adapter (hermes / codex / langgraph / a future template) discovers the runtime_wedge primitive and integrates the ~6 LOC pattern uniformly instead of inventing its own wedge state. Two additions: - workspace/runtime_wedge.py — new "How to use from a NEW adapter" section in the module docstring with the minimum viable integration recipe, what-you-get-for-free list, and explicit DON'TS (don't store local wedge state, don't mark for transient errors, don't write your own clear logic). Plus a "when wedge is the WRONG primitive" note to keep adopters from over-using it. - workspace/adapter_base.py — adds runtime_wedge to the "Cross-cutting capabilities your adapter can opt into" list in BaseAdapter's docstring (alongside capabilities() and idle_timeout_override()). Discoverability path: adapter author reads BaseAdapter docstring → sees runtime_wedge mention → reads runtime_wedge module docstring → has the recipe. Also tightens the "to add a new agent infra" steps in BaseAdapter to match the actual current model (standalone template repo + ADAPTER_MODULE env var) rather than the obsolete workspace/adapters/<infra>/ layout that hasn't been the path since the universal-runtime extraction started. Zero code change. Tests untouched (1251/1251 still pass). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 00:12:14 -07:00
Hongming Wang	1d231ed295	refactor(wedge): extract claude_sdk_executor wedge state into runtime_wedge module Prerequisite for the universal-runtime refactor (task #87) to move claude_sdk_executor.py out of molecule-runtime into the claude-code template repo. heartbeat.py had a hard import: from claude_sdk_executor import is_wedged, wedge_reason which would break the moment the executor moves out of the runtime package — the heartbeat would lose access to the wedge state used to flip workspace status to degraded. Extract the wedge state to a runtime-side module that the heartbeat can keep importing regardless of which adapter executor is wedged: - workspace/runtime_wedge.py — single-flag state + mark_wedged / clear_wedge / is_wedged / wedge_reason / reset_for_test. Same semantics as the original claude_sdk_executor implementation (sticky first-write-wins, auto-clear on observed success). 100 LOC of pure stateless helpers; lock-free ok because there's one executor per workspace process today. - workspace/claude_sdk_executor.py — drops the in-file definitions; re-exports the same names from runtime_wedge as a backwards-compat shim. Any third-party adapter that imported is_wedged / wedge_reason / _mark_sdk_wedged from claude_sdk_executor keeps working for one release cycle while they migrate to runtime_wedge. - workspace/heartbeat.py — _runtime_state_payload() now imports from runtime_wedge instead of claude_sdk_executor. Lazy-import pattern preserved; the docstring updated to explain the new cross-cutting source-of-truth. Tests (10 new in test_runtime_wedge.py): - Default state (unwedged), mark sets flag, first-write-wins, clear restores healthy, clear-when-not-wedged is no-op, re-marking after clear is allowed - Re-export shim: each old name in claude_sdk_executor IS the runtime_wedge function (identity check), state is shared (marking via the executor shim is observable via runtime_wedge and vice versa) Verification: - 1251/1251 workspace pytest pass (was 1241 after orphan deletion; +10 = exactly the new test_runtime_wedge.py cases) - All existing test_claude_sdk_executor.py cases (which call _mark_sdk_wedged via the shim) still pass After this lands + the claude-code template image rebuilds with the local claude_sdk_executor.py copy (template PR #13), the molecule- core deletion of workspace/claude_sdk_executor.py becomes safe (the shim deletion comes alongside the file deletion, since runtime_wedge is the new public API). See project memory `project_runtime_native_pluggable.md`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 00:08:53 -07:00
Hongming Wang	c1e9aa7461	Merge pull request #2153 from Molecule-AI/fix/block-internal-paths-shallow-clone-bug fix(ci): block-internal-paths handle merge_group + shallow-clone BASE	2026-04-27 06:58:32 +00:00
hongmingwang-moleculeai	5d49cd7843	Merge pull request #2152 from Molecule-AI/chore/delete-orphan-hermes-executor chore(workspace): delete orphan HermesA2AExecutor (-1.8K LOC dead code)	2026-04-27 06:58:21 +00:00
Hongming Wang	d46d558ca9	Merge pull request #2148 from Molecule-AI/test/canvas-lib-utils-runtime-names-1815 test(canvas): cover utils.cn + runtime-names.runtimeDisplayName (0% → 100%) (#1815)	2026-04-27 06:57:57 +00:00
Hongming Wang	a682dcb502	Merge pull request #2149 from Molecule-AI/test/canvas-actions-1815 test(canvas): cover canvas-actions restart-pending helpers (25% → 100%) (#1815)	2026-04-27 06:55:36 +00:00
Hongming Wang	17a6800374	Merge pull request #2150 from Molecule-AI/feat/priority-runtimes-e2e test(e2e): claude-code + hermes priority-runtimes happy path	2026-04-27 06:55:20 +00:00
Hongming Wang	ae029f8c3f	Merge pull request #2151 from Molecule-AI/test/canvas-class-names-1815 test(canvas): cover store/classNames helpers (17% → 100%) (#1815)	2026-04-27 06:54:37 +00:00
Hongming Wang	516b58dcd7	Merge pull request #2147 from Molecule-AI/feat/canvas-coverage-instrumentation-1815 feat(canvas): vitest coverage instrumentation (#1815, no CI gate yet)	2026-04-27 06:54:22 +00:00
Hongming Wang	7ac7a010fa	fix(ci): block-internal-paths handle merge_group + shallow-clone BASE [Molecule-Platform-Evolvement-Manager] ## What was broken Same bug class as the secret-scan.yml fix in #2120 — block-internal-paths hit `fatal: bad object <sha>` exit 128 on the staging push at 2026-04-27 06:50:33Z. Two cases: 1. `merge_group` events: BASE/HEAD came from `github.event.before` / `.after` which are push-event-only properties. On merge_group both came back empty, the script fell through to "scan entire tree" mode which is correct but inefficient. Worse, when this workflow is required for the merge queue (line 21-22), an empty-BASE entire-tree scan would run on every queue check. 2. `push` events with shallow clones: `fetch-depth: 2` doesn't always cover BASE across true merge commits. When BASE is in the payload but absent from the local object DB, `git diff` errors out with `fatal: bad object <sha>` and the job exits 128. This is what broke today's staging push. ## Fix Same shape as the secret-scan.yml fix (#2120): - Add a dedicated `git fetch` step for `merge_group.base_sha`. - Move event-specific SHAs into a step `env:` block; script uses a `case` over `${{ github.event_name }}` covering pull_request / merge_group / push (rather than `if pull_request / else push` which left merge_group on the empty-BASE branch). - On-demand fetch + `git cat-file -e` guard for push BASE so a SHA that's payload-present-but-DB-absent triggers the fetch, and a fetch failure falls through cleanly to "scan entire tree" instead of exiting 128. ## Test plan - [x] YAML structure preserved (no schema changes) - [x] Bash logic mirrors the secret-scan recovery path tested in #2120 - [ ] CI green on this PR's pull_request scan + push to staging post-merge 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 23:54:00 -07:00
Hongming Wang	fa8deb9d16	chore(workspace): delete orphan HermesA2AExecutor (dead code, 1.8K LOC) Removes: - workspace/hermes_executor.py (545 LOC) — HermesA2AExecutor, an OpenAI-compat direct-call executor that was the original hermes integration before the template was rewritten to bridge to hermes-agent's sidecar API server. - workspace/tests/test_hermes_executor.py (1307 LOC) — its test file. Verified-dead-code analysis: - Zero `from hermes_executor` / `import hermes_executor` imports anywhere in workspace/, workspace-server/, or workspace-configs-templates/ (excluding the file itself + its test). - The hermes template (workspace-configs-templates/hermes/executor.py) uses HermesAgentProxyExecutor, NOT HermesA2AExecutor — they're independent implementations. The executor.py file imports from `executor` (local), not from molecule_runtime. - Last touched in PR #1974 (2026 a2a-sdk migration to 1.0.0) for SDK compatibility — kept compiling but never wired into any code path. - Older than that, only the 2026 open-source restructure rename. Why now: starting task #87 (universal-runtime violation, move adapter- specific code out of workspace/). Dead-code deletion is the safest first step and motivates the broader refactor by clearing the landscape — no risk of someone defending HermesA2AExecutor as "actually used somewhere." Verification: - 1241/1241 workspace pytest pass (was 1312; the 71 dropped tests are exactly test_hermes_executor.py's coverage) - No new failures, no broken imports anywhere The remaining adapter-specific executors in workspace/ that #87 will eventually relocate (per the user's scope: claude-code + hermes priority, others later): - workspace/claude_sdk_executor.py (757 LOC) → claude-code template repo - workspace/cli_executor.py (461 LOC) → defer (codex/ollama/etc still use the runtime presets here; comes back later when those bump versions) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 23:52:10 -07:00
Hongming Wang	679e30538a	test(canvas): cover store/classNames helpers (17% → 100%) (#1815 ) [Molecule-Platform-Evolvement-Manager] Continues the #1815 coverage rollup. classNames.ts was at 17% in the baseline; this PR brings it to full coverage. 16 cases across 3 helpers: appendClass (6): - undefined / empty existing → just `cls` - single-class → "a b" join - DEDUP: existing already contains `cls` → existing unchanged. This is the load-bearing reason classNames.ts exists. Pre-helper the call sites inlined `${existing} ${cls}` with no dedup, so a tick that fired the same class twice produced "a a" and React Flow's className-equality diff saw it as a change every render. - whitespace normalization (multi-space, leading/trailing) removeClass (7): - undefined / empty existing → "" - removes named class - exact match only ("spawn" must NOT match "spawn-fast") - removing the only class → "" - no-op when class absent - whitespace normalization scheduleNodeClassRemoval (3): - after delayMs: calls set() with className-removed on target node; OTHER nodes untouched (the per-id pruning is the contract — pin it so a future refactor that maps over all nodes doesn't silently strip classes from siblings) - does NOT fire before the delay elapses (vi.useFakeTimers + advance) - SSR safety: when window is undefined, function is a no-op (neither get nor set fires) ## Note on test environment Added `// @vitest-environment jsdom` directive — the file's default `node` environment leaves `window` undefined, which would make the SSR-guard happy-path test pass for the wrong reason (every test would short-circuit). With jsdom, the SSR test explicitly stubs `window` to undefined to exercise the guard. ## Test plan - [x] All 16 cases pass locally (~1.1s with jsdom env spin-up) - [x] No SUT changes - [ ] CI green ## #1815 progress - [x] Step 1+2: instrumentation (#2147) - [x] utils.ts + runtime-names.ts (#2148) - [x] canvas-actions.ts (#2149) - [x] store/classNames.ts (this PR) - [ ] store/canvas.ts (73% — biggest absolute gap; bigger surface, separate cycle) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 23:50:00 -07:00
Hongming Wang	a4b3ebf951	test(e2e): claude-code + hermes priority-runtimes happy path Self-contained happy-path E2E for the two runtimes the project commits to first-class support for (task #116, completes the loop on the "both must work end-to-end with tests" requirement). What it proves per runtime: 1. POST /workspaces succeeds with the runtime + secrets 2. Workspace reaches status=online within its cold-boot window (claude-code: 240s, hermes: 900s on cold apt + uv + sidecar) 3. POST /a2a (message/send "Reply with PONG") returns a non-error, non-empty reply 4. activity_logs row written with method=message/send and ok\|error status (a2a_proxy.LogActivity contract) Skip semantics: each phase independently checks for its required env key (CLAUDE_CODE_OAUTH_TOKEN / E2E_OPENAI_API_KEY) and skips cleanly if absent. The script always exit-0s if every phase either passed or skipped — so wiring it into a no-keys CI job validates the script itself stays clean without false-failing. Idempotent: pre-sweeps any prior "Priority E2E (claude-code)" / "Priority E2E (hermes)" workspaces so a run interrupted by SIGPIPE / kill -9 (which bypasses the EXIT trap) doesn't poison the next run. Same defensive pattern as test_notify_attachments_e2e.sh. CI wiring: - e2e-api.yml — runs on every PR with no LLM keys, both phases skip, catches script-level regressions (set -u bugs, syntax issues, etc.) - canary-staging.yml + e2e-staging-saas.yml already have the keys via secrets.MOLECULE_STAGING_OPENAI_KEY and exercise wire-real behavior — could be wired to opt-in if you want claude-code coverage there too. Local runs (from this branch, no keys): === Results: 0 passed, 0 failed, 2 skipped === Validates the capability primitives shipped in PRs #2137-2144: once template PRs #12 (claude-code) + #25 (hermes) merge with their declared provides_native_session=True + idle_timeout_override=900, a manual run with both keys validates the full native+pluggable chain. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 23:48:54 -07:00
Hongming Wang	e5e4eb4d2a	test(canvas): cover canvas-actions restart-pending helpers (25% → 100%) (#1815 ) [Molecule-Platform-Evolvement-Manager] Continues the #1815 coverage rollup. canvas-actions.ts was at 25% in the baseline run from #2147; this PR brings the file's two helpers to full coverage. 5 cases: markAllWorkspacesNeedRestart (3): - calls updateNodeData on every node with `{needsRestart: true}` - no-op when the canvas has zero workspaces - preserves call ordering — matters because the toolbar's Restart Pending pill observes per-node data changes incrementally; a refactor that shuffled iteration order would silently change which workspaces flash first markWorkspaceNeedsRestart (2): - targeted call: updateNodeData fires exactly once on the named id - defensive: regardless of how many other workspaces exist in the store, only the target workspace gets updated. Pre-this-test, a refactor that accidentally wired this function through the per-node iteration path of markAll would silently mark every workspace — pinning the cardinality here catches that. ## Mock strategy Standard pattern for canvas store: mock useCanvasStore as both the selector function AND a getState()-bearing object. updateNodeData is a vi.fn() spy so the test asserts on calls + args directly. ## Test plan - [x] All 5 cases pass locally (~132ms) - [x] No SUT changes — pure additive coverage - [ ] CI green ## #1815 progress - [x] Step 1+2: instrumentation + script (#2147) - [x] utils.ts + runtime-names.ts (#2148) - [x] canvas-actions.ts (this PR) - [ ] Remaining low-coverage targets: store/classNames.ts (17%), store/canvas.ts (73% — largest absolute gap by lines) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 23:47:49 -07:00
Hongming Wang	4fc37a76d9	Merge pull request #2143 from Molecule-AI/test/canvas-a2a-edge-2071 test(canvas): unit tests for A2AEdge — selection + Activity-tab routing (#2071)	2026-04-27 06:45:58 +00:00
Hongming Wang	bfbbe57610	test(canvas): cover utils.cn + runtime-names.runtimeDisplayName (0% → 100%) (#1815 ) [Molecule-Platform-Evolvement-Manager] Closes two of the 0%-coverage files surfaced by the baseline run in PR #2147 (vitest coverage instrumentation). Both files are tiny utility helpers with high-touch read paths. ## utils.cn (8 cases) Wraps `twMerge(clsx(inputs))` — every conditionally-styled component flows through here. The load-bearing case is the last-wins Tailwind dedup: `cn("p-2", "p-4")` → "p-4". A regression that lost twMerge would silently double-apply utilities (cosmetically broken, breaks `:where()` rules + theme overrides). Cases: - single class unchanged - multiple positional classes joined - array input flattening (clsx) - object syntax with truthy/falsy keys - last-wins dedup on conflicting Tailwind utilities (the regression-locked guarantee) - non-conflicting utilities both survive (p-2 + m-4) - mixed input shapes (string + array + object + string) - nullish / empty inputs don't throw ## runtime-names.runtimeDisplayName (4 it.each cases + 3 it()) Friendly-name lookup that surfaces the workspace runtime in the chat indicator, details tab, and a few component labels. Cases: - known runtimes map to display strings (claude-code → Claude Code, langgraph → LangGraph, etc.) - unknown runtime falls back to input string verbatim (a NEW runtime not yet in the lookup still renders something operator-debuggable rather than a generic placeholder) - empty string falls back to "agent" (final default) - case-sensitivity pinned: "Claude-Code" / "LANGGRAPH" miss the lookup. The upstream slug is already normalized lowercase, so a future refactor that lowercases input "for safety" would silently change behavior — pinning the contract here. ## Test plan - [x] All 17 cases pass locally (~129ms) - [x] No SUT changes — pure additive coverage - [ ] CI green ## #1815 progress - [x] Step 1+2: coverage instrumentation + script (#2147) - [x] 0%-file gaps utils.ts + runtime-names.ts (this PR) - [ ] More 0%/low-coverage files: lib/canvas-actions.ts (25%), store/classNames.ts (17%) — separate PRs - [ ] Step 3b: thresholds + CI gate once baseline catches up 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 23:45:51 -07:00
Hongming Wang	d64ee7b4e4	Merge pull request #2145 from Molecule-AI/test/canvas-org-cancel-button-2071 test(canvas): unit tests for OrgCancelButton — cascade-delete + optimistic store (#2071)	2026-04-27 06:45:47 +00:00
Hongming Wang	e06bc4f832	Merge pull request #2146 from Molecule-AI/test/canvas-drag-utils-2071 test(canvas): unit tests for dragUtils — nest hysteresis + clamp geometry (#2071)	2026-04-27 06:45:37 +00:00
Hongming Wang	57457899a1	feat(canvas): vitest coverage instrumentation (#1815 , no CI gate yet) [Molecule-Platform-Evolvement-Manager] Closes step 1+2 of #1815. Step 3 (CI gate + threshold) is split into a follow-up because today's baseline is ~46% lines / ~45% statements, not the 70% the issue's draft thresholds assumed. ## What this lands - `canvas/vitest.config.ts` — `coverage` block with v8 provider, reporters: text (terminal) / html (./coverage/index.html) / json-summary (machine-readable for tooling). NO threshold — pure observability. - `canvas/package.json` — adds `test:coverage` script (`vitest run --coverage`); existing `test` script is unchanged so the default workflow is identical. - `canvas/package-lock.json` — adds @vitest/coverage-v8@^4.1.5 (the v8 provider Vitest uses for native coverage). ## Why no threshold yet Issue draft threshold was 70%/70%/65%/70% (lines/funcs/branches/stmts). Local baseline today: ``` Statements : 45.19% (3248/7186) Branches : 39.87% (2034/5101) Functions : 40.99% (724/1766) Lines : 46.36% (2905/6265) ``` Turning on a 70% gate today would either fail CI immediately or get papered over with an ad-hoc exclude list. Better path: land observability now, run coverage in PR review for any new code (via the new script), gate later when the baseline catches up. ## Heatmap (from local run, top gaps) - `src/lib/runtime-names.ts` — 0% (untouched by tests) - `src/lib/utils.ts` — 0% - `src/lib/canvas-actions.ts` — 25% - `src/store/classNames.ts` — 17% - `src/store/canvas.ts` — 73% (already-tested but the largest absolute gap by lines) Each is a concrete follow-up issue / PR target. ## Test plan - [x] `npx vitest run --coverage` runs cleanly locally (~10s) and produces `./coverage/index.html` + a `coverage-summary.json` - [x] Existing `npm run test` workflow unchanged — instrumentation only activates with `--coverage` flag - [x] No production-code changes — pure tooling addition ## Follow-ups (each tracked separately; this PR keeps minimal scope) - Step 3a — write tests for the 0% files above (~tiny each) - Step 3b — once baseline ≥ thresholds, add `thresholds` block to vitest.config.ts + a `npm run test:coverage` step in `.github/workflows/ci.yml`'s Canvas job 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 23:44:07 -07:00
Hongming Wang	e3d3b48e8c	test(canvas): unit tests for dragUtils — nest hysteresis + clamp geometry (#2071 ) [Molecule-Platform-Evolvement-Manager] Closes the fourth and final item from #2071 — but at a slightly different layer than the issue listed: tests `dragUtils.ts` (the 74-LOC pure-ish geometry helpers) instead of the full 296-LOC `useDragHandlers` hook. Rationale below. 15 cases across 2 buckets: shouldDetach (8): - child fully inside parent → false - child drifted slightly past edge but under DETACH_FRACTION → false - child past 20% threshold on X → true (un-nest) - child past 20% threshold on Y → true (un-nest) - missing child node → true (conservative fallback per source comment) - missing parent node → true (same) - measured size absent → falls back to React Flow's 220x120 defaults (mirrors initial-mount race where measurement hasn't run yet) - DETACH_FRACTION constant pinned at 0.2 (Miro/tldraw convention) clampChildIntoParent (7): - child already inside bounds → no-op (no setState — proven by reference equality on mockState.nodes) - drifted past top-left → clamps to (0, 0) - drifted past bottom-right → clamps to (parentW - childW, parentH - childH) - per-axis independence: X past edge + Y inside → only X clamps - child not in store → early return, no setState - child internalNode missing → early return, no setState - multi-node store: clamping one node MUST NOT touch siblings ## Why dragUtils, not the full useDragHandlers hook The hook (296 LOC) orchestrates React Flow drag events + Zustand mutations. Testing it would need heavyweight `useReactFlow` + internal-node + `setDragOverNode` / `nestNode` / `batchNest` / `isDescendant` mocks just to drive event handlers — and the decisions the hook makes all delegate to these two helpers: - `shouldDetach` decides "is this a real un-nest?" - `clampChildIntoParent` snaps the child back when the user drifted slightly past the edge without holding Alt/Cmd Pinning these locks the hot path the user feels. The hook's remaining surface (modifier-key snapshotting, drop-target broadcasting, commit-on-release grow pass) is plumbing — worth testing as a follow-up if it ever regresses, but lower correctness leverage per LOC of test setup. ## #2071 status after this PR - [x] useTemplateDeploy (#2121) - [x] A2AEdge (#2143) - [x] OrgCancelButton (#2145) - [x] dragUtils geometry helpers (this PR) - [ ] Full useDragHandlers hook orchestration — explicit deferral with rationale above ## Test plan - [x] All 15 cases pass locally (`vitest run dragUtils.test.ts` — 131ms) - [x] No changes to the SUT — pure additive coverage - [ ] CI green 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 23:41:37 -07:00
hongmingwang-moleculeai	34b92c33b7	Merge pull request #2144 from Molecule-AI/feat/native-session-skip-queue feat(runtime): native_session skips a2a_queue — primitive #5 of 6	2026-04-27 06:40:09 +00:00
Hongming Wang	39eb3eb2e4	test(canvas): unit tests for OrgCancelButton — cascade-delete + optimistic store (#2071 ) [Molecule-Platform-Evolvement-Manager] Closes the third item from #2071 (Canvas test gaps follow-up). Builds on the A2AEdge tests in PR #2143. 10 cases across 4 buckets: Render (2): - Default pill with `Cancel (N)` text + correct ARIA label - Confirm dialog NOT visible until pill click Pill click (3): - Click flips to confirming view + stops propagation (so React Flow doesn't interpret the click as a node selection) - Confirm copy pluralizes correctly: count=1 → "Delete 1 workspace?", count>1 → "Delete N workspaces?". Negative assertion guards against the wrong-form regressing in either direction. No / cancel-confirm (1): - Click No → returns to pill, no API call, no store mutation Yes / cascade-delete (4): - Happy path: beginDelete locks the WHOLE subtree (root + children, NOT unrelated workspace) → api.del("/workspaces/<id>?confirm=true") → optimistic store filter strips subtree, keeps unrelated → success toast → endDelete in finally - WS-event race: WS_REMOVED handler clears the root mid-flight. The bail-out branch (`!postDeleteState.nodes.some(n => n.id === rootId)`) must NOT then run a second optimistic filter. Pre-fix the post-await subtree walk would miss any orphaned descendants whose parentId got reparented upward by handleCanvasEvent — pinned now. - Error path: api.del rejects → endDelete UNDOes the lock + error toast surfaces the message → subtree STAYS in the store so the user can retry / interact with the still-deploying nodes - Non-Error rejection (e.g. string thrown directly): toast surfaces the canned "Cancel failed" fallback instead of attempting `.message` ## Mocking - `@/lib/api`, `@/components/Toaster`: simple spy mocks - `@/store/canvas`: object that satisfies BOTH the selector pattern (`useCanvasStore(s => s.x)`) AND `getState()` / `setState()` since the cascade-delete handler walks the subtree via `getState()` and mutates via `setState()` for the optimistic removal. `vi.hoisted` preserves referential identity so the mock fns wired into the state object are observed by every consumer. ## Test plan - [x] All 10 cases pass locally (`vitest run OrgCancelButton.test.tsx` — ~990ms) - [x] No changes to the SUT — pure additive coverage - [ ] CI green ## #2071 progress after this PR - [x] useTemplateDeploy (PR #2121) - [x] A2AEdge (PR #2143) - [x] OrgCancelButton (this PR) - [ ] useDragHandlers — separate PR 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 23:38:59 -07:00
Hongming Wang	ae64fe340a	feat(runtime): native_session skips a2a_queue enqueue — primitive #5 of 6 When a target workspace's adapter has declared provides_native_session=True (claude-code SDK's streaming session, hermes-agent's in-container event log), the SDK owns its own queue/ session state. Adding the platform's a2a_queue layer on top would double-buffer the same in-flight state — and worse, the platform queue's drain timing has no relationship to the SDK's actual readiness, so the queued request might dispatch while the SDK is STILL busy. Behavior change: in handleA2ADispatchError, when isUpstreamBusyError(err) fires and the target declared native_session, return 503 + Retry-After directly without enqueueing. The caller's adapter handles retry on its own schedule, and the SDK's own queue absorbs the request when ready. Response body carries native_session=true so callers can distinguish this from queue-failure 503s. Observability is preserved: logA2AFailure still runs above; the broadcaster still fires; the activity_logs row records the busy event just like the platform-fallback path. This is the consumer that validates the template-side declarations already shipped in: - molecule-ai-workspace-template-claude-code PR #12 - molecule-ai-workspace-template-hermes PR #25 Once those merge + image tags bump, claude-code + hermes workspaces' busy 503s skip the platform queue end-to-end. End-to-end validation of capability primitive #5. Tests (2 new): - NativeSession_SkipsEnqueue: cache pre-populated, deliberate sqlmock with NO INSERT INTO a2a_queue expected — implicit regression cover (sqlmock fails on unexpected queries). Asserts 503 + Retry-After + native_session=true marker in body. - NoNativeSession_StillEnqueues: negative pin — empty cache, same busy error → falls through to EnqueueA2A (which fails in this test, falls through to legacy 503 without native_session marker). Verification: - All Go handlers tests pass (2 new + existing) - go build + go vet clean See project memory `project_runtime_native_pluggable.md`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 23:34:04 -07:00
Hongming Wang	c7185ece80	test(canvas): unit tests for A2AEdge — selection + Activity-tab routing (#2071 ) [Molecule-Platform-Evolvement-Manager] Closes the second item from #2071 (Canvas test gaps follow-up): adds behavioural coverage for the custom React Flow edge that renders delegation counts between workspaces and routes a click into the source workspace's Activity feed. 10 cases across 2 buckets: Render (6): - Empty label → BaseEdge only, NO portaled HTML pill (the most common state for cold edges; pill must not render-through-empty) - Non-empty label → pill renders with the exact label text - isHot=true → violet accent classes; blue accent NOT present - isHot=false → blue accent classes - ARIA pluralization: count=1 → "1 delegation from …" (singular) - ARIA pluralization: count=7 → "7 delegations from …" (plural) Click behaviour (4): - Click → selectNode(source) - FRESH selection (selectedNodeId != source) → also setPanelTab("activity") - RE-click of already-selected source → setPanelTab MUST NOT fire (this is the regression-locked guarantee — preserves the user's current tab when they intentionally moved to Chat / Memory while inspecting the same peer) - stopPropagation: parent onClick must NOT see the event (otherwise the canvas Pane's clear-selection handler would fire and undo the edge's own selectNode call) ## Mocking strategy - `@xyflow/react`: BaseEdge → <g data-testid>, EdgeLabelRenderer → inline pass-through (no portal), getBezierPath → fixed [path, x, y]. Lets the test render the component without a ReactFlow provider. - `@/store/canvas`: vi.hoisted-shared mock state with selectNode + setPanelTab spies and a mutable selectedNodeId. The store's getState() returns the same object so the click handler's `useCanvasStore.getState().selectedNodeId` lookup works. Pattern matches the existing `A2ATopologyOverlay.test.tsx` setup in the same module. ## Test plan - [x] All 10 cases pass locally (`vitest run A2AEdge.test.tsx` — ~1.3s) - [x] No changes to the SUT — pure additive coverage - [ ] CI green ## Remaining #2071 items - OrgCancelButton tests - useDragHandlers tests Each is a separate PR. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 23:33:28 -07:00

1 2 3 4 5 ...

3203 Commits