molecule-core

Author	SHA1	Message	Date
claude-ceo-assistant	5ecec3f253	Merge pull request 'fix(a2a): reject delegate_task to your own workspace ID (self-deadlock guard)' (#291 ) from fix/self-delegation-guard into main All checks were successful Secret scan / Scan diff for credential-shaped strings (push) Successful in 5s Details	2026-05-10 10:53:18 +00:00
claude-ceo-assistant	f58a11d171	Merge pull request 'fix(runtime): MODEL_PROVIDER env is misnamed — accept MODEL/MOLECULE_MODEL, deprecate legacy name' (#280 ) from fix/model-provider-misnomer into main All checks were successful Secret scan / Scan diff for credential-shaped strings (push) Successful in 7s Details	2026-05-10 10:52:40 +00:00
hongming-pc2	31ed137b74	fix(a2a): reject delegate_task / delegate_task_async to your own workspace ID Some checks failed sop-tier-check / tier-check (pull_request) Failing after 9s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 10s Details audit-force-merge / audit (pull_request) Successful in 5s Details Self-delegation deadlocks: the sending turn holds `_run_lock`, the receive handler waits for the same lock, the A2A request 30s-times-out, and the whole cycle is wasted (the Dev Lead system prompt warns agents off this by hand — "Never delegate_task to your own workspace ID … there is no peer who is also you"). The platform/runtime had no guard. Now both `tool_delegate_task` and `tool_delegate_task_async` early-return an actionable error when `workspace_id == effective_source` (`source_workspace_id or _peer_to_source[target] or WORKSPACE_ID`) — before `discover_peer`, so no network round-trip is wasted either. A genuinely different target (incl. another of a multi-workspace agent's own registered workspaces) is unaffected. Tests: tests/test_a2a_tools_delegation.py — new TestSelfDelegationGuard (4 cases: rejects own ID; rejects when source_workspace_id explicitly == target; async path rejects; a different target passes the guard through to discover_peer). `pytest tests/test_a2a_tools_delegation.py` → 12 passed. (tests/test_a2a_tools_impl.py's TestToolDelegateTask* suite is red on this PC2/Windows checkout — same on `main` without this change; httpx-mock infra, not this PR — CI validates on Linux.) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 03:46:59 -07:00
Molecule AI · core-lead	fe1b3d9a82	Merge branch 'main' into fix/a2a-tools-and-workflow-cleanup All checks were successful Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 24s Details sop-tier-check / tier-check (pull_request) Successful in 25s Details audit-force-merge / audit (pull_request) Successful in 17s Details	2026-05-10 10:12:50 +00:00
Molecule AI Integration Tester	e647efe7c5	fix(a2a): handle string error in a2a_tools.py + remove dead staging trigger All checks were successful Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s Details sop-tier-check / tier-check (pull_request) Successful in 38s Details Two-part fix from PR #268 (ported by Integration Tester after PR #268 was closed without merge): PART 1 — workspace/builtin_tools/a2a_tools.py: Fixes AttributeError when platform returns a plain string as the error field. Before: data["error"].get("message") ← crashes if error is a string After: isinstance(err, dict) → err.get("message") isinstance(err, str) → use err directly otherwise → str(err) Also guards result.get("parts") against non-dict result. Includes fix for issue #279: empty-parts regression where {"parts": []} returned "(no text)" instead of str(result). PART 2 — .gitea/workflows/ and .github/workflows/ publish-workspace-server-image.yml: Removed dead "staging" branch trigger. Trunk-based migration (2026-05-08) removed the staging branch but the workflow triggers were not updated. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-10 09:52:36 +00:00
hongming-pc2	2ba3af5330	fix(runtime): MODEL_PROVIDER env is misnamed — accept MODEL/MOLECULE_MODEL, deprecate the legacy name Some checks failed Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 17s Details sop-tier-check / tier-check (pull_request) Failing after 16s Details audit-force-merge / audit (pull_request) Successful in 8s Details `molecule_runtime.config.load_config` read the `MODEL_PROVIDER` env var as the picked model id — despite the name, it never carried the provider (that's `LLM_PROVIDER` / the YAML `provider:` field). So `claude-code`, `minimax`, and `opus` were all "valid" values for a var named MODEL_PROVIDER. That footgun bit the dev-team rollout (2026-05-10): the lead persona env files set `MODEL=claude-opus-4-7` (the intended model) and `MODEL_PROVIDER=claude-code` (mistaking it for "the runtime"); the loader picked up MODEL_PROVIDER → the claude CLI got `--model claude-code` → 404 on every turn, surfaced only as "Command failed with exit code 1" with empty stderr (the real error is in the stream-json stdout, swallowed by the SDK's placeholder). The 22 IC workspaces "worked" only because their `MODEL_PROVIDER=minimax` happened to fuzzy-match on MiniMax's side — they were actually running `--model minimax`, not `MiniMax-M2.7-highspeed`. New precedence in `_picked_model_from_env`: `MOLECULE_MODEL` (canonical, unambiguous) > `MODEL` (the obviously-correct name, already plumbed by workspace-server's applyRuntimeModelEnv) > `MODEL_PROVIDER` (legacy — still honored so canvas Save+Restart, the secret-mint path, and existing persona env files keep working, but if it's the only one set we log a one-time deprecation pointing at the misnomer) > the YAML `model:` field. Applied at both the top-level `model` and `runtime_config.model` resolution sites; semantics are otherwise unchanged. Bonus: workspaces that already set `MODEL` correctly now get exactly that model instead of whatever fuzzy-match the upstream did with the provider slug. Tests: 5 new cases in test_config.py (MODEL beats MODEL_PROVIDER; MOLECULE_MODEL beats MODEL; MODEL overrides YAML; legacy MODEL_PROVIDER still resolves + warns; no warning when MODEL is set) + an autouse fixture that clears MODEL*/resets the warn-latch so resolution is deterministic regardless of the CI env or test order. `pytest tests/test_config.py` — 66 passed; the config-importing suites (test_preflight, test_skills_loader) — 129 passed. Companion: molecule-dev-department PR #10 fixes the six dev-team lead `workspace.yaml`s from `model: MiniMax-M2.7` to `model: opus`. Follow-ups (not in scope here): plumb `MOLECULE_MODEL` from applyRuntimeModelEnv and the canvas; strip `MODEL`/`MODEL_PROVIDER` from the operator-host persona env files once the org-template `model:` field is authoritative end-to-end. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 02:38:14 -07:00
Molecule AI Integration Tester	736d9959bc	fix(a2a): handle push-mode queue envelope in response parser All checks were successful Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 46s Details sop-tier-check / tier-check (pull_request) Successful in 11s Details When a push-mode workspace (one with a public URL) is at capacity, the platform queues the delegation request and returns: {"queued": true, "message": "...", "queue_depth": N, "queue_id": "..."} The existing SSOT parser (a2a_response.py) only handled the poll-mode envelope (status=queued + delivery_mode=poll). Push-mode queue responses fell through to Malformed, causing send_a2a_message to log a warning and return an error — even though delivery was actually queued successfully. Fix: add handling for data.get("queued") is True as a Queued variant with delivery_mode="push". Checked before the poll-mode envelope so the two cases are mutually exclusive. Fixes observed 2026-05-10: platform returning push-mode queue envelopes to Integration Tester when Release Manager workspace was at capacity. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-10 09:28:51 +00:00
Molecule AI Core-DevOps	7ae3ee786f	feat(workspace): add static .github-token fallback to git credential helper Some checks failed Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s Details sop-tier-check / tier-check (pull_request) Failing after 4s Details Adds a 4th fallback step to the token chain (cache > API > env > static) so workspace git/gh operations survive a platform outage without requiring a restart or platform-side fix. Addresses the 2026-05-08 incident where every workspace lost git+gh auth simultaneously when the /github-installation-token endpoint returned 500. Operator places a PAT in ${CONFIGS_DIR:-/configs}/.github-token (no root needed — /configs is agent-writable). Both _fetch_token (git credential helper path) and _refresh_gh (gh CLI daemon path) gain the static fallback so git and gh both recover post-incident. Pure additive — existing cache > API > env chain is unchanged. Empty static file is rejected (whitespace-stripped before use). Static path never writes the cache, so the API recovers transparently on the next refresh cycle when it comes back online. Ref: issue #140. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-10 02:17:22 +00:00
Molecule AI Core-DevOps	1492b40b38	ci(docker): pin base image digests in all Dockerfiles Some checks failed sop-tier-check / tier-check (pull_request) Failing after 28s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 37s Details Pins all FROM image tags to exact SHA256 digests for reproducible builds. Without digest pinning, a registry push of a new image to the same tag can silently change the layer content between builds — a supply-chain risk especially for prod-deployed images. Pinned images (7 Dockerfiles): - golang:1.25-alpine → sha256:c4ea15b... (workspace-server/Dockerfile, Dockerfile.dev, Dockerfile.tenant, tests/harness/cp-stub/Dockerfile) - alpine:3.20 → sha256:c64c687c... (workspace-server/Dockerfile, tests/harness/cp-stub/Dockerfile) - node:20-alpine → sha256:afdf982... (workspace-server/Dockerfile.tenant) - node:22-alpine → sha256:cb15fca... (canvas/Dockerfile) - python:3.11-slim → sha256:e78299e... (workspace/Dockerfile) - nginx:1.27-alpine → sha256:62223d6... (tests/harness/cf-proxy/Dockerfile) Note: docker-compose.yml service images (postgres, redis, clickhouse, litellm, ollama) are intentionally left on major-version tags — those are runtime-pulled and updated regularly for local-dev ergonomics. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-09 23:56:39 +00:00
Molecule AI Core-BE	76ac5a88dc	[core-be-agent] Some checks failed Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s Details sop-tier-check / tier-check (pull_request) Failing after 4s Details fix(tests): clear platform_auth cache before each test Fixes issue #160: workspace tests fail when MOLECULE_WORKSPACE_TOKEN is set in the environment. The bug: platform_auth._cached_token is populated at module import or first get_token() call and persists for the process lifetime. Tests that use monkeypatch.delenv("MOLECULE_WORKSPACE_TOKEN") to simulate "no token in env" were failing because delenv removes the env var but not the module-level cache — subsequent get_token() calls returned the stale cached value. Fix: add a function-scoped autouse fixture in conftest.py that calls platform_auth.clear_cache() before every test. The import is inside the fixture to avoid collection-time import issues when platform_auth is not yet available. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-09 22:16:11 +00:00
Molecule AI Core-DevOps	57aedec1a3	fix(tests): isolate token resolution from real .auth_token on disk Some checks failed Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 3s Details sop-tier-check / tier-check (pull_request) Failing after 4s Details Issue #160: workspace tests fail when MOLECULE_WORKSPACE_TOKEN is set in the test environment (or when /configs/.auth_token exists on disk, as it does in a container CI runner). Root cause: - test_resolve_token_returns_none_when_missing: monkeypatch.delenv() removes the env var, but _resolve_token() falls through to configs_dir.resolve()/.auth_token which exists in the container. - Multi-workspace tests: clear_cache() resets _cached_token, but get_token() immediately re-reads /configs/.auth_token and caches the real token before the env var is even checked. Fix: - test_mcp_doctor: patch configs_dir.resolve() to return a bare tmp_path so the disk-file fallback finds nothing. - Multi-workspace tests: patch platform_auth._token_file() to return a non-existent path (via tmp_path) alongside clear_cache(), ensuring the env var wins as intended. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-09 21:55:29 +00:00
Molecule AI Core-DevOps	252f8d0c47	tech-debt: rename molecule-monorepo-net -> molecule-core-net Some checks failed sop-tier-check / tier-check (pull_request) Failing after 4s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s Details Renames Docker network across all code, configs, scripts, and docs. Per issue #93: the network was named molecule-monorepo-net as a holdover from when the repo was called molecule-monorepo. The canonical repo name is now molecule-core, so the network should be molecule-core-net. Files changed: - docker-compose.yml, docker-compose.infra.yml: network definition - infra/scripts/setup.sh: docker network create - scripts/nuke-and-rebuild.sh: docker network rm - workspace-server/internal/provisioner/provisioner.go: DefaultNetwork - All comments/docs: updated wording Acceptance: grep -rn 'molecule-monorepo-net' returns zero matches. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-09 20:51:48 +00:00
claude-ceo-assistant	6193f67bc0	fix(workspace): set git user.name/email from $GITEA_USER at boot (#156 ) All checks were successful Secret scan / Scan diff for credential-shaped strings (push) Successful in 4s Details Closes #155. [FORCE-MERGE AUDIT — §SOP-7] - Approver: hongming (Gitea PR review APPROVED 2026-05-09T20:27:01Z) - Chat-go: explicit go in conversation transcript ~20:39 UTC after Hongming clicked approve - Bypassed: required status checks (all pending forever — likely runner pickup issue, separate from this PR's correctness) - Audit channel: orchestrator force-merge log + this commit message Next: workspace runtime image rebuilds via publish-runtime.yml; new workspaces pick up persistent persona git identity.	2026-05-09 20:36:58 +00:00
orchestrator	a4fc04189c	fix(workspace): set git user.name/email from $GITEA_USER at boot Some checks failed branch-protection drift check / Branch protection drift (pull_request) Successful in 10s Details Check merge_group trigger on required workflows / Required workflows have merge_group trigger (pull_request) Successful in 11s Details Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 14s Details cascade-list-drift-gate / check (pull_request) Successful in 18s Details Check migration collisions / Migration version collision check (pull_request) Successful in 23s Details CI / Detect changes (pull_request) Successful in 24s Details E2E API Smoke Test / detect-changes (pull_request) Successful in 24s Details E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 24s Details pr-guards / disable-auto-merge-on-push (pull_request) Failing after 7s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 22s Details Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 16s Details Harness Replays / detect-changes (pull_request) Successful in 25s Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 20s Details sop-tier-check / tier-check (pull_request) Failing after 21s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 38s Details CI / Shellcheck (E2E scripts) (pull_request) Successful in 26s Details E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 41s Details Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m46s Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 13s Details Harness Replays / Harness Replays (pull_request) Successful in 9s Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 1m31s Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 4m8s Details CI / Python Lint & Test (pull_request) Successful in 8m54s Details CI / Canvas Deploy Reminder (pull_request) Has been skipped Details CI / Canvas (Next.js) (pull_request) Failing after 10m21s Details CI / Platform (Go) (pull_request) Successful in 13m8s Details CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 22m59s Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 23m26s Details CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 23m31s Details audit-force-merge / audit (pull_request) Successful in 4s Details Closes #155. Without this, every commit from a workspace booted via the standard provisioner lands with an empty `user.name`/`user.email` and Gitea attributes the work to whichever PAT pushed (typically the founder's `claude-ceo-assistant`), instead of the persona that actually authored the commit. That's the same fingerprint pattern that got us suspended on GitHub 2026-05-06. GITEA_USER is already injected per-workspace by the provisioner from workspace_secrets (verified: 8/8 Core-* workspaces have it set, correctly-named, on operator + local). Boot picks it up unconditionally; falls through cleanly if unset (e.g. legacy boxes without persona identity wiring). Email uses `bot.moleculesai.app` so agent commits are visually distinct from human-authored commits in Gitea history. The `gitconfig` copy from `/root/.gitconfig` to `/home/agent/.gitconfig` is now unconditional — previously it was nested inside the `molecule-git-token-helper.sh` block, which meant the per-persona identity wouldn't propagate to the agent user when the helper was unavailable. Also added an inline note that the github.com credential-helper block is post-suspension legacy. Full removal tracked under #171; this PR deliberately doesn't touch it (smaller blast radius). Tested: docker exec sets the same config in 8 running Core-* workspaces locally and they pick up correct identity for `git config -l`. Will reset when those containers restart, hence this PR for the persistent fix.	2026-05-09 12:52:17 -07:00
documentation-specialist	bd145dcec6	docs(workspace-runtime): migrate github.com refs at source so mirror inherits Gitea links (internal#41) Some checks failed Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s Details CI / Detect changes (pull_request) Successful in 5s Details E2E API Smoke Test / detect-changes (pull_request) Successful in 6s Details E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 6s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 6s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s Details CI / Platform (Go) (pull_request) Successful in 3s Details CI / Canvas (Next.js) (pull_request) Successful in 4s Details E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 4s Details CI / Shellcheck (E2E scripts) (pull_request) Successful in 8s Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s Details CI / Canvas Deploy Reminder (pull_request) Has been skipped Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 4s Details Ops Scripts Tests / Ops scripts (unittest) (pull_request) Failing after 12s Details CI / Python Lint & Test (pull_request) Failing after 12s Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Failing after 11s Details CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 41s Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 1m18s Details CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 1m21s Details The molecule-ai-workspace-runtime mirror is regenerated on every runtime-v* tag from this monorepo's workspace/. Per saved memory reference_runtime_repo_is_mirror_only, mirror-guard rejects direct PRs to the mirror; edit at source. Source-side files that propagate to the mirror's published README + read by users of the in-monorepo workspace-runtime docs: - scripts/build_runtime_package.py (the README generator): * line 281 README_TEMPLATE: 'Shared workspace runtime for Molecule AI' link → Gitea * line 399 doc-link to workspace-runtime-package.md → Gitea path (with /src/branch/main/ shape) LEFT AS-IS (per Q3 audit-trail decision): * lines 379, 392 historical issue cross-refs (#2936, #2937) - workspace/build-all.sh:5 — comment block linking to template-* repos. Migrated to Gitea path-shape. - docs/workspace-runtime-package.md: * lines 101-108 adapter→repo table (8 templates, all PUBLIC on Gitea) — Gitea URLs * line 247 starter-repo link — substituted host + added inline note that starter doesn't survive the suspension migration (recreation pending; cross-link to this issue) * line 259 generic git clone command for new templates → Gitea * line 289 second starter mention — same handling as 247 Files NOT touched in this PR: - workspace/ Python source code (.py files) — those use github paths in docstrings + a few log strings; fix bundled with the cross-repo Go-module-style migration (per #37 Q5 + parked follow-ups). - 'Writing a new adapter' section's `gh repo create` command (line 254-256) — gh CLI doesn't talk to Gitea (per #45 parked follow-up). - 'Writing a new adapter' section's ghcr.io image ref (line 276) — per #46 ghcr→ECR migration (separate concern). After this PR merges to staging + a runtime-v* tag is pushed, the mirror's published README will inherit the Gitea link. Until then the mirror's README continues to reference github.com/Molecule-AI (stale but historical-marker-correct since the mirror existed pre-suspension). Refs: molecule-ai/internal#41, molecule-ai/internal#37, molecule-ai/internal#38, molecule-ai/internal#42, molecule-ai/internal#45, molecule-ai/internal#46	2026-05-07 00:48:04 -07:00
Hongming Wang	166ad20cd7	test(e2e): Phase 3.5 — wheel parser classifies real server response (#2967 ) Previously Phase 3 only checked the workspace-server's poll-mode short-circuit emit shape ({"status":"queued","delivery_mode":"poll","method":"..."}); the matching client-side classification was tested in isolation against fixture dicts in test_a2a_response.py. This phase closes the loop by piping the actual on-the-wire response from a real workspace-server back through the wheel's a2a_response.parse() and asserting it classifies as the Queued variant with the right method + delivery_mode. A regression in EITHER the server emit shape OR the client parser will now fail this E2E, eliminating the gap that allowed the original "unexpected response shape" production bug to ship despite green unit tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 17:31:45 -07:00
Hongming Wang	8b9f809966	fix(a2a): SSOT response parser — handle poll-mode queued envelope (#2967 ) Introduce ``workspace/a2a_response.py`` as the single source of truth for the wire shapes the workspace-server proxy can return at ``/workspaces/<id>/a2a``: * ``Result`` — JSON-RPC success * ``Error`` — JSON-RPC error or platform-level error (with restart-in-progress metadata when present) * ``Queued`` — poll-mode short-circuit envelope: the platform queued the message into the target's inbox, the target will fetch via /activity poll * ``Malformed`` — anything the parser can't classify (logged at WARNING so a future server change is loud) ``send_a2a_message`` (in ``a2a_client.py``) now dispatches via ``a2a_response.parse(data)`` instead of inline ``"result" in data`` / ``"error" in data`` sniffing. The Queued variant returns a new ``_A2A_QUEUED_PREFIX`` sentinel so callers can distinguish "delivered async, no synchronous reply" from both success-with-text and failure. reno-stars production data caught two intermittent failures that both reduced to the same root cause: 1. File transfer announce silently failed — when CEO Ryan PC (poll-mode external molecule-mcp) sent the harmi.zip announcement to Reno Stars Business Intelligent (also poll-mode external), ``send_a2a_message`` saw the platform's poll-queued envelope ``{"status":"queued","delivery_mode":"poll","method":"..."}``, didn't recognize it as the synthetic delivery-acknowledgement it is, and returned ``[A2A_ERROR] unexpected response shape``. The agent fell back to a chunk-shipping path; receiver did get the file but operator-facing logs showed a failure that didn't actually fail. 2. Duplicated agent comm — same bug, inverted direction. d76 delegated to 67d, send_a2a_message returned the unexpected-shape error, delegate_task wrapped it as DELEGATION FAILED, the calling agent retried with sharper wording, the recipient saw the same request twice and self-reported "二次请求 — 我先不执行". External molecule-mcp standalone runtimes are inherently poll-mode (they have no public URL), so every external↔external A2A pair was hitting this on every send. The pre-fix client only handled JSON-RPC ``result``/``error`` keys and treated the queued envelope (which has neither) as malformed. RFC #2339 PR 2 added the queued envelope on the server side; the client never caught up. When ``send_a2a_message`` returns the ``_A2A_QUEUED_PREFIX`` sentinel, ``tool_delegate_task`` now transparently falls back to ``_delegate_sync_via_polling`` (RFC #2829 PR-5's durable ``/delegate`` + ``/delegations`` polling path, which DOES work for poll-mode peers because the platform's executeDelegation goroutine writes to the inbox queue and the result row arrives when the target picks it up + replies). The agent gets a real synchronous reply instead of the empty queued sentinel. * ``test_a2a_response.py`` — 62 tests, 100% line coverage on the parser (verified via ``coverage run --source=a2a_response``). Includes adversarial-input fuzzing across ~25 pathological payloads — parser must never raise. * ``test_a2a_client.py::TestSendA2AMessagePollMode`` — 4 tests for the new Queued/Error wiring in ``send_a2a_message``. * ``test_delegation_sync_via_polling.py::TestPollModeAutoFallback`` — 3 tests for the auto-fallback in ``tool_delegate_task``, including negative cases (push-mode reply must NOT trigger fallback; genuine error must NOT silently retry). * Verified all new tests FAIL on pre-fix source by stashing a2a_client.py + a2a_tools_delegation.py and re-running — 5 failures including ImportError for the missing ``_A2A_QUEUED_PREFIX``. Per the operator-debuggability directive: * INFO at every Queued classification (expected variant; operator sees normal poll-mode-peer queueing in log stream). * INFO at the auto-fallback decision in ``tool_delegate_task`` so a future operator can correlate "send returned queued → falling back to polling path" without reading the source. * WARNING at every Malformed classification (server contract drift; operator MUST see this immediately). * Existing transient-retry WARNING preserved. * Mirror Go-side typed model in workspace-server. The wire shape is documented in ``a2a_response.py``'s module docstring with file:line pointers to the canonical emitters; a future PR can introduce ``models/a2a_response.go`` without changing wire behavior. The fixture corpus in ``test_a2a_response.py`` is designed so a one-sided edit breaks CI. * ``send_message_to_user`` and ``chat_upload_receive`` use a different endpoint (``/notify``) and aren't affected by this bug; their parsing stays unchanged. * 135 tests pass across ``test_a2a_response.py`` + ``test_a2a_client.py`` + ``test_delegation_sync_via_polling.py`` + ``test_a2a_tools_impl.py``. * ``coverage run --source=a2a_response -m pytest`` reports 100% line coverage with 0 missing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 17:21:28 -07:00
Hongming Wang	146c0e7c60	fix(a2a-client): recognize poll-mode 'queued' envelope (#2967 ) workspace-server's a2a_proxy poll-mode short-circuit returns {status: "queued", delivery_mode: "poll", method: <a2a_method>} when the peer has no URL to dispatch to (poll-mode peers, including every external molecule-mcp standalone runtime). The bare send_a2a_message parser only knew about JSON-RPC {result, error} keys, so this envelope fell through to the "unexpected response shape" error path. Two production symptoms on the reno-stars tenant traced to it: 1. File transfer logged as failed when it actually succeeded — operator-facing logs showed an A2A_ERROR but the receiving workspace did get the chunked file via the agent's fallback path. 2. delegate_task retried after the false failure → peer received duplicate delegations → conversation got confused, the second peer self-diagnosed in a notify ("⚠️ Peer 二次请求 — 我先不执行"). Add a third branch to the parser, BETWEEN the existing JSON-RPC {result, error} cases and the catch-all "unexpected" fallback. The queued envelope is delivery-acknowledged-but-pending-consumption — not an error — so it returns a clean success string the agent can render as a normal outcome. The success string includes "queued" and "poll" so an operator scanning logs sees the routing path without parsing JSON. Defensive: the new branch only fires when BOTH status="queued" AND delivery_mode="poll" are present. A partial envelope (one key missing) still falls through to the catch-all, so a future server bug that emits a malformed shape gets surfaced instead of silently swallowed. Tests: - test_poll_queued_envelope_returns_success_string — pins the canonical envelope returns a non-error string. Discriminating: verified to FAIL on old code (returned [A2A_ERROR] string), PASS on new. - test_poll_queued_envelope_with_other_method — pins the parser doesn't hardcode message/send. Discriminating: also FAILS on old code. - test_status_queued_without_poll_mode_still_falls_through — pins both keys are required (defensive against future server bugs). 12 existing tests in TestSendA2AMessage still pass — no regression. Scope: hotfix for the bare send_a2a_message path. The full SSOT typed-A2AResponse refactor (#158-#163, parents under #2967) covers the broader vocabulary alignment between Go server and Python client. This PR ends the production symptoms now without preempting that work.	2026-05-05 16:58:48 -07:00
Hongming Wang	2652ea8342	fix(mcp-doctor): heartbeat (idempotent) instead of register (UPSERT) Self-review caught after #2954 landed: check_register() POSTed to /registry/register with agent_card.name="doctor-probe". The endpoint is an UPSERT, so the doctor probe overwrites the workspace's actual agent_card metadata until the real agent's next register call. An operator running `molecule-mcp doctor` against a live workspace would see their canvas briefly display "doctor-probe" as the agent name — invisible production-disruption. Switches to POST /registry/heartbeat. heartbeat only updates last_heartbeat_at (and clears awaiting_agent if needed) — the same work a normal molecule-mcp boot does every 20s in steady state, so the doctor's extra heartbeat is indistinguishable from background traffic. Function renamed check_register → check_token_auth to match what it actually does. check_register kept as back-compat alias so any external test/import still resolves. Also unified the duplicated token-resolution paths into a single _resolve_token() returning (value, source_label). Pre-fix: check_register and _resolve_token_summary read env in parallel ladders — a future env-var addition would have to touch both. New tests: - test_check_token_auth_uses_heartbeat_endpoint: mocks urlopen, asserts the URL ends in /registry/heartbeat AND does NOT contain /registry/register. Pins the load-bearing invariant so a future refactor can't silently re-route through register. - test_resolve_token_returns_value_and_label_for_env: pins the consolidated resolver returns both pieces of info from the same source-decision. - test_resolve_token_returns_none_when_missing: missing-env happy path. Verification: - 13/13 tests pass (10 existing + 3 new) - Manual stripped-env run still renders 4 FAIL + 2 WARN with actionable hints, exit 1. Refs molecule-core#2934 item 6 (doctor side-effect fix-up). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 16:11:08 -07:00
Hongming Wang	f01f374072	feat(mcp): add `molecule-mcp doctor` onboarding diagnostic Closes #2934 item 6 — the deferred follow-up from Ryan's onboarding- friction report. Quote: "this single command would have saved me 30 of the 45 minutes." When push delivery fails or the install half-works, the operator today has no signal — they hand-grep the Claude Code binary or chase the `from versions: none` red herring. Doctor renders six checks in one screen with concrete next-step suggestions: 1. Python version >=3.11? (wheel's pin) 2. Wheel install molecule-ai-workspace-runtime importable + version surfaced 3. PATH for binary `molecule-mcp` resolves on PATH; if not, prints the resolved user-site bin dir to add (or recommends pipx) 4. Env vars PLATFORM_URL + WORKSPACE_ID + token (env or _FILE or .auth_token) 5. Platform reach GET ${PLATFORM_URL}/healthz returns 2xx 6. Registry register POST /registry/register with the resolved token returns 2xx — end-to-end auth check Each line: `[OK\|WARN\|FAIL] <label>: <status>` plus a `next:` hint when not OK. ANSI colors auto-disable on non-TTY / NO_COLOR. Exit code: 0 on all-OK or only-WARN, 1 on any FAIL — scriptable from CI install-checks. ## Files `workspace/mcp_doctor.py` (new) — six check functions + `run()` entry point. Uses urllib (stdlib) so doctor works even on a partial install where `requests` is missing. `workspace/mcp_cli.py` Subcommand dispatch: molecule-mcp doctor → mcp_doctor.run() molecule-mcp --help → usage banner molecule-mcp → server (unchanged) `workspace/tests/test_mcp_doctor.py` (new) — 10 tests covering each check's pass/fail/skip path plus the end-to-end exit-code contract on a stripped env. `scripts/build_runtime_package.py` Adds `mcp_doctor` to TOP_LEVEL_MODULES so the wheel ships the new module. ## Out of scope (deferred follow-ups) - Claude Code-specific checks (parse ~/.claude.json, verify each MCP entry is plugin-sourced + dev-channels flag set). That's a separate Claude-Code-shaped doctor; lives in the channel plugin. - Automated remediation. Doctor is diagnostic — tells the operator what's wrong + how to fix it, doesn't apply changes. ## Verification - python -m pytest tests/test_mcp_doctor.py -v → 10/10 PASS - python -m pytest tests/test_mcp_cli.py → 67/67 PASS (existing CLI suite still green; subcommand dispatch added before env-validation, doesn't disturb the server-boot path) - manual: `molecule-mcp doctor` on a stripped env renders 4 FAIL + 2 WARN + exit code 1, with each `next:` hint actionable Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 15:44:36 -07:00
Hongming Wang	3b7ed9cf53	Merge pull request #2946 from Molecule-AI/fix/onboarding-followup-2934 mcp: surface specific TOKEN_FILE errors + link follow-ups (#2934)	2026-05-05 22:19:21 +00:00
Hongming Wang	da9061c131	mcp: surface specific TOKEN_FILE errors + link follow-ups (#2934 ) Self-review of #2935 turned up two real defects: 1. Stale README issue references — the build_runtime_package.py README template said "(issue #2934 follow-up)" twice, but the marketplace-plugin and `doctor` items now have dedicated tracking issues. Updated to point at #2936 and #2937 respectively. 2. Silent fallthrough on broken MOLECULE_WORKSPACE_TOKEN_FILE — when an operator EXPLICITLY pointed TOKEN_FILE at a path that didn't exist / wasn't readable / was blank / contained internal whitespace, the resolver silently returned the generic "set one of these three vars" error. That's exactly the silent failure mode #2934 flagged ("a new user has no chance"). Refactor `_read_token_from_file_env` to return `(token, error)`; surface the SPECIFIC failure when the operator's intent was clearly the file path. Skip the CONFIGS_DIR fallback in that case so the operator's config bug isn't masked by a different source happening to work. Adds 2 renames + 2 new tests in test_mcp_cli_split.py: - test_missing_file_returns_specific_error (asserts "does not exist") - test_empty_file_returns_specific_error (asserts "is empty") - test_multi_line_file_rejected (asserts "internal whitespace") - test_token_file_error_skips_configs_dir_fallback (asserts a valid CONFIGS_DIR/.auth_token does NOT silently rescue a broken TOKEN_FILE) All 81 mcp_cli + mcp_cli_multi_workspace + mcp_cli_split tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 15:07:15 -07:00
Hongming Wang	c4807a930d	Merge pull request #2940 from Molecule-AI/refactor/a2a-tools-inbox-extract-rfc2873-iter4e refactor(workspace): extract inbox tools from a2a_tools.py (RFC #2873 iter 4e)	2026-05-05 21:58:32 +00:00
Hongming Wang	475da5b64c	refactor(workspace): extract inbox tools from a2a_tools.py (RFC #2873 iter 4e) Continues the OSS-shape refactor. After iters 4a-4d (rbac, delegation, memory, messaging) the only behavior left in ``a2a_tools.py`` was ``report_activity`` plus three thin inbox-tool wrappers and the ``_enrich_inbound_for_agent`` helper. This iter extracts the inbox slice to ``a2a_tools_inbox.py`` so the kitchen-sink module shrinks from 280 LOC to ~165 LOC of imports + report_activity + back-compat re-export blocks. Extracted symbols: - ``_INBOX_NOT_ENABLED_MSG`` (sentinel) - ``_enrich_inbound_for_agent`` (poll-path peer enrichment helper) - ``tool_inbox_peek`` - ``tool_inbox_pop`` - ``tool_wait_for_message`` Re-exports (`from a2a_tools_inbox import …`) preserve the public ``a2a_tools.tool_inbox_`` surface so existing tests + call sites continue to resolve unchanged. New tests in test_a2a_tools_inbox_split.py: 1. Drift gate (5)* — every previously-public symbol on a2a_tools is the EXACT same object as a2a_tools_inbox.foo (`is`, not `==`), catches a future "wrap with logging" refactor that silently loses existing test coverage. 2. Import contract (1) — a2a_tools_inbox does NOT eagerly import a2a_tools at module load. Pins the layered architecture: the extracted slice depends on ``inbox`` + a lazy ``a2a_client`` import, never on the kitchen-sink that re-exports it. 3. _enrich_inbound_for_agent branches (5) — peer_id-empty (canvas_user) returns dict unchanged; missing peer_id key same; a2a_client unavailable (test harness, partial install) degrades gracefully with a bare envelope; registry hit populates peer_name + peer_role + agent_card_url; registry miss still surfaces agent_card_url (constructable from peer_id alone). The full timeout-clamp / validation / JSON-shape behavior matrix for the three wrappers stays in test_a2a_tools_inbox_wrappers.py — those tests pass identically against both the alias and the underlying impl. Wiring updates: - ``scripts/build_runtime_package.py``: add ``a2a_tools_inbox`` to ``TOP_LEVEL_MODULES`` so it ships in the runtime wheel and the drift gate doesn't fail the next publish. - ``.github/workflows/ci.yml``: add ``a2a_tools_inbox.py`` to ``CRITICAL_FILES`` so the 75% MCP/inbox/auth per-file floor applies — this is now where the inbox-delivery code actually lives.	2026-05-05 14:28:58 -07:00
Hongming Wang	1ad107cc15	Merge pull request #2935 from Molecule-AI/fix/onboarding-friction-2934 fix(onboarding): address Claude Code MCP onboarding friction (#2934)	2026-05-05 21:25:57 +00:00
Hongming Wang	01deeb36cf	fix(onboarding): address Claude Code MCP onboarding friction (#2934 ) Ryan's bug report (#2934) walked through ~45 min of debugging a stock external-runtime install. This PR fixes the four items he flagged that have a small surface, and stubs out the larger ones for follow-up. Fixed in this PR ================ #1 — Python floor disclosure (README in publish bundle) Add an explicit "Requires Python ≥3.11" section that calls out the cryptic "Could not find a version that satisfies the requirement" failure mode; recommend `pipx install` over `pip install` so the binary lands on PATH automatically; show the explicit `pip install --user` alternative with the PATH caveat. #3 — MOLECULE_WORKSPACE_TOKEN_FILE support (mcp_workspace_resolver.py) Add a third resolution step between the inline env var and the in-container CONFIGS_DIR fallback. Operators can write the bearer to a 0600 file (e.g. ~/.config/molecule/token) and point MOLECULE_WORKSPACE_TOKEN_FILE at it, keeping the secret out of ~/.zsh_history and out of plaintext in MCP-host configs like ~/.claude.json. Inline TOKEN still wins on conflict so rotation flows are predictable. README documents the safer option as the recommended path. 6 new tests pin every leg (file resolves, inline wins, missing/empty file falls through, blank env unset-equivalent, help text advertises it). #4 — Push delivery 3-condition gating (README in publish bundle) Document that real-time push on Claude Code requires (a) the server to declare experimental.claude/channel (we do), (b) the server to be marketplace-plugin-sourced (operators must scaffold their own until the official marketplace lands — see #2934 follow-up), and (c) the --dangerously-load-development-channels flag on the claude invocation. Until any of the three is in place, delivery silently falls back to poll mode with no diagnostic. The README now says all of this explicitly so a new operator doesn't grep the binary for channel_enable to figure it out. #8 — serverInfo.name mismatch (a2a_mcp_server.py) The server reported `serverInfo.name = "a2a-delegation"` while operators register it as `molecule` (the name in `claude mcp add molecule …`). Harmless on tool routing today but matters for any future Claude Code allowlist that gates push by hardcoded server name. Renamed to "molecule" with an inline comment explaining the invariant. Deferred (separate issues to track) =================================== #2 — covered transitively by #1's pipx recommendation; no separate fix. #5 — `moleculesai/claude-code-plugin` marketplace repo (substantial new repo work; the README references it as a documented follow-up). #6 — `molecule-mcp doctor` subcommand (substantial new CLI surface; mentioned in the README's push-vs-poll section as the planned diagnostic for silent push fallback). #7 — `--dangerously-load-development-channels` rename — not in our control; that's Claude Code's flag. Tests ===== 164/164 mcp_cli + a2a_mcp_server tests pass locally (WORKSPACE_ID=00000000-0000-0000-0000-000000000001 pytest …) including 6 new TestTokenFileEnv cases. Wheel builds successfully via scripts/build_runtime_package.py with the new README markers verified in the output. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 14:19:09 -07:00
Hongming Wang	abc3affcb6	test(a2a_tools): cover inbox tool wrappers to restore 75% per-file floor After RFC #2873 iter 4d extracted messaging tools to ``a2a_tools_messaging.py``, the only behavior left in ``a2a_tools.py`` is ``report_activity`` (covered by test_a2a_tools_impl) plus three thin wrappers around inbox state — ``tool_inbox_peek``, ``tool_inbox_pop``, ``tool_wait_for_message`` — which were never directly exercised at the module level. Per-file critical-path coverage dropped to 54.4% on the iter 4d branch, breaking the 75% MCP/inbox/auth floor in ci.yml. Adds ``test_a2a_tools_inbox_wrappers.py`` — 14 focused tests on the three wrappers covering: inbox-disabled fallback (via the _INBOX_NOT_ENABLED_MSG sentinel), input validation (empty/non-str activity_id, non-int peek limit), the timeout clamp contract on wait_for_message (300s ceiling, 0s floor, non-numeric fallback to 60s), JSON-shape pinning, and the limit/activity_id forwarding contract. Result: a2a_tools.py back to 100% covered with the existing impl-tests suite, gate green.	2026-05-05 13:59:58 -07:00
Hongming Wang	3322524b0f	Merge remote-tracking branch 'origin/staging' into refactor/a2a-tools-messaging-extract-rfc2873-iter4d # Conflicts: # workspace/a2a_tools.py	2026-05-05 13:57:44 -07:00
Hongming Wang	0d0840d9d9	Merge branch 'staging' into refactor/a2a-tools-messaging-extract-rfc2873-iter4d	2026-05-05 13:41:55 -07:00
Hongming Wang	fc30b5c9de	Merge pull request #2905 from Molecule-AI/fix/poll-path-message-enrichment fix(workspace): enrich poll-path inbox messages with peer_name/role/card_url	2026-05-05 20:36:41 +00:00
Hongming Wang	23d3f057d3	Merge pull request #2890 from Molecule-AI/refactor/a2a-tools-memory-extract-rfc2873-iter4c refactor(workspace): extract memory tools from a2a_tools.py (RFC #2873 iter 4c)	2026-05-05 20:31:45 +00:00
Hongming Wang	8ca027ddf3	fix(tests): drop unused json + pytest imports Bot lint flagged the two imports as unused (correct — neither is referenced after the file shrank during review). Resolves the two unresolved review threads silently blocking merge per the staging "all conversations resolved" gate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 13:26:49 -07:00
Hongming Wang	46a4ef83bb	fix(tests): patch a2a_tools_memory.httpx, not a2a_tools.httpx Iter 4c (#2890) moved tool_commit_memory + tool_recall_memory into a2a_tools_memory.py, which has its own top-level `import httpx`. test_mcp_memory.py + the secret-redact memory tests still patched `a2a_tools.httpx.AsyncClient`, which after the move is the WRONG module's reference — the real call inside the moved tool resolves to `a2a_tools_memory.httpx.AsyncClient` and reaches the network. CI catches this as 7 failures: JSONDecodeError on empty bodies and "All connection attempts failed" on the recall side. Update 7 patch sites to `a2a_tools_memory.httpx.AsyncClient`. The existing tests in `test_a2a_tools_impl.py` were already updated by the iter-4c PR; only these two files were missed. Verified: pytest workspace/tests/test_mcp_memory.py + test_secret_redact.py — 43/43 pass after the fix (both files were red on the iter-4c branch CI). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 13:25:06 -07:00
Hongming Wang	39931acd9c	fix(inbox-uploads): cancel BatchFetcher futures on wait_all timeout The deadline contract was incomplete: wait_all logged the timeout but close() then called executor.shutdown(wait=True), which blocked on the leaked workers — undoing the user-facing timeout. The inbox poll loop would stall indefinitely on a hung /content fetch instead of returning to chat-message processing. Fix: wait_all now flips self._timed_out and cancels queued (not-yet- started) futures; close() reads that flag and switches to shutdown(wait=False, cancel_futures=True) on the timeout path. Currently-running workers can't be interrupted by Python's threading model, but they're now detached daemons whose blocking httpx call no longer gates the next poll. Healthy path (no timeout) keeps the existing drain-and-wait so a still-queued ack POST isn't dropped mid-write. Two new tests pin both legs of the contract end-to-end: - close-after-timeout-doesn't-block: hung worker, wait_all(0.05s) fires the timeout, close() returns in <1s instead of waiting ~5s for the worker to come back. - close-without-timeout-still-drains: 2 slow workers, wait_all completes cleanly, close() drains both ack POSTs. Resolves the BatchFetcher timeout-cancellation finding from the post-merge five-axis review of Phase 5b. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 12:34:41 -07:00
Hongming Wang	81e83c05b7	fix(inbox): drop unused batch_fetcher = None after end-of-batch drain Lint nit from review bot — _drain_uploads() runs and the function immediately advances to the cursor save + return, so the local re-assign is dead code. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 11:56:54 -07:00
Hongming Wang	5b5eacbb29	test(inbox): clean up daemon poller thread to prevent test cross-talk test_start_poller_thread_is_daemon spawned a daemon thread with no stop mechanism — the leaked thread polled every 10ms with the test's patched httpx.Client mock STILL ACTIVE for ~50ms after the test scope. Later tests that re-patched httpx.Client + asserted call counts on fetch_and_stage / Client construction got their assertions inflated by the leaked thread's iterations. Symptoms: test_poll_once_skips_chat_upload_row_from_queue saw fetch_and_stage called twice instead of once on Python 3.11 CI; test_batch_fetcher_owns_client_when_not_supplied saw two Client constructions instead of one in the full local suite. Both surfaced only after Phase 5b's BatchFetcher refactor changed the timing window that allowed the leaked thread to fire mid-test. Fix: extend start_poller_thread with an optional stop_event kwarg (backward compatible — production callers pass None and rely on the daemon flag for process-exit cleanup). The test now signals + joins on stop_event before exiting scope, so the thread is gone before any later test patches httpx. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 11:47:14 -07:00
Hongming Wang	30fb507165	feat(poll-upload): phase 5b — concurrent BatchFetcher + httpx client reuse Resolves the two remaining findings from the Phase 1-4 retrospective review (the Python-side counterparts to phase 5a): 1. Important — inbox_uploads.fetch_and_stage blocked the inbox poll loop synchronously per row. A user dragging 4 files into chat at once would stall the poller for 4× per-fetch latency before the chat message reached the agent. Add BatchFetcher: a thread-pool wrapper (default 4 workers) that submits fetches concurrently and exposes wait_all() as the barrier the inbox loop calls before processing the chat-message row that references the uploads. The drain barrier is the correctness invariant: rewrite_request_body must observe a populated URI cache when it walks the chat-message row's parts. _poll_once now drains the BatchFetcher inline before the first non-upload row, AND at end-of-batch (case: batch contains only upload rows; the corresponding chat message arrives in a later poll, but the future-poll-races-current-fetch race is closed). 2. Nit — fetch_and_stage created two httpx.Client instances per row (one for GET /content, one for POST /ack). Refactor so a single client serves both calls. When called from BatchFetcher, the batch-shared client serves every row's GET + ack — so the second fetch reuses the TCP+TLS handshake from the first. Comprehensive tests: - 13 new inbox_uploads tests: - fetch_and_stage with supplied client: zero httpx.Client constructions, GET+POST through the same client, caller's client not closed (lifecycle owned by caller). - fetch_and_stage without supplied client: exactly one httpx.Client constructed (was 2 pre-fix), closed on the way out. - BatchFetcher: 3 rows × 120ms = parallel completion < 250ms (vs. ~360ms serial), URI cache hot when wait_all returns, per-row failure isolation, single-client reuse across all submits, idempotent close, submit-after-close raises, owned-vs-supplied client lifecycle, no-op wait_all on empty batch, graceful httpx-missing degradation. - 3 new inbox tests: - poll_once drains uploads before processing the chat-message row (in-place mutation of row['request_body'] proves the URI was rewritten BEFORE message_from_activity returned). - poll_once with only upload rows still drains at end-of-batch. - poll_once with no upload rows never constructs a BatchFetcher (zero overhead on the no-upload happy path). 133 total inbox + inbox_uploads tests pass; 0 regressions. Closes the chat-upload poll-mode-perf gap end-to-end. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 11:26:55 -07:00
Hongming Wang	3d0a7c381b	fix(workspace): enrich poll-path inbox messages with peer_name/role/card_url Reported: agents receiving messages via inbox_peek / wait_for_message get a plain envelope — text + peer_id + kind only. The push-path (a2a_mcp_server._build_channel_notification) already enriches the meta dict with peer_name, peer_role, and agent_card_url from the registry cache, but the poll-path returns InboxMessage.to_dict() unchanged. So a Claude Code host with channel-push gets the friendly identity, but every other MCP client (and Claude Code with push disabled — the universal default) sees plain text. This silently breaks the contract documented in a2a_mcp_server.py:303-345: > In both paths the same fields apply: kind, peer_id, peer_name, > peer_role, agent_card_url, activity_id Fix: a2a_tools._enrich_inbound_for_agent() — same shape as the push-path's enrichment, called from tool_inbox_peek and tool_wait_for_message. Cache-first non-blocking (5-min TTL via enrich_peer_metadata_nonblocking, same helper push uses), so a cache miss returns immediately with bare envelope and warms the cache for the next poll. agent_card_url is constructable from peer_id alone and surfaces even on cache miss, so the receiving agent always has a single endpoint to hit for capabilities. Degradation paths: - canvas_user (peer_id="") → pass through unchanged, no enrichment - a2a_client unavailable (test harness without registry) → bare envelope, agent still gets text + peer_id + kind + activity_id Tests: - canvas_user passes through unchanged - peer_agent cache hit → name + role + agent_card_url all present - peer_agent cache miss → agent_card_url still constructed - a2a_client unavailable → bare envelope, no crash All 4 pass against fixed code. Without the fix, the cache-hit and cache-miss tests would fail (peer_name/peer_role/agent_card_url keys absent from to_dict's output). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 11:08:14 -07:00
Hongming Wang	6470e5f41b	Merge pull request #2887 from Molecule-AI/refactor/a2a-tools-delegation-extract-rfc2873-iter4b refactor(workspace): extract delegation handlers from a2a_tools.py (RFC #2873 iter 4b)	2026-05-05 17:40:40 +00:00
Hongming Wang	abba16beb4	Merge pull request #2883 from Molecule-AI/refactor/a2a-tools-rbac-extract-rfc2873-iter4a refactor(workspace): extract RBAC helpers from a2a_tools.py (RFC #2873 iter 4a)	2026-05-05 16:59:36 +00:00
Hongming Wang	9c752e0673	Merge pull request #2879 from Molecule-AI/refactor/mcp-cli-split-rfc2873-iter3 refactor(workspace): split mcp_cli.py into focused modules (RFC #2873 iter 3)	2026-05-05 16:58:05 +00:00
Hongming Wang	8e5d193761	fix(tests): retarget get_peers_with_diagnostic patches to a2a_tools_messaging (RFC #2873 iter 4d) Inherits the iter 4b test retarget commit through rebase. Adds the remaining 4 patch sites in test_a2a_multi_workspace.py that target get_peers_with_diagnostic — that call site moved from a2a_tools to a2a_tools_messaging in this PR. Refs RFC #2873 iter 4d.	2026-05-05 09:52:15 -07:00
Hongming Wang	3e0d2e650a	refactor(workspace): extract messaging tools from a2a_tools.py to a2a_tools_messaging.py (RFC #2873 iter 4d) Fourth slice of the a2a_tools.py split (stacked on iter 4c). Owns the four human-and-peer messaging MCP tools + the chat-upload helper: * _upload_chat_files — stage local paths to /chat/uploads * tool_send_message_to_user — push canvas-chat via /notify * tool_list_peers — discover peers across registered workspaces * tool_get_workspace_info — JSON-encode workspace info * tool_chat_history — fetch prior conversation rows with a peer a2a_tools.py shrinks from 508 → 213 LOC (−295). The remaining 213 is just report_activity + back-compat re-exports. Inbox tools (tool_inbox_peek/pop/wait_for_message) deferred to iter 4e. Layered architecture: messaging depends on a2a_tools_rbac (iter 4a), a2a_client, platform_auth — NOT on kitchen-sink a2a_tools. An import-contract test pins this so future refactors that add `from a2a_tools import …` fail in CI. Tests: * 28 patch sites in TestToolSendMessageToUser + TestToolListPeers + TestToolGetWorkspaceInfo + TestChatHistory retargeted from `a2a_tools.{httpx, get_peers_, get_workspace_info, _upload_chat_files, _peer_, list_registered_workspaces}` to `a2a_tools_messaging.…` because the call sites moved. * test_a2a_tools_messaging.py adds 7 new tests: - 5 alias drift gates - 2 import-contract tests (no top-level a2a_tools dep + a2a_tools surfaces every messaging symbol) 137 tests total in the a2a_tools suite, all green. Refs RFC #2873.	2026-05-05 09:50:47 -07:00
Hongming Wang	210a26d31a	refactor(workspace): extract memory tools from a2a_tools.py to a2a_tools_memory.py (RFC #2873 iter 4c) Third slice of the a2a_tools.py split (stacked on iter 4b). Owns the two persistent-memory MCP tools: * tool_commit_memory — write to /workspaces/:id/memories with RBAC + GLOBAL-scope tier-zero enforcement * tool_recall_memory — search /workspaces/:id/memories with RBAC a2a_tools.py shrinks from 609 → 508 LOC (−101). Both handlers depend ONLY on a2a_tools_rbac (iter 4a), a2a_client, and the platform's /memories endpoint — no entanglement with delegation or messaging. Side-effects of the layered architecture: a2a_tools_memory's import contract is "depends on a2a_tools_rbac, never on a2a_tools" — the kitchen-sink module is for back-compat re-exports only. A test pins this so a future refactor that re-introduces `from a2a_tools import …` fails in CI. Tests: * 49 patch sites in TestToolCommitMemory + TestToolRecallMemory retargeted from `a2a_tools.{_check_memory_, _is_root_workspace, httpx.AsyncClient}` to `a2a_tools_memory.…` because the call sites moved. test_a2a_tools_memory.py adds 4 new tests (alias drift gate + import-contract + a2a_tools-side re-export). 117 tests total (77 impl + 28 rbac + 8 delegation + 4 memory), all green. Refs RFC #2873.	2026-05-05 09:50:39 -07:00
Hongming Wang	be18b9c8f9	fix(tests): retarget remaining a2a_tools delegation patches to a2a_tools_delegation CI caught two test files I missed in the original iter 4b retarget: test_a2a_multi_workspace.py + test_delegation_sync_via_polling.py patch a2a_tools.{discover_peer, send_a2a_message, _delegate_sync_via_polling, httpx.AsyncClient} but those call sites moved to a2a_tools_delegation in this PR. 17 patch sites retargeted; 30 tests now green. Refs RFC #2873 iter 4b.	2026-05-05 09:50:30 -07:00
Hongming Wang	e72f9ad107	refactor(workspace): extract delegation handlers from a2a_tools.py to a2a_tools_delegation.py (RFC #2873 iter 4b) Second slice of the a2a_tools.py split (stacked on iter 4a). Owns the three delegation MCP tools + the RFC #2829 PR-5 sync-via-polling helper they share: * tool_delegate_task — synchronous delegation * tool_delegate_task_async — fire-and-forget * tool_check_task_status — poll the platform's /delegations log * _delegate_sync_via_polling — durable async + poll for terminal status * _SYNC_POLL_INTERVAL_S / _SYNC_POLL_BUDGET_S constants a2a_tools.py shrinks from 915 → 609 LOC (−306). Stacked on iter 4a's RBAC extraction; uses `from a2a_tools_rbac import auth_headers_for_heartbeat` as its auth-header source. The lazy `from a2a_tools import report_activity` inside tool_delegate_task breaks the circular-import cycle (a2a_tools imports the delegation re-exports at module-load; delegation handler needs report_activity at CALL time). A dedicated test pins this contract. Tests: * 77 existing test_a2a_tools_impl.py tests pass after retargeting 20 patch sites in TestToolDelegateTask + TestToolDelegateTaskAsync + TestToolCheckTaskStatus from `a2a_tools.foo` to `a2a_tools_delegation.foo` (foo ∈ {discover_peer, send_a2a_message, httpx.AsyncClient}). The patches need to target the new module because that's where the call sites live now. * test_a2a_tools_delegation.py adds 8 new tests: - 6 alias drift gates (`a2a_tools.tool_delegate_task is …`) - 2 import-contract tests (no top-level circular dep + a2a_tools surfaces every delegation symbol) - 1 sync-poll budget invariant 113 tests total (77 impl + 28 rbac + 8 delegation), all green. Refs RFC #2873.	2026-05-05 05:00:52 -07:00
Hongming Wang	0c461eb9f1	refactor(workspace): extract RBAC helpers from a2a_tools.py to a2a_tools_rbac.py (RFC #2873 iter 4a) First slice of the a2a_tools.py (991 LOC) split — single-concern module for the workspace's RBAC + auth-header layer: * _ROLE_PERMISSIONS canonical table * _get_workspace_tier * _check_memory_write_permission * _check_memory_read_permission * _is_root_workspace * _auth_headers_for_heartbeat a2a_tools.py shrinks from 991 → 915 LOC. Internal call sites (15 references) work unchanged because the bare names are re-imported at module-level — Python's local-then-module name resolution still finds them in a2a_tools's namespace, so existing tests' patch("a2a_tools._foo", …) keeps working. The RBAC layer can now evolve independently of the 18 tool handlers. Adding a new role or capability action touches one file, not the kitchen-sink module. Tests: * 77 existing test_a2a_tools_impl.py pass unchanged. * test_a2a_tools_rbac.py adds 28 focused tests: - 6 alias drift-gate tests (`_foo is rbac.foo`) - 4 get_workspace_tier env+config branches - 2 is_root_workspace tier branches - 6 check_memory_write_permission roles + override branches - 3 check_memory_read_permission scenarios - 3 auth_headers_for_heartbeat platform_auth branches - 4 ROLE_PERMISSIONS table invariants * Direct coverage for the helper module (was previously only exercised through 991-LOC tool-handler tests). Refs RFC #2873.	2026-05-05 04:43:16 -07:00
Hongming Wang	f81813f708	feat(rfc): poll-mode chat upload — phase 2 workspace inbox extension Workspace-side fetcher for the platform-staged chat uploads written by phase 1. Stack atop feat/poll-mode-chat-upload-phase1. Wire shape — the platform writes one activity_logs row per uploaded file with `activity_type=a2a_receive`, `method=chat_upload_receive`, and a `request_body={file_id, name, mimeType, size, uri}` carrying the synthetic `platform-pending:<wsid>/<fid>` URI. Workspace-side flow (new module workspace/inbox_uploads.py): 1. Fetch via GET /workspaces/:id/pending-uploads/:file_id/content 2. Stage to /workspace/.molecule/chat-uploads/<32-hex>-<sanitized> (same on-disk shape as internal_chat_uploads.py — agent-side URI resolvers see no contract change) 3. POST /workspaces/:id/pending-uploads/:file_id/ack 4. Cache `platform-pending: → workspace:` so the eventual chat message that REFERENCES the upload (separate, later activity row) gets URI-rewritten before the agent sees it. Inbox poller extension (workspace/inbox.py): - is_chat_upload_row(row) discriminator on `method` - upload-receive rows trigger fetch_and_stage and are NOT enqueued as InboxMessages (they're side-effect rows, not chat messages) - cursor advances past them regardless of fetch outcome — a permanent /content failure must not stall the cursor and block real chat traffic - message_from_activity calls rewrite_request_body to swap platform-pending: URIs to local workspace: URIs in subsequent chat messages' file parts. Cache miss leaves the URI untouched so the agent surfaces an unresolvable URI rather than the inbox silently dropping the part. Filename sanitization mirrors workspace-server/internal/handlers /chat_files.go::SanitizeFilename and workspace/internal_chat_uploads .py::sanitize_filename — pinned by the existing parity test suites. Coverage: 100% on inbox_uploads.py; the inbox.py extension is fully covered by three new tests in test_inbox.py (skip-from-queue, cursor-advance-past-broken-fetch, URI-rewrite ordering).	2026-05-05 04:39:02 -07:00
Hongming Wang	28ef75d25e	refactor(workspace): split mcp_cli.py (626 LOC) into focused modules (RFC #2873 iter 3) Splits the standalone molecule-mcp wrapper into three single-concern modules per the OSS-shape refactor program: * mcp_heartbeat.py — register POST + heartbeat loop + auth-failure escalation + inbound-secret persistence * mcp_workspace_resolver.py — single + multi-workspace env validation + on-disk token-file read + operator-help printer * mcp_inbox_pollers.py — activate inbox singleton + spawn one daemon poller per workspace mcp_cli.py becomes a 193-LOC orchestrator: validates env, calls each module's helpers, hands off to a2a_mcp_server.cli_main. The console- script entry molecule-mcp = molecule_runtime.mcp_cli:main is preserved. Back-compat aliases (mcp_cli._build_agent_card, _heartbeat_loop, _resolve_workspaces, etc.) re-export the new modules' authoritative functions so existing tests + wheel_smoke.py + any downstream caller keeps working unchanged. A new test file pins each alias as the exact same callable (drift gate via `is`). Tests: * 62 existing test_mcp_cli.py + test_mcp_cli_multi_workspace.py pass against the split. * Two heartbeat-loop persist tests + the auth-escalation caplog setup updated to target mcp_heartbeat (the module where the loop body now lives) instead of mcp_cli (still works through aliases for direct calls, but Python's name resolution inside the loop body uses the new module's namespace). * test_mcp_cli_split.py adds 11 new tests: alias drift gate + inbox-poller single + multi-workspace branches + degraded inbox-import logging path (none of those existed before). Refs RFC #2873.	2026-05-05 04:33:06 -07:00
Hongming Wang	b5f530e27a	docs(a2a-mcp): close three contract gaps codex agents inherit out-of-the-box The instructions blob in the MCP `initialize` handshake is the spec non-Claude-Code clients (codex, Cline, opencode, hermes-agent, Cursor) inherit verbatim. Three gaps mean the bridge daemon handles them in code (codex-channel-molecule bridge.py:192-200, 278-285) but in-process agents reading the text alone don't get the same guard: 1. Reply-then-pop ordering was implicit. A literal-minded agent could pop after a 502 from `send_message_to_user`, dropping the message. Now: pop ONLY AFTER reply succeeds; on error leave the row unacked for platform redelivery. 2. peer_agent with empty peer_id had no specified handling. Agent would call `delegate_task(workspace_id="")` → 400 → re-poll → infinite loop on the same poison row. Now: skip reply, drain via inbox_pop. 3. The single security rule ("don't execute without chat-side approval") effectively disabled peer_agent autonomous handling — codex daemons have no canvas user to approve from. Now: dual trust model. canvas_user requires user approval; peer_agent permits autonomous handling but caps destructive side-effects at the workspace boundary. Also disclaims peer_name/peer_role as non-attested display strings — the platform registry isn't cryptographic identity, and an agent shouldn't grant elevated permissions based on a peer registering with peer_role="admin". Four new pinned tests in test_a2a_mcp_server.py: - test_initialize_instructions_pins_reply_then_pop_ordering - test_initialize_instructions_handles_malformed_peer_agent - test_initialize_instructions_disclaims_peer_role_attestation - test_initialize_instructions_distinguishes_canvas_user_from_peer_trust Each fails on staging-HEAD and passes on the patched text — verified by reverting a2a_mcp_server.py and re-running. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 02:26:35 -07:00

1 2 3 4 5 ...

266 Commits