forked from molecule-ai/molecule-core
Closes part of #2790 (Phase B). Prevents a recurrence of the PR #2766 → PR #2771 cycle: PR #2766 added ``source_workspace_id`` to four tools' ``input_schema`` and tool implementations, but the dispatcher in ``a2a_mcp_server.handle_tool_call`` silently dropped the kwarg for ``commit_memory`` / ``recall_memory`` / ``chat_history`` / ``get_workspace_info``. Schema lied; LLMs populated the param; every call fell back to ``WORKSPACE_ID``, defeating multi-tenant isolation. Existing dispatcher tests asserted return-value substrings (``"working" in result``) instead of kwarg flow, so the bug shipped to main and was only caught by re-reviewing post-merge. This change adds an AST-driven gate. For every ToolSpec in platform_tools.registry.TOOLS, the gate finds the matching ``elif name == "<tool>"`` arm in a2a_mcp_server.py and asserts that every property declared in input_schema.properties is read by an ``arguments.get("<property>", ...)`` call inside that arm. A new schema field the dispatcher forgets to forward fails CI loudly. Three tests: - test_every_dispatch_arm_reads_every_schema_property: main drift gate. Walks registry, matches dispatch arms by name, diffs declared vs read keys. - test_dispatch_arms_reach_every_registered_tool: inverse direction. A registered tool with no dispatch arm is "Unknown tool" at runtime, even though docs/wrappers/schema all advertise it. Catches PRs that add a ToolSpec but forget the dispatcher. - test_drift_gate_self_check_finds_known_arms: pin the AST parser. If handle_tool_call is refactored into a different shape (dict dispatch, registry-driven, etc.) and _load_dispatch_arms returns {}, the main gate vacuously passes — this self-check makes that failure mode explicit by requiring 12 known arms to be discovered. Verified the gate catches the PR #2766 bug: stripping ``source_workspace_id=arguments.get(...)`` from the commit_memory arm fails the gate with a descriptive error pointing at the missing kwarg and referencing the prior incident. Restored → 3 tests pass. Suite: 1733 passed (was 1730 + 3 new), 3 skipped, 2 xfailed. Why AST, not runtime invocation: the runtime mock-based tests in test_a2a_mcp_server.py already assert kwargs flow correctly for four explicitly-tested tools. This gate is cheaper (~1ms), catches new properties before someone has to remember the runtime test, and runs as a structural invariant. Phase A (Python coverage floor) and Phase C (molecule-mcp e2e harness) remain in #2790 as separate follow-ups. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| adapters | ||
| builtin_tools | ||
| lib | ||
| molecule_audit | ||
| platform_tools | ||
| plugins_registry | ||
| policies | ||
| scripts | ||
| skill_loader | ||
| tests | ||
| .coveragerc | ||
| a2a_cli.py | ||
| a2a_client.py | ||
| a2a_executor.py | ||
| a2a_mcp_server.py | ||
| a2a_tools.py | ||
| adapter_base.py | ||
| agent.py | ||
| agents_md.py | ||
| boot_routes.py | ||
| build-all.sh | ||
| card_helpers.py | ||
| config.py | ||
| configs_dir.py | ||
| consolidation.py | ||
| coordinator.py | ||
| Dockerfile | ||
| entrypoint.sh | ||
| event_log.py | ||
| events.py | ||
| executor_helpers.py | ||
| heartbeat.py | ||
| inbox.py | ||
| initial_prompt.py | ||
| internal_chat_uploads.py | ||
| internal_file_read.py | ||
| main.py | ||
| mcp_cli.py | ||
| molecule_ai_status.py | ||
| not_configured_handler.py | ||
| platform_auth.py | ||
| platform_inbound_auth.py | ||
| plugins.py | ||
| preflight.py | ||
| prompt.py | ||
| pytest.ini | ||
| rebuild-runtime-images.sh | ||
| requirements.txt | ||
| runtime_wedge.py | ||
| secret_redactor.py | ||
| shared_runtime.py | ||
| smoke_mode.py | ||
| transcript_auth.py | ||
| watcher.py | ||