molecule-core/workspace/tests/conftest.py
Hongming Wang e9a59cda3b feat(platform): single-source-of-truth tool registry — adapters consume, no drift
Establishes workspace/platform_tools/registry.py as THE place tool
naming and docs live. Every consumer reads from it; nothing duplicates
the source. Closes the architectural gap behind the doc/tool drift
discussion 2026-04-28 — adding hundreds of future runtime SDK adapters
should not require touching tool names anywhere except the registry.

What the registry owns

  ToolSpec dataclass with: name, short (one-line description), when_to_use
  (multi-paragraph agent-facing usage guidance), input_schema (JSON Schema),
  impl (the actual coroutine in a2a_tools.py), section ('a2a' | 'memory').

  TOOLS list with 8 entries — delegate_task, delegate_task_async,
  check_task_status, list_peers, get_workspace_info, send_message_to_user,
  commit_memory, recall_memory.

What now reads from the registry

  - workspace/a2a_mcp_server.py
      The hardcoded TOOLS list (167 lines of hand-maintained dicts) is
      gone. Replaced with a 6-line list comprehension over the registry.
      MCP description = spec.short. inputSchema = spec.input_schema.

  - workspace/executor_helpers.py
      get_a2a_instructions(mcp=True) and get_hma_instructions() now
      GENERATE the agent-facing system-prompt text from the registry.
      Heading + per-tool bullet (spec.short) + per-tool when_to_use +
      a section-specific footer. No more hand-maintained instruction
      blocks that drift from reality.

  - workspace/builtin_tools/delegation.py
      Renamed delegate_to_workspace -> delegate_task_async to match
      registry. check_delegation_status -> check_task_status. Added
      sync delegate_task @tool wrapping a2a_tools.tool_delegate_task
      (was missing for LangChain runtimes — CP review Issue 3).

  - workspace/builtin_tools/memory.py
      Renamed search_memory -> recall_memory to match registry.

  - workspace/adapter_base.py, workspace/main.py
      Bundle all 7 core tools (was 6) into all_tools / base_tools.

  - workspace/coordinator.py, shared_runtime.py, policies/routing.py
      Updated system-prompt-text references to use the registry names.

Structural alignment tests

  workspace/tests/test_platform_tools.py — 9 tests pin every
  registry-to-adapter mapping:
    - registry names are unique
    - a2a + memory partition is complete (no orphans)
    - by_name lookup works
    - MCP server registers exactly the registry's tool set
    - MCP description equals registry.short for every tool
    - MCP inputSchema equals registry.input_schema for every tool
    - get_a2a_instructions text contains every a2a tool name
    - get_hma_instructions text contains every memory tool name
    - pre-rename names (delegate_to_workspace, search_memory,
      check_delegation_status) cannot leak back

  Adding a future tool means adding one ToolSpec; the test failure
  list tells the author exactly which adapter to update.

Adapter pattern for future SDK support

  When (e.g.) AutoGen or Pydantic AI gets adapters, the only work
  needed for tool surfacing is "wrap registry.TOOLS in your SDK's
  tool format." Names, descriptions, schemas, impl come from the
  registry — adapter author writes zero strings.

Why this needed to ship now

  PR #2237 (already in staging) injected MCP-world docs as the
  default system-prompt content. Without the registry, those docs
  said "delegate_task" while LangChain runtimes only had
  "delegate_to_workspace" — workers see docs for tools that don't
  exist (CP review Issue 1+3). PR #2239 was a tactical rename;
  this PR is the structural fix that prevents the same class of
  drift from recurring as new adapters ship.

  PR #2239 was closed in favor of this — same renames, plus the
  registry, plus structural tests. Single coherent change.

Tests: 1232 pass, 2 xfailed (pre-existing). 9 new in
test_platform_tools.py; 4 alignment tests in test_prompt.py from
#2237 still pass; original test_executor_helpers tests adapted to
the registry-driven world.

Refs: CP review Issues 1, 2, 3, 5; project memory
project_runtime_native_pluggable.md (platform owns A2A);
project memory feedback_doc_tool_alignment.md (this is the structural
fix for the tactical lesson).
2026-04-28 17:11:36 -07:00

271 lines
12 KiB
Python

"""Shared fixtures and module mocks for workspace-template tests.
Mocks the a2a SDK modules before any test imports a2a_executor,
since the a2a SDK is a heavy external dependency.
"""
import sys
from types import ModuleType
from unittest.mock import MagicMock
def _make_a2a_mocks():
"""Create mock modules for the a2a SDK with real base classes."""
# a2a.server.agent_execution needs a real AgentExecutor base class
agent_execution_mod = ModuleType("a2a.server.agent_execution")
class AgentExecutor:
"""Stub base class for LangGraphA2AExecutor."""
pass
class RequestContext:
"""Stub for type hints."""
pass
agent_execution_mod.AgentExecutor = AgentExecutor
agent_execution_mod.RequestContext = RequestContext
# a2a.server.events needs a real EventQueue reference
events_mod = ModuleType("a2a.server.events")
class EventQueue:
"""Stub for type hints."""
pass
events_mod.EventQueue = EventQueue
# a2a.server.tasks needs a TaskUpdater stub whose async methods are no-ops.
# In tests, TaskUpdater calls go to this stub rather than the real SDK so
# event_queue.enqueue_event is only called via explicit executor code paths.
tasks_mod = ModuleType("a2a.server.tasks")
class TaskUpdater:
"""Stub TaskUpdater — no-op async methods for unit tests."""
def __init__(self, event_queue, task_id, context_id, *args, **kwargs):
self.event_queue = event_queue
self.task_id = task_id
self.context_id = context_id
async def start_work(self, message=None):
pass
async def complete(self, message=None):
pass
async def failed(self, message=None):
pass
async def add_artifact(
self, parts, artifact_id=None, name=None, metadata=None,
append=None, last_chunk=None, extensions=None
):
pass
tasks_mod.TaskUpdater = TaskUpdater
# a2a.types needs Part stub for artifact construction (v1: Part takes text= directly, no TextPart)
types_mod = ModuleType("a2a.types")
class Part:
"""Stub for A2A Part (v1: takes text= kwarg directly)."""
def __init__(self, text=None, root=None, **kwargs):
self.text = text
types_mod.Part = Part
# a2a.helpers (v1: moved from a2a.utils, renamed new_agent_text_message
# → new_text_message). Mock both names — production code only calls
# new_text_message, but if any test still references the old name it
# gets the same lambda for backward compat during the rename rollout.
helpers_mod = ModuleType("a2a.helpers")
helpers_mod.new_text_message = lambda text, **kwargs: text
helpers_mod.new_agent_text_message = helpers_mod.new_text_message
# Register all module paths
a2a_mod = ModuleType("a2a")
a2a_server_mod = ModuleType("a2a.server")
sys.modules["a2a"] = a2a_mod
sys.modules["a2a.server"] = a2a_server_mod
sys.modules["a2a.server.agent_execution"] = agent_execution_mod
sys.modules["a2a.server.events"] = events_mod
sys.modules["a2a.server.tasks"] = tasks_mod
sys.modules["a2a.types"] = types_mod
sys.modules["a2a.helpers"] = helpers_mod
def _make_langchain_mocks():
"""Create mock modules for langchain_core so coordinator.py can be imported."""
langchain_core_mod = ModuleType("langchain_core")
langchain_core_tools_mod = ModuleType("langchain_core.tools")
# Make @tool a no-op decorator
langchain_core_tools_mod.tool = lambda f: f
sys.modules["langchain_core"] = langchain_core_mod
sys.modules["langchain_core.tools"] = langchain_core_tools_mod
def _make_tools_mocks():
"""Create mock modules for tools.* so adapters can be imported in tests."""
tools_mod = ModuleType("builtin_tools")
tools_mod.__path__ = [] # Make it a proper package
tools_delegation_mod = ModuleType("builtin_tools.delegation")
tools_delegation_mod.delegate_task = MagicMock()
tools_delegation_mod.delegate_task.name = "delegate_task"
tools_delegation_mod.delegate_task_async = MagicMock()
tools_delegation_mod.delegate_task_async.name = "delegate_task_async"
tools_delegation_mod.check_task_status = MagicMock()
tools_delegation_mod.check_task_status.name = "check_task_status"
tools_approval_mod = ModuleType("builtin_tools.approval")
tools_approval_mod.request_approval = MagicMock()
tools_approval_mod.request_approval.name = "request_approval"
tools_memory_mod = ModuleType("builtin_tools.memory")
tools_memory_mod.commit_memory = MagicMock()
tools_memory_mod.commit_memory.name = "commit_memory"
tools_memory_mod.recall_memory = MagicMock()
tools_memory_mod.recall_memory.name = "recall_memory"
tools_sandbox_mod = ModuleType("builtin_tools.sandbox")
tools_sandbox_mod.run_code = MagicMock()
tools_sandbox_mod.run_code.name = "run_code"
tools_a2a_mod = ModuleType("builtin_tools.a2a_tools")
tools_a2a_mod.delegate_task = MagicMock()
tools_a2a_mod.list_peers = MagicMock()
tools_a2a_mod.get_peers_summary = MagicMock()
tools_awareness_mod = ModuleType("builtin_tools.awareness_client")
tools_awareness_mod.get_awareness_config = MagicMock(return_value=None)
# tools.telemetry — provide constants and no-op callables used by a2a_executor
from contextvars import ContextVar
tools_telemetry_mod = ModuleType("builtin_tools.telemetry")
tools_telemetry_mod.GEN_AI_SYSTEM = "gen_ai.system"
tools_telemetry_mod.GEN_AI_REQUEST_MODEL = "gen_ai.request.model"
tools_telemetry_mod.GEN_AI_OPERATION_NAME = "gen_ai.operation.name"
tools_telemetry_mod.GEN_AI_USAGE_INPUT_TOKENS = "gen_ai.usage.input_tokens"
tools_telemetry_mod.GEN_AI_USAGE_OUTPUT_TOKENS = "gen_ai.usage.output_tokens"
tools_telemetry_mod.GEN_AI_RESPONSE_FINISH_REASONS = "gen_ai.response.finish_reasons"
tools_telemetry_mod.WORKSPACE_ID_ATTR = "workspace.id"
tools_telemetry_mod.A2A_TASK_ID = "a2a.task_id"
tools_telemetry_mod.A2A_SOURCE_WORKSPACE = "a2a.source_workspace_id"
tools_telemetry_mod.A2A_TARGET_WORKSPACE = "a2a.target_workspace_id"
tools_telemetry_mod.MEMORY_SCOPE = "memory.scope"
tools_telemetry_mod.MEMORY_QUERY = "memory.query"
tools_telemetry_mod._incoming_trace_context = ContextVar("otel_incoming_trace_context", default=None)
tools_telemetry_mod.get_tracer = MagicMock(return_value=MagicMock())
tools_telemetry_mod.setup_telemetry = MagicMock()
tools_telemetry_mod.make_trace_middleware = MagicMock(side_effect=lambda app: app)
tools_telemetry_mod.inject_trace_headers = MagicMock(side_effect=lambda h: h)
tools_telemetry_mod.extract_trace_context = MagicMock(return_value=None)
tools_telemetry_mod.get_current_traceparent = MagicMock(return_value=None)
tools_telemetry_mod.gen_ai_system_from_model = lambda m: m.split(":")[0] if ":" in m else "unknown"
tools_telemetry_mod.record_llm_token_usage = MagicMock()
# tools.audit — provide RBAC helpers and log_event as no-ops
tools_audit_mod = ModuleType("builtin_tools.audit")
tools_audit_mod.log_event = MagicMock(return_value="mock-trace-id")
tools_audit_mod.check_permission = MagicMock(return_value=True)
tools_audit_mod.get_workspace_roles = MagicMock(return_value=(["operator"], {}))
tools_audit_mod.ROLE_PERMISSIONS = {
"admin": {"delegate", "approve", "memory.read", "memory.write"},
"operator": {"delegate", "approve", "memory.read", "memory.write"},
"read-only": {"memory.read"},
}
# tools.hitl — lightweight stubs for the HITL tools
tools_hitl_mod = ModuleType("builtin_tools.hitl")
tools_hitl_mod.pause_task = MagicMock()
tools_hitl_mod.pause_task.name = "pause_task"
tools_hitl_mod.resume_task = MagicMock()
tools_hitl_mod.resume_task.name = "resume_task"
tools_hitl_mod.list_paused_tasks = MagicMock()
tools_hitl_mod.list_paused_tasks.name = "list_paused_tasks"
tools_hitl_mod.requires_approval = MagicMock(side_effect=lambda *a, **kw: (lambda f: f))
tools_hitl_mod.pause_registry = MagicMock()
# builtin_tools.security — load the real module so _redact_secrets is
# available to executor_helpers, a2a_tools, and any other module that
# imports from it. The module is pure-Python with no external deps.
import importlib.util as _ilu
import os as _os
_sec_path = _os.path.join(
_os.path.dirname(_os.path.dirname(_os.path.abspath(__file__))),
"builtin_tools", "security.py",
)
_sec_spec = _ilu.spec_from_file_location("builtin_tools.security", _sec_path)
_sec_mod = _ilu.module_from_spec(_sec_spec)
_sec_spec.loader.exec_module(_sec_mod)
sys.modules["builtin_tools"] = tools_mod
sys.modules["builtin_tools.delegation"] = tools_delegation_mod
sys.modules["builtin_tools.approval"] = tools_approval_mod
sys.modules["builtin_tools.memory"] = tools_memory_mod
sys.modules["builtin_tools.sandbox"] = tools_sandbox_mod
sys.modules["builtin_tools.a2a_tools"] = tools_a2a_mod
sys.modules["builtin_tools.awareness_client"] = tools_awareness_mod
sys.modules["builtin_tools.telemetry"] = tools_telemetry_mod
sys.modules["builtin_tools.audit"] = tools_audit_mod
sys.modules["builtin_tools.hitl"] = tools_hitl_mod
sys.modules["builtin_tools.security"] = _sec_mod
# Install mocks before any test collection imports a2a_executor
if "a2a" not in sys.modules:
_make_a2a_mocks()
# Note: the claude_agent_sdk stub was removed alongside
# workspace/claude_sdk_executor.py (#87 Phase 2). The executor + its
# tests now live in the claude-code template repo, where the real SDK
# IS installed via Dockerfile, so no stub is needed.
if "langchain_core" not in sys.modules:
_make_langchain_mocks()
if "builtin_tools" not in sys.modules or not hasattr(sys.modules.get("builtin_tools"), "__path__"):
_make_tools_mocks()
# Mock additional modules needed by _common_setup in base.py
if "plugins" not in sys.modules:
plugins_mod = ModuleType("plugins")
plugins_mod.load_plugins = MagicMock()
sys.modules["plugins"] = plugins_mod
if "skill_loader" not in sys.modules:
# Add workspace-template to path so real skills.loader can be imported
import importlib.util
_ws_root = str(MagicMock.__module__).replace("unittest.mock", "") # just a trick to get path
import os as _os
_ws_root = _os.path.dirname(_os.path.dirname(_os.path.abspath(__file__)))
if _ws_root not in sys.path:
sys.path.insert(0, _ws_root)
# Import real skills module so LoadedSkill/SkillMetadata are available
skills_mod = ModuleType("skill_loader")
skills_mod.__path__ = [_os.path.join(_ws_root, "skill_loader")]
sys.modules["skill_loader"] = skills_mod
_spec = importlib.util.spec_from_file_location("skill_loader.loader", _os.path.join(_ws_root, "skill_loader", "loader.py"))
_loader_mod = importlib.util.module_from_spec(_spec)
sys.modules["skill_loader.loader"] = _loader_mod
_spec.loader.exec_module(_loader_mod)
if "coordinator" not in sys.modules:
# Try importing real coordinator first
try:
import coordinator as _coord # noqa: F401
except (ImportError, RuntimeError):
coordinator_mod = ModuleType("coordinator")
coordinator_mod.get_children = MagicMock()
coordinator_mod.get_parent_context = MagicMock()
coordinator_mod.build_children_description = MagicMock()
coordinator_mod.route_task_to_team = MagicMock()
coordinator_mod.route_task_to_team.name = "route_task_to_team"
sys.modules["coordinator"] = coordinator_mod
# Don't mock prompt or coordinator if they can be imported from the workspace-template dir
# test_prompt.py and test_coordinator.py need the real modules