molecule-core/workspace/tests
Hongming Wang 7504aba934 feat(tools): tighten send_message_to_user description to forbid pasting URLs in body
Root-cause fix for #118 (chat attachments rendering as plain text links
instead of download chips). User flagged with screenshot 2026-04-26
showing the Design Director agent pasting https://files.catbox.moe/…
in the message body — chat rendered the URL as plain markdown text,
unclickable in the canvas's bubble layout, and unreachable in any SaaS
deployment where the user's browser can't egress to catbox.

The structured `attachments` field already exists, the canvas's
AttachmentChip already renders well, the WebSocket broadcast already
carries attachments verbatim — the missing piece was the LLM choosing
the body over the structured field. Tighten the tool description so it
trains the right behavior.

Three targeted strengthenings:

  1. Top-level tool description: enumerated use case (4) now reads
     "via the `attachments` field (NEVER paste file URLs in `message`)".
     The all-caps NEVER + the explicit field name move the LLM toward
     the structured path on first read.

  2. `message` param: adds an explicit DO NOT rule with rationale.
     Includes the SaaS-reachability reason so operators can grep for
     "SaaS" and find this design constraint instead of re-discovering it
     after a tenant complaint. Calls out catbox.moe + file:// by name as
     concrete examples of forbidden hosts (those are the two we've seen
     in production).

  3. `attachments` param: leads with REQUIRED, lists the bad
     alternatives explicitly (pasting URLs, base64-encoding, telling
     user to look at a path). LLMs handle "use X, NOT Y" framings
     better than "use X" alone — observed during prompt-engineering
     iteration on hermes' tool descriptions.

Tests pin all three load-bearing phrases (4 new in test_a2a_mcp_server.py)
so a future doc edit that softens or drops them fails CI. Brittle by
design — these are prompt-engineering invariants, not implementation
details.

This is the root-cause fix. A defensive canvas-side backstop (auto-
detect download-shaped URLs in body and convert to chips) is a
follow-up that could land separately if the steering proves
insufficient in practice.

Verification:
  - 1190/1190 workspace pytest pass
  - 4 new test_a2a_mcp_server.py cases all green

Closes the steering half of #118. The structured-attachments-only
contract was already enforced server-side (PR #2130 added per-attachment
validation); this PR closes the prompt-side gap.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 01:13:11 -07:00
..
adapters chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
__init__.py chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
conftest.py chore(workspace): drop claude_sdk_executor — Phase 2 of #87 2026-04-27 00:52:55 -07:00
test_a2a_cli.py chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
test_a2a_client.py fix(a2a): review-driven hardening — prefix-anchored type check, error_detail cap, shared hint module 2026-04-24 23:47:44 -07:00
test_a2a_executor.py feat(workspace): migrate a2a-sdk from 0.3.x to 1.0.0 (KI-009) (#1974) 2026-04-24 04:43:17 +00:00
test_a2a_mcp_server.py feat(tools): tighten send_message_to_user description to forbid pasting URLs in body 2026-04-27 01:13:11 -07:00
test_a2a_tools_impl.py feat(tools): borrow hermes-style discipline — error/summary caps + sharper MCP descriptions 2026-04-26 23:25:54 -07:00
test_a2a_tools_module.py fix(workspace): tag self-originated A2A POSTs with X-Workspace-ID 2026-04-24 19:54:43 -07:00
test_agent_base_urls.py chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
test_agent.py chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
test_agents_md.py chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
test_approval.py chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
test_audit_ledger.py chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
test_audit.py chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
test_awareness_client_full.py chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
test_compliance.py chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
test_config.py refactor(test_config): parametrize the 3 yaml-default cases (simplify on #2085) 2026-04-26 02:03:59 -07:00
test_consolidation.py chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
test_coordinator_parent.py chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
test_coordinator_routing.py chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
test_delegation.py fix(delegation): lazy-refresh QUEUED state from platform; live DELEGATION_* events 2026-04-26 16:05:04 -07:00
test_events.py chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
test_executor_helpers.py feat(canvas+platform): chat attachments, model selection, deploy/delete UX 2026-04-24 13:27:51 -07:00
test_gh_wrapper.sh chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
test_governance.py chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
test_heartbeat_runtime_metadata.py fix(test): drop unused MagicMock import in test_heartbeat_runtime_metadata 2026-04-26 22:58:21 -07:00
test_heartbeat.py fix(workspace): tag self-originated A2A POSTs with X-Workspace-ID 2026-04-24 19:54:43 -07:00
test_hitl.py chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
test_main_initial_prompt.py chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
test_mcp_memory.py chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
test_memory.py chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
test_molecule_ai_status.py chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
test_namespaces.py chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
test_openclaw_adapter.py chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
test_platform_auth.py chore: final open-source cleanup — binary, stale paths, private refs 2026-04-18 00:38:55 -07:00
test_plugins_builtins.py feat(plugin): implement MCPServerAdaptor (issue #847) 2026-04-24 01:42:13 +00:00
test_plugins_registry.py chore: final open-source cleanup — binary, stale paths, private refs 2026-04-18 00:38:55 -07:00
test_plugins.py chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
test_pre_stop.py feat(workspace): pre-stop serialization for pause/resume (closes #1386) 2026-04-21 12:40:44 +00:00
test_preflight.py feat(preflight): replace SUPPORTED_RUNTIMES static list with adapter discovery 2026-04-27 00:44:51 -07:00
test_prompt.py chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
test_routing_policy.py chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
test_runtime_capabilities.py feat(runtime): adapter-declared idle_timeout_override end-to-end 2026-04-26 22:38:01 -07:00
test_runtime_wedge.py chore(workspace): drop claude_sdk_executor — Phase 2 of #87 2026-04-27 00:52:55 -07:00
test_safe_env.py chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
test_sandbox.py chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
test_secret_redact.py chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
test_security_scan.py chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
test_skills_loader.py chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
test_skills_watcher.py chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
test_snapshot_scrub.py feat(workspace): snapshot secret scrubber (closes #823) 2026-04-19 00:32:42 -07:00
test_telemetry.py chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
test_temporal_workflow.py chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
test_transcript_auth.py chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
test_watcher.py chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00