RFC: Push instruction layer — tell agents how to reply, not just what arrived #1830

Open
opened 2026-05-25 03:25:21 +00:00 by hongming · 2 comments
Owner

RFC: Push instruction layer — tell agents how to reply, not just what arrived

Summary

Add a server-rendered <instructions> block (or top-level field in the JSON-shaped push) to every outbound channel push. Today, the push tag tells the receiving agent the message kind, sender, and body, but says nothing about what to do with it. New agents (and current ones, on bad days) reply via local stdout/TUI which the user never sees if they're on mobile.

Problem

Observed today

Verified live 2026-05-25 in 091a9180-805f-4c82-90c7-0c7961b0c007 (CTO's Claude Code instance):

<channel
  source="molecule" source="molecule"          ← bug: duplicated attribute
  kind="canvas_user"
  workspace_id="091a9180-805f-4c82-90c7-0c7961b0c007"
  watching_as="091a9180-805f-4c82-90c7-0c7961b0c007"
  peer_id=""
  method="message/send"
  activity_id="3ff7d004-671a-4b66-af69-3980b7796ef1"
  ts="2026-05-25T01:21:28.304111Z">
hi
</channel>

Nothing in the tag tells me:

  • Which MCP tool to reply with (reply_to_workspace vs send_message_to_user)
  • For canvas_user kind: pass peer_id="" vs for peer_agent: pass peer_id=<from_tag>
  • Do not print to stdout — user may be on mobile and never sees terminal output
  • Where the agent docs live (when in doubt)

Failure mode

An agent that doesn't know the convention will route the reply to its local terminal. From the user's perspective: the message was delivered, the agent went silent. Silent failure, no error path, no observability.

This isn't hypothetical — I almost did it in this very session before catching myself.

Related context

  • feedback_obs_first_debugging_all_agents says agents must respond via the platform, but the rule lives in agent system prompts, not in the inbound push. New agent runtimes have no way to learn the rule from the message alone.
  • Task #16 (#1675 canvas push regression) shows the canvas/push path is already known-fragile — file + test landed, fix still pending.

Proposal

Shape

Wrap the existing body with a server-rendered preamble:

<channel kind="canvas_user" ...>
  <instructions>
    Reply via mcp__molecule__reply_to_workspace (kind=canvas_user → omit peer_id; kind=peer_agent → pass peer_id from tag).
    Do NOT emit text in your local terminal/stdout — user may not be at the keyboard.
    For attachments, the upload arrives as a separate `chat_upload_receive` push with the attachments array.
    Docs: https://docs.moleculesai.app/agent/replies
    Capabilities visible to you: <list of MCP tools the platform knows you have>
  </instructions>
  <body>hi</body>
</channel>

Equivalent JSON-shaped (poll path, wait_for_message):

{
  "kind": "canvas_user",
  "workspace_id": "...",
  "peer_id": "",
  "method": "message/send",
  "activity_id": "...",
  "ts": "...",
  "instructions": {
    "reply_via": "mcp__molecule__reply_to_workspace",
    "reply_args": {"peer_id": ""},
    "stdout_warning": "User may not be at the terminal — route reply through platform.",
    "docs_url": "https://docs.moleculesai.app/agent/replies",
    "available_tools": ["reply_to_workspace", "send_message_to_user", "inbox_pop", "present_options"]
  },
  "body": "hi"
}

Where it's rendered

Server-side, in the same handler that emits the push tag today. Workspace-templated:

  • Each workspace registers an instruction_template_id at create time (default: claude-code-default, codex-default, etc., based on runtime).
  • Server stores templates in a new instruction_templates table keyed by template_id.
  • On every outbound push, the server renders the template with {kind, peer_id, available_tools} substitutions and prepends/embeds it in the payload.

Two implementation choices

Choice How it surfaces Pros Cons
A. Wrap body with <instructions> block Preamble visible in the channel tag XML No new field; agents who already parse the channel tag see it for free Changes the visible body shape; may surprise existing prompts
B. Add instructions JSON field Sibling to other attributes, agents must explicitly look at it Cleanly separated; no body mutation Existing agents won't read it unless they update — graceful but slow rollout

Recommend B with a 2-week shadow period emitting both, then deprecate A's option (since A was never shipped).

API shape (new)

Workspace template registration (admin API)

POST /cp/admin/workspace/{workspace_id}/instruction-template
Body: { "template_id": "claude-code-default" } | { "inline_template": "..." }

Template variables

Var Source
{kind} from inbound message
{peer_id} from inbound message (empty for canvas_user)
{reply_tool} hardcoded per template
{available_tools} resolved from agent_card MCP capabilities at push time
{docs_url} hardcoded per template

Default templates (ship 3)

  • claude-code-default — covers Claude Code CLI runtime
  • codex-default — covers codex app-server runtime
  • generic-mcp-default — covers everything else (Cursor, custom MCP clients)

Migration / rollout

  1. Phase 1 (this RFC): land the schema + server-side rendering + 3 default templates, gated by env var PUSH_INSTRUCTIONS_ENABLED=false.
  2. Phase 2: flip to true in staging, observe agents in the agents-team tenant for one week.
  3. Phase 3: production rollout. Existing workspaces without a template assignment get generic-mcp-default by runtime column inference.

Backwards compat

Agents that ignore the new instructions field continue working exactly as today. No-op for them.

Open questions

  1. Token cost — preamble adds ~200 tokens per inbound push. Across many inbox messages this adds up. Worth it? (Yes, IMO — silent-failure cost is higher.) Mitigation: workspace-level instruction_compact=true flag for hot paths.
  2. Templating engine — Go text/template? Liquid? Or just printf-style substitution? Recommend stdlib text/template (already in workspace-server).
  3. Override hierarchy — workspace-level template overrides runtime default, OR workspace template extends runtime default? Recommend extends — workspace adds workspace-specific tools/docs to the runtime base.
  4. Do peer_agent pushes need a different preamble than canvas_user? — yes, the peer_id requirement is different. Template branches on kind.

Scope guardrails

  • Out of scope: changing the existing push transport, WebSocket protocol, or activity_logs schema.
  • Out of scope: rewriting agent system prompts — the preamble augments, doesn't replace.
  • Out of scope: per-message agent intelligence — preamble is workspace-templated, same for every message in a workspace.

Pairs with

  • Push attachments regression (task #16, #1675) — once attachments are correctly projected, the preamble can include "an attachments=[] array is present on this push; fetch via uri" hint.
  • RFC#645 (present_options MCP) — preamble would advertise present_options as a capability when the agent has it, encouraging button-style replies over open-ended text.
  • RFC #637 (canvas-user identity capture) — preamble could surface "this message is from user_id=X" once that lands.

Cost estimate

  • Schema migration + server-side render path: ~2 days
  • 3 default templates: ~0.5 day
  • Admin API for template assignment: ~1 day
  • Tests + canary in agents-team: ~1 day
  • Total: ~1 week of one engineer
# RFC: Push instruction layer — tell agents *how* to reply, not just *what* arrived ## Summary Add a server-rendered `<instructions>` block (or top-level field in the JSON-shaped push) to every outbound channel push. Today, the push tag tells the receiving agent the message kind, sender, and body, but says nothing about *what to do with it*. New agents (and current ones, on bad days) reply via local stdout/TUI which the user never sees if they're on mobile. ## Problem ### Observed today Verified live 2026-05-25 in `091a9180-805f-4c82-90c7-0c7961b0c007` (CTO's Claude Code instance): ``` <channel source="molecule" source="molecule" ← bug: duplicated attribute kind="canvas_user" workspace_id="091a9180-805f-4c82-90c7-0c7961b0c007" watching_as="091a9180-805f-4c82-90c7-0c7961b0c007" peer_id="" method="message/send" activity_id="3ff7d004-671a-4b66-af69-3980b7796ef1" ts="2026-05-25T01:21:28.304111Z"> hi </channel> ``` Nothing in the tag tells me: - Which MCP tool to reply with (`reply_to_workspace` vs `send_message_to_user`) - For canvas_user kind: pass `peer_id=""` vs for peer_agent: pass `peer_id=<from_tag>` - **Do not print to stdout** — user may be on mobile and never sees terminal output - Where the agent docs live (when in doubt) ### Failure mode An agent that doesn't know the convention will route the reply to its local terminal. From the user's perspective: the message was delivered, the agent went silent. Silent failure, no error path, no observability. This isn't hypothetical — I almost did it in this very session before catching myself. ### Related context - `feedback_obs_first_debugging_all_agents` says agents must respond via the platform, but the rule lives in agent system prompts, not in the inbound push. New agent runtimes have no way to learn the rule from the message alone. - Task #16 (`#1675 canvas push regression`) shows the canvas/push path is already known-fragile — file + test landed, fix still pending. ## Proposal ### Shape Wrap the existing body with a server-rendered preamble: ``` <channel kind="canvas_user" ...> <instructions> Reply via mcp__molecule__reply_to_workspace (kind=canvas_user → omit peer_id; kind=peer_agent → pass peer_id from tag). Do NOT emit text in your local terminal/stdout — user may not be at the keyboard. For attachments, the upload arrives as a separate `chat_upload_receive` push with the attachments array. Docs: https://docs.moleculesai.app/agent/replies Capabilities visible to you: <list of MCP tools the platform knows you have> </instructions> <body>hi</body> </channel> ``` Equivalent JSON-shaped (poll path, `wait_for_message`): ```json { "kind": "canvas_user", "workspace_id": "...", "peer_id": "", "method": "message/send", "activity_id": "...", "ts": "...", "instructions": { "reply_via": "mcp__molecule__reply_to_workspace", "reply_args": {"peer_id": ""}, "stdout_warning": "User may not be at the terminal — route reply through platform.", "docs_url": "https://docs.moleculesai.app/agent/replies", "available_tools": ["reply_to_workspace", "send_message_to_user", "inbox_pop", "present_options"] }, "body": "hi" } ``` ### Where it's rendered Server-side, in the same handler that emits the push tag today. Workspace-templated: - Each workspace registers an `instruction_template_id` at create time (default: `claude-code-default`, `codex-default`, etc., based on runtime). - Server stores templates in a new `instruction_templates` table keyed by `template_id`. - On every outbound push, the server renders the template with `{kind, peer_id, available_tools}` substitutions and prepends/embeds it in the payload. ### Two implementation choices | Choice | How it surfaces | Pros | Cons | |---|---|---|---| | **A. Wrap body with `<instructions>` block** | Preamble visible in the channel tag XML | No new field; agents who already parse the channel tag see it for free | Changes the visible body shape; may surprise existing prompts | | **B. Add `instructions` JSON field** | Sibling to other attributes, agents must explicitly look at it | Cleanly separated; no body mutation | Existing agents won't read it unless they update — graceful but slow rollout | Recommend **B** with a 2-week shadow period emitting both, then deprecate A's option (since A was never shipped). ## API shape (new) ### Workspace template registration (admin API) ``` POST /cp/admin/workspace/{workspace_id}/instruction-template Body: { "template_id": "claude-code-default" } | { "inline_template": "..." } ``` ### Template variables | Var | Source | |---|---| | `{kind}` | from inbound message | | `{peer_id}` | from inbound message (empty for canvas_user) | | `{reply_tool}` | hardcoded per template | | `{available_tools}` | resolved from agent_card MCP capabilities at push time | | `{docs_url}` | hardcoded per template | ### Default templates (ship 3) - `claude-code-default` — covers Claude Code CLI runtime - `codex-default` — covers codex app-server runtime - `generic-mcp-default` — covers everything else (Cursor, custom MCP clients) ## Migration / rollout 1. **Phase 1** (this RFC): land the schema + server-side rendering + 3 default templates, gated by env var `PUSH_INSTRUCTIONS_ENABLED=false`. 2. **Phase 2**: flip to `true` in staging, observe agents in the agents-team tenant for one week. 3. **Phase 3**: production rollout. Existing workspaces without a template assignment get `generic-mcp-default` by `runtime` column inference. ### Backwards compat Agents that ignore the new `instructions` field continue working exactly as today. No-op for them. ## Open questions 1. **Token cost** — preamble adds ~200 tokens per inbound push. Across many inbox messages this adds up. Worth it? (Yes, IMO — silent-failure cost is higher.) Mitigation: workspace-level `instruction_compact=true` flag for hot paths. 2. **Templating engine** — Go `text/template`? Liquid? Or just `printf`-style substitution? Recommend stdlib `text/template` (already in workspace-server). 3. **Override hierarchy** — workspace-level template overrides runtime default, OR workspace template *extends* runtime default? Recommend extends — workspace adds workspace-specific tools/docs to the runtime base. 4. **Do peer_agent pushes need a different preamble than canvas_user?** — yes, the `peer_id` requirement is different. Template branches on `kind`. ## Scope guardrails - Out of scope: changing the existing push transport, WebSocket protocol, or activity_logs schema. - Out of scope: rewriting agent system prompts — the preamble *augments*, doesn't replace. - Out of scope: per-message agent intelligence — preamble is workspace-templated, same for every message in a workspace. ## Pairs with - Push attachments regression (task #16, #1675) — once attachments are correctly projected, the preamble can include "an `attachments=[]` array is present on this push; fetch via uri" hint. - RFC#645 (`present_options` MCP) — preamble would advertise `present_options` as a capability when the agent has it, encouraging button-style replies over open-ended text. - RFC #637 (canvas-user identity capture) — preamble could surface "this message is from user_id=X" once that lands. ## Cost estimate - Schema migration + server-side render path: ~2 days - 3 default templates: ~0.5 day - Admin API for template assignment: ~1 day - Tests + canary in agents-team: ~1 day - **Total: ~1 week of one engineer**
Author
Owner

Phase-2 implementation pointer

Integration site identified: workspace-server/internal/handlers/activity.go:579-609 (the same projection block that already does peer_info and attachments enrichment per Layers 1/2/3).

Cleanest path is to extend the existing ?include= query parameter pattern:

GET /workspaces/{id}/activities?include=peer_info,instructions

Add a new conditional block parallel to if includePeerInfo { ... }:

if includeInstructions {
    if tmpl := h.instructionTemplate(workspaceID, kind, methodFromRow); tmpl != nil {
        entry["instructions"] = tmpl  // {reply_via, reply_args, stdout_warning, docs_url, available_tools}
    }
}

Three companion changes needed:

  1. New instruction_templates table + workspaces.instruction_template_id column (migration)
  2. Default templates seeded for claude-code, codex, generic-mcp runtimes
  3. molecule-channel plugin (client-side) updated to render the instructions field as part of the <channel> synthetic tag

Phase 2 (this layer) only touches the server projection. Plugin change is Phase 3 and gates on workspaces having templates assigned (defaults inferred from workspaces.runtime).

Coordination

Kimi (agent-dev-a) currently at capacity (CI runner investigation queued). Once cleared, well-suited to take Phase 2. MM (agent-dev-b) also available.

CTO authorization captured this session — proceed with workspaces.runtime → template_id inference; do NOT require admin to explicitly assign templates per workspace for v1.

## Phase-2 implementation pointer Integration site identified: `workspace-server/internal/handlers/activity.go:579-609` (the same projection block that already does `peer_info` and `attachments` enrichment per Layers 1/2/3). Cleanest path is to extend the existing `?include=` query parameter pattern: ``` GET /workspaces/{id}/activities?include=peer_info,instructions ``` Add a new conditional block parallel to `if includePeerInfo { ... }`: ```go if includeInstructions { if tmpl := h.instructionTemplate(workspaceID, kind, methodFromRow); tmpl != nil { entry["instructions"] = tmpl // {reply_via, reply_args, stdout_warning, docs_url, available_tools} } } ``` Three companion changes needed: 1. New `instruction_templates` table + `workspaces.instruction_template_id` column (migration) 2. Default templates seeded for `claude-code`, `codex`, `generic-mcp` runtimes 3. molecule-channel plugin (client-side) updated to render the `instructions` field as part of the `<channel>` synthetic tag Phase 2 (this layer) only touches the server projection. Plugin change is Phase 3 and gates on workspaces having templates assigned (defaults inferred from `workspaces.runtime`). ## Coordination Kimi (agent-dev-a) currently at capacity (CI runner investigation queued). Once cleared, well-suited to take Phase 2. MM (agent-dev-b) also available. CTO authorization captured this session — proceed with workspaces.runtime → template_id inference; do NOT require admin to explicitly assign templates per workspace for v1.
Member

RCA — root cause\nThe push-instruction RFC is addressing a real split-brain: reply-routing rules exist as install docs/templates, but they are not attached to every inbound push as machine-readable runtime instructions. Agents that do not already know the convention can still receive a canvas/peer event without an inline contract telling them which MCP tool to call back through.\n\n## Evidence\n- workspace-server/internal/handlers/external_connection.go:258 — Claude channel docs say inbound A2A becomes synthetic <channel ...> tags and replies route through MCP tools.\n- workspace-server/internal/handlers/external_connection.go:262 — multi-workspace replies require _as_workspace, but that is documentation text, not per-message payload.\n- workspace-server/internal/handlers/external_connection.go:597 — codex bridge daemon resumes sessions and routes replies through send_message_to_user / delegate_task, again in operator template text.\n- workspace-server/internal/handlers/a2a_proxy_helpers.go:615 — poll-mode delivery persists only the inbound activity_logs row; no instruction-template metadata is added to that queued message.\n\n## Suggested fix\nImplement the instruction layer at the push/queue boundary in molecule-core rather than in each runtime template. The responsible area is workspace-server/internal/handlers/a2a_proxy_helpers.go plus the push/channel emitters and external-connection templates: store/render a compact instructions object with reply_tool, workspace routing, and available capabilities, then include it in both push-mode and poll-mode messages. Keep existing docs as fallback, but make the runtime-visible message self-describing.\n\n## Confidence\nMedium — the cited code shows the convention lives in docs/templates and queued activity rows, but I did not trace every websocket/channel emitter in this tick.

## RCA — root cause\nThe push-instruction RFC is addressing a real split-brain: reply-routing rules exist as install docs/templates, but they are not attached to every inbound push as machine-readable runtime instructions. Agents that do not already know the convention can still receive a canvas/peer event without an inline contract telling them which MCP tool to call back through.\n\n## Evidence\n- `workspace-server/internal/handlers/external_connection.go:258` — Claude channel docs say inbound A2A becomes synthetic `<channel ...>` tags and replies route through MCP tools.\n- `workspace-server/internal/handlers/external_connection.go:262` — multi-workspace replies require `_as_workspace`, but that is documentation text, not per-message payload.\n- `workspace-server/internal/handlers/external_connection.go:597` — codex bridge daemon resumes sessions and routes replies through `send_message_to_user` / `delegate_task`, again in operator template text.\n- `workspace-server/internal/handlers/a2a_proxy_helpers.go:615` — poll-mode delivery persists only the inbound `activity_logs` row; no instruction-template metadata is added to that queued message.\n\n## Suggested fix\nImplement the instruction layer at the push/queue boundary in `molecule-core` rather than in each runtime template. The responsible area is `workspace-server/internal/handlers/a2a_proxy_helpers.go` plus the push/channel emitters and external-connection templates: store/render a compact `instructions` object with `reply_tool`, workspace routing, and available capabilities, then include it in both push-mode and poll-mode messages. Keep existing docs as fallback, but make the runtime-visible message self-describing.\n\n## Confidence\nMedium — the cited code shows the convention lives in docs/templates and queued activity rows, but I did not trace every websocket/channel emitter in this tick.
Sign in to join this conversation.
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: molecule-ai/molecule-core#1830