forked from molecule-ai/molecule-core
Merge pull request #2512 from Molecule-AI/feat/register-codex-runtime
feat: register codex runtime + runtime native-MCP design docs
This commit is contained in:
commit
35cb6ba089
360
docs/integrations/codex-app-server-adapter-design.md
Normal file
360
docs/integrations/codex-app-server-adapter-design.md
Normal file
@ -0,0 +1,360 @@
|
|||||||
|
# Codex CLI workspace adapter — app-server design
|
||||||
|
|
||||||
|
**Status:** Design draft — pre-implementation
|
||||||
|
**Owner:** Molecule AI (hongmingwang@moleculesai.app)
|
||||||
|
**Date:** 2026-05-02
|
||||||
|
**Codex version validated against:** `codex-cli 0.72.0`
|
||||||
|
**Related:** `docs/integrations/hermes-platform-plugins-upstream-pr.md`,
|
||||||
|
`molecule-ai-workspace-template-openclaw/packages/openclaw-channel-plugin/`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Goal
|
||||||
|
|
||||||
|
Add a Molecule workspace template for the OpenAI Codex CLI runtime
|
||||||
|
(`@openai/codex` v0.72+). The template should give Codex agents the
|
||||||
|
same A2A inbox + mid-session push behavior the other supported
|
||||||
|
runtimes have:
|
||||||
|
|
||||||
|
- **claude-code:** MCP `notifications/claude/channel`
|
||||||
|
- **OpenClaw:** channel-plugin webhook into the gateway kernel
|
||||||
|
- **hermes:** `BasePlatformAdapter` (pending upstream PR; polling fallback today)
|
||||||
|
- **codex (this design):** persistent `codex app-server` stdio JSON-RPC
|
||||||
|
client; A2A messages become `turn/start` calls against a long-lived
|
||||||
|
thread
|
||||||
|
|
||||||
|
Today there is no codex template. The legacy fallback registry entry
|
||||||
|
at `workspace-server/internal/handlers/runtime_registry.go:83` exists
|
||||||
|
only to keep old workspaces from crashing — there is no live adapter,
|
||||||
|
no Dockerfile, nothing in `manifest.json`. This design covers the
|
||||||
|
fresh build.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Architecture decision: app-server, not `codex exec`
|
||||||
|
|
||||||
|
`codex exec --json` is the obvious shape — one CLI subprocess per
|
||||||
|
A2A message, same anti-pattern OpenClaw used to have and that we are
|
||||||
|
replacing. It loses session continuity (no shared thread), pays
|
||||||
|
process-spawn cost on every turn, and gives no path to mid-turn
|
||||||
|
interruption.
|
||||||
|
|
||||||
|
`codex app-server` is a long-running JSON-RPC server over stdio that
|
||||||
|
holds thread state in memory. The v2 protocol (validated below) gives
|
||||||
|
us:
|
||||||
|
|
||||||
|
- `thread/start` → returns `threadId`
|
||||||
|
- `turn/start` → input array, threadId required → returns `turnId`
|
||||||
|
- `turn/interrupt` → cancel a running turn by `(threadId, turnId)`
|
||||||
|
- Server-pushed notifications: `agent_message_delta`, `turn/started`,
|
||||||
|
`turn/completed`, `reasoning_text_delta`,
|
||||||
|
`command_execution_output_delta`, `mcp_tool_call_progress`,
|
||||||
|
`error_notification`, etc.
|
||||||
|
|
||||||
|
A persistent app-server child plus a small async stdio reader gives us
|
||||||
|
session continuity AND mid-turn injection. Same dual-win shape we got
|
||||||
|
from migrating OpenClaw away from `openclaw agent`.
|
||||||
|
|
||||||
|
### Why not v1?
|
||||||
|
|
||||||
|
v1 of the protocol exposes `newConversation` + `sendUserMessage` /
|
||||||
|
`sendUserTurn` (one-shot per message, no streaming notifications). v2
|
||||||
|
introduces threads + turns + delta notifications. v2 is the
|
||||||
|
forward-looking surface; we build against v2 from the start.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## RPC sequence
|
||||||
|
|
||||||
|
### 1. Boot
|
||||||
|
|
||||||
|
```
|
||||||
|
adapter spawn ▶ codex app-server (stdio NDJSON)
|
||||||
|
◀ ready (process up)
|
||||||
|
adapter ▶ {"jsonrpc":"2.0","id":1,"method":"initialize",
|
||||||
|
"params":{"clientInfo":{"name":"molecule-runtime","version":"…"}}}
|
||||||
|
adapter ◀ {"id":1,"result":{"userAgent":"codex_cli_rs/0.72.0 …"}}
|
||||||
|
```
|
||||||
|
|
||||||
|
Validated 2026-05-02 against the installed binary — NDJSON framing,
|
||||||
|
initialize works as shown.
|
||||||
|
|
||||||
|
### 2. Thread per workspace session
|
||||||
|
|
||||||
|
```
|
||||||
|
adapter ▶ thread/start
|
||||||
|
params: {model, sandboxPolicy, approvalPolicy, cwd,
|
||||||
|
baseInstructions, developerInstructions, …}
|
||||||
|
adapter ◀ {result: {thread: {threadId: "th_…"}}}
|
||||||
|
```
|
||||||
|
|
||||||
|
`threadId` is cached on the adapter for the workspace's lifetime. On
|
||||||
|
adapter restart we use `thread/resume` against the persisted ID
|
||||||
|
(written to disk under `~/.codex/sessions/` by codex itself, but we
|
||||||
|
also keep our own pointer in workspace state for fast restore).
|
||||||
|
|
||||||
|
### 3. A2A message → turn/start
|
||||||
|
|
||||||
|
For each inbound A2A message:
|
||||||
|
|
||||||
|
```
|
||||||
|
adapter ▶ turn/start
|
||||||
|
params: {threadId, input: [{type:"text", text:"…"}], …}
|
||||||
|
adapter ◀ {result: {turn: {turnId: "tu_…"}}}
|
||||||
|
|
||||||
|
(server pushes notifications)
|
||||||
|
adapter ◀ turn/started
|
||||||
|
adapter ◀ agent_message_delta (text chunk)
|
||||||
|
adapter ◀ agent_message_delta (text chunk)
|
||||||
|
…
|
||||||
|
adapter ◀ turn/completed
|
||||||
|
```
|
||||||
|
|
||||||
|
The adapter accumulates `agent_message_delta` chunks into a buffer
|
||||||
|
keyed by `turnId`, emits them onto the A2A response queue (streamed if
|
||||||
|
the molecule-runtime contract supports streaming, otherwise assembled
|
||||||
|
into a single final message on `turn/completed`).
|
||||||
|
|
||||||
|
### 4. Mid-turn injection — the load-bearing case
|
||||||
|
|
||||||
|
**Default policy: per-thread serialization.** If a turn is already
|
||||||
|
running when a second A2A message arrives, queue the new message and
|
||||||
|
fire `turn/start` once the current `turn/completed` lands. This
|
||||||
|
matches OpenClaw's per-chat sequentializer behavior — the A2A peer
|
||||||
|
sees their messages handled in order, and we don't need
|
||||||
|
`turn/interrupt` for the common case.
|
||||||
|
|
||||||
|
**Opt-in policy: interrupt-and-rerun.** For workspaces that prefer
|
||||||
|
"latest message wins" semantics (rare; configurable), the adapter
|
||||||
|
fires `turn/interrupt` with `(threadId, currentTurnId)`, waits for
|
||||||
|
`turn/completed` (with cancelled status), then `turn/start` with the
|
||||||
|
combined context: previous user message + agent's partial response so
|
||||||
|
far + new message, so the agent has full context of what got
|
||||||
|
interrupted. Off by default.
|
||||||
|
|
||||||
|
### 5. Shutdown
|
||||||
|
|
||||||
|
```
|
||||||
|
adapter ▶ {"method":"shutdown"} (if v2 exposes one; otherwise SIGTERM)
|
||||||
|
adapter ▶ close stdio
|
||||||
|
adapter ▶ wait(child, timeout=5s); on timeout SIGKILL
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## File layout (new template repo)
|
||||||
|
|
||||||
|
```
|
||||||
|
molecule-ai-workspace-template-codex/
|
||||||
|
├── adapter.py # BaseAdapter shell, thin (~50 LOC)
|
||||||
|
├── executor.py # AppServerProxyExecutor — the RPC client (~300 LOC)
|
||||||
|
├── app_server.py # AppServerProcess — stdio child + NDJSON reader (~150 LOC)
|
||||||
|
├── config.yaml
|
||||||
|
├── Dockerfile # node:20 + npm i -g @openai/codex@0.72
|
||||||
|
├── start.sh # boots adapter; codex app-server is spawned per session by executor
|
||||||
|
├── requirements.txt
|
||||||
|
├── README.md
|
||||||
|
└── tests/
|
||||||
|
├── test_app_server.py # mocks stdio; tests framing, request/notification dispatch
|
||||||
|
└── test_executor.py # mocks AppServerProcess; tests turn lifecycle, interrupt
|
||||||
|
```
|
||||||
|
|
||||||
|
Modeled on the hermes template (which is the closest existing shape:
|
||||||
|
adapter.py + executor.py separation; daemon proxy via local IPC). The
|
||||||
|
extra `app_server.py` exists because the JSON-RPC client + child
|
||||||
|
process management is non-trivial enough to warrant its own module
|
||||||
|
with its own tests.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Executor skeleton
|
||||||
|
|
||||||
|
```python
|
||||||
|
# executor.py — A2A → codex app-server bridge
|
||||||
|
|
||||||
|
class CodexAppServerExecutor(AgentExecutor):
|
||||||
|
"""Holds one app-server child + thread, dispatches A2A turns as turn/start RPCs."""
|
||||||
|
|
||||||
|
def __init__(self, config: AdapterConfig):
|
||||||
|
self._config = config
|
||||||
|
self._app_server: AppServerProcess | None = None
|
||||||
|
self._thread_id: str | None = None
|
||||||
|
self._turn_lock = asyncio.Lock() # serialize per-thread by default
|
||||||
|
|
||||||
|
async def _ensure_thread(self) -> str:
|
||||||
|
if self._app_server is None:
|
||||||
|
self._app_server = await AppServerProcess.start()
|
||||||
|
await self._app_server.initialize(client_info={
|
||||||
|
"name": "molecule-runtime",
|
||||||
|
"version": MOLECULE_RUNTIME_VERSION,
|
||||||
|
})
|
||||||
|
if self._thread_id is None:
|
||||||
|
resp = await self._app_server.request("thread/start", {
|
||||||
|
"model": self._config.model or None,
|
||||||
|
"developerInstructions": self._config.system_prompt or None,
|
||||||
|
# other policy fields (sandbox, approval) — Molecule defaults
|
||||||
|
})
|
||||||
|
self._thread_id = resp["thread"]["threadId"]
|
||||||
|
return self._thread_id
|
||||||
|
|
||||||
|
async def execute(self, context: RequestContext, event_queue: EventQueue) -> None:
|
||||||
|
prompt = extract_message_text(context.message) or ""
|
||||||
|
if not prompt.strip():
|
||||||
|
await event_queue.enqueue_event(new_agent_text_message("(empty prompt)"))
|
||||||
|
return
|
||||||
|
|
||||||
|
async with self._turn_lock: # per-thread serialization
|
||||||
|
thread_id = await self._ensure_thread()
|
||||||
|
|
||||||
|
# Subscribe to delta notifications BEFORE starting the turn so we
|
||||||
|
# don't race the first agent_message_delta.
|
||||||
|
buffer: list[str] = []
|
||||||
|
done = asyncio.Event()
|
||||||
|
error: Exception | None = None
|
||||||
|
|
||||||
|
def on_notification(method: str, params: dict) -> None:
|
||||||
|
nonlocal error
|
||||||
|
if method == "agent_message_delta":
|
||||||
|
buffer.append(params.get("delta", ""))
|
||||||
|
elif method == "turn/completed":
|
||||||
|
done.set()
|
||||||
|
elif method == "error_notification":
|
||||||
|
error = RuntimeError(params.get("message", "unknown app-server error"))
|
||||||
|
done.set()
|
||||||
|
|
||||||
|
unsub = self._app_server.subscribe(on_notification)
|
||||||
|
try:
|
||||||
|
resp = await self._app_server.request("turn/start", {
|
||||||
|
"threadId": thread_id,
|
||||||
|
"input": [{"type": "text", "text": prompt}],
|
||||||
|
})
|
||||||
|
turn_id = resp["turn"]["turnId"]
|
||||||
|
await asyncio.wait_for(done.wait(), timeout=_TURN_TIMEOUT)
|
||||||
|
finally:
|
||||||
|
unsub()
|
||||||
|
|
||||||
|
if error:
|
||||||
|
await event_queue.enqueue_event(
|
||||||
|
new_agent_text_message(f"[codex error] {error}"))
|
||||||
|
return
|
||||||
|
await event_queue.enqueue_event(new_agent_text_message("".join(buffer)))
|
||||||
|
|
||||||
|
async def cancel(self, context: RequestContext, event_queue: EventQueue) -> None:
|
||||||
|
# When the molecule-runtime cancels a request, fire turn/interrupt
|
||||||
|
# against the currently-running turn. Best-effort — racing
|
||||||
|
# turn/completed is fine, app-server returns a noop in that case.
|
||||||
|
if self._app_server and self._thread_id and self._current_turn_id:
|
||||||
|
await self._app_server.request("turn/interrupt", {
|
||||||
|
"threadId": self._thread_id,
|
||||||
|
"turnId": self._current_turn_id,
|
||||||
|
})
|
||||||
|
```
|
||||||
|
|
||||||
|
The `AppServerProcess` class encapsulates: stdio child management,
|
||||||
|
NDJSON line reader/writer, request-id correlation, notification
|
||||||
|
subscriber registry, and graceful shutdown. Standard async stdio
|
||||||
|
JSON-RPC client — nothing exotic.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Open questions to resolve before implementation
|
||||||
|
|
||||||
|
1. **MoleculeRuntime streaming contract.** Does our A2A executor
|
||||||
|
contract support emitting incremental events (so the user sees
|
||||||
|
partial responses as the agent streams), or do we always assemble
|
||||||
|
on `turn/completed`? If streaming is supported, we want to forward
|
||||||
|
each `agent_message_delta` as an A2A event for parity with hermes
|
||||||
|
gateway streaming. (Cross-reference: hermes adapter currently
|
||||||
|
doesn't stream either — `executor.py:122` sets `stream=False` —
|
||||||
|
so non-streaming is the safe v1 baseline.)
|
||||||
|
|
||||||
|
2. **Sandbox policy default.** Codex defaults to `read-only` for safety
|
||||||
|
in CLI mode; for workspace use we need write access to the
|
||||||
|
workspace tree. Pick a sensible default in `thread/start` —
|
||||||
|
probably `workspace-write` scoped to the workspace cwd.
|
||||||
|
|
||||||
|
3. **Approval policy default.** Codex's `--ask-for-approval` modes
|
||||||
|
(`untrusted`, `on-failure`, `never`). Workspace agents need
|
||||||
|
`never` (they can't prompt a human). Confirm this is exposed via
|
||||||
|
`approvalPolicy` in `thread/start`.
|
||||||
|
|
||||||
|
4. **Auth — login flow.** Codex supports `login api-key` (env
|
||||||
|
`OPENAI_API_KEY`) and `login chatgpt` (interactive OAuth). For
|
||||||
|
workspace use we mandate API key. Document this in the template's
|
||||||
|
README and surface it as a required env in config.yaml.
|
||||||
|
|
||||||
|
5. **MCP server passthrough.** Codex's own `mcp_servers` config lets
|
||||||
|
the agent call out to MCP servers as a CLIENT. Should the workspace
|
||||||
|
adapter automatically wire `~/.codex/config.toml` so the agent can
|
||||||
|
reach the molecule MCP server (chat_history, recall_memory,
|
||||||
|
delegate_task)? Almost certainly yes — but verify the env-var
|
||||||
|
substitution pattern works in TOML.
|
||||||
|
|
||||||
|
6. **Thread persistence across workspace restarts.** Codex stores
|
||||||
|
sessions on disk under `~/.codex/sessions/`. The adapter should
|
||||||
|
persist the threadId in workspace state so a restart resumes the
|
||||||
|
thread (`thread/resume`) rather than starting fresh. This matches
|
||||||
|
the existing molecule-runtime convention for session continuity.
|
||||||
|
|
||||||
|
7. **Token usage / cost reporting.** v2 emits
|
||||||
|
`ThreadTokenUsageUpdatedNotification`. Plumb this into our usage
|
||||||
|
tracking — same path the other runtimes use.
|
||||||
|
|
||||||
|
8. **MCP push notifications inbound.** Earlier research established
|
||||||
|
that codex's own MCP server mode does NOT support
|
||||||
|
`notifications/*` for push. So the path for unsolicited mid-session
|
||||||
|
A2A messages is NOT "codex's MCP client receives notifications from
|
||||||
|
our MCP server" — it's "molecule-runtime polls inbox via
|
||||||
|
`wait_for_message`, and on each polled message fires `turn/start`
|
||||||
|
on the existing thread." The "MCP native" framing here is satisfied
|
||||||
|
not by codex receiving MCP push, but by the persistent thread +
|
||||||
|
turn/start delivering the same UX (session continuity + queued or
|
||||||
|
interrupted handling of new messages mid-thread).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Why this design satisfies "MCP native push parity"
|
||||||
|
|
||||||
|
User goal: every runtime delivers A2A inbox messages with the same
|
||||||
|
quality of experience as claude-code's MCP `notifications/claude/channel`.
|
||||||
|
|
||||||
|
claude-code path: MCP server pushes notification → claude-code SDK
|
||||||
|
injects synthetic user turn into running session.
|
||||||
|
|
||||||
|
Codex path: molecule-runtime polls inbox (universal poll path) →
|
||||||
|
adapter fires `turn/start` on the existing app-server thread → codex
|
||||||
|
processes the message in-thread with full context. The "push" happens
|
||||||
|
at the molecule-runtime ↔ adapter boundary; the "native" part is that
|
||||||
|
codex's own session model handles it as an in-thread turn, not as a
|
||||||
|
fresh subprocess.
|
||||||
|
|
||||||
|
For mid-turn arrivals: the per-thread serialization (or opt-in
|
||||||
|
interrupt) gives us behavior equivalent to OpenClaw's per-chat
|
||||||
|
sequentializer. Equivalent UX to claude-code's mid-session
|
||||||
|
notification injection in practice — one is a kernel-level interrupt,
|
||||||
|
the other is a queue-then-dispatch, but the user-visible behavior
|
||||||
|
("the agent processes my message after the current turn finishes") is
|
||||||
|
identical.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Sequencing
|
||||||
|
|
||||||
|
This is post-demo work. Order:
|
||||||
|
|
||||||
|
1. **Spec the executor lifecycle** — pin down the open questions
|
||||||
|
above (especially #1 streaming, #5 MCP passthrough, #6 thread
|
||||||
|
persistence) before any code lands.
|
||||||
|
2. **Implement `AppServerProcess`** with thorough unit tests against a
|
||||||
|
mock stdio. This is the riskiest module (concurrency around
|
||||||
|
request-id correlation + notification dispatch); land it first
|
||||||
|
with high coverage.
|
||||||
|
3. **Implement `CodexAppServerExecutor`** on top.
|
||||||
|
4. **Build the template repo skeleton** (Dockerfile, config.yaml,
|
||||||
|
start.sh, README) once the Python side runs locally.
|
||||||
|
5. **Add codex to `manifest.json`** and the runtime registry.
|
||||||
|
6. **End-to-end verify** per `feedback_close_on_user_visible_not_merge`
|
||||||
|
— boot a real workspace, send A2A messages, observe streamed
|
||||||
|
responses + thread continuity + queued mid-turn handling.
|
||||||
|
|
||||||
|
Estimated total: 3-5 engineering days for v1, plus E2E hardening.
|
||||||
191
docs/integrations/hermes-platform-plugins-upstream-pr.md
Normal file
191
docs/integrations/hermes-platform-plugins-upstream-pr.md
Normal file
@ -0,0 +1,191 @@
|
|||||||
|
# Upstream PR draft: Pluggable platform adapters for hermes-agent
|
||||||
|
|
||||||
|
**Status:** Draft — pre-submission review
|
||||||
|
**Target repo:** `NousResearch/hermes-agent`
|
||||||
|
**Owner:** Molecule AI (hongmingwang@moleculesai.app)
|
||||||
|
**Date drafted:** 2026-05-02
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Why this draft exists
|
||||||
|
|
||||||
|
Molecule needs to deliver A2A inbox messages to a hermes-hosted agent the same way Telegram messages reach it today — through `_handle_message`, with `set_busy_session_handler` semantics for mid-turn arrivals. Today this requires forking `gateway/run.py` because the platform adapter system is closed (`_create_adapter` is a hardcoded if/elif chain at lines 2424-2578).
|
||||||
|
|
||||||
|
But hermes already ships a working plugin discovery system for memory backends (`plugins/memory/__init__.py`). Extending the same pattern to platforms is a small, symmetric change — not novel architecture. This draft documents the proposed upstream PR before we open it, so we can iterate locally on tone, scope, and code shape.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Proposed PR title
|
||||||
|
|
||||||
|
> Pluggable platform adapters via `plugins/platforms/` discovery
|
||||||
|
|
||||||
|
(Mirrors the existing `plugins/memory/` shape so the title alone signals "this is the same pattern, just for the other subsystem.")
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## PR body
|
||||||
|
|
||||||
|
### Problem
|
||||||
|
|
||||||
|
Hermes ships 19 in-tree platform adapters (Telegram, Discord, WhatsApp, Slack, Signal, Mattermost, Matrix, Email, SMS, DingTalk, Feishu, WeCom variants, Weixin, BlueBubbles, QQBot, HomeAssistant, API server, Webhook). Each is wired by editing two files:
|
||||||
|
|
||||||
|
- `gateway/config.py:48-69` — append a `Platform` enum value
|
||||||
|
- `gateway/run.py:2424-2578` — append an `elif platform == Platform.X:` branch in `_create_adapter()`
|
||||||
|
|
||||||
|
For platforms with broad demand (Telegram, Slack, etc.) this is fine: the maintenance load lives upstream, every user benefits. For platforms with narrow but real demand — enterprise-internal channels (Rocket.Chat, RingCentral, Zulip), agent-to-agent inbox protocols (e.g. Molecule's A2A), niche regional platforms, or experimental transports — the only path today is forking `gateway/run.py`. Forks drift, defeat the purpose of an OSS gateway, and discourage contribution back upstream.
|
||||||
|
|
||||||
|
### Prior art (already in hermes)
|
||||||
|
|
||||||
|
The memory subsystem solved exactly this problem at `plugins/memory/__init__.py`:
|
||||||
|
|
||||||
|
1. **Two-tier discovery** — bundled providers in `plugins/memory/<name>/` plus user-installed providers in `$HERMES_HOME/plugins/<name>/`. Bundled wins on name collision.
|
||||||
|
2. **`register(ctx)` collector pattern** (`plugins/memory/__init__.py:264-305`) — a plugin's `__init__.py` exposes a `register(ctx)` function; `ctx` already supports `register_memory_provider`, `register_tool`, `register_hook`, `register_cli_command`.
|
||||||
|
3. **`plugin.yaml` manifest** for description and metadata.
|
||||||
|
4. **Config-driven activation** (`memory.provider: honcho` selects which provider loads).
|
||||||
|
|
||||||
|
Adding `register_platform_adapter` to the same collector and a `plugins/platforms/` discovery directory extends this pattern symmetrically.
|
||||||
|
|
||||||
|
### Proposal
|
||||||
|
|
||||||
|
**Three small changes:**
|
||||||
|
|
||||||
|
1. **New collector method** in `plugins/memory/__init__.py:_ProviderCollector` (or a new shared `plugins/_collector.py` if maintainers prefer cleaner separation):
|
||||||
|
|
||||||
|
```python
|
||||||
|
def register_platform_adapter(self, name: str, adapter_class: type, requirements_check=None):
|
||||||
|
"""Register a platform adapter loadable as plugin.
|
||||||
|
|
||||||
|
name: unique platform identifier (matches gateway.platforms.<name> in config)
|
||||||
|
adapter_class: subclass of BasePlatformAdapter
|
||||||
|
requirements_check: optional callable returning bool — same shape as
|
||||||
|
existing check_telegram_requirements() etc.
|
||||||
|
"""
|
||||||
|
self.platform_adapters[name] = (adapter_class, requirements_check)
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **New `plugins/platforms/__init__.py`** mirroring `plugins/memory/__init__.py` — `discover_platform_adapters()`, `load_platform_adapter(name)`, two-tier (bundled + `$HERMES_HOME/plugins/`) discovery.
|
||||||
|
|
||||||
|
3. **`_create_adapter()` fallback** at `gateway/run.py:2578` — after the in-tree if/elif chain returns None, attempt plugin lookup:
|
||||||
|
|
||||||
|
```python
|
||||||
|
# Existing in-tree adapters checked first (precedence preserved).
|
||||||
|
# If no match, fall through to plugin discovery.
|
||||||
|
from plugins.platforms import load_platform_adapter
|
||||||
|
plugin_entry = load_platform_adapter(platform.value)
|
||||||
|
if plugin_entry:
|
||||||
|
adapter_class, req_check = plugin_entry
|
||||||
|
if req_check and not req_check():
|
||||||
|
logger.warning(f"{platform.value}: plugin requirements not met")
|
||||||
|
return None
|
||||||
|
return adapter_class(config)
|
||||||
|
return None
|
||||||
|
```
|
||||||
|
|
||||||
|
4. **`Platform` enum becomes open-set.** Today it's `Enum`; switch to a string-backed pattern that accepts unknown values (still validates against the union of in-tree + discovered plugins at config-load time):
|
||||||
|
|
||||||
|
```python
|
||||||
|
# gateway/config.py — replace Enum with frozen dataclass + dynamic registry.
|
||||||
|
# Keeps the in-tree values as module-level singletons for backward compat:
|
||||||
|
# Platform.TELEGRAM still works as today.
|
||||||
|
```
|
||||||
|
|
||||||
|
This is the only "shape change" in the PR. Backward compat is straightforward: every existing `Platform.TELEGRAM` reference continues to work because the module exports the same names.
|
||||||
|
|
||||||
|
### Backward compatibility
|
||||||
|
|
||||||
|
- All 19 in-tree adapters keep their hardcoded path in `_create_adapter()` (precedence: in-tree wins on name collision, exactly like memory plugins).
|
||||||
|
- Existing config files (`gateway.platforms.telegram.enabled: true`) continue to work unchanged.
|
||||||
|
- No new mandatory config keys.
|
||||||
|
- Plugin discovery only runs if the platform name doesn't match an in-tree value, so cold-start cost is zero for users who don't use plugins.
|
||||||
|
- Fork-then-add-platform users can migrate to plugins at their own pace; the in-tree path isn't deprecated.
|
||||||
|
|
||||||
|
### Test plan
|
||||||
|
|
||||||
|
- **Unit**: discovery scans both bundled and user dirs, respects precedence.
|
||||||
|
- **Unit**: `_create_adapter()` falls through to plugin lookup only when in-tree doesn't match.
|
||||||
|
- **Integration**: ship a minimal `plugins/platforms/example/` in-tree (read-only, returns canned messages) so CI exercises the full plugin code path. Same approach `plugins/memory/holographic/` takes today.
|
||||||
|
- **Manual**: Molecule will publish `hermes-platform-molecule-a2a` as the first external consumer once this lands.
|
||||||
|
|
||||||
|
### Documentation
|
||||||
|
|
||||||
|
- Extend `CONTRIBUTING.md`'s "Should it be a Skill or a Tool?" section with "Should it be a Platform Plugin or an in-tree Platform?" — same shape, same decision tree.
|
||||||
|
- Add `plugins/platforms/README.md` mirroring `plugins/memory/`'s convention.
|
||||||
|
|
||||||
|
### Out of scope (intentionally)
|
||||||
|
|
||||||
|
- **Setuptools `entry_points`** — could be added later as a third discovery tier (after bundled + `$HERMES_HOME/plugins/`). Skipping for v1 because the directory-based discovery already covers the demand and matches the memory pattern. Adding entry_points is a non-breaking extension.
|
||||||
|
- **Hot-reload** — plugins discovered at gateway boot, no live re-scan. Matches memory plugins.
|
||||||
|
- **Sandboxing** — plugins run with full hermes process privileges. Same trust model as memory plugins; documented in the new README.
|
||||||
|
|
||||||
|
### Reference consumer
|
||||||
|
|
||||||
|
Molecule AI will ship `hermes-platform-molecule-a2a` as the first external consumer. Use case: deliver agent-to-agent inbox messages (from peer agents authenticated at the platform layer, not the Telegram-user level) into the same `_handle_message` dispatch Telegram uses, with `internal=True` events to bypass user-auth. Expected timeline: within 2 weeks of merge.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Open questions for upstream maintainers
|
||||||
|
|
||||||
|
Per `CONTRIBUTING.md`, the right channel for design proposals is
|
||||||
|
**GitHub Discussions**, not Discord (Discord is for "questions,
|
||||||
|
showcasing projects, and sharing skills" — Discussions is the
|
||||||
|
documented channel for "design proposals and architecture discussions").
|
||||||
|
|
||||||
|
Open a Discussion at `NousResearch/hermes-agent/discussions` titled
|
||||||
|
"RFC: pluggable platform adapters via `plugins/platforms/`" with the
|
||||||
|
problem + proposal + open questions before filing the PR. This gives
|
||||||
|
maintainers space to weigh in on shape before code is in flight.
|
||||||
|
|
||||||
|
Open questions to put in the Discussion:
|
||||||
|
|
||||||
|
1. **Preferred naming.** `register_platform_adapter` vs `register_platform` vs `register_channel`. Consistency with memory's `register_memory_provider` argues for the long form.
|
||||||
|
2. **Enum vs string.** Is the maintainer team open to making `Platform` open-set? If not, fallback design: keep enum, add a single `Platform.PLUGIN` sentinel + a `plugin_name` field on `PlatformConfig`. Slightly uglier but smaller blast radius.
|
||||||
|
3. **Testing**: `plugins/platforms/example/` checked into the repo, or test-fixtures-only? Memory plugins are real (mem0, honcho, supermemory bundled), so a real example seems consistent.
|
||||||
|
4. **Discovery ordering**: confirm the user wants bundled-wins precedence (matches memory) vs user-can-override-bundled (would let downstream patch a buggy in-tree adapter without forking). Current memory pattern is bundled-wins; we'll match it unless told otherwise.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Effort estimate
|
||||||
|
|
||||||
|
- **Code change**: ~150 LOC across `plugins/platforms/__init__.py` (new), `gateway/config.py` (Platform refactor), `gateway/run.py` (10-line fallback in `_create_adapter`), tests (~50 LOC).
|
||||||
|
- **Docs**: ~80 LOC across `CONTRIBUTING.md` extension and new `plugins/platforms/README.md`.
|
||||||
|
- **Review cycle**: depends on maintainer responsiveness. Memory plugin system shipped in v0.5–0.7 era; platform plugin system would land for v0.11 if accepted.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## After this PR lands (Molecule-side follow-up)
|
||||||
|
|
||||||
|
1. Publish `hermes-platform-molecule-a2a` (PyPI + `~/.hermes/plugins/molecule-a2a/`).
|
||||||
|
2. Bump our hermes workspace template to declare `plugins.platforms.molecule_a2a.enabled: true`.
|
||||||
|
3. Remove the polling shim from `molecule-ai-workspace-template-hermes/adapter.py` once the plugin path is verified end-to-end.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Status checklist (for our own tracking)
|
||||||
|
|
||||||
|
Per user's gating: "if the plugin works locally in our docker setup
|
||||||
|
and e2e testing works, yes [submit]". Validation prerequisites:
|
||||||
|
|
||||||
|
- [ ] Build a working `plugins/platforms/molecule_a2a/` plugin against
|
||||||
|
a forked hermes-agent with the proposed change applied
|
||||||
|
- [ ] Bake the forked hermes + plugin into a local copy of our
|
||||||
|
`molecule-ai-workspace-template-hermes` Docker image
|
||||||
|
- [ ] E2E: boot the local image, send A2A messages from a peer agent,
|
||||||
|
observe `_handle_message` dispatch + reply through A2A queue
|
||||||
|
- [ ] Confirm `Platform` enum refactor doesn't break downstream — grep
|
||||||
|
for `Platform.X` usages across hermes
|
||||||
|
- [ ] Confirm `$HERMES_HOME` is the right user-plugin root for
|
||||||
|
platforms (matches memory convention)
|
||||||
|
- [ ] Open a GitHub Discussion at
|
||||||
|
`NousResearch/hermes-agent/discussions` titled
|
||||||
|
"RFC: pluggable platform adapters via plugins/platforms/" with
|
||||||
|
design + open questions; wait for maintainer feedback
|
||||||
|
- [ ] Branch name: `feat/pluggable-platform-adapters` per
|
||||||
|
CONTRIBUTING.md branch convention
|
||||||
|
- [ ] Commit prefix: `feat(gateway): pluggable platform adapters
|
||||||
|
via plugins/platforms/` per Conventional Commits + scope `gateway`
|
||||||
|
- [ ] PR description covers what/why + how-to-test + platforms tested,
|
||||||
|
per CONTRIBUTING.md PR-description requirements
|
||||||
|
- [ ] Open PR against `NousResearch/hermes-agent` main once Discussion
|
||||||
|
lands consensus
|
||||||
|
- [ ] Track PR; bump cadence weekly; if stalled past 4 weeks, propose
|
||||||
|
fork-and-bundle as fallback for our hermes template image
|
||||||
162
docs/integrations/runtime-native-mcp-status.md
Normal file
162
docs/integrations/runtime-native-mcp-status.md
Normal file
@ -0,0 +1,162 @@
|
|||||||
|
# Runtime native-MCP push parity — status
|
||||||
|
|
||||||
|
**Goal:** every workspace runtime delivers Molecule A2A inbox messages
|
||||||
|
with the same UX as claude-code's MCP `notifications/claude/channel`
|
||||||
|
push: session continuity + queued or interrupted handling of new
|
||||||
|
messages mid-thread, no fresh subprocess per message.
|
||||||
|
|
||||||
|
Tracked across four runtime streams. Updated 2026-05-02.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## claude-code
|
||||||
|
|
||||||
|
**Status:** ✅ Done. Native MCP `notifications/claude/channel` push
|
||||||
|
shipped via `workspace/a2a_mcp_server.py`. Requires the host to launch
|
||||||
|
with `--dangerously-load-development-channels server:molecule`.
|
||||||
|
|
||||||
|
No further work.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## OpenClaw
|
||||||
|
|
||||||
|
**Status:** Scaffolded; awaiting validation + companion adapter rewrite.
|
||||||
|
|
||||||
|
**Path:** Channel-plugin SDK (`openclaw/plugin-sdk`), auto-discovered
|
||||||
|
from `~/.openclaw/plugins/<name>/` or workspace `.openclaw/`. Plugin
|
||||||
|
registers an HTTP webhook on `openclaw gateway`; Molecule workspace
|
||||||
|
adapter POSTs A2A messages to it; gateway dispatches through the same
|
||||||
|
`dispatchReplyWithBufferedBlockDispatcher` kernel call native channels
|
||||||
|
(Telegram, Lark, Slack, Discord) use.
|
||||||
|
|
||||||
|
**Artifacts landed:**
|
||||||
|
- `molecule-ai-workspace-template-openclaw/packages/openclaw-channel-plugin/`
|
||||||
|
- `package.json`, `openclaw.plugin.json` (manifest), `index.ts`
|
||||||
|
(channel + webhook handler), `README.md`, `tsconfig.json`
|
||||||
|
- Pre-release `v0.1.0-pre`. Mirrors `rabbit-lark-bot` reference
|
||||||
|
plugin shape.
|
||||||
|
|
||||||
|
**Remaining (task #84, #87):**
|
||||||
|
1. Validate against a running OpenClaw gateway. Open questions in the
|
||||||
|
plugin README: `resolveAgentRoute` peer-id shape,
|
||||||
|
`dispatchReplyWithBufferedBlockDispatcher` async semantics,
|
||||||
|
`outbound.sendText` no-op safety.
|
||||||
|
2. Rewrite Python adapter (`adapter.py`) to stop shelling out
|
||||||
|
`openclaw agent --message ...` and instead POST to the plugin's
|
||||||
|
webhook + run `/agent-reply` callback HTTP server. **Post-demo
|
||||||
|
work** (touches a working integration).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## hermes
|
||||||
|
|
||||||
|
**Status:** Upstream PR drafted; short-term shim deemed unnecessary.
|
||||||
|
|
||||||
|
**Path:** Open the upstream `BasePlatformAdapter` system to external
|
||||||
|
plugins. Hermes already ships a working plugin discovery system for
|
||||||
|
memory backends (`plugins/memory/`, `register(ctx)` collector pattern,
|
||||||
|
`$HERMES_HOME/plugins/<name>/` user-installed tier). The PR extends
|
||||||
|
the same shape to platforms — `register_platform_adapter(...)` on the
|
||||||
|
existing collector, new `plugins/platforms/` discovery directory,
|
||||||
|
3-line fallback in `_create_adapter()`. Symmetric, not novel.
|
||||||
|
|
||||||
|
**Artifacts landed:**
|
||||||
|
- `docs/integrations/hermes-platform-plugins-upstream-pr.md` — full
|
||||||
|
PR draft including problem, prior art, proposal, code shape,
|
||||||
|
backward compat, test plan, and four open questions to resolve in
|
||||||
|
Discord before submitting.
|
||||||
|
|
||||||
|
**Why no short-term polling shim:** earlier framing was wrong. Molecule
|
||||||
|
runtime already polls the inbox via `wait_for_message` per turn; each
|
||||||
|
polled message fires a fresh `execute()` on the adapter, which
|
||||||
|
proxies to hermes's stateless `/v1/chat/completions`. Adding adapter-
|
||||||
|
side polling would be duplicate work. The genuine short-term gap is
|
||||||
|
**session continuity** (hermes daemon doesn't see a single
|
||||||
|
conversation across turns because chat/completions is stateless), not
|
||||||
|
push latency. That gap is solved by the upstream PR; no
|
||||||
|
intermediate shim earns its complexity.
|
||||||
|
|
||||||
|
**Remaining (task #83):**
|
||||||
|
1. Reach out in Nous Research Discord to validate open questions
|
||||||
|
(Platform enum-vs-string refactor, naming, example-plugin scope).
|
||||||
|
2. Submit PR to `NousResearch/hermes-agent`. **Requires user
|
||||||
|
confirmation** — opening an upstream PR is an action visible to
|
||||||
|
others.
|
||||||
|
3. Once merged: ship `hermes-platform-molecule-a2a` as the first
|
||||||
|
external consumer, bump our hermes workspace template to enable
|
||||||
|
it, remove any transitional code.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Codex (OpenAI Codex CLI)
|
||||||
|
|
||||||
|
**Status:** Template structurally complete (12 files, 12/12 tests passing,
|
||||||
|
validated against real codex-cli 0.72.0). Awaiting molecule-core
|
||||||
|
registry integration + E2E.
|
||||||
|
|
||||||
|
**Path:** Persistent `codex app-server` stdio JSON-RPC client
|
||||||
|
(NDJSON-framed, v2 protocol). One app-server child per workspace
|
||||||
|
session; one `thread/start` per session; each A2A message becomes a
|
||||||
|
`turn/start` RPC; agent responses arrive as
|
||||||
|
`agent_message_delta` notifications. Per-thread serialization for
|
||||||
|
mid-turn arrivals (matches OpenClaw's per-chat sequentializer).
|
||||||
|
Optional `turn/interrupt` for "latest message wins" workspaces.
|
||||||
|
|
||||||
|
**Artifacts landed:**
|
||||||
|
- `docs/integrations/codex-app-server-adapter-design.md` — full design
|
||||||
|
including RPC sequence, executor skeleton, eight open questions.
|
||||||
|
- `molecule-ai-workspace-template-codex/` — full template repo
|
||||||
|
scaffolded:
|
||||||
|
- `app_server.py` (286 LOC) — async JSON-RPC over NDJSON stdio
|
||||||
|
- `executor.py` (~270 LOC) — thread bootstrap, turn dispatch,
|
||||||
|
notification accumulation, mid-turn serialization
|
||||||
|
- `adapter.py` — thin `BaseAdapter` shell + preflight
|
||||||
|
- `Dockerfile`, `start.sh`, `config.yaml`, `requirements.txt`,
|
||||||
|
`README.md`
|
||||||
|
- `tests/` — **12/12 tests pass** (7 vs NDJSON mock child, 5 vs
|
||||||
|
fake AppServerProcess covering executor logic)
|
||||||
|
|
||||||
|
**Validated against live `codex-cli 0.72.0`:** NDJSON framing,
|
||||||
|
`initialize` handshake, AND `thread/start` all work end-to-end.
|
||||||
|
**Schema-runtime drift caught:** real binary returns `thread.id`,
|
||||||
|
not `thread.threadId` as the JSON schema claims. Executor now
|
||||||
|
accepts both shapes; without the smoke test this would have been
|
||||||
|
a production bug.
|
||||||
|
|
||||||
|
**Remaining (task #85, #86):**
|
||||||
|
1. Register `codex` in molecule-core's `manifest.json` +
|
||||||
|
`workspace-server/internal/handlers/runtime_registry.go`.
|
||||||
|
**Defer to post-demo** — touches working live registry.
|
||||||
|
2. E2E verification with a real Molecule workspace + peer A2A
|
||||||
|
traffic, per `feedback_close_on_user_visible_not_merge`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Cross-cutting (task #86)
|
||||||
|
|
||||||
|
End-to-end verification per `feedback_close_on_user_visible_not_merge`.
|
||||||
|
For each runtime, the closure criterion is not "code merged" but
|
||||||
|
"observed: real workspace boots → A2A message from peer agent →
|
||||||
|
delivered to running session → reply returned through A2A response
|
||||||
|
queue → peer agent receives". No runtime stream closes until that
|
||||||
|
chain is observed.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## What's blocking what
|
||||||
|
|
||||||
|
| Stream | Blocked on |
|
||||||
|
|---|---|
|
||||||
|
| claude-code | (done) |
|
||||||
|
| OpenClaw plugin | live gateway validation, then post-demo adapter rewrite |
|
||||||
|
| OpenClaw adapter rewrite | post-demo timing |
|
||||||
|
| hermes upstream PR | user confirmation to submit + Discord pre-validation |
|
||||||
|
| hermes consumer plugin | upstream PR merging |
|
||||||
|
| codex implementation | resolve 8 open questions, then post-demo eng time |
|
||||||
|
| E2E verification | each runtime stream completing |
|
||||||
|
|
||||||
|
Three of four runtime streams are at decision points needing user
|
||||||
|
input. Pre-demo (T-4d to 2026-05-06), the safe move is to land the
|
||||||
|
remaining design + scaffolding work and defer all behavioral changes to
|
||||||
|
post-demo.
|
||||||
@ -32,7 +32,8 @@
|
|||||||
{"name": "deepagents", "repo": "Molecule-AI/molecule-ai-workspace-template-deepagents", "ref": "main"},
|
{"name": "deepagents", "repo": "Molecule-AI/molecule-ai-workspace-template-deepagents", "ref": "main"},
|
||||||
{"name": "hermes", "repo": "Molecule-AI/molecule-ai-workspace-template-hermes", "ref": "main"},
|
{"name": "hermes", "repo": "Molecule-AI/molecule-ai-workspace-template-hermes", "ref": "main"},
|
||||||
{"name": "gemini-cli", "repo": "Molecule-AI/molecule-ai-workspace-template-gemini-cli", "ref": "main"},
|
{"name": "gemini-cli", "repo": "Molecule-AI/molecule-ai-workspace-template-gemini-cli", "ref": "main"},
|
||||||
{"name": "openclaw", "repo": "Molecule-AI/molecule-ai-workspace-template-openclaw", "ref": "main"}
|
{"name": "openclaw", "repo": "Molecule-AI/molecule-ai-workspace-template-openclaw", "ref": "main"},
|
||||||
|
{"name": "codex", "repo": "Molecule-AI/molecule-ai-workspace-template-codex", "ref": "main"}
|
||||||
],
|
],
|
||||||
"org_templates": [
|
"org_templates": [
|
||||||
{"name": "molecule-dev", "repo": "Molecule-AI/molecule-ai-org-template-molecule-dev", "ref": "main"},
|
{"name": "molecule-dev", "repo": "Molecule-AI/molecule-ai-org-template-molecule-dev", "ref": "main"},
|
||||||
|
|||||||
Loading…
Reference in New Issue
Block a user