Merge pull request #2513 from Molecule-AI/auto-sync/main-35cb6ba0

chore: sync main → staging (auto, merge 35cb6ba0)
2026-05-02 10:53:27 +00:00 · 2026-05-02 10:53:27 +00:00 · 15dd1f26c3
commit 15dd1f26c3
parent 1f77f41a80 8083fd8b7d
4 changed files with 715 additions and 1 deletions
--- a/docs/integrations/codex-app-server-adapter-design.md
+++ b/docs/integrations/codex-app-server-adapter-design.md
@ -0,0 +1,360 @@
+# Codex CLI workspace adapter — app-server design
+
+**Status:** Design draft — pre-implementation
+**Owner:** Molecule AI (hongmingwang@moleculesai.app)
+**Date:** 2026-05-02
+**Codex version validated against:** `codex-cli 0.72.0`
+**Related:** `docs/integrations/hermes-platform-plugins-upstream-pr.md`,
+`molecule-ai-workspace-template-openclaw/packages/openclaw-channel-plugin/`
+
+---
+
+## Goal
+
+Add a Molecule workspace template for the OpenAI Codex CLI runtime
+(`@openai/codex` v0.72+). The template should give Codex agents the
+same A2A inbox + mid-session push behavior the other supported
+runtimes have:
+
+- **claude-code:** MCP `notifications/claude/channel`
+- **OpenClaw:** channel-plugin webhook into the gateway kernel
+- **hermes:** `BasePlatformAdapter` (pending upstream PR; polling fallback today)
+- **codex (this design):** persistent `codex app-server` stdio JSON-RPC
+  client; A2A messages become `turn/start` calls against a long-lived
+  thread
+
+Today there is no codex template. The legacy fallback registry entry
+at `workspace-server/internal/handlers/runtime_registry.go:83` exists
+only to keep old workspaces from crashing — there is no live adapter,
+no Dockerfile, nothing in `manifest.json`. This design covers the
+fresh build.
+
+---
+
+## Architecture decision: app-server, not `codex exec`
+
+`codex exec --json` is the obvious shape — one CLI subprocess per
+A2A message, same anti-pattern OpenClaw used to have and that we are
+replacing. It loses session continuity (no shared thread), pays
+process-spawn cost on every turn, and gives no path to mid-turn
+interruption.
+
+`codex app-server` is a long-running JSON-RPC server over stdio that
+holds thread state in memory. The v2 protocol (validated below) gives
+us:
+
+- `thread/start` → returns `threadId`
+- `turn/start` → input array, threadId required → returns `turnId`
+- `turn/interrupt` → cancel a running turn by `(threadId, turnId)`
+- Server-pushed notifications: `agent_message_delta`, `turn/started`,
+  `turn/completed`, `reasoning_text_delta`,
+  `command_execution_output_delta`, `mcp_tool_call_progress`,
+  `error_notification`, etc.
+
+A persistent app-server child plus a small async stdio reader gives us
+session continuity AND mid-turn injection. Same dual-win shape we got
+from migrating OpenClaw away from `openclaw agent`.
+
+### Why not v1?
+
+v1 of the protocol exposes `newConversation` + `sendUserMessage` /
+`sendUserTurn` (one-shot per message, no streaming notifications). v2
+introduces threads + turns + delta notifications. v2 is the
+forward-looking surface; we build against v2 from the start.
+
+---
+
+## RPC sequence
+
+### 1. Boot
+
+```
+adapter spawn ▶ codex app-server (stdio NDJSON)
+         ◀ ready (process up)
+adapter ▶ {"jsonrpc":"2.0","id":1,"method":"initialize",
+           "params":{"clientInfo":{"name":"molecule-runtime","version":"…"}}}
+adapter ◀ {"id":1,"result":{"userAgent":"codex_cli_rs/0.72.0 …"}}
+```
+
+Validated 2026-05-02 against the installed binary — NDJSON framing,
+initialize works as shown.
+
+### 2. Thread per workspace session
+
+```
+adapter ▶ thread/start
+            params: {model, sandboxPolicy, approvalPolicy, cwd,
+                     baseInstructions, developerInstructions, …}
+adapter ◀ {result: {thread: {threadId: "th_…"}}}
+```
+
+`threadId` is cached on the adapter for the workspace's lifetime. On
+adapter restart we use `thread/resume` against the persisted ID
+(written to disk under `~/.codex/sessions/` by codex itself, but we
+also keep our own pointer in workspace state for fast restore).
+
+### 3. A2A message → turn/start
+
+For each inbound A2A message:
+
+```
+adapter ▶ turn/start
+            params: {threadId, input: [{type:"text", text:"…"}], …}
+adapter ◀ {result: {turn: {turnId: "tu_…"}}}
+
+(server pushes notifications)
+adapter ◀ turn/started
+adapter ◀ agent_message_delta (text chunk)
+adapter ◀ agent_message_delta (text chunk)
+…
+adapter ◀ turn/completed
+```
+
+The adapter accumulates `agent_message_delta` chunks into a buffer
+keyed by `turnId`, emits them onto the A2A response queue (streamed if
+the molecule-runtime contract supports streaming, otherwise assembled
+into a single final message on `turn/completed`).
+
+### 4. Mid-turn injection — the load-bearing case
+
+**Default policy: per-thread serialization.** If a turn is already
+running when a second A2A message arrives, queue the new message and
+fire `turn/start` once the current `turn/completed` lands. This
+matches OpenClaw's per-chat sequentializer behavior — the A2A peer
+sees their messages handled in order, and we don't need
+`turn/interrupt` for the common case.
+
+**Opt-in policy: interrupt-and-rerun.** For workspaces that prefer
+"latest message wins" semantics (rare; configurable), the adapter
+fires `turn/interrupt` with `(threadId, currentTurnId)`, waits for
+`turn/completed` (with cancelled status), then `turn/start` with the
+combined context: previous user message + agent's partial response so
+far + new message, so the agent has full context of what got
+interrupted. Off by default.
+
+### 5. Shutdown
+
+```
+adapter ▶ {"method":"shutdown"} (if v2 exposes one; otherwise SIGTERM)
+adapter ▶ close stdio
+adapter ▶ wait(child, timeout=5s); on timeout SIGKILL
+```
+
+---
+
+## File layout (new template repo)
+
+```
+molecule-ai-workspace-template-codex/
+├── adapter.py        # BaseAdapter shell, thin (~50 LOC)
+├── executor.py       # AppServerProxyExecutor — the RPC client (~300 LOC)
+├── app_server.py     # AppServerProcess — stdio child + NDJSON reader (~150 LOC)
+├── config.yaml
+├── Dockerfile        # node:20 + npm i -g @openai/codex@0.72
+├── start.sh          # boots adapter; codex app-server is spawned per session by executor
+├── requirements.txt
+├── README.md
+└── tests/
+    ├── test_app_server.py     # mocks stdio; tests framing, request/notification dispatch
+    └── test_executor.py       # mocks AppServerProcess; tests turn lifecycle, interrupt
+```
+
+Modeled on the hermes template (which is the closest existing shape:
+adapter.py + executor.py separation; daemon proxy via local IPC). The
+extra `app_server.py` exists because the JSON-RPC client + child
+process management is non-trivial enough to warrant its own module
+with its own tests.
+
+---
+
+## Executor skeleton
+
+```python
+# executor.py — A2A → codex app-server bridge
+
+class CodexAppServerExecutor(AgentExecutor):
+    """Holds one app-server child + thread, dispatches A2A turns as turn/start RPCs."""
+
+    def __init__(self, config: AdapterConfig):
+        self._config = config
+        self._app_server: AppServerProcess | None = None
+        self._thread_id: str | None = None
+        self._turn_lock = asyncio.Lock()  # serialize per-thread by default
+
+    async def _ensure_thread(self) -> str:
+        if self._app_server is None:
+            self._app_server = await AppServerProcess.start()
+            await self._app_server.initialize(client_info={
+                "name": "molecule-runtime",
+                "version": MOLECULE_RUNTIME_VERSION,
+            })
+        if self._thread_id is None:
+            resp = await self._app_server.request("thread/start", {
+                "model": self._config.model or None,
+                "developerInstructions": self._config.system_prompt or None,
+                # other policy fields (sandbox, approval) — Molecule defaults
+            })
+            self._thread_id = resp["thread"]["threadId"]
+        return self._thread_id
+
+    async def execute(self, context: RequestContext, event_queue: EventQueue) -> None:
+        prompt = extract_message_text(context.message) or ""
+        if not prompt.strip():
+            await event_queue.enqueue_event(new_agent_text_message("(empty prompt)"))
+            return
+
+        async with self._turn_lock:  # per-thread serialization
+            thread_id = await self._ensure_thread()
+
+            # Subscribe to delta notifications BEFORE starting the turn so we
+            # don't race the first agent_message_delta.
+            buffer: list[str] = []
+            done = asyncio.Event()
+            error: Exception | None = None
+
+            def on_notification(method: str, params: dict) -> None:
+                nonlocal error
+                if method == "agent_message_delta":
+                    buffer.append(params.get("delta", ""))
+                elif method == "turn/completed":
+                    done.set()
+                elif method == "error_notification":
+                    error = RuntimeError(params.get("message", "unknown app-server error"))
+                    done.set()
+
+            unsub = self._app_server.subscribe(on_notification)
+            try:
+                resp = await self._app_server.request("turn/start", {
+                    "threadId": thread_id,
+                    "input": [{"type": "text", "text": prompt}],
+                })
+                turn_id = resp["turn"]["turnId"]
+                await asyncio.wait_for(done.wait(), timeout=_TURN_TIMEOUT)
+            finally:
+                unsub()
+
+            if error:
+                await event_queue.enqueue_event(
+                    new_agent_text_message(f"[codex error] {error}"))
+                return
+            await event_queue.enqueue_event(new_agent_text_message("".join(buffer)))
+
+    async def cancel(self, context: RequestContext, event_queue: EventQueue) -> None:
+        # When the molecule-runtime cancels a request, fire turn/interrupt
+        # against the currently-running turn. Best-effort — racing
+        # turn/completed is fine, app-server returns a noop in that case.
+        if self._app_server and self._thread_id and self._current_turn_id:
+            await self._app_server.request("turn/interrupt", {
+                "threadId": self._thread_id,
+                "turnId": self._current_turn_id,
+            })
+```
+
+The `AppServerProcess` class encapsulates: stdio child management,
+NDJSON line reader/writer, request-id correlation, notification
+subscriber registry, and graceful shutdown. Standard async stdio
+JSON-RPC client — nothing exotic.
+
+---
+
+## Open questions to resolve before implementation
+
+1. **MoleculeRuntime streaming contract.** Does our A2A executor
+   contract support emitting incremental events (so the user sees
+   partial responses as the agent streams), or do we always assemble
+   on `turn/completed`? If streaming is supported, we want to forward
+   each `agent_message_delta` as an A2A event for parity with hermes
+   gateway streaming. (Cross-reference: hermes adapter currently
+   doesn't stream either — `executor.py:122` sets `stream=False` —
+   so non-streaming is the safe v1 baseline.)
+
+2. **Sandbox policy default.** Codex defaults to `read-only` for safety
+   in CLI mode; for workspace use we need write access to the
+   workspace tree. Pick a sensible default in `thread/start` —
+   probably `workspace-write` scoped to the workspace cwd.
+
+3. **Approval policy default.** Codex's `--ask-for-approval` modes
+   (`untrusted`, `on-failure`, `never`). Workspace agents need
+   `never` (they can't prompt a human). Confirm this is exposed via
+   `approvalPolicy` in `thread/start`.
+
+4. **Auth — login flow.** Codex supports `login api-key` (env
+   `OPENAI_API_KEY`) and `login chatgpt` (interactive OAuth). For
+   workspace use we mandate API key. Document this in the template's
+   README and surface it as a required env in config.yaml.
+
+5. **MCP server passthrough.** Codex's own `mcp_servers` config lets
+   the agent call out to MCP servers as a CLIENT. Should the workspace
+   adapter automatically wire `~/.codex/config.toml` so the agent can
+   reach the molecule MCP server (chat_history, recall_memory,
+   delegate_task)? Almost certainly yes — but verify the env-var
+   substitution pattern works in TOML.
+
+6. **Thread persistence across workspace restarts.** Codex stores
+   sessions on disk under `~/.codex/sessions/`. The adapter should
+   persist the threadId in workspace state so a restart resumes the
+   thread (`thread/resume`) rather than starting fresh. This matches
+   the existing molecule-runtime convention for session continuity.
+
+7. **Token usage / cost reporting.** v2 emits
+   `ThreadTokenUsageUpdatedNotification`. Plumb this into our usage
+   tracking — same path the other runtimes use.
+
+8. **MCP push notifications inbound.** Earlier research established
+   that codex's own MCP server mode does NOT support
+   `notifications/*` for push. So the path for unsolicited mid-session
+   A2A messages is NOT "codex's MCP client receives notifications from
+   our MCP server" — it's "molecule-runtime polls inbox via
+   `wait_for_message`, and on each polled message fires `turn/start`
+   on the existing thread." The "MCP native" framing here is satisfied
+   not by codex receiving MCP push, but by the persistent thread +
+   turn/start delivering the same UX (session continuity + queued or
+   interrupted handling of new messages mid-thread).
+
+---
+
+## Why this design satisfies "MCP native push parity"
+
+User goal: every runtime delivers A2A inbox messages with the same
+quality of experience as claude-code's MCP `notifications/claude/channel`.
+
+claude-code path: MCP server pushes notification → claude-code SDK
+injects synthetic user turn into running session.
+
+Codex path: molecule-runtime polls inbox (universal poll path) →
+adapter fires `turn/start` on the existing app-server thread → codex
+processes the message in-thread with full context. The "push" happens
+at the molecule-runtime ↔ adapter boundary; the "native" part is that
+codex's own session model handles it as an in-thread turn, not as a
+fresh subprocess.
+
+For mid-turn arrivals: the per-thread serialization (or opt-in
+interrupt) gives us behavior equivalent to OpenClaw's per-chat
+sequentializer. Equivalent UX to claude-code's mid-session
+notification injection in practice — one is a kernel-level interrupt,
+the other is a queue-then-dispatch, but the user-visible behavior
+("the agent processes my message after the current turn finishes") is
+identical.
+
+---
+
+## Sequencing
+
+This is post-demo work. Order:
+
+1. **Spec the executor lifecycle** — pin down the open questions
+   above (especially #1 streaming, #5 MCP passthrough, #6 thread
+   persistence) before any code lands.
+2. **Implement `AppServerProcess`** with thorough unit tests against a
+   mock stdio. This is the riskiest module (concurrency around
+   request-id correlation + notification dispatch); land it first
+   with high coverage.
+3. **Implement `CodexAppServerExecutor`** on top.
+4. **Build the template repo skeleton** (Dockerfile, config.yaml,
+   start.sh, README) once the Python side runs locally.
+5. **Add codex to `manifest.json`** and the runtime registry.
+6. **End-to-end verify** per `feedback_close_on_user_visible_not_merge`
+   — boot a real workspace, send A2A messages, observe streamed
+   responses + thread continuity + queued mid-turn handling.
+
+Estimated total: 3-5 engineering days for v1, plus E2E hardening.
--- a/docs/integrations/hermes-platform-plugins-upstream-pr.md
+++ b/docs/integrations/hermes-platform-plugins-upstream-pr.md
@ -0,0 +1,191 @@
+# Upstream PR draft: Pluggable platform adapters for hermes-agent
+
+**Status:** Draft — pre-submission review
+**Target repo:** `NousResearch/hermes-agent`
+**Owner:** Molecule AI (hongmingwang@moleculesai.app)
+**Date drafted:** 2026-05-02
+
+---
+
+## Why this draft exists
+
+Molecule needs to deliver A2A inbox messages to a hermes-hosted agent the same way Telegram messages reach it today — through `_handle_message`, with `set_busy_session_handler` semantics for mid-turn arrivals. Today this requires forking `gateway/run.py` because the platform adapter system is closed (`_create_adapter` is a hardcoded if/elif chain at lines 2424-2578).
+
+But hermes already ships a working plugin discovery system for memory backends (`plugins/memory/__init__.py`). Extending the same pattern to platforms is a small, symmetric change — not novel architecture. This draft documents the proposed upstream PR before we open it, so we can iterate locally on tone, scope, and code shape.
+
+---
+
+## Proposed PR title
+
+> Pluggable platform adapters via `plugins/platforms/` discovery
+
+(Mirrors the existing `plugins/memory/` shape so the title alone signals "this is the same pattern, just for the other subsystem.")
+
+---
+
+## PR body
+
+### Problem
+
+Hermes ships 19 in-tree platform adapters (Telegram, Discord, WhatsApp, Slack, Signal, Mattermost, Matrix, Email, SMS, DingTalk, Feishu, WeCom variants, Weixin, BlueBubbles, QQBot, HomeAssistant, API server, Webhook). Each is wired by editing two files:
+
+- `gateway/config.py:48-69` — append a `Platform` enum value
+- `gateway/run.py:2424-2578` — append an `elif platform == Platform.X:` branch in `_create_adapter()`
+
+For platforms with broad demand (Telegram, Slack, etc.) this is fine: the maintenance load lives upstream, every user benefits. For platforms with narrow but real demand — enterprise-internal channels (Rocket.Chat, RingCentral, Zulip), agent-to-agent inbox protocols (e.g. Molecule's A2A), niche regional platforms, or experimental transports — the only path today is forking `gateway/run.py`. Forks drift, defeat the purpose of an OSS gateway, and discourage contribution back upstream.
+
+### Prior art (already in hermes)
+
+The memory subsystem solved exactly this problem at `plugins/memory/__init__.py`:
+
+1. **Two-tier discovery** — bundled providers in `plugins/memory/<name>/` plus user-installed providers in `$HERMES_HOME/plugins/<name>/`. Bundled wins on name collision.
+2. **`register(ctx)` collector pattern** (`plugins/memory/__init__.py:264-305`) — a plugin's `__init__.py` exposes a `register(ctx)` function; `ctx` already supports `register_memory_provider`, `register_tool`, `register_hook`, `register_cli_command`.
+3. **`plugin.yaml` manifest** for description and metadata.
+4. **Config-driven activation** (`memory.provider: honcho` selects which provider loads).
+
+Adding `register_platform_adapter` to the same collector and a `plugins/platforms/` discovery directory extends this pattern symmetrically.
+
+### Proposal
+
+**Three small changes:**
+
+1. **New collector method** in `plugins/memory/__init__.py:_ProviderCollector` (or a new shared `plugins/_collector.py` if maintainers prefer cleaner separation):
+
+   ```python
+   def register_platform_adapter(self, name: str, adapter_class: type, requirements_check=None):
+       """Register a platform adapter loadable as plugin.
+
+       name: unique platform identifier (matches gateway.platforms.<name> in config)
+       adapter_class: subclass of BasePlatformAdapter
+       requirements_check: optional callable returning bool — same shape as
+                          existing check_telegram_requirements() etc.
+       """
+       self.platform_adapters[name] = (adapter_class, requirements_check)
+   ```
+
+2. **New `plugins/platforms/__init__.py`** mirroring `plugins/memory/__init__.py` — `discover_platform_adapters()`, `load_platform_adapter(name)`, two-tier (bundled + `$HERMES_HOME/plugins/`) discovery.
+
+3. **`_create_adapter()` fallback** at `gateway/run.py:2578` — after the in-tree if/elif chain returns None, attempt plugin lookup:
+
+   ```python
+   # Existing in-tree adapters checked first (precedence preserved).
+   # If no match, fall through to plugin discovery.
+   from plugins.platforms import load_platform_adapter
+   plugin_entry = load_platform_adapter(platform.value)
+   if plugin_entry:
+       adapter_class, req_check = plugin_entry
+       if req_check and not req_check():
+           logger.warning(f"{platform.value}: plugin requirements not met")
+           return None
+       return adapter_class(config)
+   return None
+   ```
+
+4. **`Platform` enum becomes open-set.** Today it's `Enum`; switch to a string-backed pattern that accepts unknown values (still validates against the union of in-tree + discovered plugins at config-load time):
+
+   ```python
+   # gateway/config.py — replace Enum with frozen dataclass + dynamic registry.
+   # Keeps the in-tree values as module-level singletons for backward compat:
+   # Platform.TELEGRAM still works as today.
+   ```
+
+   This is the only "shape change" in the PR. Backward compat is straightforward: every existing `Platform.TELEGRAM` reference continues to work because the module exports the same names.
+
+### Backward compatibility
+
+- All 19 in-tree adapters keep their hardcoded path in `_create_adapter()` (precedence: in-tree wins on name collision, exactly like memory plugins).
+- Existing config files (`gateway.platforms.telegram.enabled: true`) continue to work unchanged.
+- No new mandatory config keys.
+- Plugin discovery only runs if the platform name doesn't match an in-tree value, so cold-start cost is zero for users who don't use plugins.
+- Fork-then-add-platform users can migrate to plugins at their own pace; the in-tree path isn't deprecated.
+
+### Test plan
+
+- **Unit**: discovery scans both bundled and user dirs, respects precedence.
+- **Unit**: `_create_adapter()` falls through to plugin lookup only when in-tree doesn't match.
+- **Integration**: ship a minimal `plugins/platforms/example/` in-tree (read-only, returns canned messages) so CI exercises the full plugin code path. Same approach `plugins/memory/holographic/` takes today.
+- **Manual**: Molecule will publish `hermes-platform-molecule-a2a` as the first external consumer once this lands.
+
+### Documentation
+
+- Extend `CONTRIBUTING.md`'s "Should it be a Skill or a Tool?" section with "Should it be a Platform Plugin or an in-tree Platform?" — same shape, same decision tree.
+- Add `plugins/platforms/README.md` mirroring `plugins/memory/`'s convention.
+
+### Out of scope (intentionally)
+
+- **Setuptools `entry_points`** — could be added later as a third discovery tier (after bundled + `$HERMES_HOME/plugins/`). Skipping for v1 because the directory-based discovery already covers the demand and matches the memory pattern. Adding entry_points is a non-breaking extension.
+- **Hot-reload** — plugins discovered at gateway boot, no live re-scan. Matches memory plugins.
+- **Sandboxing** — plugins run with full hermes process privileges. Same trust model as memory plugins; documented in the new README.
+
+### Reference consumer
+
+Molecule AI will ship `hermes-platform-molecule-a2a` as the first external consumer. Use case: deliver agent-to-agent inbox messages (from peer agents authenticated at the platform layer, not the Telegram-user level) into the same `_handle_message` dispatch Telegram uses, with `internal=True` events to bypass user-auth. Expected timeline: within 2 weeks of merge.
+
+---
+
+## Open questions for upstream maintainers
+
+Per `CONTRIBUTING.md`, the right channel for design proposals is
+**GitHub Discussions**, not Discord (Discord is for "questions,
+showcasing projects, and sharing skills" — Discussions is the
+documented channel for "design proposals and architecture discussions").
+
+Open a Discussion at `NousResearch/hermes-agent/discussions` titled
+"RFC: pluggable platform adapters via `plugins/platforms/`" with the
+problem + proposal + open questions before filing the PR. This gives
+maintainers space to weigh in on shape before code is in flight.
+
+Open questions to put in the Discussion:
+
+1. **Preferred naming.** `register_platform_adapter` vs `register_platform` vs `register_channel`. Consistency with memory's `register_memory_provider` argues for the long form.
+2. **Enum vs string.** Is the maintainer team open to making `Platform` open-set? If not, fallback design: keep enum, add a single `Platform.PLUGIN` sentinel + a `plugin_name` field on `PlatformConfig`. Slightly uglier but smaller blast radius.
+3. **Testing**: `plugins/platforms/example/` checked into the repo, or test-fixtures-only? Memory plugins are real (mem0, honcho, supermemory bundled), so a real example seems consistent.
+4. **Discovery ordering**: confirm the user wants bundled-wins precedence (matches memory) vs user-can-override-bundled (would let downstream patch a buggy in-tree adapter without forking). Current memory pattern is bundled-wins; we'll match it unless told otherwise.
+
+---
+
+## Effort estimate
+
+- **Code change**: ~150 LOC across `plugins/platforms/__init__.py` (new), `gateway/config.py` (Platform refactor), `gateway/run.py` (10-line fallback in `_create_adapter`), tests (~50 LOC).
+- **Docs**: ~80 LOC across `CONTRIBUTING.md` extension and new `plugins/platforms/README.md`.
+- **Review cycle**: depends on maintainer responsiveness. Memory plugin system shipped in v0.5–0.7 era; platform plugin system would land for v0.11 if accepted.
+
+---
+
+## After this PR lands (Molecule-side follow-up)
+
+1. Publish `hermes-platform-molecule-a2a` (PyPI + `~/.hermes/plugins/molecule-a2a/`).
+2. Bump our hermes workspace template to declare `plugins.platforms.molecule_a2a.enabled: true`.
+3. Remove the polling shim from `molecule-ai-workspace-template-hermes/adapter.py` once the plugin path is verified end-to-end.
+
+---
+
+## Status checklist (for our own tracking)
+
+Per user's gating: "if the plugin works locally in our docker setup
+and e2e testing works, yes [submit]". Validation prerequisites:
+
+- [ ] Build a working `plugins/platforms/molecule_a2a/` plugin against
+      a forked hermes-agent with the proposed change applied
+- [ ] Bake the forked hermes + plugin into a local copy of our
+      `molecule-ai-workspace-template-hermes` Docker image
+- [ ] E2E: boot the local image, send A2A messages from a peer agent,
+      observe `_handle_message` dispatch + reply through A2A queue
+- [ ] Confirm `Platform` enum refactor doesn't break downstream — grep
+      for `Platform.X` usages across hermes
+- [ ] Confirm `$HERMES_HOME` is the right user-plugin root for
+      platforms (matches memory convention)
+- [ ] Open a GitHub Discussion at
+      `NousResearch/hermes-agent/discussions` titled
+      "RFC: pluggable platform adapters via plugins/platforms/" with
+      design + open questions; wait for maintainer feedback
+- [ ] Branch name: `feat/pluggable-platform-adapters` per
+      CONTRIBUTING.md branch convention
+- [ ] Commit prefix: `feat(gateway): pluggable platform adapters
+      via plugins/platforms/` per Conventional Commits + scope `gateway`
+- [ ] PR description covers what/why + how-to-test + platforms tested,
+      per CONTRIBUTING.md PR-description requirements
+- [ ] Open PR against `NousResearch/hermes-agent` main once Discussion
+      lands consensus
+- [ ] Track PR; bump cadence weekly; if stalled past 4 weeks, propose
+      fork-and-bundle as fallback for our hermes template image
--- a/docs/integrations/runtime-native-mcp-status.md
+++ b/docs/integrations/runtime-native-mcp-status.md
@ -0,0 +1,162 @@
+# Runtime native-MCP push parity — status
+
+**Goal:** every workspace runtime delivers Molecule A2A inbox messages
+with the same UX as claude-code's MCP `notifications/claude/channel`
+push: session continuity + queued or interrupted handling of new
+messages mid-thread, no fresh subprocess per message.
+
+Tracked across four runtime streams. Updated 2026-05-02.
+
+---
+
+## claude-code
+
+**Status:** ✅ Done. Native MCP `notifications/claude/channel` push
+shipped via `workspace/a2a_mcp_server.py`. Requires the host to launch
+with `--dangerously-load-development-channels server:molecule`.
+
+No further work.
+
+---
+
+## OpenClaw
+
+**Status:** Scaffolded; awaiting validation + companion adapter rewrite.
+
+**Path:** Channel-plugin SDK (`openclaw/plugin-sdk`), auto-discovered
+from `~/.openclaw/plugins/<name>/` or workspace `.openclaw/`. Plugin
+registers an HTTP webhook on `openclaw gateway`; Molecule workspace
+adapter POSTs A2A messages to it; gateway dispatches through the same
+`dispatchReplyWithBufferedBlockDispatcher` kernel call native channels
+(Telegram, Lark, Slack, Discord) use.
+
+**Artifacts landed:**
+- `molecule-ai-workspace-template-openclaw/packages/openclaw-channel-plugin/`
+  - `package.json`, `openclaw.plugin.json` (manifest), `index.ts`
+    (channel + webhook handler), `README.md`, `tsconfig.json`
+- Pre-release `v0.1.0-pre`. Mirrors `rabbit-lark-bot` reference
+  plugin shape.
+
+**Remaining (task #84, #87):**
+1. Validate against a running OpenClaw gateway. Open questions in the
+   plugin README: `resolveAgentRoute` peer-id shape,
+   `dispatchReplyWithBufferedBlockDispatcher` async semantics,
+   `outbound.sendText` no-op safety.
+2. Rewrite Python adapter (`adapter.py`) to stop shelling out
+   `openclaw agent --message ...` and instead POST to the plugin's
+   webhook + run `/agent-reply` callback HTTP server. **Post-demo
+   work** (touches a working integration).
+
+---
+
+## hermes
+
+**Status:** Upstream PR drafted; short-term shim deemed unnecessary.
+
+**Path:** Open the upstream `BasePlatformAdapter` system to external
+plugins. Hermes already ships a working plugin discovery system for
+memory backends (`plugins/memory/`, `register(ctx)` collector pattern,
+`$HERMES_HOME/plugins/<name>/` user-installed tier). The PR extends
+the same shape to platforms — `register_platform_adapter(...)` on the
+existing collector, new `plugins/platforms/` discovery directory,
+3-line fallback in `_create_adapter()`. Symmetric, not novel.
+
+**Artifacts landed:**
+- `docs/integrations/hermes-platform-plugins-upstream-pr.md` — full
+  PR draft including problem, prior art, proposal, code shape,
+  backward compat, test plan, and four open questions to resolve in
+  Discord before submitting.
+
+**Why no short-term polling shim:** earlier framing was wrong. Molecule
+runtime already polls the inbox via `wait_for_message` per turn; each
+polled message fires a fresh `execute()` on the adapter, which
+proxies to hermes's stateless `/v1/chat/completions`. Adding adapter-
+side polling would be duplicate work. The genuine short-term gap is
+**session continuity** (hermes daemon doesn't see a single
+conversation across turns because chat/completions is stateless), not
+push latency. That gap is solved by the upstream PR; no
+intermediate shim earns its complexity.
+
+**Remaining (task #83):**
+1. Reach out in Nous Research Discord to validate open questions
+   (Platform enum-vs-string refactor, naming, example-plugin scope).
+2. Submit PR to `NousResearch/hermes-agent`. **Requires user
+   confirmation** — opening an upstream PR is an action visible to
+   others.
+3. Once merged: ship `hermes-platform-molecule-a2a` as the first
+   external consumer, bump our hermes workspace template to enable
+   it, remove any transitional code.
+
+---
+
+## Codex (OpenAI Codex CLI)
+
+**Status:** Template structurally complete (12 files, 12/12 tests passing,
+validated against real codex-cli 0.72.0). Awaiting molecule-core
+registry integration + E2E.
+
+**Path:** Persistent `codex app-server` stdio JSON-RPC client
+(NDJSON-framed, v2 protocol). One app-server child per workspace
+session; one `thread/start` per session; each A2A message becomes a
+`turn/start` RPC; agent responses arrive as
+`agent_message_delta` notifications. Per-thread serialization for
+mid-turn arrivals (matches OpenClaw's per-chat sequentializer).
+Optional `turn/interrupt` for "latest message wins" workspaces.
+
+**Artifacts landed:**
+- `docs/integrations/codex-app-server-adapter-design.md` — full design
+  including RPC sequence, executor skeleton, eight open questions.
+- `molecule-ai-workspace-template-codex/` — full template repo
+  scaffolded:
+  - `app_server.py` (286 LOC) — async JSON-RPC over NDJSON stdio
+  - `executor.py` (~270 LOC) — thread bootstrap, turn dispatch,
+    notification accumulation, mid-turn serialization
+  - `adapter.py` — thin `BaseAdapter` shell + preflight
+  - `Dockerfile`, `start.sh`, `config.yaml`, `requirements.txt`,
+    `README.md`
+  - `tests/` — **12/12 tests pass** (7 vs NDJSON mock child, 5 vs
+    fake AppServerProcess covering executor logic)
+
+**Validated against live `codex-cli 0.72.0`:** NDJSON framing,
+`initialize` handshake, AND `thread/start` all work end-to-end.
+**Schema-runtime drift caught:** real binary returns `thread.id`,
+not `thread.threadId` as the JSON schema claims. Executor now
+accepts both shapes; without the smoke test this would have been
+a production bug.
+
+**Remaining (task #85, #86):**
+1. Register `codex` in molecule-core's `manifest.json` +
+   `workspace-server/internal/handlers/runtime_registry.go`.
+   **Defer to post-demo** — touches working live registry.
+2. E2E verification with a real Molecule workspace + peer A2A
+   traffic, per `feedback_close_on_user_visible_not_merge`.
+
+---
+
+## Cross-cutting (task #86)
+
+End-to-end verification per `feedback_close_on_user_visible_not_merge`.
+For each runtime, the closure criterion is not "code merged" but
+"observed: real workspace boots → A2A message from peer agent →
+delivered to running session → reply returned through A2A response
+queue → peer agent receives". No runtime stream closes until that
+chain is observed.
+
+---
+
+## What's blocking what
+
+| Stream | Blocked on |
+|---|---|
+| claude-code | (done) |
+| OpenClaw plugin | live gateway validation, then post-demo adapter rewrite |
+| OpenClaw adapter rewrite | post-demo timing |
+| hermes upstream PR | user confirmation to submit + Discord pre-validation |
+| hermes consumer plugin | upstream PR merging |
+| codex implementation | resolve 8 open questions, then post-demo eng time |
+| E2E verification | each runtime stream completing |
+
+Three of four runtime streams are at decision points needing user
+input. Pre-demo (T-4d to 2026-05-06), the safe move is to land the
+remaining design + scaffolding work and defer all behavioral changes to
+post-demo.
--- a/manifest.json
+++ b/manifest.json
@ -32,7 +32,8 @@
    {"name": "deepagents", "repo": "Molecule-AI/molecule-ai-workspace-template-deepagents", "ref": "main"},
    {"name": "hermes", "repo": "Molecule-AI/molecule-ai-workspace-template-hermes", "ref": "main"},
    {"name": "gemini-cli", "repo": "Molecule-AI/molecule-ai-workspace-template-gemini-cli", "ref": "main"},
-    {"name": "openclaw", "repo": "Molecule-AI/molecule-ai-workspace-template-openclaw", "ref": "main"}
+    {"name": "openclaw", "repo": "Molecule-AI/molecule-ai-workspace-template-openclaw", "ref": "main"},
+    {"name": "codex", "repo": "Molecule-AI/molecule-ai-workspace-template-codex", "ref": "main"}
  ],
  "org_templates": [
    {"name": "molecule-dev", "repo": "Molecule-AI/molecule-ai-org-template-molecule-dev", "ref": "main"},