Merge pull request #2513 from Molecule-AI/auto-sync/main-35cb6ba0

chore: sync main → staging (auto, merge 35cb6ba0)
This commit is contained in:
Hongming Wang 2026-05-02 10:53:27 +00:00 committed by GitHub
commit 15dd1f26c3
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
4 changed files with 715 additions and 1 deletions

View File

@ -0,0 +1,360 @@
# Codex CLI workspace adapter — app-server design
**Status:** Design draft — pre-implementation
**Owner:** Molecule AI (hongmingwang@moleculesai.app)
**Date:** 2026-05-02
**Codex version validated against:** `codex-cli 0.72.0`
**Related:** `docs/integrations/hermes-platform-plugins-upstream-pr.md`,
`molecule-ai-workspace-template-openclaw/packages/openclaw-channel-plugin/`
---
## Goal
Add a Molecule workspace template for the OpenAI Codex CLI runtime
(`@openai/codex` v0.72+). The template should give Codex agents the
same A2A inbox + mid-session push behavior the other supported
runtimes have:
- **claude-code:** MCP `notifications/claude/channel`
- **OpenClaw:** channel-plugin webhook into the gateway kernel
- **hermes:** `BasePlatformAdapter` (pending upstream PR; polling fallback today)
- **codex (this design):** persistent `codex app-server` stdio JSON-RPC
client; A2A messages become `turn/start` calls against a long-lived
thread
Today there is no codex template. The legacy fallback registry entry
at `workspace-server/internal/handlers/runtime_registry.go:83` exists
only to keep old workspaces from crashing — there is no live adapter,
no Dockerfile, nothing in `manifest.json`. This design covers the
fresh build.
---
## Architecture decision: app-server, not `codex exec`
`codex exec --json` is the obvious shape — one CLI subprocess per
A2A message, same anti-pattern OpenClaw used to have and that we are
replacing. It loses session continuity (no shared thread), pays
process-spawn cost on every turn, and gives no path to mid-turn
interruption.
`codex app-server` is a long-running JSON-RPC server over stdio that
holds thread state in memory. The v2 protocol (validated below) gives
us:
- `thread/start` → returns `threadId`
- `turn/start` → input array, threadId required → returns `turnId`
- `turn/interrupt` → cancel a running turn by `(threadId, turnId)`
- Server-pushed notifications: `agent_message_delta`, `turn/started`,
`turn/completed`, `reasoning_text_delta`,
`command_execution_output_delta`, `mcp_tool_call_progress`,
`error_notification`, etc.
A persistent app-server child plus a small async stdio reader gives us
session continuity AND mid-turn injection. Same dual-win shape we got
from migrating OpenClaw away from `openclaw agent`.
### Why not v1?
v1 of the protocol exposes `newConversation` + `sendUserMessage` /
`sendUserTurn` (one-shot per message, no streaming notifications). v2
introduces threads + turns + delta notifications. v2 is the
forward-looking surface; we build against v2 from the start.
---
## RPC sequence
### 1. Boot
```
adapter spawn ▶ codex app-server (stdio NDJSON)
◀ ready (process up)
adapter ▶ {"jsonrpc":"2.0","id":1,"method":"initialize",
"params":{"clientInfo":{"name":"molecule-runtime","version":"…"}}}
adapter ◀ {"id":1,"result":{"userAgent":"codex_cli_rs/0.72.0 …"}}
```
Validated 2026-05-02 against the installed binary — NDJSON framing,
initialize works as shown.
### 2. Thread per workspace session
```
adapter ▶ thread/start
params: {model, sandboxPolicy, approvalPolicy, cwd,
baseInstructions, developerInstructions, …}
adapter ◀ {result: {thread: {threadId: "th_…"}}}
```
`threadId` is cached on the adapter for the workspace's lifetime. On
adapter restart we use `thread/resume` against the persisted ID
(written to disk under `~/.codex/sessions/` by codex itself, but we
also keep our own pointer in workspace state for fast restore).
### 3. A2A message → turn/start
For each inbound A2A message:
```
adapter ▶ turn/start
params: {threadId, input: [{type:"text", text:"…"}], …}
adapter ◀ {result: {turn: {turnId: "tu_…"}}}
(server pushes notifications)
adapter ◀ turn/started
adapter ◀ agent_message_delta (text chunk)
adapter ◀ agent_message_delta (text chunk)
adapter ◀ turn/completed
```
The adapter accumulates `agent_message_delta` chunks into a buffer
keyed by `turnId`, emits them onto the A2A response queue (streamed if
the molecule-runtime contract supports streaming, otherwise assembled
into a single final message on `turn/completed`).
### 4. Mid-turn injection — the load-bearing case
**Default policy: per-thread serialization.** If a turn is already
running when a second A2A message arrives, queue the new message and
fire `turn/start` once the current `turn/completed` lands. This
matches OpenClaw's per-chat sequentializer behavior — the A2A peer
sees their messages handled in order, and we don't need
`turn/interrupt` for the common case.
**Opt-in policy: interrupt-and-rerun.** For workspaces that prefer
"latest message wins" semantics (rare; configurable), the adapter
fires `turn/interrupt` with `(threadId, currentTurnId)`, waits for
`turn/completed` (with cancelled status), then `turn/start` with the
combined context: previous user message + agent's partial response so
far + new message, so the agent has full context of what got
interrupted. Off by default.
### 5. Shutdown
```
adapter ▶ {"method":"shutdown"} (if v2 exposes one; otherwise SIGTERM)
adapter ▶ close stdio
adapter ▶ wait(child, timeout=5s); on timeout SIGKILL
```
---
## File layout (new template repo)
```
molecule-ai-workspace-template-codex/
├── adapter.py # BaseAdapter shell, thin (~50 LOC)
├── executor.py # AppServerProxyExecutor — the RPC client (~300 LOC)
├── app_server.py # AppServerProcess — stdio child + NDJSON reader (~150 LOC)
├── config.yaml
├── Dockerfile # node:20 + npm i -g @openai/codex@0.72
├── start.sh # boots adapter; codex app-server is spawned per session by executor
├── requirements.txt
├── README.md
└── tests/
├── test_app_server.py # mocks stdio; tests framing, request/notification dispatch
└── test_executor.py # mocks AppServerProcess; tests turn lifecycle, interrupt
```
Modeled on the hermes template (which is the closest existing shape:
adapter.py + executor.py separation; daemon proxy via local IPC). The
extra `app_server.py` exists because the JSON-RPC client + child
process management is non-trivial enough to warrant its own module
with its own tests.
---
## Executor skeleton
```python
# executor.py — A2A → codex app-server bridge
class CodexAppServerExecutor(AgentExecutor):
"""Holds one app-server child + thread, dispatches A2A turns as turn/start RPCs."""
def __init__(self, config: AdapterConfig):
self._config = config
self._app_server: AppServerProcess | None = None
self._thread_id: str | None = None
self._turn_lock = asyncio.Lock() # serialize per-thread by default
async def _ensure_thread(self) -> str:
if self._app_server is None:
self._app_server = await AppServerProcess.start()
await self._app_server.initialize(client_info={
"name": "molecule-runtime",
"version": MOLECULE_RUNTIME_VERSION,
})
if self._thread_id is None:
resp = await self._app_server.request("thread/start", {
"model": self._config.model or None,
"developerInstructions": self._config.system_prompt or None,
# other policy fields (sandbox, approval) — Molecule defaults
})
self._thread_id = resp["thread"]["threadId"]
return self._thread_id
async def execute(self, context: RequestContext, event_queue: EventQueue) -> None:
prompt = extract_message_text(context.message) or ""
if not prompt.strip():
await event_queue.enqueue_event(new_agent_text_message("(empty prompt)"))
return
async with self._turn_lock: # per-thread serialization
thread_id = await self._ensure_thread()
# Subscribe to delta notifications BEFORE starting the turn so we
# don't race the first agent_message_delta.
buffer: list[str] = []
done = asyncio.Event()
error: Exception | None = None
def on_notification(method: str, params: dict) -> None:
nonlocal error
if method == "agent_message_delta":
buffer.append(params.get("delta", ""))
elif method == "turn/completed":
done.set()
elif method == "error_notification":
error = RuntimeError(params.get("message", "unknown app-server error"))
done.set()
unsub = self._app_server.subscribe(on_notification)
try:
resp = await self._app_server.request("turn/start", {
"threadId": thread_id,
"input": [{"type": "text", "text": prompt}],
})
turn_id = resp["turn"]["turnId"]
await asyncio.wait_for(done.wait(), timeout=_TURN_TIMEOUT)
finally:
unsub()
if error:
await event_queue.enqueue_event(
new_agent_text_message(f"[codex error] {error}"))
return
await event_queue.enqueue_event(new_agent_text_message("".join(buffer)))
async def cancel(self, context: RequestContext, event_queue: EventQueue) -> None:
# When the molecule-runtime cancels a request, fire turn/interrupt
# against the currently-running turn. Best-effort — racing
# turn/completed is fine, app-server returns a noop in that case.
if self._app_server and self._thread_id and self._current_turn_id:
await self._app_server.request("turn/interrupt", {
"threadId": self._thread_id,
"turnId": self._current_turn_id,
})
```
The `AppServerProcess` class encapsulates: stdio child management,
NDJSON line reader/writer, request-id correlation, notification
subscriber registry, and graceful shutdown. Standard async stdio
JSON-RPC client — nothing exotic.
---
## Open questions to resolve before implementation
1. **MoleculeRuntime streaming contract.** Does our A2A executor
contract support emitting incremental events (so the user sees
partial responses as the agent streams), or do we always assemble
on `turn/completed`? If streaming is supported, we want to forward
each `agent_message_delta` as an A2A event for parity with hermes
gateway streaming. (Cross-reference: hermes adapter currently
doesn't stream either — `executor.py:122` sets `stream=False`
so non-streaming is the safe v1 baseline.)
2. **Sandbox policy default.** Codex defaults to `read-only` for safety
in CLI mode; for workspace use we need write access to the
workspace tree. Pick a sensible default in `thread/start`
probably `workspace-write` scoped to the workspace cwd.
3. **Approval policy default.** Codex's `--ask-for-approval` modes
(`untrusted`, `on-failure`, `never`). Workspace agents need
`never` (they can't prompt a human). Confirm this is exposed via
`approvalPolicy` in `thread/start`.
4. **Auth — login flow.** Codex supports `login api-key` (env
`OPENAI_API_KEY`) and `login chatgpt` (interactive OAuth). For
workspace use we mandate API key. Document this in the template's
README and surface it as a required env in config.yaml.
5. **MCP server passthrough.** Codex's own `mcp_servers` config lets
the agent call out to MCP servers as a CLIENT. Should the workspace
adapter automatically wire `~/.codex/config.toml` so the agent can
reach the molecule MCP server (chat_history, recall_memory,
delegate_task)? Almost certainly yes — but verify the env-var
substitution pattern works in TOML.
6. **Thread persistence across workspace restarts.** Codex stores
sessions on disk under `~/.codex/sessions/`. The adapter should
persist the threadId in workspace state so a restart resumes the
thread (`thread/resume`) rather than starting fresh. This matches
the existing molecule-runtime convention for session continuity.
7. **Token usage / cost reporting.** v2 emits
`ThreadTokenUsageUpdatedNotification`. Plumb this into our usage
tracking — same path the other runtimes use.
8. **MCP push notifications inbound.** Earlier research established
that codex's own MCP server mode does NOT support
`notifications/*` for push. So the path for unsolicited mid-session
A2A messages is NOT "codex's MCP client receives notifications from
our MCP server" — it's "molecule-runtime polls inbox via
`wait_for_message`, and on each polled message fires `turn/start`
on the existing thread." The "MCP native" framing here is satisfied
not by codex receiving MCP push, but by the persistent thread +
turn/start delivering the same UX (session continuity + queued or
interrupted handling of new messages mid-thread).
---
## Why this design satisfies "MCP native push parity"
User goal: every runtime delivers A2A inbox messages with the same
quality of experience as claude-code's MCP `notifications/claude/channel`.
claude-code path: MCP server pushes notification → claude-code SDK
injects synthetic user turn into running session.
Codex path: molecule-runtime polls inbox (universal poll path) →
adapter fires `turn/start` on the existing app-server thread → codex
processes the message in-thread with full context. The "push" happens
at the molecule-runtime ↔ adapter boundary; the "native" part is that
codex's own session model handles it as an in-thread turn, not as a
fresh subprocess.
For mid-turn arrivals: the per-thread serialization (or opt-in
interrupt) gives us behavior equivalent to OpenClaw's per-chat
sequentializer. Equivalent UX to claude-code's mid-session
notification injection in practice — one is a kernel-level interrupt,
the other is a queue-then-dispatch, but the user-visible behavior
("the agent processes my message after the current turn finishes") is
identical.
---
## Sequencing
This is post-demo work. Order:
1. **Spec the executor lifecycle** — pin down the open questions
above (especially #1 streaming, #5 MCP passthrough, #6 thread
persistence) before any code lands.
2. **Implement `AppServerProcess`** with thorough unit tests against a
mock stdio. This is the riskiest module (concurrency around
request-id correlation + notification dispatch); land it first
with high coverage.
3. **Implement `CodexAppServerExecutor`** on top.
4. **Build the template repo skeleton** (Dockerfile, config.yaml,
start.sh, README) once the Python side runs locally.
5. **Add codex to `manifest.json`** and the runtime registry.
6. **End-to-end verify** per `feedback_close_on_user_visible_not_merge`
— boot a real workspace, send A2A messages, observe streamed
responses + thread continuity + queued mid-turn handling.
Estimated total: 3-5 engineering days for v1, plus E2E hardening.

View File

@ -0,0 +1,191 @@
# Upstream PR draft: Pluggable platform adapters for hermes-agent
**Status:** Draft — pre-submission review
**Target repo:** `NousResearch/hermes-agent`
**Owner:** Molecule AI (hongmingwang@moleculesai.app)
**Date drafted:** 2026-05-02
---
## Why this draft exists
Molecule needs to deliver A2A inbox messages to a hermes-hosted agent the same way Telegram messages reach it today — through `_handle_message`, with `set_busy_session_handler` semantics for mid-turn arrivals. Today this requires forking `gateway/run.py` because the platform adapter system is closed (`_create_adapter` is a hardcoded if/elif chain at lines 2424-2578).
But hermes already ships a working plugin discovery system for memory backends (`plugins/memory/__init__.py`). Extending the same pattern to platforms is a small, symmetric change — not novel architecture. This draft documents the proposed upstream PR before we open it, so we can iterate locally on tone, scope, and code shape.
---
## Proposed PR title
> Pluggable platform adapters via `plugins/platforms/` discovery
(Mirrors the existing `plugins/memory/` shape so the title alone signals "this is the same pattern, just for the other subsystem.")
---
## PR body
### Problem
Hermes ships 19 in-tree platform adapters (Telegram, Discord, WhatsApp, Slack, Signal, Mattermost, Matrix, Email, SMS, DingTalk, Feishu, WeCom variants, Weixin, BlueBubbles, QQBot, HomeAssistant, API server, Webhook). Each is wired by editing two files:
- `gateway/config.py:48-69` — append a `Platform` enum value
- `gateway/run.py:2424-2578` — append an `elif platform == Platform.X:` branch in `_create_adapter()`
For platforms with broad demand (Telegram, Slack, etc.) this is fine: the maintenance load lives upstream, every user benefits. For platforms with narrow but real demand — enterprise-internal channels (Rocket.Chat, RingCentral, Zulip), agent-to-agent inbox protocols (e.g. Molecule's A2A), niche regional platforms, or experimental transports — the only path today is forking `gateway/run.py`. Forks drift, defeat the purpose of an OSS gateway, and discourage contribution back upstream.
### Prior art (already in hermes)
The memory subsystem solved exactly this problem at `plugins/memory/__init__.py`:
1. **Two-tier discovery** — bundled providers in `plugins/memory/<name>/` plus user-installed providers in `$HERMES_HOME/plugins/<name>/`. Bundled wins on name collision.
2. **`register(ctx)` collector pattern** (`plugins/memory/__init__.py:264-305`) — a plugin's `__init__.py` exposes a `register(ctx)` function; `ctx` already supports `register_memory_provider`, `register_tool`, `register_hook`, `register_cli_command`.
3. **`plugin.yaml` manifest** for description and metadata.
4. **Config-driven activation** (`memory.provider: honcho` selects which provider loads).
Adding `register_platform_adapter` to the same collector and a `plugins/platforms/` discovery directory extends this pattern symmetrically.
### Proposal
**Three small changes:**
1. **New collector method** in `plugins/memory/__init__.py:_ProviderCollector` (or a new shared `plugins/_collector.py` if maintainers prefer cleaner separation):
```python
def register_platform_adapter(self, name: str, adapter_class: type, requirements_check=None):
"""Register a platform adapter loadable as plugin.
name: unique platform identifier (matches gateway.platforms.<name> in config)
adapter_class: subclass of BasePlatformAdapter
requirements_check: optional callable returning bool — same shape as
existing check_telegram_requirements() etc.
"""
self.platform_adapters[name] = (adapter_class, requirements_check)
```
2. **New `plugins/platforms/__init__.py`** mirroring `plugins/memory/__init__.py``discover_platform_adapters()`, `load_platform_adapter(name)`, two-tier (bundled + `$HERMES_HOME/plugins/`) discovery.
3. **`_create_adapter()` fallback** at `gateway/run.py:2578` — after the in-tree if/elif chain returns None, attempt plugin lookup:
```python
# Existing in-tree adapters checked first (precedence preserved).
# If no match, fall through to plugin discovery.
from plugins.platforms import load_platform_adapter
plugin_entry = load_platform_adapter(platform.value)
if plugin_entry:
adapter_class, req_check = plugin_entry
if req_check and not req_check():
logger.warning(f"{platform.value}: plugin requirements not met")
return None
return adapter_class(config)
return None
```
4. **`Platform` enum becomes open-set.** Today it's `Enum`; switch to a string-backed pattern that accepts unknown values (still validates against the union of in-tree + discovered plugins at config-load time):
```python
# gateway/config.py — replace Enum with frozen dataclass + dynamic registry.
# Keeps the in-tree values as module-level singletons for backward compat:
# Platform.TELEGRAM still works as today.
```
This is the only "shape change" in the PR. Backward compat is straightforward: every existing `Platform.TELEGRAM` reference continues to work because the module exports the same names.
### Backward compatibility
- All 19 in-tree adapters keep their hardcoded path in `_create_adapter()` (precedence: in-tree wins on name collision, exactly like memory plugins).
- Existing config files (`gateway.platforms.telegram.enabled: true`) continue to work unchanged.
- No new mandatory config keys.
- Plugin discovery only runs if the platform name doesn't match an in-tree value, so cold-start cost is zero for users who don't use plugins.
- Fork-then-add-platform users can migrate to plugins at their own pace; the in-tree path isn't deprecated.
### Test plan
- **Unit**: discovery scans both bundled and user dirs, respects precedence.
- **Unit**: `_create_adapter()` falls through to plugin lookup only when in-tree doesn't match.
- **Integration**: ship a minimal `plugins/platforms/example/` in-tree (read-only, returns canned messages) so CI exercises the full plugin code path. Same approach `plugins/memory/holographic/` takes today.
- **Manual**: Molecule will publish `hermes-platform-molecule-a2a` as the first external consumer once this lands.
### Documentation
- Extend `CONTRIBUTING.md`'s "Should it be a Skill or a Tool?" section with "Should it be a Platform Plugin or an in-tree Platform?" — same shape, same decision tree.
- Add `plugins/platforms/README.md` mirroring `plugins/memory/`'s convention.
### Out of scope (intentionally)
- **Setuptools `entry_points`** — could be added later as a third discovery tier (after bundled + `$HERMES_HOME/plugins/`). Skipping for v1 because the directory-based discovery already covers the demand and matches the memory pattern. Adding entry_points is a non-breaking extension.
- **Hot-reload** — plugins discovered at gateway boot, no live re-scan. Matches memory plugins.
- **Sandboxing** — plugins run with full hermes process privileges. Same trust model as memory plugins; documented in the new README.
### Reference consumer
Molecule AI will ship `hermes-platform-molecule-a2a` as the first external consumer. Use case: deliver agent-to-agent inbox messages (from peer agents authenticated at the platform layer, not the Telegram-user level) into the same `_handle_message` dispatch Telegram uses, with `internal=True` events to bypass user-auth. Expected timeline: within 2 weeks of merge.
---
## Open questions for upstream maintainers
Per `CONTRIBUTING.md`, the right channel for design proposals is
**GitHub Discussions**, not Discord (Discord is for "questions,
showcasing projects, and sharing skills" — Discussions is the
documented channel for "design proposals and architecture discussions").
Open a Discussion at `NousResearch/hermes-agent/discussions` titled
"RFC: pluggable platform adapters via `plugins/platforms/`" with the
problem + proposal + open questions before filing the PR. This gives
maintainers space to weigh in on shape before code is in flight.
Open questions to put in the Discussion:
1. **Preferred naming.** `register_platform_adapter` vs `register_platform` vs `register_channel`. Consistency with memory's `register_memory_provider` argues for the long form.
2. **Enum vs string.** Is the maintainer team open to making `Platform` open-set? If not, fallback design: keep enum, add a single `Platform.PLUGIN` sentinel + a `plugin_name` field on `PlatformConfig`. Slightly uglier but smaller blast radius.
3. **Testing**: `plugins/platforms/example/` checked into the repo, or test-fixtures-only? Memory plugins are real (mem0, honcho, supermemory bundled), so a real example seems consistent.
4. **Discovery ordering**: confirm the user wants bundled-wins precedence (matches memory) vs user-can-override-bundled (would let downstream patch a buggy in-tree adapter without forking). Current memory pattern is bundled-wins; we'll match it unless told otherwise.
---
## Effort estimate
- **Code change**: ~150 LOC across `plugins/platforms/__init__.py` (new), `gateway/config.py` (Platform refactor), `gateway/run.py` (10-line fallback in `_create_adapter`), tests (~50 LOC).
- **Docs**: ~80 LOC across `CONTRIBUTING.md` extension and new `plugins/platforms/README.md`.
- **Review cycle**: depends on maintainer responsiveness. Memory plugin system shipped in v0.50.7 era; platform plugin system would land for v0.11 if accepted.
---
## After this PR lands (Molecule-side follow-up)
1. Publish `hermes-platform-molecule-a2a` (PyPI + `~/.hermes/plugins/molecule-a2a/`).
2. Bump our hermes workspace template to declare `plugins.platforms.molecule_a2a.enabled: true`.
3. Remove the polling shim from `molecule-ai-workspace-template-hermes/adapter.py` once the plugin path is verified end-to-end.
---
## Status checklist (for our own tracking)
Per user's gating: "if the plugin works locally in our docker setup
and e2e testing works, yes [submit]". Validation prerequisites:
- [ ] Build a working `plugins/platforms/molecule_a2a/` plugin against
a forked hermes-agent with the proposed change applied
- [ ] Bake the forked hermes + plugin into a local copy of our
`molecule-ai-workspace-template-hermes` Docker image
- [ ] E2E: boot the local image, send A2A messages from a peer agent,
observe `_handle_message` dispatch + reply through A2A queue
- [ ] Confirm `Platform` enum refactor doesn't break downstream — grep
for `Platform.X` usages across hermes
- [ ] Confirm `$HERMES_HOME` is the right user-plugin root for
platforms (matches memory convention)
- [ ] Open a GitHub Discussion at
`NousResearch/hermes-agent/discussions` titled
"RFC: pluggable platform adapters via plugins/platforms/" with
design + open questions; wait for maintainer feedback
- [ ] Branch name: `feat/pluggable-platform-adapters` per
CONTRIBUTING.md branch convention
- [ ] Commit prefix: `feat(gateway): pluggable platform adapters
via plugins/platforms/` per Conventional Commits + scope `gateway`
- [ ] PR description covers what/why + how-to-test + platforms tested,
per CONTRIBUTING.md PR-description requirements
- [ ] Open PR against `NousResearch/hermes-agent` main once Discussion
lands consensus
- [ ] Track PR; bump cadence weekly; if stalled past 4 weeks, propose
fork-and-bundle as fallback for our hermes template image

View File

@ -0,0 +1,162 @@
# Runtime native-MCP push parity — status
**Goal:** every workspace runtime delivers Molecule A2A inbox messages
with the same UX as claude-code's MCP `notifications/claude/channel`
push: session continuity + queued or interrupted handling of new
messages mid-thread, no fresh subprocess per message.
Tracked across four runtime streams. Updated 2026-05-02.
---
## claude-code
**Status:** ✅ Done. Native MCP `notifications/claude/channel` push
shipped via `workspace/a2a_mcp_server.py`. Requires the host to launch
with `--dangerously-load-development-channels server:molecule`.
No further work.
---
## OpenClaw
**Status:** Scaffolded; awaiting validation + companion adapter rewrite.
**Path:** Channel-plugin SDK (`openclaw/plugin-sdk`), auto-discovered
from `~/.openclaw/plugins/<name>/` or workspace `.openclaw/`. Plugin
registers an HTTP webhook on `openclaw gateway`; Molecule workspace
adapter POSTs A2A messages to it; gateway dispatches through the same
`dispatchReplyWithBufferedBlockDispatcher` kernel call native channels
(Telegram, Lark, Slack, Discord) use.
**Artifacts landed:**
- `molecule-ai-workspace-template-openclaw/packages/openclaw-channel-plugin/`
- `package.json`, `openclaw.plugin.json` (manifest), `index.ts`
(channel + webhook handler), `README.md`, `tsconfig.json`
- Pre-release `v0.1.0-pre`. Mirrors `rabbit-lark-bot` reference
plugin shape.
**Remaining (task #84, #87):**
1. Validate against a running OpenClaw gateway. Open questions in the
plugin README: `resolveAgentRoute` peer-id shape,
`dispatchReplyWithBufferedBlockDispatcher` async semantics,
`outbound.sendText` no-op safety.
2. Rewrite Python adapter (`adapter.py`) to stop shelling out
`openclaw agent --message ...` and instead POST to the plugin's
webhook + run `/agent-reply` callback HTTP server. **Post-demo
work** (touches a working integration).
---
## hermes
**Status:** Upstream PR drafted; short-term shim deemed unnecessary.
**Path:** Open the upstream `BasePlatformAdapter` system to external
plugins. Hermes already ships a working plugin discovery system for
memory backends (`plugins/memory/`, `register(ctx)` collector pattern,
`$HERMES_HOME/plugins/<name>/` user-installed tier). The PR extends
the same shape to platforms — `register_platform_adapter(...)` on the
existing collector, new `plugins/platforms/` discovery directory,
3-line fallback in `_create_adapter()`. Symmetric, not novel.
**Artifacts landed:**
- `docs/integrations/hermes-platform-plugins-upstream-pr.md` — full
PR draft including problem, prior art, proposal, code shape,
backward compat, test plan, and four open questions to resolve in
Discord before submitting.
**Why no short-term polling shim:** earlier framing was wrong. Molecule
runtime already polls the inbox via `wait_for_message` per turn; each
polled message fires a fresh `execute()` on the adapter, which
proxies to hermes's stateless `/v1/chat/completions`. Adding adapter-
side polling would be duplicate work. The genuine short-term gap is
**session continuity** (hermes daemon doesn't see a single
conversation across turns because chat/completions is stateless), not
push latency. That gap is solved by the upstream PR; no
intermediate shim earns its complexity.
**Remaining (task #83):**
1. Reach out in Nous Research Discord to validate open questions
(Platform enum-vs-string refactor, naming, example-plugin scope).
2. Submit PR to `NousResearch/hermes-agent`. **Requires user
confirmation** — opening an upstream PR is an action visible to
others.
3. Once merged: ship `hermes-platform-molecule-a2a` as the first
external consumer, bump our hermes workspace template to enable
it, remove any transitional code.
---
## Codex (OpenAI Codex CLI)
**Status:** Template structurally complete (12 files, 12/12 tests passing,
validated against real codex-cli 0.72.0). Awaiting molecule-core
registry integration + E2E.
**Path:** Persistent `codex app-server` stdio JSON-RPC client
(NDJSON-framed, v2 protocol). One app-server child per workspace
session; one `thread/start` per session; each A2A message becomes a
`turn/start` RPC; agent responses arrive as
`agent_message_delta` notifications. Per-thread serialization for
mid-turn arrivals (matches OpenClaw's per-chat sequentializer).
Optional `turn/interrupt` for "latest message wins" workspaces.
**Artifacts landed:**
- `docs/integrations/codex-app-server-adapter-design.md` — full design
including RPC sequence, executor skeleton, eight open questions.
- `molecule-ai-workspace-template-codex/` — full template repo
scaffolded:
- `app_server.py` (286 LOC) — async JSON-RPC over NDJSON stdio
- `executor.py` (~270 LOC) — thread bootstrap, turn dispatch,
notification accumulation, mid-turn serialization
- `adapter.py` — thin `BaseAdapter` shell + preflight
- `Dockerfile`, `start.sh`, `config.yaml`, `requirements.txt`,
`README.md`
- `tests/`**12/12 tests pass** (7 vs NDJSON mock child, 5 vs
fake AppServerProcess covering executor logic)
**Validated against live `codex-cli 0.72.0`:** NDJSON framing,
`initialize` handshake, AND `thread/start` all work end-to-end.
**Schema-runtime drift caught:** real binary returns `thread.id`,
not `thread.threadId` as the JSON schema claims. Executor now
accepts both shapes; without the smoke test this would have been
a production bug.
**Remaining (task #85, #86):**
1. Register `codex` in molecule-core's `manifest.json` +
`workspace-server/internal/handlers/runtime_registry.go`.
**Defer to post-demo** — touches working live registry.
2. E2E verification with a real Molecule workspace + peer A2A
traffic, per `feedback_close_on_user_visible_not_merge`.
---
## Cross-cutting (task #86)
End-to-end verification per `feedback_close_on_user_visible_not_merge`.
For each runtime, the closure criterion is not "code merged" but
"observed: real workspace boots → A2A message from peer agent →
delivered to running session → reply returned through A2A response
queue → peer agent receives". No runtime stream closes until that
chain is observed.
---
## What's blocking what
| Stream | Blocked on |
|---|---|
| claude-code | (done) |
| OpenClaw plugin | live gateway validation, then post-demo adapter rewrite |
| OpenClaw adapter rewrite | post-demo timing |
| hermes upstream PR | user confirmation to submit + Discord pre-validation |
| hermes consumer plugin | upstream PR merging |
| codex implementation | resolve 8 open questions, then post-demo eng time |
| E2E verification | each runtime stream completing |
Three of four runtime streams are at decision points needing user
input. Pre-demo (T-4d to 2026-05-06), the safe move is to land the
remaining design + scaffolding work and defer all behavioral changes to
post-demo.

View File

@ -32,7 +32,8 @@
{"name": "deepagents", "repo": "Molecule-AI/molecule-ai-workspace-template-deepagents", "ref": "main"},
{"name": "hermes", "repo": "Molecule-AI/molecule-ai-workspace-template-hermes", "ref": "main"},
{"name": "gemini-cli", "repo": "Molecule-AI/molecule-ai-workspace-template-gemini-cli", "ref": "main"},
{"name": "openclaw", "repo": "Molecule-AI/molecule-ai-workspace-template-openclaw", "ref": "main"}
{"name": "openclaw", "repo": "Molecule-AI/molecule-ai-workspace-template-openclaw", "ref": "main"},
{"name": "codex", "repo": "Molecule-AI/molecule-ai-workspace-template-codex", "ref": "main"}
],
"org_templates": [
{"name": "molecule-dev", "repo": "Molecule-AI/molecule-ai-org-template-molecule-dev", "ref": "main"},