Replaces the synchronous /v1/chat/completions proxy with an async
plugin-path executor that earns single-session continuity for peer
agents.
Behavior:
- Default: POST each A2A turn to the in-container hermes plugin's
/a2a/inbound; await the agent reply via an aiohttp callback server
inside the executor. The plugin POSTs hermes's reply back to the
callback server, correlated by message_id, which resolves the
awaiting Future and emits on the A2A queue.
- Fallback: MOLECULE_A2A_PLATFORM_ENABLED=false reverts to the
legacy /v1/chat/completions transport — same behavior as before
this commit. Lets operators flip the path off if the plugin path
misbehaves in production.
Wire shape:
- Plugin's adapter.send(chat_id, content, reply_to, metadata)
becomes POST <callback_url> with the same fields.
- Correlation is by reply_to (= the inbound message_id), not by
chat_id — two in-flight messages on the same chat would race on
the latter.
- Optional MOLECULE_A2A_PLATFORM_SHARED_SECRET is sent on outbound
POSTs and required on inbound replies.
Tests: 36 unit tests, 98% combined coverage on adapter.py + executor.py.
Covers lifecycle (start/stop/idempotent), happy path (round-trip
through stub plugin), error paths (POST failure, reply timeout, late
delivery for unknown message_id, malformed JSON, missing fields),
auth (shared_secret enforcement both directions), fallback (chat
completions HTTP error, unreachable port, junk response shape), and
chat_id derivation precedence.
Real-LLM E2E remains gated on docker image republish + workspace
provisioning + LLM key — the unit tests bound the wire-shape risk
and the existing scripts/e2e_real_hermes_subprocess.py in
hermes-platform-molecule-a2a covers the plugin side end-to-end against
a real `hermes gateway run` subprocess.
7 lines
57 B
Plaintext
7 lines
57 B
Plaintext
__pycache__/
|
|
*.pyc
|
|
.coverage
|
|
.pytest_cache/
|
|
.venv/
|
|
venv/
|