molecule-ai-workspace-templ.../docs/PLANNING.md
Hongming Wang 1313694df2 feat(v2.0.0): replace provider shim with the real Nous hermes-agent
The old template was a thin OpenAI-compat multi-provider dispatcher
that shared the name "hermes" with Nous Research's hermes-agent but
had none of its actual capabilities (skills, memory, tools, learning
loop, multi-platform gateway). Customers picking "Hermes" in canvas
got a stateless chat shim instead of the agent framework they
expected.

This PR rewrites the template to run the real hermes-agent
(github.com/NousResearch/hermes-agent) inside the workspace
container:

- Dockerfile installs hermes-agent via its upstream install.sh
  (same pattern template-claude-code uses for the claude CLI).
- start.sh boots `hermes gateway` with the api_server platform
  on 127.0.0.1:8642, waits for /health, then exec's
  molecule-runtime on :8000.
- adapter.py / executor.py collapse to a thin A2A proxy that
  forwards every incoming message to /v1/chat/completions on
  the local gateway and returns the response on the A2A queue.
- providers.py + escalation.py deleted — hermes-agent owns
  provider selection (`hermes model`), its own skill/memory loop
  supersedes escalation.
- Env vars unchanged: HERMES_API_KEY, OPENROUTER_API_KEY,
  ANTHROPIC_API_KEY, OPENAI_API_KEY, GEMINI_API_KEY, MINIMAX_API_KEY
  are all forwarded into ~/.hermes/.env at boot.

All planning + rationale lives in this repo under docs/:
- docs/PLANNING.md — why, scope, phases, risks, success criteria
- docs/ARCHITECTURE.md — port map, boot sequence, request flow,
  what the bridge deliberately does NOT do
- docs/MIGRATION.md — v1.x → v2.0.0 behaviour changes (no
  customer migration needed, v1.x was CI-canary-only)
- docs/CONFIGURATION.md — model picking, persistence, gateway
  restart, inspection, timeouts

Net -195 lines of code for a massive capability upgrade.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 10:59:20 -07:00

7.1 KiB

Rewrite Plan — v2.0.0: run the real hermes-agent

Context

The hermes runtime in Molecule AI used to be a thin OpenAI-compat provider shim that dispatched chat-completions to Nous Portal, OpenRouter, Anthropic, Gemini, and ~12 other providers. It shared the name with Nous Research's hermes-agent but had none of the agent capabilities:

Capability Old template-hermes v1.x Real hermes-agent
Terminal / file / web tools No Yes (native)
Skills with self-improvement loop No Yes
Cross-session memory / FTS search No Yes
Telegram/Discord/Slack gateway No Yes
Scheduled automations (cron) No Yes
Sub-agent spawning No (platform-level only) Yes
Provider registry Duplicated in providers.py Owned by hermes-agent (hermes model)

Customers picking "Hermes" in the canvas expected the agent framework — they got a stateless chat shim. This PR replaces the shim with the real thing and removes the name collision.

Goals

  1. Drop the shim. Delete providers.py, escalation.py, and the multi-provider dispatch logic in the old executor.py. hermes-agent already owns all of that.
  2. Install the real agent. curl install.sh | bash at image-build time, same way template-claude-code pulls the real claude CLI.
  3. Preserve the A2A contract. molecule_runtime still serves :8000; the rest of the platform doesn't need to know hermes-agent exists.
  4. Own nothing hermes-agent owns. No provider selection, no fallback chains, no model alias mapping. The template is just transport glue.

Non-goals

  • ACP (acp_adapter/server.py) integration. hermes-agent exposes it, but our platform speaks A2A; adding a second protocol surface isn't worth the complexity today.
  • Streaming. Buffering a final response is fine for agent-turn latency (tool-using turns dominate wall time, not token output). Streaming upgrade path is noted in ARCHITECTURE.md.
  • Telegram/Discord/Slack gateway exposure. The platform's entry surface is canvas; those platforms are out of scope for now.
  • Migration shim for v1.x configs. v1.x had no production customers, only CI canary runs.

Scope — what changes

Added

  • start.sh — boots hermes gateway (api_server platform), waits for :8642 health, exec's molecule-runtime.
  • docs/PLANNING.md (this file)
  • docs/ARCHITECTURE.md
  • docs/MIGRATION.md
  • docs/CONFIGURATION.md

Rewritten

  • Dockerfile — installs hermes-agent via upstream installer, copies bridge files, sets entrypoint to start.sh.
  • adapter.py — shrinks from multi-provider dispatch to a factory that returns HermesAgentProxyExecutor.
  • executor.py — pure HTTP proxy: A2A message in → POST /v1/chat/completions → A2A message out.
  • config.yamlruntime: hermes, v2.0.0, cleaner model list. Canvas Config tab resolves required_env per model from the registry.
  • requirements.txtmolecule-ai-workspace-runtime + httpx. Dropped openai, anthropic, google-genai — hermes-agent owns provider SDKs.
  • __init__.py — exports HermesAgentAdapter instead of HermesAdapter.
  • README.md — reflects the new reality.

Deleted

  • providers.py (replaced by hermes-agent's internal registry)
  • escalation.py (replaced by hermes-agent's skill/memory loop)

Unchanged

  • CLAUDE.md — workspace-level agent instructions, orthogonal.
  • known-issues.md — retained for historical context.
  • runbooks/ — platform-level runbooks, still applicable.
  • .molecule-ci/ — CI metadata.

Phases

Phase 1 — Merge this PR

Lands the rewrite. Next image build publishes workspace-template:hermes as the new v2.0.0 base.

Phase 2 — Validate in staging

Re-provision the staging canary workspace (.github/workflows/canary-staging.ymlE2E_RUNTIME=hermes) and verify the agent can:

  • Start (health probe green)
  • Respond to a trivial hello prompt end-to-end via A2A
  • Use at least one built-in tool (hermes tools → terminal/file)
  • Surface hermes-agent errors cleanly when provider keys are absent

Phase 3 — Persist skills + memory

The default Docker named volume the platform attaches already covers /home/agent/.hermes. Confirm that skills created in one canvas session survive a container restart. Document in CONFIGURATION.md which mount points matter.

Phase 4 — Opt into hermes-agent gateway platforms

(Out of scope for v2.0.0.) Later release can expose hermes-agent's Telegram/Discord/Slack gateway to customers by surfacing those platforms in the canvas Config tab and wiring secrets in. Tracked separately — will open an issue when Phase 3 lands.

Risks + mitigations

Risk Mitigation
hermes-agent's install.sh pulls a newer agent schema that doesn't match our bridge Pin the upstream install commit in the Dockerfile once the bridge is battle-tested. Today we're on main intentionally — Nous moves fast and we want those features.
:8642 bind collides with something the user spawns Internal loopback only, and the port is documented in ARCHITECTURE.md.
hermes-agent crashes mid-request Bridge surfaces the HTTP error as an A2A message; gateway restart is a manual ops step (see CONFIGURATION.md#restart).
API_SERVER_KEY regenerated on each boot invalidates old clients Clients are us; we read it from env at request time, not cached.
Skills/memory loss on workspace re-provision Covered in Phase 3 — volume mount at /home/agent/.hermes.

Success criteria

  • workspace-template:hermes image built and pushed.
  • Staging canary goes green on a hermes workspace.
  • hermes runtime in canvas picks up real agent capabilities (skills visible, terminal tools usable in agent output).
  • providers.py / escalation.py grep clean in the repo.

Open questions

  1. Should we expose hermes CLI subcommands (hermes model, hermes tools, hermes skills) through the canvas Terminal tab directly, or keep them hidden behind the A2A chat interface? Leaning Terminal tab — operators will want to configure without writing a natural-language prompt.
  2. Where does the per-workspace model selection surface live? Today it's AdapterConfig.model → payload model field. If a user switches in the canvas, do we also write it to ~/.hermes/config.yaml via hermes config set so CLI usage stays in sync? Probably yes; follow-up PR.