Stacked follow-up on the v2.0.0 rewrite. The merged v2.0.0 template
had three latent issues that only surfaced during local E2E testing:
1) sudo → gosu (python:3.11-slim ships neither; only gosu was in
the Dockerfile). start.sh was calling sudo which would have
broken every container boot.
2) PATH pointed at /home/agent/.hermes/bin which doesn't exist —
install.sh symlinks ~/.local/bin/hermes. Installer is also
interactive by default; needs --skip-setup to run in docker build.
3) start.sh wrote ~/.hermes/cli-config.yaml but hermes-agent reads
~/.hermes/config.yaml. cli-config.yaml.example is just a starter
file — install.sh copies it to config.yaml on first boot. Without
our overwrite the template inherited the example default
(anthropic/claude-opus-4.6 + provider: auto) instead of the
workspace's chosen model. We now rewrite config.yaml every boot
from HERMES_DEFAULT_MODEL + HERMES_INFERENCE_PROVIDER env.
Also:
- Added xz-utils + build-essential to the image (hermes installer
extracts a Node 22 .tar.xz and some Python deps in .[all] build
from source).
- Forward every provider key hermes-agent knows about, not just
the 6 from v2.0.0. All ~22 providers documented in the official
website/docs/integrations/providers.md are now wired:
HERMES_API_KEY, NOUS_API_KEY, OPENROUTER_API_KEY, OPENAI_API_KEY,
ANTHROPIC_API_KEY, GEMINI_API_KEY, GOOGLE_API_KEY, DEEPSEEK_API_KEY,
GLM_API_KEY, KIMI_API_KEY, KIMI_CN_API_KEY, MINIMAX_API_KEY,
MINIMAX_CN_API_KEY, DASHSCOPE_API_KEY, XIAOMI_API_KEY,
ARCEEAI_API_KEY, NVIDIA_API_KEY, OLLAMA_API_KEY, HF_TOKEN,
AI_GATEWAY_API_KEY, KILOCODE_API_KEY, OPENCODE_ZEN_API_KEY,
OPENCODE_GO_API_KEY, COPILOT_GITHUB_TOKEN, GH_TOKEN
- config.yaml models[] list expanded to 30+ entries covering every
provider family (Hermes 3/4, Anthropic direct, OpenAI via
OpenRouter, Gemini direct, DeepSeek, GLM, Kimi, MiniMax global+CN,
Qwen/DashScope, Xiaomi MiMo, Arcee Trinity, NVIDIA NIM, Ollama
Cloud, Hugging Face catch-all, Vercel AI Gateway, OpenCode Zen+Go,
Kilo Code, OpenRouter catch-all, custom/local).
- top-level required_env: [] — hermes supports too many providers
for a single hardcoded requirement; per-model required_env in
the canvas Config tab drives the real UX. hermes-agent itself
errors loud at request time if zero providers are configured.
- HERMES_CUSTOM_BASE_URL / HERMES_CUSTOM_API_KEY env support in
start.sh — lets operators point hermes at OpenAI direct, LM Studio,
LiteLLM, any OpenAI-compat endpoint without exec-ing into the
container.
- HERMES_INFERENCE_PROVIDER env — forces a specific provider,
overriding hermes' auto-detection (which routes OPENAI_API_KEY
to openai-codex OAuth path → 401 Missing Authentication header).
- docs/CONFIGURATION.md rewritten with the full provider matrix,
OAuth flow, forcing a provider, auxiliary model, persistence
layout, and the common routing gotchas surfaced during testing.
- docs/ARCHITECTURE.md adds "Provider routing (how keys become
inference)" section.
Proved end-to-end on local Docker:
[start.sh] hermes gateway ready on :8642 (pid 22)
Uvicorn running on http://0.0.0.0:8000
→ A2A message/send "Respond with HERMES BRIDGE WORKING END TO END"
← HERMES BRIDGE WORKING END TO END — (via OpenAI Responses API)
→ "Run uname -a && whoami && pwd using your terminal tool"
← Linux 094f72... aarch64 GNU/Linux / agent / /home/agent
(real tool call — not chat response)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
8.6 KiB
8.6 KiB
Architecture — A2A bridge to hermes-agent
Port map
┌─────────────────── workspace container ───────────────────┐
│ │
│ :8000 ← molecule_runtime (A2A server + adapter) ──┐ │
│ │ │
│ proxy │ │
│ ▼ │
│ :8642 → hermes-agent gateway (OpenAI-compat API) │
│ running as user `agent`, state in ~/.hermes │
│ │
└───────────────────────────────────────────────────────────┘
▲
│ (only :8000 exposed outside)
│
platform + canvas
:8000— A2A server, exposed to the rest of the platform. Contract is stable across all runtimes (langgraph, claude-code, hermes, etc.).:8642— hermes-agent's OpenAI-compatible HTTP API. Loopback only. Never routed outside the container.
Boot sequence
start.sh (runs as root inside the container):
- Generate a random
API_SERVER_KEYif the env var isn't already set. This is hermes-agent's bearer token; the executor reads it from the env at request time. - Write
/home/agent/.hermes/.env:API_SERVER_ENABLED=trueAPI_SERVER_KEY=<generated>API_SERVER_HOST=127.0.0.1API_SERVER_PORT=8642- Any provider keys present in the container env (
HERMES_API_KEY,OPENROUTER_API_KEY,ANTHROPIC_API_KEY,OPENAI_API_KEY,GEMINI_API_KEY,MINIMAX_API_KEY) forwarded through.
- Launch
hermes gatewayin the background as useragentviasudo -u agent -E bash -lc 'hermes gateway'. Logs →/var/log/hermes-gateway.log. - Poll
http://127.0.0.1:8642/healthup to 60×1s. Fail loud on timeout — dumps last 80 log lines to stderr so provisioning logs capture the reason. exec molecule-runtime— replaces the shell, becoming PID 1. molecule-runtime loadsAdapter = HermesAgentAdapterfrom__init__.pyand starts the A2A server on:8000.
Request flow
canvas ─── POST /a2a/... ───▶ molecule_runtime (:8000)
│
▼
HermesAgentProxyExecutor.execute()
│
┌──────────────────┴──────────────────┐
▼ ▼
extract message text build {model, messages[], stream:false}
│
▼
POST 127.0.0.1:8642/v1/chat/completions
Authorization: Bearer ${API_SERVER_KEY}
│
▼
hermes-agent runs the turn with its
native tools (terminal, files, web,
memory, skills), resolves provider
from the `model` string, returns
OpenAI-format response.
│
▼
extract choices[0].message.content
│
▼
event_queue.enqueue_event(
new_agent_text_message(...)
)
│
▼
canvas receives the reply
What the bridge is intentionally not doing
- Provider selection. The bridge sends the
modelstring verbatim. hermes-agent owns the registry and picks provider + API keys. If you ever feel tempted to add fallback chains here, stop — that's a regression to v1.x. - Tool routing. Tools are hermes-agent's job. Our bridge sees only the final assistant text.
Provider routing (how keys become inference)
Provider resolution happens inside hermes-agent, driven by:
~/.hermes/cli-config.yaml—model.providerfield. start.sh seeds this file on first boot (autoby default, or whateverHERMES_INFERENCE_PROVIDERspecifies).~/.hermes/.env— every provider key we forward from the container env (see start.sh for the full list; seeCONFIGURATION.md#provider-matrixfor the mapping).- Auto-detection — when
provider: auto, hermes walks its internal resolution order and picks the first provider whose credential is present. When multiple keys are set, prefer explicitHERMES_INFERENCE_PROVIDERto avoid surprises.
Common routing gotcha
With only OPENAI_API_KEY set and provider: auto, hermes-agent will
route to openai-codex (Codex API, OAuth-only) and return:
401 - Missing Authentication header
The fix is to set HERMES_INFERENCE_PROVIDER=openrouter — hermes's
openrouter provider accepts OPENAI_API_KEY as alt-auth and routes
OpenAI-format Chat Completions correctly. This is documented in
CONFIGURATION.md#forcing-a-provider.
Auxiliary model
Vision, web summarization, and MoA use a separate auxiliary model —
defaults to Gemini Flash via OpenRouter. If OPENROUTER_API_KEY is
absent, these capabilities break silently (the primary path still
works). Set HERMES_AUXILIARY_PROVIDER to override.
- Streaming.
stream: falsein the request payload. A later revision can upgrade to SSE by subscribing toGET /v1/runs/{run_id}/eventsand pushing partial messages into the A2A event queue — theAgentExecutorcontract already supports multipleenqueue_eventcalls per turn. - Caching. hermes-agent has its own session store
(
X-Hermes-Session-Id); the bridge does not attempt to pin conversations. Molecule's A2A layer already carries session context in its message envelope.
Failure modes
| Symptom | Likely cause | Where to look |
|---|---|---|
| Provisioning fails at "health probe" | hermes gateway crashed during boot |
/var/log/hermes-gateway.log (tail in start.sh stderr dump) |
Every request returns [hermes-agent error 401] |
API_SERVER_KEY mismatch between processes |
Inspect /home/agent/.hermes/.env + container env |
Every request returns [hermes-agent error 400] |
Model string unrecognized by hermes-agent | docker exec -u agent … hermes model inside the container |
[hermes-agent unreachable] |
Gateway exited post-boot (OOM, crash) | /var/log/hermes-gateway.log; may need container restart |
| Skills disappear between sessions | /home/agent/.hermes not volume-mounted |
Platform-side volume config; see CONFIGURATION.md |
| Agent ignores provider key | Key not forwarded from container env | Workspace secrets; see runbooks/saas-secrets.md in monorepo |
Future work
- Streaming: subscribe to
/v1/runs/.../eventsand pipe partial assistant tokens + tool progress into the A2A queue. - Session pinning: thread hermes-agent's
X-Hermes-Session-Idthrough the A2A envelope so long conversations keep server-side context when beneficial. hermes config setsync: when canvas Config tab changes themodelfield, also invokehermes model <id>so CLI usage inside the workspace's Terminal tab stays in sync.- Gateway platforms passthrough: let customers opt their workspace into hermes-agent's Telegram/Discord/Slack platforms without duplicating the config surface.
- Install pin: once stabilized, replace
curl install.sh | bashin the Dockerfile with a pinned commit SHA so a bad upstream release doesn't break builds.