fix(v2.1.0): startup bugs + full provider matrix — proven end-to-end (#6)
Stacked follow-up on the v2.0.0 rewrite. The merged v2.0.0 template
had three latent issues that only surfaced during local E2E testing:
1) sudo → gosu (python:3.11-slim ships neither; only gosu was in
the Dockerfile). start.sh was calling sudo which would have
broken every container boot.
2) PATH pointed at /home/agent/.hermes/bin which doesn't exist —
install.sh symlinks ~/.local/bin/hermes. Installer is also
interactive by default; needs --skip-setup to run in docker build.
3) start.sh wrote ~/.hermes/cli-config.yaml but hermes-agent reads
~/.hermes/config.yaml. cli-config.yaml.example is just a starter
file — install.sh copies it to config.yaml on first boot. Without
our overwrite the template inherited the example default
(anthropic/claude-opus-4.6 + provider: auto) instead of the
workspace's chosen model. We now rewrite config.yaml every boot
from HERMES_DEFAULT_MODEL + HERMES_INFERENCE_PROVIDER env.
Also:
- Added xz-utils + build-essential to the image (hermes installer
extracts a Node 22 .tar.xz and some Python deps in .[all] build
from source).
- Forward every provider key hermes-agent knows about, not just
the 6 from v2.0.0. All ~22 providers documented in the official
website/docs/integrations/providers.md are now wired:
HERMES_API_KEY, NOUS_API_KEY, OPENROUTER_API_KEY, OPENAI_API_KEY,
ANTHROPIC_API_KEY, GEMINI_API_KEY, GOOGLE_API_KEY, DEEPSEEK_API_KEY,
GLM_API_KEY, KIMI_API_KEY, KIMI_CN_API_KEY, MINIMAX_API_KEY,
MINIMAX_CN_API_KEY, DASHSCOPE_API_KEY, XIAOMI_API_KEY,
ARCEEAI_API_KEY, NVIDIA_API_KEY, OLLAMA_API_KEY, HF_TOKEN,
AI_GATEWAY_API_KEY, KILOCODE_API_KEY, OPENCODE_ZEN_API_KEY,
OPENCODE_GO_API_KEY, COPILOT_GITHUB_TOKEN, GH_TOKEN
- config.yaml models[] list expanded to 30+ entries covering every
provider family (Hermes 3/4, Anthropic direct, OpenAI via
OpenRouter, Gemini direct, DeepSeek, GLM, Kimi, MiniMax global+CN,
Qwen/DashScope, Xiaomi MiMo, Arcee Trinity, NVIDIA NIM, Ollama
Cloud, Hugging Face catch-all, Vercel AI Gateway, OpenCode Zen+Go,
Kilo Code, OpenRouter catch-all, custom/local).
- top-level required_env: [] — hermes supports too many providers
for a single hardcoded requirement; per-model required_env in
the canvas Config tab drives the real UX. hermes-agent itself
errors loud at request time if zero providers are configured.
- HERMES_CUSTOM_BASE_URL / HERMES_CUSTOM_API_KEY env support in
start.sh — lets operators point hermes at OpenAI direct, LM Studio,
LiteLLM, any OpenAI-compat endpoint without exec-ing into the
container.
- HERMES_INFERENCE_PROVIDER env — forces a specific provider,
overriding hermes' auto-detection (which routes OPENAI_API_KEY
to openai-codex OAuth path → 401 Missing Authentication header).
- docs/CONFIGURATION.md rewritten with the full provider matrix,
OAuth flow, forcing a provider, auxiliary model, persistence
layout, and the common routing gotchas surfaced during testing.
- docs/ARCHITECTURE.md adds "Provider routing (how keys become
inference)" section.
Proved end-to-end on local Docker:
[start.sh] hermes gateway ready on :8642 (pid 22)
Uvicorn running on http://0.0.0.0:8000
→ A2A message/send "Respond with HERMES BRIDGE WORKING END TO END"
← HERMES BRIDGE WORKING END TO END — (via OpenAI Responses API)
→ "Run uname -a && whoami && pwd using your terminal tool"
← Linux 094f72... aarch64 GNU/Linux / agent / /home/agent
(real tool call — not chat response)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
49040a1ade
commit
7e7871875c
27
Dockerfile
27
Dockerfile
@ -1,9 +1,14 @@
|
||||
FROM python:3.11-slim
|
||||
|
||||
# System deps: curl for the hermes installer, git for the agent's file/repo
|
||||
# tools, gosu so start.sh can drop privileges, ca-certificates for TLS.
|
||||
# System deps:
|
||||
# curl — hermes installer + loopback health probe in start.sh
|
||||
# ca-certificates — TLS for all the outbound installs
|
||||
# git — hermes installer clones the repo; also used by agent tools
|
||||
# gosu — drop privileges in start.sh (single-process friendly)
|
||||
# xz-utils — hermes installer extracts a Node 22 tarball (.tar.xz)
|
||||
# build-essential — some python deps in hermes `.[all]` extra compile from src
|
||||
RUN apt-get update && apt-get install -y --no-install-recommends \
|
||||
curl ca-certificates git gosu \
|
||||
curl ca-certificates git gosu xz-utils build-essential \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
# Non-root agent user. hermes-agent writes its state into ~/.hermes so
|
||||
@ -26,13 +31,19 @@ COPY start.sh /usr/local/bin/start.sh
|
||||
RUN chmod +x /usr/local/bin/start.sh
|
||||
|
||||
# --- Install the real Nous Research hermes-agent as the agent user ---
|
||||
# The installer lives under the agent's home (~/.hermes, PATH update in
|
||||
# .bashrc). Running as root would place it in /root and break discovery.
|
||||
# The installer lives under the agent's home (~/.hermes, symlinks the
|
||||
# `hermes` entrypoint into ~/.local/bin/). Running as root would place
|
||||
# it in /root and break discovery.
|
||||
# --skip-setup → no interactive wizard (curl|bash is non-tty anyway
|
||||
# but the installer treats this as "run anyway" by
|
||||
# default; passing it explicitly avoids surprises).
|
||||
USER agent
|
||||
WORKDIR /home/agent
|
||||
RUN curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
|
||||
# Make `hermes` available in non-interactive shells (start.sh).
|
||||
ENV PATH="/home/agent/.local/bin:/home/agent/.hermes/bin:${PATH}"
|
||||
RUN curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh \
|
||||
| bash -s -- --skip-setup
|
||||
# hermes installer symlinks ~/.hermes/hermes-agent/venv/bin/hermes into
|
||||
# ~/.local/bin/hermes, so ~/.local/bin is the only PATH entry we need.
|
||||
ENV PATH="/home/agent/.local/bin:${PATH}"
|
||||
|
||||
USER root
|
||||
WORKDIR /app
|
||||
|
||||
179
config.yaml
179
config.yaml
@ -6,79 +6,186 @@ description: >-
|
||||
behind an A2A bridge. The agent exposes its OpenAI-compatible API on
|
||||
localhost:8642 and molecule_runtime proxies messages to it on :8000.
|
||||
|
||||
Any model `hermes-agent` supports is selectable at runtime via
|
||||
`hermes model` — Nous Portal, OpenRouter, OpenAI, Anthropic, Gemini,
|
||||
xAI, MiniMax, Qwen, DeepSeek, Groq, NVIDIA NIM, Kimi, Mistral, and more.
|
||||
version: 2.0.0
|
||||
Supports every provider hermes-agent supports — see docs/CONFIGURATION.md
|
||||
for the full list. Any model hermes-agent accepts is selectable at
|
||||
runtime via `hermes model`, or from the canvas Config tab.
|
||||
version: 2.1.0
|
||||
tier: 2
|
||||
|
||||
runtime: hermes
|
||||
runtime_config:
|
||||
# Default model. `hermes-agent` resolves provider + auth from its own
|
||||
# config — canvas just passes the model ID through so the user can switch
|
||||
# without leaving the UI. Override per-workspace in the Config tab.
|
||||
# Default model. hermes-agent owns provider resolution via the
|
||||
# HERMES_INFERENCE_PROVIDER env or its own `hermes model` wizard.
|
||||
# Override per-workspace in the canvas Config tab.
|
||||
model: nousresearch/hermes-4-70b
|
||||
|
||||
# Canvas surfaces this list as a Model dropdown and auto-populates
|
||||
# Required Env Vars based on the selected entry. hermes-agent itself
|
||||
# accepts any model string and picks the provider by prefix/scheme.
|
||||
# Required Env Vars from the selected entry's required_env. Provider
|
||||
# names in parentheses mirror hermes-agent's CLI provider field; see
|
||||
# docs/CONFIGURATION.md#provider-matrix for the full mapping.
|
||||
models:
|
||||
# --- Hermes 4 (Nous Research) ---
|
||||
# ── Nous Research (Hermes 4) via Nous Portal ──────────────────────
|
||||
- id: nousresearch/hermes-4-70b
|
||||
name: Hermes 4 70B (Nous Portal)
|
||||
required_env: [HERMES_API_KEY]
|
||||
- id: nousresearch/hermes-4-405b
|
||||
name: Hermes 4 405B (Nous Portal)
|
||||
required_env: [HERMES_API_KEY]
|
||||
# --- Hermes 3 (Nous Research, legacy) ---
|
||||
- id: nousresearch/hermes-4-14b
|
||||
name: Hermes 4 14B (Nous Portal)
|
||||
required_env: [HERMES_API_KEY]
|
||||
|
||||
# ── Hermes 3 family (via OpenRouter) ─────────────────────────────
|
||||
- id: nousresearch/hermes-3-llama-3.1-70b
|
||||
name: Hermes 3 70B (via OpenRouter)
|
||||
name: Hermes 3 70B (OpenRouter)
|
||||
required_env: [OPENROUTER_API_KEY]
|
||||
- id: nousresearch/hermes-3-llama-3.1-405b
|
||||
name: Hermes 3 405B (via OpenRouter)
|
||||
name: Hermes 3 405B (OpenRouter)
|
||||
required_env: [OPENROUTER_API_KEY]
|
||||
|
||||
# --- Anthropic (direct, native SDK in hermes-agent) ---
|
||||
# ── Anthropic (native SDK inside hermes-agent) ────────────────────
|
||||
- id: anthropic/claude-sonnet-4-5
|
||||
name: Claude Sonnet 4.5 (direct)
|
||||
required_env: [ANTHROPIC_API_KEY]
|
||||
- id: anthropic/claude-opus-4-1
|
||||
name: Claude Opus 4.1 (direct)
|
||||
required_env: [ANTHROPIC_API_KEY]
|
||||
- id: anthropic/claude-haiku-4-5
|
||||
name: Claude Haiku 4.5 (direct)
|
||||
required_env: [ANTHROPIC_API_KEY]
|
||||
|
||||
# --- OpenAI (direct) ---
|
||||
# ── OpenAI (via OpenRouter — hermes has no direct openai provider;
|
||||
# openai-codex is OAuth-only for Codex models) ──────────────────
|
||||
- id: openai/gpt-5
|
||||
name: GPT-5 (direct)
|
||||
required_env: [OPENAI_API_KEY]
|
||||
name: GPT-5 (via OpenRouter)
|
||||
required_env: [OPENROUTER_API_KEY]
|
||||
- id: openai/gpt-5-mini
|
||||
name: GPT-5 mini (via OpenRouter)
|
||||
required_env: [OPENROUTER_API_KEY]
|
||||
- id: openai/gpt-4o
|
||||
name: GPT-4o (via OpenRouter)
|
||||
required_env: [OPENROUTER_API_KEY]
|
||||
- id: openai/gpt-4o-mini
|
||||
name: GPT-4o mini (via OpenRouter)
|
||||
required_env: [OPENROUTER_API_KEY]
|
||||
|
||||
# --- Gemini (direct, native SDK in hermes-agent) ---
|
||||
# ── Google Gemini (native SDK) ────────────────────────────────────
|
||||
- id: gemini/gemini-2.5-pro
|
||||
name: Gemini 2.5 Pro (direct)
|
||||
required_env: [GEMINI_API_KEY]
|
||||
- id: gemini/gemini-2.5-flash
|
||||
name: Gemini 2.5 Flash (direct)
|
||||
required_env: [GEMINI_API_KEY]
|
||||
|
||||
# --- MiniMax (direct) — reused from prior template version ---
|
||||
- id: MiniMax-M2.7
|
||||
name: MiniMax M2.7 (direct, ~197K ctx, coding-tuned)
|
||||
# ── DeepSeek (direct) ─────────────────────────────────────────────
|
||||
- id: deepseek/deepseek-v3.2
|
||||
name: DeepSeek V3.2 (direct)
|
||||
required_env: [DEEPSEEK_API_KEY]
|
||||
- id: deepseek/deepseek-r1
|
||||
name: DeepSeek R1 reasoning (direct)
|
||||
required_env: [DEEPSEEK_API_KEY]
|
||||
|
||||
# ── z.ai / GLM ────────────────────────────────────────────────────
|
||||
- id: zai/glm-4.6
|
||||
name: GLM 4.6 (z.ai)
|
||||
required_env: [GLM_API_KEY]
|
||||
|
||||
# ── Kimi / Moonshot ───────────────────────────────────────────────
|
||||
- id: kimi-coding/kimi-k2
|
||||
name: Kimi K2 (Moonshot)
|
||||
required_env: [KIMI_API_KEY]
|
||||
|
||||
# ── MiniMax (global + China) ─────────────────────────────────────
|
||||
- id: minimax/MiniMax-M2.7
|
||||
name: MiniMax M2.7 (direct, coding-tuned)
|
||||
required_env: [MINIMAX_API_KEY]
|
||||
- id: MiniMax-M2.7-highspeed
|
||||
- id: minimax/MiniMax-M2.7-highspeed
|
||||
name: MiniMax M2.7 highspeed (Token Plan only)
|
||||
required_env: [MINIMAX_API_KEY]
|
||||
- id: minimax/MiniMax-M1
|
||||
name: MiniMax M1 (1M ctx)
|
||||
required_env: [MINIMAX_API_KEY]
|
||||
- id: minimax-cn/abab6.5-chat
|
||||
name: MiniMax China (abab6.5)
|
||||
required_env: [MINIMAX_CN_API_KEY]
|
||||
|
||||
# --- Any OpenRouter-exposed model (catch-all) ---
|
||||
# ── Alibaba Cloud / Qwen (DashScope) ─────────────────────────────
|
||||
- id: alibaba/qwen3-max
|
||||
name: Qwen 3 Max (Alibaba Cloud)
|
||||
required_env: [DASHSCOPE_API_KEY]
|
||||
- id: alibaba/qwen3-coder
|
||||
name: Qwen 3 Coder (Alibaba Cloud)
|
||||
required_env: [DASHSCOPE_API_KEY]
|
||||
|
||||
# ── Xiaomi MiMo ──────────────────────────────────────────────────
|
||||
- id: xiaomi/mimo-v1
|
||||
name: Xiaomi MiMo v1
|
||||
required_env: [XIAOMI_API_KEY]
|
||||
|
||||
# ── Arcee Trinity ────────────────────────────────────────────────
|
||||
- id: arcee/trinity-70b
|
||||
name: Arcee Trinity 70B
|
||||
required_env: [ARCEEAI_API_KEY]
|
||||
|
||||
# ── NVIDIA NIM ────────────────────────────────────────────────────
|
||||
- id: nvidia/nemotron-70b
|
||||
name: Nemotron 70B (NVIDIA NIM)
|
||||
required_env: [NVIDIA_API_KEY]
|
||||
|
||||
# ── Ollama Cloud ─────────────────────────────────────────────────
|
||||
- id: ollama-cloud/llama-3.3-70b
|
||||
name: Llama 3.3 70B (Ollama Cloud)
|
||||
required_env: [OLLAMA_API_KEY]
|
||||
|
||||
# ── Hugging Face Inference ───────────────────────────────────────
|
||||
- id: huggingface/*
|
||||
name: Any Hugging Face model (set ID per workspace)
|
||||
required_env: [HF_TOKEN]
|
||||
|
||||
# ── Vercel AI Gateway ────────────────────────────────────────────
|
||||
- id: ai-gateway/*
|
||||
name: Any Vercel AI Gateway model
|
||||
required_env: [AI_GATEWAY_API_KEY]
|
||||
|
||||
# ── OpenCode Zen / Go ────────────────────────────────────────────
|
||||
- id: opencode-zen/*
|
||||
name: OpenCode Zen (set model per workspace)
|
||||
required_env: [OPENCODE_ZEN_API_KEY]
|
||||
- id: opencode-go/*
|
||||
name: OpenCode Go (set model per workspace)
|
||||
required_env: [OPENCODE_GO_API_KEY]
|
||||
|
||||
# ── Kilo Code ────────────────────────────────────────────────────
|
||||
- id: kilocode/*
|
||||
name: Kilo Code (set model per workspace)
|
||||
required_env: [KILOCODE_API_KEY]
|
||||
|
||||
# ── OpenRouter catch-all ─────────────────────────────────────────
|
||||
- id: openrouter/*
|
||||
name: Any OpenRouter model (set ID per workspace)
|
||||
name: Any OpenRouter model (200+ available)
|
||||
required_env: [OPENROUTER_API_KEY]
|
||||
|
||||
# Required env is driven by the selected model above.
|
||||
required_env:
|
||||
- HERMES_API_KEY
|
||||
# ── Custom endpoint (LM Studio / Ollama local / vLLM / llama.cpp) ─
|
||||
- id: custom/*
|
||||
name: Self-hosted OpenAI-compat endpoint (configure base_url in ~/.hermes/config)
|
||||
required_env: []
|
||||
|
||||
# No single required env — hermes-agent supports 20+ providers and
|
||||
# customers pick any one via the canvas Config tab (per-model
|
||||
# required_env above drives the real UX). Molecule-runtime's
|
||||
# preflight enforces AND-semantics on this list, so a non-empty
|
||||
# value here would block workspaces using any non-default provider.
|
||||
# hermes-agent itself errors loud at request time if zero providers
|
||||
# are configured — that's the safety net.
|
||||
required_env: []
|
||||
|
||||
# 0 = no timeout; hermes-agent sessions can run long when tool-using.
|
||||
timeout: 0
|
||||
|
||||
# Tools hermes-agent ships natively — see .hermes/config.yaml inside the
|
||||
# container for the full list. These are informational; hermes-agent owns
|
||||
# tool selection. Use `hermes tools` to adjust at runtime.
|
||||
# Tools hermes-agent ships natively. See `hermes tools` inside the
|
||||
# container for the interactive toggle — by default all 17 built-in
|
||||
# tool families are available (terminal, file read/write/edit,
|
||||
# web fetch+search, memory, skills, subagent spawn, etc.).
|
||||
skills: []
|
||||
|
||||
a2a:
|
||||
@ -86,14 +193,20 @@ a2a:
|
||||
streaming: true
|
||||
push_notifications: true
|
||||
|
||||
# Bridge config — consumed by executor.py.
|
||||
# Bridge config — consumed by executor.py and start.sh.
|
||||
bridge:
|
||||
# Where the in-container hermes-agent API server listens. Do not change
|
||||
# unless you also change start.sh and API_SERVER_PORT env.
|
||||
# Where the in-container hermes-agent API server listens. Do NOT
|
||||
# change unless you also change start.sh / API_SERVER_PORT.
|
||||
hermes_api_base: http://127.0.0.1:8642/v1
|
||||
# Bearer token is injected by start.sh as API_SERVER_KEY and read by
|
||||
# executor.py from the env at request time — not stored in config.
|
||||
hermes_api_key_env: API_SERVER_KEY
|
||||
# Optional: force a specific hermes provider instead of relying on
|
||||
# hermes-agent's `auto` detection. Useful when multiple keys are
|
||||
# set and you want deterministic routing. Set via HERMES_INFERENCE_PROVIDER
|
||||
# env on the container OR hardcode here. Valid values: see
|
||||
# docs/CONFIGURATION.md#provider-matrix.
|
||||
provider: ""
|
||||
|
||||
delegation:
|
||||
retry_attempts: 3
|
||||
|
||||
@ -92,6 +92,42 @@ canvas ─── POST /a2a/... ───▶ molecule_runtime (:8000)
|
||||
that's a regression to v1.x.
|
||||
- **Tool routing.** Tools are hermes-agent's job. Our bridge sees
|
||||
only the final assistant text.
|
||||
|
||||
## Provider routing (how keys become inference)
|
||||
|
||||
Provider resolution happens inside hermes-agent, driven by:
|
||||
|
||||
1. **`~/.hermes/cli-config.yaml`** — `model.provider` field. start.sh
|
||||
seeds this file on first boot (`auto` by default, or whatever
|
||||
`HERMES_INFERENCE_PROVIDER` specifies).
|
||||
2. **`~/.hermes/.env`** — every provider key we forward from the
|
||||
container env (see start.sh for the full list; see
|
||||
`CONFIGURATION.md#provider-matrix` for the mapping).
|
||||
3. **Auto-detection** — when `provider: auto`, hermes walks its
|
||||
internal resolution order and picks the first provider whose
|
||||
credential is present. When multiple keys are set, prefer explicit
|
||||
`HERMES_INFERENCE_PROVIDER` to avoid surprises.
|
||||
|
||||
### Common routing gotcha
|
||||
|
||||
With only `OPENAI_API_KEY` set and `provider: auto`, hermes-agent will
|
||||
route to `openai-codex` (Codex API, OAuth-only) and return:
|
||||
|
||||
```
|
||||
401 - Missing Authentication header
|
||||
```
|
||||
|
||||
The fix is to set `HERMES_INFERENCE_PROVIDER=openrouter` — hermes's
|
||||
openrouter provider accepts `OPENAI_API_KEY` as alt-auth and routes
|
||||
OpenAI-format Chat Completions correctly. This is documented in
|
||||
`CONFIGURATION.md#forcing-a-provider`.
|
||||
|
||||
### Auxiliary model
|
||||
|
||||
Vision, web summarization, and MoA use a separate auxiliary model —
|
||||
defaults to Gemini Flash via OpenRouter. If `OPENROUTER_API_KEY` is
|
||||
absent, these capabilities break silently (the primary path still
|
||||
works). Set `HERMES_AUXILIARY_PROVIDER` to override.
|
||||
- **Streaming.** `stream: false` in the request payload. A later
|
||||
revision can upgrade to SSE by subscribing to
|
||||
`GET /v1/runs/{run_id}/events` and pushing partial messages into
|
||||
|
||||
@ -10,36 +10,116 @@ list in `config.yaml`. When you pick one, canvas writes the selection
|
||||
into the workspace's runtime_config; molecule_runtime constructs
|
||||
`AdapterConfig.model` from that; the bridge sends it verbatim as the
|
||||
`model` field in the OpenAI-compat request payload. hermes-agent
|
||||
resolves provider + auth from the string.
|
||||
resolves provider + auth from the string (see provider matrix below).
|
||||
|
||||
**Via `hermes` CLI** — open the workspace's Terminal tab and run
|
||||
`hermes model`. This updates `~/.hermes/config.yaml` inside the
|
||||
`hermes model`. This updates `~/.hermes/cli-config.yaml` inside the
|
||||
container and affects any subsequent A2A request.
|
||||
|
||||
**Which wins** — today the CLI and the bridge are independent.
|
||||
If you set the model in the canvas AND in the CLI, each request
|
||||
If you set the model in the canvas AND in the CLI, each A2A request
|
||||
uses the one the bridge sends (the canvas value). An upcoming PR
|
||||
will sync the two; see `ARCHITECTURE.md#future-work`.
|
||||
|
||||
## Provider keys
|
||||
## Provider matrix
|
||||
|
||||
Set one or more of these as workspace-level secrets via
|
||||
`POST /settings/secrets` (see monorepo `docs/runbooks/saas-secrets.md`).
|
||||
All are forwarded into `~/.hermes/.env` at container boot.
|
||||
hermes-agent supports every provider below. Set the corresponding env
|
||||
var as a workspace secret (`POST /settings/secrets` — see monorepo
|
||||
`docs/runbooks/saas-secrets.md`). start.sh forwards it into
|
||||
`~/.hermes/.env` at container boot.
|
||||
|
||||
| Env var | Activates provider |
|
||||
|----------------------|---------------------------------------------------|
|
||||
| `HERMES_API_KEY` | Nous Portal (Hermes 3, Hermes 4 direct) |
|
||||
| `OPENROUTER_API_KEY` | OpenRouter (200+ models) |
|
||||
| `ANTHROPIC_API_KEY` | Claude direct via Anthropic Messages API |
|
||||
| `OPENAI_API_KEY` | GPT direct |
|
||||
| `GEMINI_API_KEY` | Gemini direct via `google-genai` |
|
||||
| `MINIMAX_API_KEY` | MiniMax direct (sk-api-* or sk-cp-* accepted) |
|
||||
### OAuth-based providers
|
||||
|
||||
You don't pick the provider yourself. hermes-agent resolves it from
|
||||
the `model` string prefix — `anthropic/` → Anthropic, `gemini/` →
|
||||
Gemini, `nousresearch/` → Nous Portal (if `HERMES_API_KEY` present)
|
||||
falling back to OpenRouter, etc.
|
||||
These require `hermes model` to be run interactively (Terminal tab,
|
||||
non-piped). Set up once; tokens stored at `~/.hermes/auth/`.
|
||||
|
||||
| Provider | How to set up |
|
||||
|-------------------------|---------------------------------------------------------------------------------|
|
||||
| **Nous Portal** | `hermes model` → Nous Portal OAuth (subscription) |
|
||||
| **OpenAI Codex** | `hermes model` → ChatGPT OAuth (uses GPT-5-Codex family) |
|
||||
| **GitHub Copilot** | `hermes model` → OAuth device code, or set `COPILOT_GITHUB_TOKEN` / `GH_TOKEN` |
|
||||
| **Anthropic (Claude Pro/Max)** | `hermes model` → Claude Code auth, or set `ANTHROPIC_API_KEY` for API-key mode |
|
||||
| **Google Gemini OAuth** | `hermes model` → "Google Gemini (OAuth)". Free tier, PKCE. See provider-routing docs for GCP-project caveats |
|
||||
|
||||
### API-key providers
|
||||
|
||||
Just set the env var; hermes-agent picks up the key at boot.
|
||||
|
||||
| Provider | Env var | Example model IDs |
|
||||
|--------------------|------------------------|-------------------------------------------------------|
|
||||
| **Nous Portal API**| `HERMES_API_KEY` (or `NOUS_API_KEY`) | `nousresearch/hermes-4-70b`, `nousresearch/hermes-4-405b`, `nousresearch/hermes-4-14b` |
|
||||
| **OpenRouter** | `OPENROUTER_API_KEY` | Anything on openrouter.ai (`openai/gpt-5`, `anthropic/claude-sonnet-4-5`, 200+ others) |
|
||||
| **OpenAI (via OpenRouter)** | `OPENAI_API_KEY` alt-auth on openrouter | `openai/gpt-5`, `openai/gpt-4o`, `openai/gpt-4o-mini` |
|
||||
| **Anthropic** | `ANTHROPIC_API_KEY` | `anthropic/claude-sonnet-4-5`, `anthropic/claude-opus-4-1`, `anthropic/claude-haiku-4-5` |
|
||||
| **Google Gemini** | `GEMINI_API_KEY` or `GOOGLE_API_KEY` | `gemini/gemini-2.5-pro`, `gemini/gemini-2.5-flash` |
|
||||
| **DeepSeek** | `DEEPSEEK_API_KEY` | `deepseek/deepseek-v3.2`, `deepseek/deepseek-r1` |
|
||||
| **z.ai / GLM** | `GLM_API_KEY` | `zai/glm-4.6` |
|
||||
| **Kimi / Moonshot**| `KIMI_API_KEY` (global), `KIMI_CN_API_KEY` (China) | `kimi-coding/kimi-k2` |
|
||||
| **MiniMax** | `MINIMAX_API_KEY` (global), `MINIMAX_CN_API_KEY` (China) | `minimax/MiniMax-M2.7`, `minimax-cn/abab6.5-chat` |
|
||||
| **Alibaba / Qwen** | `DASHSCOPE_API_KEY` | `alibaba/qwen3-max`, `alibaba/qwen3-coder` |
|
||||
| **Xiaomi MiMo** | `XIAOMI_API_KEY` | `xiaomi/mimo-v1` |
|
||||
| **Arcee Trinity** | `ARCEEAI_API_KEY` | `arcee/trinity-70b` |
|
||||
| **NVIDIA NIM** | `NVIDIA_API_KEY` | `nvidia/nemotron-70b` |
|
||||
| **Ollama Cloud** | `OLLAMA_API_KEY` | `ollama-cloud/llama-3.3-70b` |
|
||||
| **Hugging Face** | `HF_TOKEN` | `huggingface/*` (any HF inference model) |
|
||||
| **Vercel AI Gateway** | `AI_GATEWAY_API_KEY` | `ai-gateway/*` |
|
||||
| **Kilo Code** | `KILOCODE_API_KEY` | `kilocode/*` |
|
||||
| **OpenCode Zen** | `OPENCODE_ZEN_API_KEY` | `opencode-zen/*` |
|
||||
| **OpenCode Go** | `OPENCODE_GO_API_KEY` | `opencode-go/*` |
|
||||
|
||||
### Self-hosted / local
|
||||
|
||||
`hermes model` → "Custom endpoint" — any OpenAI-compatible HTTP API.
|
||||
Aliases for quick setup: `lmstudio`, `ollama`, `vllm`, `llamacpp`.
|
||||
|
||||
```yaml
|
||||
# example ~/.hermes/cli-config.yaml override
|
||||
model:
|
||||
default: "llama-3.3-70b-instruct"
|
||||
provider: "lmstudio"
|
||||
base_url: "http://host.docker.internal:1234/v1"
|
||||
```
|
||||
|
||||
No API key needed — local servers typically ignore auth.
|
||||
|
||||
## Forcing a provider
|
||||
|
||||
By default hermes-agent's provider-selection is `auto` — it walks its
|
||||
internal resolution order and picks the first available credential.
|
||||
This can route surprising ways when multiple keys are set (e.g. an
|
||||
`OPENAI_API_KEY` will fall to `openai-codex` which is OAuth-only and
|
||||
returns 401 on API-key auth).
|
||||
|
||||
To force a specific provider, set `HERMES_INFERENCE_PROVIDER` on the
|
||||
workspace container. start.sh writes it into `~/.hermes/cli-config.yaml`
|
||||
and `~/.hermes/.env` at boot. Valid values (from hermes-agent
|
||||
`cli-config.yaml.example`):
|
||||
|
||||
```
|
||||
auto | openrouter | nous | nous-api | anthropic | openai-codex
|
||||
copilot | gemini | google-gemini-cli | zai | kimi-coding | kimi-coding-cn
|
||||
minimax | minimax-cn | alibaba (aliases: dashscope, qwen)
|
||||
arcee | nvidia | xiaomi | huggingface | ollama-cloud
|
||||
ai-gateway | kilocode | opencode-zen | opencode-go | deepseek | custom
|
||||
```
|
||||
|
||||
**Most common choices when multiple keys are present:**
|
||||
- `OPENAI_API_KEY` only → `HERMES_INFERENCE_PROVIDER=openrouter` (hermes
|
||||
openrouter accepts OPENAI_API_KEY as alt auth)
|
||||
- `ANTHROPIC_API_KEY` only → `anthropic`
|
||||
- Mixed keys → `auto` usually works
|
||||
|
||||
## Auxiliary model (vision / MoA / summarization)
|
||||
|
||||
hermes-agent uses a second, smaller model for vision, web page
|
||||
summarization, and mixture-of-agents tool calls. Defaults to
|
||||
**Gemini Flash via OpenRouter**. Having `OPENROUTER_API_KEY` set is
|
||||
enough; otherwise vision + web-summarize + MoA break silently.
|
||||
|
||||
Override the auxiliary path with `HERMES_AUXILIARY_PROVIDER` env —
|
||||
start.sh forwards it. See hermes-agent
|
||||
[Auxiliary Models docs](https://github.com/NousResearch/hermes-agent/blob/main/website/docs/user-guide/configuration.md)
|
||||
for the full field set.
|
||||
|
||||
## Persisting skills + memory
|
||||
|
||||
@ -47,30 +127,29 @@ falling back to OpenRouter, etc.
|
||||
|
||||
```
|
||||
/home/agent/.hermes/
|
||||
├── .env ← provider keys + API_SERVER_* (regenerated per boot)
|
||||
├── config.yaml ← model, tools, gateway settings
|
||||
├── skills/ ← self-improvement loop writes here
|
||||
├── sessions/ ← conversation history (FTS5-indexed)
|
||||
├── memory/ ← long-lived user model (Honcho + custom)
|
||||
├── .env ← provider keys + API_SERVER_* (regenerated per boot)
|
||||
├── cli-config.yaml ← model + provider selection (seeded by start.sh if absent)
|
||||
├── hermes-agent/ ← the installed project; venv, source, upstream repo
|
||||
├── auth/ ← OAuth tokens (Google Gemini OAuth, Copilot, Codex, etc.)
|
||||
├── skills/ ← self-improvement loop writes here
|
||||
├── sessions/ ← conversation history (FTS5-indexed)
|
||||
├── memory/ ← long-lived user model (Honcho + custom)
|
||||
└── logs/
|
||||
```
|
||||
|
||||
For these to survive a workspace container restart, the platform
|
||||
needs a Docker volume mounted at `/home/agent/.hermes`. The default
|
||||
provisioner config already handles this — verify with:
|
||||
For these to survive a container restart, mount a Docker volume at
|
||||
`/home/agent/.hermes`. The platform's default provisioner config does
|
||||
this already — verify with:
|
||||
|
||||
```bash
|
||||
docker inspect --format='{{json .Mounts}}' <workspace-container-id>
|
||||
```
|
||||
|
||||
If `/home/agent/.hermes` is not in the Mounts list, edit the
|
||||
workspace's provisioner config in the monorepo.
|
||||
|
||||
## Gateway platforms (advanced)
|
||||
|
||||
`hermes-agent` ships with Telegram, Discord, Slack, WhatsApp, Signal,
|
||||
and ~10 other platform adapters. v2.0.0 of this template wires only
|
||||
the `api_server` platform (required for the A2A bridge).
|
||||
and ~10 other platform adapters. v2.x of this template wires only the
|
||||
`api_server` platform (required for the A2A bridge).
|
||||
|
||||
To enable another platform, customize `~/.hermes/.env` in the workspace:
|
||||
|
||||
@ -86,8 +165,8 @@ EOF
|
||||
'
|
||||
```
|
||||
|
||||
This is not yet surfaced in canvas. Follow the issue tracker for
|
||||
first-class gateway-platform support.
|
||||
Not yet surfaced in canvas. Follow the issue tracker for first-class
|
||||
gateway-platform support.
|
||||
|
||||
## Restarting the gateway
|
||||
|
||||
@ -108,12 +187,12 @@ molecule_runtime on :8000 is unaffected.
|
||||
|
||||
```bash
|
||||
# What model is the CLI pinned to?
|
||||
docker exec -u agent <id> hermes model
|
||||
docker exec -u agent <id> hermes model show
|
||||
|
||||
# What tools are enabled?
|
||||
docker exec -u agent <id> hermes tools
|
||||
|
||||
# How is the agent doing?
|
||||
# Doctor report (warnings, missing deps, broken providers):
|
||||
docker exec -u agent <id> hermes doctor
|
||||
|
||||
# Last 200 lines of gateway log:
|
||||
@ -122,8 +201,8 @@ docker exec <id> tail -200 /var/log/hermes-gateway.log
|
||||
|
||||
## Bridge timeouts
|
||||
|
||||
`executor.py` uses a 600-second httpx timeout. If you run agent
|
||||
turns that take longer than 10 minutes (large research tasks with
|
||||
many tool calls), bump `_REQUEST_TIMEOUT` in `executor.py` and rebuild
|
||||
the image. Don't try to configure this at runtime via env — we keep
|
||||
it in code so regressions are version-controlled.
|
||||
`executor.py` uses a 600-second httpx timeout. If you run agent turns
|
||||
that take longer than 10 minutes (large research tasks with many tool
|
||||
calls), bump `_REQUEST_TIMEOUT` in `executor.py` and rebuild the
|
||||
image. Don't try to configure this at runtime via env — we keep it in
|
||||
code so regressions are version-controlled.
|
||||
|
||||
104
start.sh
104
start.sh
@ -12,6 +12,7 @@ set -euo pipefail
|
||||
|
||||
HERMES_HOME="/home/agent/.hermes"
|
||||
ENV_FILE="${HERMES_HOME}/.env"
|
||||
HERMES_CONFIG="${HERMES_HOME}/config.yaml"
|
||||
LOG_FILE="/var/log/hermes-gateway.log"
|
||||
|
||||
mkdir -p "$(dirname "$LOG_FILE")"
|
||||
@ -27,37 +28,110 @@ if [ -z "${API_SERVER_KEY:-}" ]; then
|
||||
export API_SERVER_KEY
|
||||
fi
|
||||
|
||||
install -d -o agent -g agent "$HERMES_HOME"
|
||||
|
||||
# --- Write hermes-agent's .env ---
|
||||
# API_SERVER_ENABLED must be true and the bearer must match. Provider
|
||||
# keys (HERMES_API_KEY / OPENROUTER_API_KEY / ANTHROPIC_API_KEY /
|
||||
# OPENAI_API_KEY / GEMINI_API_KEY / MINIMAX_API_KEY) are forwarded from
|
||||
# the container env — hermes-agent will pick the right one based on the
|
||||
# model selected via `hermes model`.
|
||||
sudo -u agent mkdir -p "$HERMES_HOME"
|
||||
sudo -u agent tee "$ENV_FILE" >/dev/null <<EOF
|
||||
# API_SERVER_ENABLED must be true and the bearer must match. Every
|
||||
# provider key hermes-agent knows about is forwarded from the container
|
||||
# env IF it's set — see docs/CONFIGURATION.md#provider-matrix for the
|
||||
# authoritative list. Adding a new key here also needs a matching
|
||||
# required_env entry in config.yaml.
|
||||
cat >"$ENV_FILE" <<EOF
|
||||
API_SERVER_ENABLED=true
|
||||
API_SERVER_KEY=${API_SERVER_KEY}
|
||||
API_SERVER_HOST=${API_SERVER_HOST:-127.0.0.1}
|
||||
API_SERVER_PORT=${API_SERVER_PORT:-8642}
|
||||
# Provider-selection override (optional; empty = hermes auto-detect).
|
||||
${HERMES_INFERENCE_PROVIDER:+HERMES_INFERENCE_PROVIDER=${HERMES_INFERENCE_PROVIDER}}
|
||||
# Auxiliary model defaults — used by vision, web summarization, MoA.
|
||||
${HERMES_AUXILIARY_PROVIDER:+HERMES_AUXILIARY_PROVIDER=${HERMES_AUXILIARY_PROVIDER}}
|
||||
# ── Primary inference providers (keyed) ───────────────────────
|
||||
${HERMES_API_KEY:+HERMES_API_KEY=${HERMES_API_KEY}}
|
||||
${NOUS_API_KEY:+NOUS_API_KEY=${NOUS_API_KEY}}
|
||||
${OPENROUTER_API_KEY:+OPENROUTER_API_KEY=${OPENROUTER_API_KEY}}
|
||||
${ANTHROPIC_API_KEY:+ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}}
|
||||
${OPENAI_API_KEY:+OPENAI_API_KEY=${OPENAI_API_KEY}}
|
||||
${ANTHROPIC_API_KEY:+ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}}
|
||||
${GEMINI_API_KEY:+GEMINI_API_KEY=${GEMINI_API_KEY}}
|
||||
${GOOGLE_API_KEY:+GOOGLE_API_KEY=${GOOGLE_API_KEY}}
|
||||
${DEEPSEEK_API_KEY:+DEEPSEEK_API_KEY=${DEEPSEEK_API_KEY}}
|
||||
${GLM_API_KEY:+GLM_API_KEY=${GLM_API_KEY}}
|
||||
${KIMI_API_KEY:+KIMI_API_KEY=${KIMI_API_KEY}}
|
||||
${KIMI_CN_API_KEY:+KIMI_CN_API_KEY=${KIMI_CN_API_KEY}}
|
||||
${MINIMAX_API_KEY:+MINIMAX_API_KEY=${MINIMAX_API_KEY}}
|
||||
${MINIMAX_CN_API_KEY:+MINIMAX_CN_API_KEY=${MINIMAX_CN_API_KEY}}
|
||||
${DASHSCOPE_API_KEY:+DASHSCOPE_API_KEY=${DASHSCOPE_API_KEY}}
|
||||
${XIAOMI_API_KEY:+XIAOMI_API_KEY=${XIAOMI_API_KEY}}
|
||||
${ARCEEAI_API_KEY:+ARCEEAI_API_KEY=${ARCEEAI_API_KEY}}
|
||||
${NVIDIA_API_KEY:+NVIDIA_API_KEY=${NVIDIA_API_KEY}}
|
||||
${OLLAMA_API_KEY:+OLLAMA_API_KEY=${OLLAMA_API_KEY}}
|
||||
${HF_TOKEN:+HF_TOKEN=${HF_TOKEN}}
|
||||
${AI_GATEWAY_API_KEY:+AI_GATEWAY_API_KEY=${AI_GATEWAY_API_KEY}}
|
||||
${KILOCODE_API_KEY:+KILOCODE_API_KEY=${KILOCODE_API_KEY}}
|
||||
${OPENCODE_ZEN_API_KEY:+OPENCODE_ZEN_API_KEY=${OPENCODE_ZEN_API_KEY}}
|
||||
${OPENCODE_GO_API_KEY:+OPENCODE_GO_API_KEY=${OPENCODE_GO_API_KEY}}
|
||||
# GitHub Copilot (OAuth or token)
|
||||
${COPILOT_GITHUB_TOKEN:+COPILOT_GITHUB_TOKEN=${COPILOT_GITHUB_TOKEN}}
|
||||
${GH_TOKEN:+GH_TOKEN=${GH_TOKEN}}
|
||||
EOF
|
||||
chown agent:agent "$ENV_FILE"
|
||||
chmod 600 "$ENV_FILE"
|
||||
|
||||
# --- Seed a minimal ~/.hermes/config.yaml if not already present ---
|
||||
# The container image runs install.sh with --skip-setup so no config
|
||||
# is generated at build time. Without an explicit provider, hermes
|
||||
# errors at request time with "No LLM provider configured" even when
|
||||
# a provider key is present in .env — the config.yaml is the primary
|
||||
# source of truth, .env only holds keys.
|
||||
#
|
||||
# Writing an explicit provider here also avoids the auto-detect
|
||||
# falling through to openai-codex (OAuth-only) when OPENAI_API_KEY is
|
||||
# set but OPENROUTER_API_KEY isn't — source of the 401 "Missing
|
||||
# Authentication header" in early testing.
|
||||
# Unconditionally overwrite — the hermes installer drops its
|
||||
# `cli-config.yaml.example` in place as `~/.hermes/config.yaml`
|
||||
# (defaulting to anthropic/claude-opus-4.6 + provider:auto) which
|
||||
# doesn't match the workspace's intended model. Our template owns
|
||||
# the selection; operators override via HERMES_INFERENCE_PROVIDER
|
||||
# + HERMES_DEFAULT_MODEL env, or by editing config.yaml at runtime
|
||||
# inside the container.
|
||||
PROVIDER="${HERMES_INFERENCE_PROVIDER:-auto}"
|
||||
DEFAULT_MODEL="${HERMES_DEFAULT_MODEL:-nousresearch/hermes-4-70b}"
|
||||
{
|
||||
echo "# Seeded by molecule template-hermes start.sh. Customize via"
|
||||
echo "# \`hermes config edit\` or by editing this file directly."
|
||||
echo "# start.sh rewrites model.default + model.provider on every"
|
||||
echo "# boot from HERMES_DEFAULT_MODEL / HERMES_INFERENCE_PROVIDER env."
|
||||
echo "model:"
|
||||
echo " default: \"${DEFAULT_MODEL}\""
|
||||
echo " provider: \"${PROVIDER}\""
|
||||
# For custom provider (or its aliases lmstudio/ollama/vllm/llamacpp),
|
||||
# let operators pipe the base_url and api_key through env. Useful for
|
||||
# pointing at a non-OpenRouter OpenAI-compat endpoint (OpenAI direct,
|
||||
# LiteLLM gateway, LM Studio, local vLLM, etc.).
|
||||
if [ -n "${HERMES_CUSTOM_BASE_URL:-}" ]; then
|
||||
echo " base_url: \"${HERMES_CUSTOM_BASE_URL}\""
|
||||
fi
|
||||
if [ -n "${HERMES_CUSTOM_API_KEY:-}" ]; then
|
||||
echo " api_key: \"${HERMES_CUSTOM_API_KEY}\""
|
||||
fi
|
||||
} >"$HERMES_CONFIG"
|
||||
chown agent:agent "$HERMES_CONFIG"
|
||||
|
||||
# --- Start hermes gateway in the background ---
|
||||
# `hermes gateway` reads ~/.hermes/.env at startup. We run it as the
|
||||
# agent user so memory/skills land in the agent-owned home.
|
||||
nohup sudo -u agent -E bash -lc "hermes gateway" >>"$LOG_FILE" 2>&1 &
|
||||
# agent user via gosu so memory/skills land in the agent-owned home.
|
||||
# `bash -lc` forces a login shell so .profile / .bashrc add ~/.local/bin
|
||||
# to PATH (that's where install.sh symlinks the hermes binary).
|
||||
nohup gosu agent bash -lc "cd /home/agent && hermes gateway" \
|
||||
>>"$LOG_FILE" 2>&1 &
|
||||
GATEWAY_PID=$!
|
||||
|
||||
# --- Wait for :8642 readiness ---
|
||||
# Max 60s — enough for a cold gateway boot including first-time DB
|
||||
# migrations. Longer waits should surface as a provisioning failure
|
||||
# upstream rather than silently holding the container.
|
||||
for _ in $(seq 1 60); do
|
||||
# Max 120s — enough for a cold gateway boot including first-time DB
|
||||
# migrations and session-store init. Longer waits should surface as a
|
||||
# provisioning failure upstream rather than silently holding the container.
|
||||
READY_TIMEOUT=120
|
||||
for _ in $(seq 1 $READY_TIMEOUT); do
|
||||
if curl -fsS "http://127.0.0.1:${API_SERVER_PORT:-8642}/health" >/dev/null 2>&1; then
|
||||
break
|
||||
fi
|
||||
@ -70,7 +144,7 @@ for _ in $(seq 1 60); do
|
||||
done
|
||||
|
||||
if ! curl -fsS "http://127.0.0.1:${API_SERVER_PORT:-8642}/health" >/dev/null 2>&1; then
|
||||
echo "[start.sh] hermes gateway failed to reach /health within 60s." >&2
|
||||
echo "[start.sh] hermes gateway failed to reach /health within ${READY_TIMEOUT}s." >&2
|
||||
tail -80 "$LOG_FILE" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
Loading…
Reference in New Issue
Block a user