Three small but real cleanups against hermes-agent v0.12.0
(NousResearch/hermes-agent, 2026-04-30):
1. Rename HERMES_DEFAULT_MODEL -> HERMES_INFERENCE_MODEL (upstream's
actual env name). Reads BOTH for one release cycle so workspace-server
(which still writes the legacy name) doesn't break — drop the legacy
fallback after workspace-server is updated in a follow-up PR.
2. Drop HERMES_API_KEY from start.sh's .env heredoc. That var only feeds
hermes-agent's TUI gateway bridge, NOT any LLM provider. Provider
credentials go through OPENROUTER_API_KEY / OPENAI_API_KEY / etc.
3. Add 12 missing provider prefixes to derive-provider.sh so model slugs
like xai/grok-4, bedrock/anthropic.claude-sonnet-4, lmstudio/local,
copilot/gpt-4o, etc., route to the correct provider instead of
falling through to "auto".
New tests/test_derive_provider.sh — 26 sh-style assertions covering the
legacy fallback, the precedence rule, all 12 new providers, and a few
regression cases for adjacent prefixes (minimax vs minimax-oauth, qwen
vs qwen-oauth, alibaba vs alibaba-coding-plan).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Spawns a real hermes gateway run + a stub OpenAI-compat LLM server +
the real executor's reply server, and routes a message through every
hop of the production chain except platform-side peer-message routing:
HermesAgentProxyExecutor.execute()
→ POST /a2a/inbound (hermes plugin)
→ MessageEvent dispatch through hermes pipeline
→ stub LLM /v1/chat/completions
→ plugin send() POSTs reply to executor /a2a/reply
→ execute() Future resolves → emits on event_queue
This is the highest-fidelity local approximation of staging E2E.
Caught a real KeyError in upstream hermes hermes_cli/tools_config.py
that no in-process test surfaced. Asserts the wire shape works end to
end + guards against the KeyError regression. The reply CONTENT
depends on whether the stub speaks hermes' multi-turn tool loop, so
we don't assert on it — what matters is the full pipeline routes
through the plugin and back.
Run:
/Users/hongming/.hermes/hermes-agent/venv/bin/python3 \\
scripts/e2e_full_chain.py
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the gap where CP user-data (PR-3, task #197) writes
runtime_config.{model,provider} into /configs/config.yaml but start.sh
only reads HERMES_DEFAULT_MODEL / HERMES_INFERENCE_PROVIDER env vars
that CP doesn't set. Result: every CP-provisioned hermes workspace
booted with the built-in `nousresearch/hermes-4-70b` default and
500'd at first prompt with "No LLM provider configured" — visible
in the 2026-04-30 hongmingwang tenant screenshots.
New scripts/load-workspace-config.sh, sourced by start.sh before the
existing DEFAULT_MODEL/PROVIDER derivation. Reads /configs/config.yaml
via python3 + PyYAML and exports HERMES_DEFAULT_MODEL +
HERMES_INFERENCE_PROVIDER if they're not already set.
Precedence (highest to lowest):
1. HERMES_* env vars (operator override via workspace secrets)
2. /configs/config.yaml runtime_config.{model,provider} (canvas UI)
3. start.sh hard-coded fallback (nousresearch/hermes-4-70b)
Resilience:
- Missing config.yaml → silent skip (dev containers)
- Malformed YAML → silent skip (don't kill boot)
- python3 missing → silent skip
- PyYAML missing → silent skip
- Empty/non-dict runtime_config → silent skip
Tests: scripts/test-load-workspace-config.sh — 11 cases covering all
silent-skip paths + happy paths + operator override + non-string scalar
coercion. Existing scripts/test-derive-provider.sh (12 cases) re-verified.
Wires shell tests into CI via a new shell-tests job — those tests
weren't running anywhere before, opportunistically closes that gap.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The install.sh "OpenAI bridge" block bundled two distinct concerns under
one guard:
1. Auto-fill HERMES_CUSTOM_{BASE_URL,API_KEY,API_MODE} when operator
didn't set them
2. Strip the openai/ prefix from DEFAULT_MODEL (OpenAI rejects prefixed
model IDs with 400 "invalid model ID")
Both only fired when the operator had NOT pre-configured HERMES_CUSTOM_*.
That broke molecule-core#1987: the staging E2E now pins HERMES_CUSTOM_*
explicitly (to work around derive-provider.sh's #19 fix not reaching all
tenants). The pin skipped concern (1) intentionally — but also skipped
(2) unintentionally. Result: E2E routes to api.openai.com with the wrong
model name and hits 400.
Fix: separate the two concerns.
- (A) Auto-fill block keeps its original guard — runs only when operator
didn't configure.
- (B) New independent block: strip openai/ iff the FINAL
HERMES_CUSTOM_BASE_URL matches api.openai.com. Regex is anchored
(^https?://api\.openai\.com(/|$)) so lookalike domains
(api.openai.com.evil.internal, beta.api.openai.com) do NOT match.
Idempotent on already-bare model names.
Verified via scripts/test-install-prefix-strip.sh — 10 cases including:
A default bridge strips openai/ → gpt-4o
B operator-pinned OpenAI URL also strips → gpt-4o (#1987 path, was broken)
C vLLM URL keeps prefix → openai/my-finetune
D openrouter keeps prefix → openai/gpt-4o
E minimax untouched → minimax/MiniMax-M2.7
G lookalike domain NOT stripped → openai/gpt-4o (anti-spoofing)
H http://api.openai.com also strips → gpt-4o
I subdomain beta.api.openai.com NOT strip → openai/gpt-4o
All 10 pass. Plus a parity check greps install.sh to ensure the inlined
logic matches what ships.
No behavioral change for any existing working config (scenarios A, C, D,
E, F above — all unchanged). Only fixes the broken scenario B.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Root cause of the 2026-04-23 E2E A2A regression: when a workspace's
model is openai/* and the tenant has an OPENROUTER_API_KEY set globally
(common on SaaS staging), derive-provider.sh was picking PROVIDER=openrouter
even when the WORKSPACE-level secret OPENAI_API_KEY was explicitly provided
for the direct-OpenAI path.
hermes then called OpenRouter with a key that was stale/empty for this
workspace, and OR returned `{"error": {"message": "Missing Authentication
header", "code": 401}}` — which surfaced in the A2A agent reply and
failed the E2E at step 8.
Fix: flip the priority. For openai/* model slugs, prefer `custom` (direct
OpenAI via install.sh's HERMES_CUSTOM_* bridge) when OPENAI_API_KEY is
present. Fall through to `openrouter` only when OPENAI_API_KEY is absent.
Operators who want OR for openai/* models can still:
- set HERMES_INFERENCE_PROVIDER=openrouter (wins via the explicit override at top of file), or
- use an openrouter/* model slug
Adds scripts/test-derive-provider.sh — 12 offline shell assertions
pinning the decision table including the exact #19 regression case.
Acceptance: E2E step 8 A2A returns a real PONG reply instead of OR's
401-shaped error from the agent response.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Hermes-agent upstream has no direct "openai" provider — its own
.env.example states "All LLM calls go through OpenRouter". The
template's derive-provider.sh followed that: `openai/*` → `openrouter`
always. That left a real gap for workspaces whose sole credential is
an OpenAI API key (no OpenRouter subscription) — hermes would route
the call through OpenRouter, OpenRouter would reject the OpenAI key,
and the user got "401 Invalid API key" on the first A2A turn.
Hermes does expose a "custom" provider with base_url + api_key
overrides (already plumbed via HERMES_CUSTOM_BASE_URL /
HERMES_CUSTOM_API_KEY env vars). This PR teaches the template to
auto-populate that bridge when the operator ships only an OpenAI key:
scripts/derive-provider.sh
openai/* now maps to:
- openrouter, when OPENROUTER_API_KEY is set (unchanged path)
- custom, when only OPENAI_API_KEY is set (new path)
- openrouter, as a no-key fallback so hermes errors clearly
install.sh + start.sh (bare-host + Docker — kept in sync)
When PROVIDER="custom" and HERMES_CUSTOM_* aren't explicitly set
but OPENAI_API_KEY is, auto-set:
HERMES_CUSTOM_BASE_URL=https://api.openai.com/v1
HERMES_CUSTOM_API_KEY=$OPENAI_API_KEY
The config.yaml's model.{base_url,api_key} pick these up
verbatim so hermes calls OpenAI directly.
Existing operators with HERMES_CUSTOM_* or OPENROUTER_API_KEY set
explicitly are unchanged — the bridge only fires when *both*
HERMES_CUSTOM_BASE_URL and HERMES_CUSTOM_API_KEY are empty.
## Verified
- derive-provider shell path:
OPENAI only → custom
OR set → openrouter
neither → openrouter (fallback for clear error)
- bash -n on all three scripts passes
- Symmetric install.sh (bare-host) and start.sh (Docker) bridges
## Root-cause context
Incident 2026-04-23: E2E Staging SaaS reached step 8/11 (A2A call)
for the first time after CP #236 deployed the install.sh hook. A2A
failed with 401 because staging CI only has MOLECULE_STAGING_OPENAI_KEY
(no OR key), and the slug routed through OpenRouter. This PR closes
that gap so an OpenAI-only workspace is usable out of the box.
When HERMES_INFERENCE_PROVIDER isn't set, hermes-agent's "auto"
detection is unreliable — particularly for MiniMax Token Plan + most
non-OpenRouter direct-SDK providers. Without the right provider,
hermes gateway rejects chats with "No LLM provider configured" or
routes to the wrong backend.
This is the last piece of the MVP pipeline that makes "customer
creates workspace with model = minimax/MiniMax-M2.7-highspeed + key
= sk-cp-..., just works" a one-click flow.
Changes:
- scripts/derive-provider.sh (NEW): sourced sub-script that reads
HERMES_DEFAULT_MODEL and sets $PROVIDER to the right value for
every provider hermes-agent supports. Explicit HERMES_INFERENCE_PROVIDER
in the env wins; otherwise the case statement handles: minimax/*,
minimax-cn/*, anthropic/*, gemini/*, deepseek/*, zai/*, kimi-coding/*
(+ cn variant), alibaba/ dashscope/ qwen/*, xiaomi/ mimo/*, arcee/*,
nvidia/ nim/*, ollama-cloud/*, huggingface/ hf/*, ai-gateway/*,
kilocode/*, opencode-zen/*, opencode-go/*, openai/* (routes through
openrouter — hermes has no direct openai provider), nousresearch/*
(nous if HERMES_API_KEY or NOUS_API_KEY is set, else openrouter),
openrouter/*, custom/*. Unknown prefix → "auto".
- install.sh (bare-host path) + start.sh (Docker path): both now
source scripts/derive-provider.sh to populate $PROVIDER before
writing ~/.hermes/config.yaml.
- Dockerfile: COPY scripts/ /app/scripts/ so start.sh can find the
sub-script at the same path it lives on EC2 (/opt/adapter/scripts/
via git clone).
This is the cross-path pattern the workspace-backends design doc
calls for: each template's backend-specific entrypoints share
common sub-recipes, so evolving the logic lives in one place.
Tested locally:
minimax/MiniMax-M2.7-highspeed → minimax
anthropic/claude-sonnet-4-5 → anthropic
openai/gpt-5-mini → openrouter
nousresearch/hermes-4-70b (w/ key) → nous
nousresearch/hermes-4-70b (no key) → openrouter
zai/glm-4.6 → zai
gemini/gemini-2.5-pro → gemini
bogus/model-v1 → auto
HERMES_INFERENCE_PROVIDER=custom (+anything) → custom
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>