Commit Graph

7 Commits

Author SHA1 Message Date
Hongming Wang
ae3bf3196e fix(hermes): align env vars with upstream + add 12 missing providers
Three small but real cleanups against hermes-agent v0.12.0
(NousResearch/hermes-agent, 2026-04-30):

1. Rename HERMES_DEFAULT_MODEL -> HERMES_INFERENCE_MODEL (upstream's
   actual env name). Reads BOTH for one release cycle so workspace-server
   (which still writes the legacy name) doesn't break — drop the legacy
   fallback after workspace-server is updated in a follow-up PR.

2. Drop HERMES_API_KEY from start.sh's .env heredoc. That var only feeds
   hermes-agent's TUI gateway bridge, NOT any LLM provider. Provider
   credentials go through OPENROUTER_API_KEY / OPENAI_API_KEY / etc.

3. Add 12 missing provider prefixes to derive-provider.sh so model slugs
   like xai/grok-4, bedrock/anthropic.claude-sonnet-4, lmstudio/local,
   copilot/gpt-4o, etc., route to the correct provider instead of
   falling through to "auto".

New tests/test_derive_provider.sh — 26 sh-style assertions covering the
legacy fallback, the precedence rule, all 12 new providers, and a few
regression cases for adjacent prefixes (minimax vs minimax-oauth, qwen
vs qwen-oauth, alibaba vs alibaba-coding-plan).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 19:23:43 -07:00
Hongming Wang
96c25fd168 test(e2e): full-chain local validation against real hermes gateway subprocess
Spawns a real hermes gateway run + a stub OpenAI-compat LLM server +
the real executor's reply server, and routes a message through every
hop of the production chain except platform-side peer-message routing:

  HermesAgentProxyExecutor.execute()
    → POST /a2a/inbound (hermes plugin)
      → MessageEvent dispatch through hermes pipeline
        → stub LLM /v1/chat/completions
      → plugin send() POSTs reply to executor /a2a/reply
    → execute() Future resolves → emits on event_queue

This is the highest-fidelity local approximation of staging E2E.
Caught a real KeyError in upstream hermes hermes_cli/tools_config.py
that no in-process test surfaced. Asserts the wire shape works end to
end + guards against the KeyError regression. The reply CONTENT
depends on whether the stub speaks hermes' multi-turn tool loop, so
we don't assert on it — what matters is the full pipeline routes
through the plugin and back.

Run:
  /Users/hongming/.hermes/hermes-agent/venv/bin/python3 \\
      scripts/e2e_full_chain.py

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 04:08:27 -07:00
Hongming Wang
43c2569faa fix(start.sh): read provider/model from /configs/config.yaml (Option B PR-4)
Closes the gap where CP user-data (PR-3, task #197) writes
runtime_config.{model,provider} into /configs/config.yaml but start.sh
only reads HERMES_DEFAULT_MODEL / HERMES_INFERENCE_PROVIDER env vars
that CP doesn't set. Result: every CP-provisioned hermes workspace
booted with the built-in `nousresearch/hermes-4-70b` default and
500'd at first prompt with "No LLM provider configured" — visible
in the 2026-04-30 hongmingwang tenant screenshots.

New scripts/load-workspace-config.sh, sourced by start.sh before the
existing DEFAULT_MODEL/PROVIDER derivation. Reads /configs/config.yaml
via python3 + PyYAML and exports HERMES_DEFAULT_MODEL +
HERMES_INFERENCE_PROVIDER if they're not already set.

Precedence (highest to lowest):
  1. HERMES_* env vars (operator override via workspace secrets)
  2. /configs/config.yaml runtime_config.{model,provider} (canvas UI)
  3. start.sh hard-coded fallback (nousresearch/hermes-4-70b)

Resilience:
  - Missing config.yaml → silent skip (dev containers)
  - Malformed YAML → silent skip (don't kill boot)
  - python3 missing → silent skip
  - PyYAML missing → silent skip
  - Empty/non-dict runtime_config → silent skip

Tests: scripts/test-load-workspace-config.sh — 11 cases covering all
silent-skip paths + happy paths + operator override + non-string scalar
coercion. Existing scripts/test-derive-provider.sh (12 cases) re-verified.

Wires shell tests into CI via a new shell-tests job — those tests
weren't running anywhere before, opportunistically closes that gap.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 23:52:00 -07:00
Hongming Wang
dbf86a142e fix(install): decouple openai/ prefix strip from HERMES_CUSTOM_* auto-fill
The install.sh "OpenAI bridge" block bundled two distinct concerns under
one guard:

  1. Auto-fill HERMES_CUSTOM_{BASE_URL,API_KEY,API_MODE} when operator
     didn't set them
  2. Strip the openai/ prefix from DEFAULT_MODEL (OpenAI rejects prefixed
     model IDs with 400 "invalid model ID")

Both only fired when the operator had NOT pre-configured HERMES_CUSTOM_*.
That broke molecule-core#1987: the staging E2E now pins HERMES_CUSTOM_*
explicitly (to work around derive-provider.sh's #19 fix not reaching all
tenants). The pin skipped concern (1) intentionally — but also skipped
(2) unintentionally. Result: E2E routes to api.openai.com with the wrong
model name and hits 400.

Fix: separate the two concerns.

- (A) Auto-fill block keeps its original guard — runs only when operator
  didn't configure.
- (B) New independent block: strip openai/ iff the FINAL
  HERMES_CUSTOM_BASE_URL matches api.openai.com. Regex is anchored
  (^https?://api\.openai\.com(/|$)) so lookalike domains
  (api.openai.com.evil.internal, beta.api.openai.com) do NOT match.
  Idempotent on already-bare model names.

Verified via scripts/test-install-prefix-strip.sh — 10 cases including:
  A  default bridge strips openai/           → gpt-4o
  B  operator-pinned OpenAI URL also strips  → gpt-4o   (#1987 path, was broken)
  C  vLLM URL keeps prefix                   → openai/my-finetune
  D  openrouter keeps prefix                 → openai/gpt-4o
  E  minimax untouched                       → minimax/MiniMax-M2.7
  G  lookalike domain NOT stripped           → openai/gpt-4o (anti-spoofing)
  H  http://api.openai.com also strips       → gpt-4o
  I  subdomain beta.api.openai.com NOT strip → openai/gpt-4o

All 10 pass. Plus a parity check greps install.sh to ensure the inlined
logic matches what ships.

No behavioral change for any existing working config (scenarios A, C, D,
E, F above — all unchanged). Only fixes the broken scenario B.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 23:27:33 -07:00
Hongming Wang
2eddb36bff fix(derive-provider): prefer custom provider when OPENAI_API_KEY set for openai/* models
Root cause of the 2026-04-23 E2E A2A regression: when a workspace's
model is openai/* and the tenant has an OPENROUTER_API_KEY set globally
(common on SaaS staging), derive-provider.sh was picking PROVIDER=openrouter
even when the WORKSPACE-level secret OPENAI_API_KEY was explicitly provided
for the direct-OpenAI path.

hermes then called OpenRouter with a key that was stale/empty for this
workspace, and OR returned `{"error": {"message": "Missing Authentication
header", "code": 401}}` — which surfaced in the A2A agent reply and
failed the E2E at step 8.

Fix: flip the priority. For openai/* model slugs, prefer `custom` (direct
OpenAI via install.sh's HERMES_CUSTOM_* bridge) when OPENAI_API_KEY is
present. Fall through to `openrouter` only when OPENAI_API_KEY is absent.

Operators who want OR for openai/* models can still:
  - set HERMES_INFERENCE_PROVIDER=openrouter (wins via the explicit override at top of file), or
  - use an openrouter/* model slug

Adds scripts/test-derive-provider.sh — 12 offline shell assertions
pinning the decision table including the exact #19 regression case.

Acceptance: E2E step 8 A2A returns a real PONG reply instead of OR's
401-shaped error from the agent response.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 18:27:43 -07:00
Hongming Wang
6c55ee0ed8 fix: bridge OPENAI_API_KEY → custom provider when only OpenAI key set
Hermes-agent upstream has no direct "openai" provider — its own
.env.example states "All LLM calls go through OpenRouter". The
template's derive-provider.sh followed that: `openai/*` → `openrouter`
always. That left a real gap for workspaces whose sole credential is
an OpenAI API key (no OpenRouter subscription) — hermes would route
the call through OpenRouter, OpenRouter would reject the OpenAI key,
and the user got "401 Invalid API key" on the first A2A turn.

Hermes does expose a "custom" provider with base_url + api_key
overrides (already plumbed via HERMES_CUSTOM_BASE_URL /
HERMES_CUSTOM_API_KEY env vars). This PR teaches the template to
auto-populate that bridge when the operator ships only an OpenAI key:

  scripts/derive-provider.sh
    openai/* now maps to:
      - openrouter, when OPENROUTER_API_KEY is set (unchanged path)
      - custom,     when only OPENAI_API_KEY is set  (new path)
      - openrouter, as a no-key fallback so hermes errors clearly

  install.sh + start.sh (bare-host + Docker — kept in sync)
    When PROVIDER="custom" and HERMES_CUSTOM_* aren't explicitly set
    but OPENAI_API_KEY is, auto-set:
      HERMES_CUSTOM_BASE_URL=https://api.openai.com/v1
      HERMES_CUSTOM_API_KEY=$OPENAI_API_KEY
    The config.yaml's model.{base_url,api_key} pick these up
    verbatim so hermes calls OpenAI directly.

Existing operators with HERMES_CUSTOM_* or OPENROUTER_API_KEY set
explicitly are unchanged — the bridge only fires when *both*
HERMES_CUSTOM_BASE_URL and HERMES_CUSTOM_API_KEY are empty.

## Verified
- derive-provider shell path:
    OPENAI only   → custom
    OR set        → openrouter
    neither       → openrouter (fallback for clear error)
- bash -n on all three scripts passes
- Symmetric install.sh (bare-host) and start.sh (Docker) bridges

## Root-cause context
Incident 2026-04-23: E2E Staging SaaS reached step 8/11 (A2A call)
for the first time after CP #236 deployed the install.sh hook. A2A
failed with 401 because staging CI only has MOLECULE_STAGING_OPENAI_KEY
(no OR key), and the slug routed through OpenRouter. This PR closes
that gap so an OpenAI-only workspace is usable out of the box.
2026-04-23 01:28:40 -07:00
Hongming Wang
16de6351dd feat: derive hermes provider from model slug prefix (both paths)
When HERMES_INFERENCE_PROVIDER isn't set, hermes-agent's "auto"
detection is unreliable — particularly for MiniMax Token Plan + most
non-OpenRouter direct-SDK providers. Without the right provider,
hermes gateway rejects chats with "No LLM provider configured" or
routes to the wrong backend.

This is the last piece of the MVP pipeline that makes "customer
creates workspace with model = minimax/MiniMax-M2.7-highspeed + key
= sk-cp-..., just works" a one-click flow.

Changes:

- scripts/derive-provider.sh (NEW): sourced sub-script that reads
  HERMES_DEFAULT_MODEL and sets $PROVIDER to the right value for
  every provider hermes-agent supports. Explicit HERMES_INFERENCE_PROVIDER
  in the env wins; otherwise the case statement handles: minimax/*,
  minimax-cn/*, anthropic/*, gemini/*, deepseek/*, zai/*, kimi-coding/*
  (+ cn variant), alibaba/ dashscope/ qwen/*, xiaomi/ mimo/*, arcee/*,
  nvidia/ nim/*, ollama-cloud/*, huggingface/ hf/*, ai-gateway/*,
  kilocode/*, opencode-zen/*, opencode-go/*, openai/* (routes through
  openrouter — hermes has no direct openai provider), nousresearch/*
  (nous if HERMES_API_KEY or NOUS_API_KEY is set, else openrouter),
  openrouter/*, custom/*. Unknown prefix → "auto".

- install.sh (bare-host path) + start.sh (Docker path): both now
  source scripts/derive-provider.sh to populate $PROVIDER before
  writing ~/.hermes/config.yaml.

- Dockerfile: COPY scripts/ /app/scripts/ so start.sh can find the
  sub-script at the same path it lives on EC2 (/opt/adapter/scripts/
  via git clone).

This is the cross-path pattern the workspace-backends design doc
calls for: each template's backend-specific entrypoints share
common sub-recipes, so evolving the logic lives in one place.

Tested locally:
  minimax/MiniMax-M2.7-highspeed        → minimax
  anthropic/claude-sonnet-4-5           → anthropic
  openai/gpt-5-mini                     → openrouter
  nousresearch/hermes-4-70b (w/ key)    → nous
  nousresearch/hermes-4-70b (no key)    → openrouter
  zai/glm-4.6                           → zai
  gemini/gemini-2.5-pro                 → gemini
  bogus/model-v1                        → auto
  HERMES_INFERENCE_PROVIDER=custom (+anything) → custom

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 16:16:27 -07:00