molecule-ai-workspace-template-hermes

molecule-ai/molecule-ai-workspace-template-hermes

Author	SHA1	Message	Date
Hongming Wang	ae3bf3196e	fix(hermes): align env vars with upstream + add 12 missing providers Three small but real cleanups against hermes-agent v0.12.0 (NousResearch/hermes-agent, 2026-04-30): 1. Rename HERMES_DEFAULT_MODEL -> HERMES_INFERENCE_MODEL (upstream's actual env name). Reads BOTH for one release cycle so workspace-server (which still writes the legacy name) doesn't break — drop the legacy fallback after workspace-server is updated in a follow-up PR. 2. Drop HERMES_API_KEY from start.sh's .env heredoc. That var only feeds hermes-agent's TUI gateway bridge, NOT any LLM provider. Provider credentials go through OPENROUTER_API_KEY / OPENAI_API_KEY / etc. 3. Add 12 missing provider prefixes to derive-provider.sh so model slugs like xai/grok-4, bedrock/anthropic.claude-sonnet-4, lmstudio/local, copilot/gpt-4o, etc., route to the correct provider instead of falling through to "auto". New tests/test_derive_provider.sh — 26 sh-style assertions covering the legacy fallback, the precedence rule, all 12 new providers, and a few regression cases for adjacent prefixes (minimax vs minimax-oauth, qwen vs qwen-oauth, alibaba vs alibaba-coding-plan). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 19:23:43 -07:00
Hongming Wang	96c25fd168	test(e2e): full-chain local validation against real hermes gateway subprocess Spawns a real hermes gateway run + a stub OpenAI-compat LLM server + the real executor's reply server, and routes a message through every hop of the production chain except platform-side peer-message routing: HermesAgentProxyExecutor.execute() → POST /a2a/inbound (hermes plugin) → MessageEvent dispatch through hermes pipeline → stub LLM /v1/chat/completions → plugin send() POSTs reply to executor /a2a/reply → execute() Future resolves → emits on event_queue This is the highest-fidelity local approximation of staging E2E. Caught a real KeyError in upstream hermes hermes_cli/tools_config.py that no in-process test surfaced. Asserts the wire shape works end to end + guards against the KeyError regression. The reply CONTENT depends on whether the stub speaks hermes' multi-turn tool loop, so we don't assert on it — what matters is the full pipeline routes through the plugin and back. Run: /Users/hongming/.hermes/hermes-agent/venv/bin/python3 \\ scripts/e2e_full_chain.py Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 04:08:27 -07:00
Hongming Wang	43c2569faa	fix(start.sh): read provider/model from /configs/config.yaml (Option B PR-4) Closes the gap where CP user-data (PR-3, task #197) writes runtime_config.{model,provider} into /configs/config.yaml but start.sh only reads HERMES_DEFAULT_MODEL / HERMES_INFERENCE_PROVIDER env vars that CP doesn't set. Result: every CP-provisioned hermes workspace booted with the built-in `nousresearch/hermes-4-70b` default and 500'd at first prompt with "No LLM provider configured" — visible in the 2026-04-30 hongmingwang tenant screenshots. New scripts/load-workspace-config.sh, sourced by start.sh before the existing DEFAULT_MODEL/PROVIDER derivation. Reads /configs/config.yaml via python3 + PyYAML and exports HERMES_DEFAULT_MODEL + HERMES_INFERENCE_PROVIDER if they're not already set. Precedence (highest to lowest): 1. HERMES_* env vars (operator override via workspace secrets) 2. /configs/config.yaml runtime_config.{model,provider} (canvas UI) 3. start.sh hard-coded fallback (nousresearch/hermes-4-70b) Resilience: - Missing config.yaml → silent skip (dev containers) - Malformed YAML → silent skip (don't kill boot) - python3 missing → silent skip - PyYAML missing → silent skip - Empty/non-dict runtime_config → silent skip Tests: scripts/test-load-workspace-config.sh — 11 cases covering all silent-skip paths + happy paths + operator override + non-string scalar coercion. Existing scripts/test-derive-provider.sh (12 cases) re-verified. Wires shell tests into CI via a new shell-tests job — those tests weren't running anywhere before, opportunistically closes that gap. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 23:52:00 -07:00
Hongming Wang	dbf86a142e	fix(install): decouple openai/ prefix strip from HERMES_CUSTOM_* auto-fill The install.sh "OpenAI bridge" block bundled two distinct concerns under one guard: 1. Auto-fill HERMES_CUSTOM_{BASE_URL,API_KEY,API_MODE} when operator didn't set them 2. Strip the openai/ prefix from DEFAULT_MODEL (OpenAI rejects prefixed model IDs with 400 "invalid model ID") Both only fired when the operator had NOT pre-configured HERMES_CUSTOM_. That broke molecule-core#1987: the staging E2E now pins HERMES_CUSTOM_ explicitly (to work around derive-provider.sh's #19 fix not reaching all tenants). The pin skipped concern (1) intentionally — but also skipped (2) unintentionally. Result: E2E routes to api.openai.com with the wrong model name and hits 400. Fix: separate the two concerns. - (A) Auto-fill block keeps its original guard — runs only when operator didn't configure. - (B) New independent block: strip openai/ iff the FINAL HERMES_CUSTOM_BASE_URL matches api.openai.com. Regex is anchored (^https?://api\.openai\.com(/\|$)) so lookalike domains (api.openai.com.evil.internal, beta.api.openai.com) do NOT match. Idempotent on already-bare model names. Verified via scripts/test-install-prefix-strip.sh — 10 cases including: A default bridge strips openai/ → gpt-4o B operator-pinned OpenAI URL also strips → gpt-4o (#1987 path, was broken) C vLLM URL keeps prefix → openai/my-finetune D openrouter keeps prefix → openai/gpt-4o E minimax untouched → minimax/MiniMax-M2.7 G lookalike domain NOT stripped → openai/gpt-4o (anti-spoofing) H http://api.openai.com also strips → gpt-4o I subdomain beta.api.openai.com NOT strip → openai/gpt-4o All 10 pass. Plus a parity check greps install.sh to ensure the inlined logic matches what ships. No behavioral change for any existing working config (scenarios A, C, D, E, F above — all unchanged). Only fixes the broken scenario B. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 23:27:33 -07:00
Hongming Wang	2eddb36bff	fix(derive-provider): prefer custom provider when OPENAI_API_KEY set for openai/* models Root cause of the 2026-04-23 E2E A2A regression: when a workspace's model is openai/* and the tenant has an OPENROUTER_API_KEY set globally (common on SaaS staging), derive-provider.sh was picking PROVIDER=openrouter even when the WORKSPACE-level secret OPENAI_API_KEY was explicitly provided for the direct-OpenAI path. hermes then called OpenRouter with a key that was stale/empty for this workspace, and OR returned `{"error": {"message": "Missing Authentication header", "code": 401}}` — which surfaced in the A2A agent reply and failed the E2E at step 8. Fix: flip the priority. For openai/* model slugs, prefer `custom` (direct OpenAI via install.sh's HERMES_CUSTOM_* bridge) when OPENAI_API_KEY is present. Fall through to `openrouter` only when OPENAI_API_KEY is absent. Operators who want OR for openai/* models can still: - set HERMES_INFERENCE_PROVIDER=openrouter (wins via the explicit override at top of file), or - use an openrouter/* model slug Adds scripts/test-derive-provider.sh — 12 offline shell assertions pinning the decision table including the exact #19 regression case. Acceptance: E2E step 8 A2A returns a real PONG reply instead of OR's 401-shaped error from the agent response. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 18:27:43 -07:00
Hongming Wang	6c55ee0ed8	fix: bridge OPENAI_API_KEY → custom provider when only OpenAI key set Hermes-agent upstream has no direct "openai" provider — its own .env.example states "All LLM calls go through OpenRouter". The template's derive-provider.sh followed that: `openai/` → `openrouter` always. That left a real gap for workspaces whose sole credential is an OpenAI API key (no OpenRouter subscription) — hermes would route the call through OpenRouter, OpenRouter would reject the OpenAI key, and the user got "401 Invalid API key" on the first A2A turn. Hermes does expose a "custom" provider with base_url + api_key overrides (already plumbed via HERMES_CUSTOM_BASE_URL / HERMES_CUSTOM_API_KEY env vars). This PR teaches the template to auto-populate that bridge when the operator ships only an OpenAI key: scripts/derive-provider.sh openai/ now maps to: - openrouter, when OPENROUTER_API_KEY is set (unchanged path) - custom, when only OPENAI_API_KEY is set (new path) - openrouter, as a no-key fallback so hermes errors clearly install.sh + start.sh (bare-host + Docker — kept in sync) When PROVIDER="custom" and HERMES_CUSTOM_* aren't explicitly set but OPENAI_API_KEY is, auto-set: HERMES_CUSTOM_BASE_URL=https://api.openai.com/v1 HERMES_CUSTOM_API_KEY=$OPENAI_API_KEY The config.yaml's model.{base_url,api_key} pick these up verbatim so hermes calls OpenAI directly. Existing operators with HERMES_CUSTOM_* or OPENROUTER_API_KEY set explicitly are unchanged — the bridge only fires when both HERMES_CUSTOM_BASE_URL and HERMES_CUSTOM_API_KEY are empty. ## Verified - derive-provider shell path: OPENAI only → custom OR set → openrouter neither → openrouter (fallback for clear error) - bash -n on all three scripts passes - Symmetric install.sh (bare-host) and start.sh (Docker) bridges ## Root-cause context Incident 2026-04-23: E2E Staging SaaS reached step 8/11 (A2A call) for the first time after CP #236 deployed the install.sh hook. A2A failed with 401 because staging CI only has MOLECULE_STAGING_OPENAI_KEY (no OR key), and the slug routed through OpenRouter. This PR closes that gap so an OpenAI-only workspace is usable out of the box.	2026-04-23 01:28:40 -07:00
Hongming Wang	16de6351dd	feat: derive hermes provider from model slug prefix (both paths) When HERMES_INFERENCE_PROVIDER isn't set, hermes-agent's "auto" detection is unreliable — particularly for MiniMax Token Plan + most non-OpenRouter direct-SDK providers. Without the right provider, hermes gateway rejects chats with "No LLM provider configured" or routes to the wrong backend. This is the last piece of the MVP pipeline that makes "customer creates workspace with model = minimax/MiniMax-M2.7-highspeed + key = sk-cp-..., just works" a one-click flow. Changes: - scripts/derive-provider.sh (NEW): sourced sub-script that reads HERMES_DEFAULT_MODEL and sets $PROVIDER to the right value for every provider hermes-agent supports. Explicit HERMES_INFERENCE_PROVIDER in the env wins; otherwise the case statement handles: minimax/, minimax-cn/, anthropic/, gemini/, deepseek/, zai/, kimi-coding/* (+ cn variant), alibaba/ dashscope/ qwen/, xiaomi/ mimo/, arcee/, nvidia/ nim/, ollama-cloud/, huggingface/ hf/, ai-gateway/, kilocode/, opencode-zen/, opencode-go/, openai/* (routes through openrouter — hermes has no direct openai provider), nousresearch/* (nous if HERMES_API_KEY or NOUS_API_KEY is set, else openrouter), openrouter/, custom/. Unknown prefix → "auto". - install.sh (bare-host path) + start.sh (Docker path): both now source scripts/derive-provider.sh to populate $PROVIDER before writing ~/.hermes/config.yaml. - Dockerfile: COPY scripts/ /app/scripts/ so start.sh can find the sub-script at the same path it lives on EC2 (/opt/adapter/scripts/ via git clone). This is the cross-path pattern the workspace-backends design doc calls for: each template's backend-specific entrypoints share common sub-recipes, so evolving the logic lives in one place. Tested locally: minimax/MiniMax-M2.7-highspeed → minimax anthropic/claude-sonnet-4-5 → anthropic openai/gpt-5-mini → openrouter nousresearch/hermes-4-70b (w/ key) → nous nousresearch/hermes-4-70b (no key) → openrouter zai/glm-4.6 → zai gemini/gemini-2.5-pro → gemini bogus/model-v1 → auto HERMES_INFERENCE_PROVIDER=custom (+anything) → custom Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 16:16:27 -07:00

7 Commits