molecule-ai-workspace-templ.../install.sh
Hongming Wang ae3bf3196e fix(hermes): align env vars with upstream + add 12 missing providers
Three small but real cleanups against hermes-agent v0.12.0
(NousResearch/hermes-agent, 2026-04-30):

1. Rename HERMES_DEFAULT_MODEL -> HERMES_INFERENCE_MODEL (upstream's
   actual env name). Reads BOTH for one release cycle so workspace-server
   (which still writes the legacy name) doesn't break — drop the legacy
   fallback after workspace-server is updated in a follow-up PR.

2. Drop HERMES_API_KEY from start.sh's .env heredoc. That var only feeds
   hermes-agent's TUI gateway bridge, NOT any LLM provider. Provider
   credentials go through OPENROUTER_API_KEY / OPENAI_API_KEY / etc.

3. Add 12 missing provider prefixes to derive-provider.sh so model slugs
   like xai/grok-4, bedrock/anthropic.claude-sonnet-4, lmstudio/local,
   copilot/gpt-4o, etc., route to the correct provider instead of
   falling through to "auto".

New tests/test_derive_provider.sh — 26 sh-style assertions covering the
legacy fallback, the precedence rule, all 12 new providers, and a few
regression cases for adjacent prefixes (minimax vs minimax-oauth, qwen
vs qwen-oauth, alibaba vs alibaba-coding-plan).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 19:23:43 -07:00

280 lines
13 KiB
Bash
Executable File

#!/usr/bin/env bash
# install.sh — set up hermes-agent on a bare-host workspace (EC2 /
# bare-metal / any OS-level install). Runs as the workspace's runtime
# user (typically `ubuntu` on EC2) AFTER molecule-ai-workspace-runtime
# has been pip-installed and this repo's *.py adapter files have been
# copied into site-packages, BEFORE molecule-runtime is started.
#
# This is the symmetric twin of start.sh. start.sh is the entrypoint
# of the Docker image used for local dev (`docker compose up`).
# install.sh is what the SaaS EC2 provisioner calls on the host.
#
# Both do the same high-level work:
# 1. Install the real hermes-agent from NousResearch/hermes-agent
# 2. Seed ~/.hermes/.env (provider keys, API_SERVER_*)
# 3. Seed ~/.hermes/config.yaml (default model + provider)
# 4. Start `hermes gateway` in the background
# 5. Wait until :8642 /health returns 200
#
# Architectural context: each workspace template ships both recipes
# because the control plane picks different code paths depending on
# backend. See internal/product/designs/workspace-backends.md for the
# manifest-driven backend-selection design that subsumes this dual
# setup.
#
# Idempotent: safe to re-run. Kills any prior gateway process for
# this user before starting a fresh one.
set -euo pipefail
HERMES_HOME="$HOME/.hermes"
LOG_FILE="/var/log/hermes-gateway.log"
API_SERVER_PORT="${API_SERVER_PORT:-8642}"
API_SERVER_HOST="${API_SERVER_HOST:-127.0.0.1}"
echo "[install.sh] hermes bare-host setup starting (user=$USER, home=$HOME)"
# --- System deps (idempotent) ---
# hermes-agent installer pulls a Node 22 .tar.xz and builds some
# Python deps from source. Ubuntu EC2 AMI ships without xz or gcc.
if ! command -v xz >/dev/null 2>&1 || ! command -v gcc >/dev/null 2>&1; then
echo "[install.sh] installing system deps (xz-utils + build-essential)..."
sudo apt-get update -qq
sudo DEBIAN_FRONTEND=noninteractive apt-get install -y -qq --no-install-recommends \
curl ca-certificates git xz-utils build-essential
fi
# --- Install hermes-agent (only if not already present) ---
# Installer places `hermes` at ~/.local/bin/hermes (symlink to
# ~/.hermes/hermes-agent/venv/bin/hermes). --skip-setup avoids the
# interactive wizard.
if ! command -v hermes >/dev/null 2>&1 && [ ! -x "$HOME/.local/bin/hermes" ]; then
echo "[install.sh] installing hermes-agent from NousResearch..."
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh \
| bash -s -- --skip-setup
fi
export PATH="$HOME/.local/bin:$PATH"
# --- Ensure hermes home exists ---
mkdir -p "$HERMES_HOME"
# --- Generate API_SERVER_KEY if not already set in env ---
# hermes-agent requires a bearer for the api-server platform. The
# molecule_runtime adapter (executor.py) reads this same var at
# request time to auth against the gateway.
if [ -z "${API_SERVER_KEY:-}" ]; then
API_SERVER_KEY="$(head -c 32 /dev/urandom | base64 | tr -d '/+=' | head -c 40)"
export API_SERVER_KEY
fi
# --- FIX #12: export API_SERVER_KEY into /etc/environment so ---
# --- molecule-runtime (started by user-data sudo before this ---
# --- install.sh ran) can read it on its next restart, AND so ---
# --- any future login shell / systemd service inherits it. ---
#
# Root cause: molecule-runtime's executor.py does
# `os.environ.get("API_SERVER_KEY")` — if empty, no Authorization
# header is sent to the gateway → gateway returns 401 → the
# confusing `[hermes-agent error 401] Invalid API key` surfaces
# in A2A responses.
#
# We need BOTH:
# 1. /etc/environment for the systemd-launched runtime (no shell)
# 2. /etc/profile.d/ for any future sudo -i or login shell
#
# The runtime is already running at this point (started by
# user-data). We signal a restart at the end of install.sh so it
# re-reads the env.
if [ -w /etc/environment ] || sudo test -w /etc/environment 2>/dev/null; then
# Remove any prior entry (idempotent) then append fresh.
sudo sed -i '/^API_SERVER_KEY=/d' /etc/environment
echo "API_SERVER_KEY=${API_SERVER_KEY}" | sudo tee -a /etc/environment >/dev/null
fi
sudo tee /etc/profile.d/hermes-api-key.sh >/dev/null <<EOF
# Generated by template-hermes install.sh. Makes API_SERVER_KEY
# available to sudo/login shells so molecule-runtime (which reads
# os.environ) can auth against the local hermes gateway.
export API_SERVER_KEY="${API_SERVER_KEY}"
EOF
sudo chmod 644 /etc/profile.d/hermes-api-key.sh
echo "[install.sh] exported API_SERVER_KEY to /etc/environment + /etc/profile.d/"
# --- Write hermes-agent .env ---
# Every provider key the workspace's process env carries is forwarded.
# CP's provisioner injects these from Secrets Manager + the per-tenant
# shared secret bundle before this script runs.
cat >"$HERMES_HOME/.env" <<EOF
API_SERVER_ENABLED=true
API_SERVER_KEY=${API_SERVER_KEY}
API_SERVER_HOST=${API_SERVER_HOST}
API_SERVER_PORT=${API_SERVER_PORT}
${HERMES_INFERENCE_PROVIDER:+HERMES_INFERENCE_PROVIDER=${HERMES_INFERENCE_PROVIDER}}
${HERMES_AUXILIARY_PROVIDER:+HERMES_AUXILIARY_PROVIDER=${HERMES_AUXILIARY_PROVIDER}}
${HERMES_API_KEY:+HERMES_API_KEY=${HERMES_API_KEY}}
${NOUS_API_KEY:+NOUS_API_KEY=${NOUS_API_KEY}}
${OPENROUTER_API_KEY:+OPENROUTER_API_KEY=${OPENROUTER_API_KEY}}
${OPENAI_API_KEY:+OPENAI_API_KEY=${OPENAI_API_KEY}}
${ANTHROPIC_API_KEY:+ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}}
${GEMINI_API_KEY:+GEMINI_API_KEY=${GEMINI_API_KEY}}
${GOOGLE_API_KEY:+GOOGLE_API_KEY=${GOOGLE_API_KEY}}
${DEEPSEEK_API_KEY:+DEEPSEEK_API_KEY=${DEEPSEEK_API_KEY}}
${GLM_API_KEY:+GLM_API_KEY=${GLM_API_KEY}}
${KIMI_API_KEY:+KIMI_API_KEY=${KIMI_API_KEY}}
${KIMI_CN_API_KEY:+KIMI_CN_API_KEY=${KIMI_CN_API_KEY}}
${MINIMAX_API_KEY:+MINIMAX_API_KEY=${MINIMAX_API_KEY}}
${MINIMAX_CN_API_KEY:+MINIMAX_CN_API_KEY=${MINIMAX_CN_API_KEY}}
${DASHSCOPE_API_KEY:+DASHSCOPE_API_KEY=${DASHSCOPE_API_KEY}}
${XIAOMI_API_KEY:+XIAOMI_API_KEY=${XIAOMI_API_KEY}}
${ARCEEAI_API_KEY:+ARCEEAI_API_KEY=${ARCEEAI_API_KEY}}
${NVIDIA_API_KEY:+NVIDIA_API_KEY=${NVIDIA_API_KEY}}
${OLLAMA_API_KEY:+OLLAMA_API_KEY=${OLLAMA_API_KEY}}
${HF_TOKEN:+HF_TOKEN=${HF_TOKEN}}
${AI_GATEWAY_API_KEY:+AI_GATEWAY_API_KEY=${AI_GATEWAY_API_KEY}}
${KILOCODE_API_KEY:+KILOCODE_API_KEY=${KILOCODE_API_KEY}}
${OPENCODE_ZEN_API_KEY:+OPENCODE_ZEN_API_KEY=${OPENCODE_ZEN_API_KEY}}
${OPENCODE_GO_API_KEY:+OPENCODE_GO_API_KEY=${OPENCODE_GO_API_KEY}}
${COPILOT_GITHUB_TOKEN:+COPILOT_GITHUB_TOKEN=${COPILOT_GITHUB_TOKEN}}
${GH_TOKEN:+GH_TOKEN=${GH_TOKEN}}
EOF
chmod 600 "$HERMES_HOME/.env"
# --- Write hermes-agent config.yaml ---
# Unconditional overwrite — the hermes installer drops its
# cli-config.yaml.example here as config.yaml which defaults to
# anthropic/claude-opus-4.6 + provider:auto. Our bridge needs
# deterministic routing.
#
# Provider derivation: scripts/derive-provider.sh looks at the model
# slug prefix and sets $PROVIDER accordingly (minimax/* → minimax,
# anthropic/* → anthropic, openai/* → openrouter (hermes has no
# direct openai provider), nousresearch/* → nous-or-openrouter based
# on keys present, etc.). Explicit HERMES_INFERENCE_PROVIDER in the
# env always wins.
# Read BOTH HERMES_INFERENCE_MODEL (upstream's actual env var, see
# NousResearch/hermes-agent website/docs/reference/environment-variables.md)
# AND HERMES_DEFAULT_MODEL (legacy name we invented before 2026-05).
# Workspace-server still writes the legacy name during the migration
# window — accepting both keeps existing provisions green.
DEFAULT_MODEL="${HERMES_INFERENCE_MODEL:-${HERMES_DEFAULT_MODEL:-nousresearch/hermes-4-70b}}"
HERMES_INFERENCE_MODEL="${DEFAULT_MODEL}" \
. "$(dirname "$0")/scripts/derive-provider.sh"
# --- OpenAI bridge: PROVIDER=custom + chat_completions api_mode ---
#
# hermes-agent does NOT have a native "openai" provider in its registry
# (valid list includes: ai-gateway, anthropic, openrouter, nous, custom,
# minimax, kimi, etc — but NOT "openai"). See `hermes doctor` output for
# the authoritative list. Therefore the OpenAI bridge uses `custom`
# provider pointed at api.openai.com.
#
# Two independent concerns:
#
# (A) Auto-fill HERMES_CUSTOM_{BASE_URL,API_KEY,API_MODE} when the
# operator hasn't — so the common case (just OPENAI_API_KEY) Just
# Works. Operators who configure HERMES_CUSTOM_* themselves (vLLM,
# LM Studio, a custom OpenAI-compat gateway) skip this step and
# keep their own values.
#
# (B) Strip the `openai/` provider prefix from DEFAULT_MODEL when the
# final request target is api.openai.com, regardless of who set
# HERMES_CUSTOM_BASE_URL. OpenAI rejects `openai/gpt-4o` with 400
# "invalid model ID" — it expects bare `gpt-4o`. This concern is
# independent of who configured the routing: if requests land at
# api.openai.com, the prefix must go.
#
# These were bundled in a single guard before 2026-04-24, which caused:
# operators who pinned HERMES_CUSTOM_{BASE_URL,API_KEY,API_MODE} to
# api.openai.com (e.g. the molecule-core E2E test after PR #1987) got
# the right routing but not the prefix strip, and hit OpenAI 400.
# Separating (A) and (B) makes both paths correct.
# (A) auto-fill defaults when operator hasn't configured custom
if [ "${PROVIDER}" = "custom" ] && [ -n "${OPENAI_API_KEY:-}" ] && [ -z "${HERMES_CUSTOM_BASE_URL:-}" ] && [ -z "${HERMES_CUSTOM_API_KEY:-}" ]; then
export HERMES_CUSTOM_BASE_URL="https://api.openai.com/v1"
export HERMES_CUSTOM_API_KEY="${OPENAI_API_KEY}"
export HERMES_CUSTOM_API_MODE="chat_completions"
echo "[install.sh] bridged OPENAI_API_KEY → custom provider @ api.openai.com (api_mode=chat_completions)"
fi
# (B) strip the openai/ prefix ONLY when the final URL is api.openai.com.
# The regex intentionally anchors to start + requires a `/` or end-of-string
# after `api.openai.com` so lookalike domains (api.openai.com.evil.internal)
# do not match. Idempotent when there's no prefix.
if [[ "${HERMES_CUSTOM_BASE_URL:-}" =~ ^https?://api\.openai\.com(/|$) ]]; then
BEFORE="${DEFAULT_MODEL}"
DEFAULT_MODEL="${DEFAULT_MODEL#openai/}"
if [ "${BEFORE}" != "${DEFAULT_MODEL}" ]; then
echo "[install.sh] stripped openai/ prefix → model=${DEFAULT_MODEL} (routing to api.openai.com)"
fi
fi
{
echo "# Seeded by template-hermes install.sh on $(date -u -Iseconds)"
echo "# Rewritten each boot from HERMES_DEFAULT_MODEL + HERMES_INFERENCE_PROVIDER env."
echo "model:"
echo " default: \"${DEFAULT_MODEL}\""
echo " provider: \"${PROVIDER}\""
if [ -n "${HERMES_CUSTOM_BASE_URL:-}" ]; then
echo " base_url: \"${HERMES_CUSTOM_BASE_URL}\""
fi
if [ -n "${HERMES_CUSTOM_API_KEY:-}" ]; then
echo " api_key: \"${HERMES_CUSTOM_API_KEY}\""
fi
# api_mode gates hermes's custom-provider request shape:
# chat_completions → POST /v1/chat/completions (OpenAI-compat, default)
# codex_responses → POST /v1/responses + include=[encrypted_content]
# Emit the field only when explicitly set — absent = hermes auto-detect.
if [ -n "${HERMES_CUSTOM_API_MODE:-}" ]; then
echo " api_mode: \"${HERMES_CUSTOM_API_MODE}\""
fi
} >"$HERMES_HOME/config.yaml"
# --- Prepare gateway log ---
# /var/log needs root to create the file; chown to runtime user so
# the gateway (running as that user) can append to it.
sudo touch "$LOG_FILE"
sudo chown "$USER:$USER" "$LOG_FILE"
# --- Kill prior gateway, start fresh ---
if pgrep -u "$USER" -f "hermes gateway" >/dev/null 2>&1; then
echo "[install.sh] killing prior hermes gateway process(es) for $USER..."
pkill -u "$USER" -f "hermes gateway" 2>/dev/null || true
sleep 2
fi
echo "[install.sh] starting hermes gateway in background..."
# `bash -lc` forces a login shell so .bashrc's PATH export is picked
# up for the `hermes` binary. `cd $HOME` is defensive — hermes writes
# relative state if CWD is unusual.
nohup bash -lc "cd $HOME && exec hermes gateway" >>"$LOG_FILE" 2>&1 &
GATEWAY_PID=$!
# --- FIX #12 follow-up: signal molecule-runtime to restart ---
# The runtime was started by user-data BEFORE install.sh exported
# API_SERVER_KEY. Kill it so user-data's retry / systemd respawn
# picks it up from /etc/environment on next launch. Safe no-op if
# the runtime hasn't been started yet.
if pgrep -u "$USER" -f molecule-runtime >/dev/null 2>&1; then
echo "[install.sh] signalling molecule-runtime to restart (pick up new API_SERVER_KEY)..."
pkill -u "$USER" -f molecule-runtime 2>/dev/null || true
fi
# --- Wait for :8642 readiness (max 120s) ---
READY_TIMEOUT=120
for _ in $(seq 1 $READY_TIMEOUT); do
if curl -fsS "http://${API_SERVER_HOST}:${API_SERVER_PORT}/health" >/dev/null 2>&1; then
echo "[install.sh] hermes gateway ready on :${API_SERVER_PORT} (pid ${GATEWAY_PID})"
exit 0
fi
if ! kill -0 "$GATEWAY_PID" 2>/dev/null; then
echo "[install.sh] hermes gateway exited during boot. Last log lines:" >&2
tail -40 "$LOG_FILE" >&2 || true
exit 1
fi
sleep 1
done
echo "[install.sh] hermes gateway failed to reach /health within ${READY_TIMEOUT}s." >&2
tail -80 "$LOG_FILE" >&2 || true
exit 1