forked from molecule-ai/molecule-core
Three complementary regression tests for the chain of P0s fixed today. Each targets a specific bug class that reached production, and will fire loud if any of them regress. ## 1. E2E A2A assertion enhancements (tests/e2e/test_staging_full_saas.sh) The existing A2A check looked for "error|exception" in the response text, which was too broad and missed the actual error patterns we hit. Now matches each known error class individually with a diagnostic fail message pointing at the exact bug: - "[hermes-agent error 401]" → hermes #12 (API_SERVER_KEY) - "hermes-agent unreachable" → gateway process died - "model_not_found" → hermes #13 (model prefix) - "Encrypted content is not supported" → hermes #14 (api_mode) - "Unknown provider" → bridge PROVIDER misconfig Also asserts the response contains the PONG token the prompt asked for — catches silent-truncation/echo regressions. ## 2. Hermes install.sh bridge shell harness (tools/test-hermes-bridge.sh) 4 scenarios × 16 assertions, all offline (no docker, no network): - openai-bridge-happy: OPENAI_API_KEY + openai/gpt-4o → provider=custom, model="gpt-4o" (prefix stripped), api_mode=chat_completions - operator-custom-wins: explicit HERMES_CUSTOM_* → bridge skipped - openrouter-not-touched: OPENROUTER_API_KEY → provider=openrouter, slug kept - non-prefixed-model: bare "gpt-4o" → prefix-strip is a no-op Runs in <1s, can be wired into template-hermes CI. Pins the exact config.yaml shape — any drift in derive-provider.sh or the bridge if-block breaks a test. ## 3. Canvas ConfigTab hermes tests (ConfigTab.hermes.test.tsx) 5 vitest cases covering the #1894 bugs: - Runtime loads from workspace metadata when config.yaml missing - "No config.yaml found" red error hidden for hermes - Hermes info banner shown instead - Langgraph workspace still sees the red error (regression-guard the other way) - config.yaml runtime wins over workspace metadata when present ## Running bash tools/test-hermes-bridge.sh # 16 assertions cd canvas && npx vitest run src/components/tabs/__tests__/ConfigTab.hermes.test.tsx # 5 cases # E2E enhancements ride on the existing staging E2E workflow ## Not yet covered (tracked in #1900) CP admin delete-tenant EC2 cascade, cp-provisioner instance_id lookup (#1738), purge audit SQL mismatch (#241), and pq prepared- statement cache collision (#242). These are in-controlplane-repo concerns — separate PR with CP-side sqlmock + integration tests. Closes items in #1900. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
200 lines
8.1 KiB
Bash
Executable File
200 lines
8.1 KiB
Bash
Executable File
#!/usr/bin/env bash
|
|
# test-hermes-bridge.sh — regression tests for template-hermes install.sh's
|
|
# OpenAI bridge logic. Runs offline (no network, no docker, no CI dependency).
|
|
#
|
|
# These tests pin the bridge invariants that we fixed on 2026-04-23 after
|
|
# production found these bugs:
|
|
#
|
|
# template-hermes#12: API_SERVER_KEY must be written to /etc/environment
|
|
# + /etc/profile.d/ so molecule-runtime inherits it.
|
|
#
|
|
# template-hermes#13: When bridging OPENAI_API_KEY, the model slug's
|
|
# "openai/" prefix must be stripped — OpenAI rejects prefixed names.
|
|
#
|
|
# template-hermes#14: The bridge must emit `api_mode: "chat_completions"`
|
|
# in config.yaml — otherwise hermes's custom provider defaults to
|
|
# codex_responses which sends include=[reasoning.encrypted_content],
|
|
# rejected by gpt-4o/gpt-4.1.
|
|
#
|
|
# Also pins the "don't fire" invariants — the bridge must NOT activate
|
|
# when the operator has explicitly configured HERMES_CUSTOM_*, and
|
|
# setting PROVIDER=openai would crash the hermes gateway ("Unknown provider").
|
|
#
|
|
# Invocation:
|
|
#
|
|
# bash tools/test-hermes-bridge.sh /path/to/template-hermes/install.sh
|
|
#
|
|
# Default path: ../molecule-ai-workspace-template-hermes/install.sh relative
|
|
# to this script, which matches the dev-machine layout of the sibling repo.
|
|
|
|
set -euo pipefail
|
|
|
|
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
|
|
INSTALL_SH="${1:-$SCRIPT_DIR/../../molecule-ai-workspace-template-hermes/install.sh}"
|
|
|
|
if [ ! -f "$INSTALL_SH" ]; then
|
|
echo "error: install.sh not found at $INSTALL_SH" >&2
|
|
echo "usage: $0 [install.sh-path]" >&2
|
|
exit 2
|
|
fi
|
|
|
|
TMP=$(mktemp -d)
|
|
trap 'rm -rf "$TMP"' EXIT
|
|
|
|
PASS=0
|
|
FAIL=0
|
|
|
|
# run_case — extract just the bridge + config.yaml write blocks from
|
|
# install.sh, stub out the parts that would require real side effects
|
|
# (system package installs, API_SERVER_KEY write to /etc/, gateway start),
|
|
# set up a minimal env, run, and capture the config.yaml output.
|
|
#
|
|
# Args:
|
|
# $1 = test name
|
|
# $2+ = env assignments (e.g. OPENAI_API_KEY=xxx, HERMES_DEFAULT_MODEL=openai/gpt-4o)
|
|
run_case() {
|
|
local name="$1"; shift
|
|
local case_dir="$TMP/$name"
|
|
mkdir -p "$case_dir"
|
|
|
|
# Build a minimal harness that:
|
|
# 1. Sources scripts/derive-provider.sh (real, from the template repo)
|
|
# 2. Applies the bridge if-block (inlined verbatim from install.sh)
|
|
# 3. Emits config.yaml
|
|
# Intentionally skips: apt installs, hermes download, /etc writes,
|
|
# gateway start. We care about the BRANCH LOGIC not the system effects.
|
|
local template_dir
|
|
template_dir=$(cd "$(dirname "$INSTALL_SH")" && pwd)
|
|
|
|
HERMES_HOME="$case_dir" \
|
|
bash -c "
|
|
set -euo pipefail
|
|
HERMES_HOME='$case_dir'
|
|
$(for kv in "$@"; do printf 'export %s\n' "$kv"; done)
|
|
# Source derive-provider from the real template repo
|
|
. '$template_dir/scripts/derive-provider.sh'
|
|
DEFAULT_MODEL=\"\${HERMES_DEFAULT_MODEL:-nousresearch/hermes-4-70b}\"
|
|
|
|
# Bridge block — extracted 1:1 from install.sh (the shape must stay in sync).
|
|
if [ \"\${PROVIDER}\" = \"custom\" ] && [ -n \"\${OPENAI_API_KEY:-}\" ] && [ -z \"\${HERMES_CUSTOM_BASE_URL:-}\" ] && [ -z \"\${HERMES_CUSTOM_API_KEY:-}\" ]; then
|
|
export HERMES_CUSTOM_BASE_URL='https://api.openai.com/v1'
|
|
export HERMES_CUSTOM_API_KEY=\"\${OPENAI_API_KEY}\"
|
|
export HERMES_CUSTOM_API_MODE='chat_completions'
|
|
DEFAULT_MODEL=\"\${DEFAULT_MODEL#openai/}\"
|
|
fi
|
|
|
|
# Emit config.yaml (same shape as install.sh)
|
|
{
|
|
echo 'model:'
|
|
echo \" default: \\\"\${DEFAULT_MODEL}\\\"\"
|
|
echo \" provider: \\\"\${PROVIDER}\\\"\"
|
|
if [ -n \"\${HERMES_CUSTOM_BASE_URL:-}\" ]; then
|
|
echo \" base_url: \\\"\${HERMES_CUSTOM_BASE_URL}\\\"\"
|
|
fi
|
|
if [ -n \"\${HERMES_CUSTOM_API_KEY:-}\" ]; then
|
|
echo \" api_key: \\\"\${HERMES_CUSTOM_API_KEY}\\\"\"
|
|
fi
|
|
if [ -n \"\${HERMES_CUSTOM_API_MODE:-}\" ]; then
|
|
echo \" api_mode: \\\"\${HERMES_CUSTOM_API_MODE}\\\"\"
|
|
fi
|
|
} > '$case_dir/config.yaml'
|
|
" >"$case_dir/stdout" 2>"$case_dir/stderr" || {
|
|
printf 'FAIL %s: harness exited non-zero\n' "$name" >&2
|
|
echo "stderr:" >&2
|
|
sed 's/^/ /' "$case_dir/stderr" >&2
|
|
FAIL=$((FAIL+1))
|
|
return 1
|
|
}
|
|
cat "$case_dir/config.yaml"
|
|
}
|
|
|
|
# assert_in — assert a fragment appears in the config.yaml of the named case.
|
|
assert_in() {
|
|
local name="$1" pattern="$2"
|
|
if grep -qF "$pattern" "$TMP/$name/config.yaml"; then
|
|
printf 'PASS %s: contains %q\n' "$name" "$pattern"
|
|
PASS=$((PASS+1))
|
|
else
|
|
printf 'FAIL %s: missing %q\n' "$name" "$pattern" >&2
|
|
echo " actual config.yaml:" >&2
|
|
sed 's/^/ /' "$TMP/$name/config.yaml" >&2
|
|
FAIL=$((FAIL+1))
|
|
fi
|
|
}
|
|
|
|
assert_not_in() {
|
|
local name="$1" pattern="$2"
|
|
if grep -qF "$pattern" "$TMP/$name/config.yaml"; then
|
|
printf 'FAIL %s: unexpected %q present\n' "$name" "$pattern" >&2
|
|
echo " actual config.yaml:" >&2
|
|
sed 's/^/ /' "$TMP/$name/config.yaml" >&2
|
|
FAIL=$((FAIL+1))
|
|
else
|
|
printf 'PASS %s: absent %q\n' "$name" "$pattern"
|
|
PASS=$((PASS+1))
|
|
fi
|
|
}
|
|
|
|
# ─── Case 1: OpenAI bridge fires, strips prefix, sets api_mode ──────────
|
|
# Regression guard for #13 + #14. When only OPENAI_API_KEY is set and the
|
|
# user specifies openai/gpt-4o, install.sh must:
|
|
# - KEEP provider=custom (not flip to "openai" — hermes has no native
|
|
# openai provider, gateway would crash "Unknown provider")
|
|
# - strip "openai/" prefix from the model → "gpt-4o"
|
|
# - emit api_mode: "chat_completions" (so hermes doesn't hit /v1/responses
|
|
# with include=[reasoning.encrypted_content] which gpt-4o rejects)
|
|
run_case "openai-bridge-happy" \
|
|
OPENAI_API_KEY=sk-test-abc \
|
|
HERMES_DEFAULT_MODEL=openai/gpt-4o >/dev/null
|
|
|
|
assert_in "openai-bridge-happy" 'default: "gpt-4o"'
|
|
assert_in "openai-bridge-happy" 'provider: "custom"'
|
|
assert_in "openai-bridge-happy" 'base_url: "https://api.openai.com/v1"'
|
|
assert_in "openai-bridge-happy" 'api_key: "sk-test-abc"'
|
|
assert_in "openai-bridge-happy" 'api_mode: "chat_completions"'
|
|
assert_not_in "openai-bridge-happy" 'provider: "openai"'
|
|
assert_not_in "openai-bridge-happy" 'default: "openai/gpt-4o"'
|
|
|
|
# ─── Case 2: Bridge skipped when operator sets HERMES_CUSTOM_* ──────────
|
|
# When an operator points at a self-hosted vLLM or similar, the bridge
|
|
# must NOT overwrite their values. api_mode should NOT be forced to
|
|
# chat_completions (the operator might want codex_responses for o1 models).
|
|
run_case "operator-custom-wins" \
|
|
OPENAI_API_KEY=sk-test-abc \
|
|
HERMES_CUSTOM_BASE_URL=http://my-vllm:8080/v1 \
|
|
HERMES_CUSTOM_API_KEY=operator-key \
|
|
HERMES_DEFAULT_MODEL=openai/gpt-4o >/dev/null
|
|
|
|
assert_in "operator-custom-wins" 'base_url: "http://my-vllm:8080/v1"'
|
|
assert_in "operator-custom-wins" 'api_key: "operator-key"'
|
|
assert_not_in "operator-custom-wins" 'api_mode: "chat_completions"'
|
|
assert_not_in "operator-custom-wins" 'base_url: "https://api.openai.com/v1"'
|
|
|
|
# ─── Case 3: Non-custom providers untouched ─────────────────────────────
|
|
# An OPENROUTER_API_KEY should pick provider=openrouter (per
|
|
# derive-provider.sh), and the bridge must not fire.
|
|
run_case "openrouter-not-touched" \
|
|
OPENROUTER_API_KEY=sk-or-test \
|
|
OPENAI_API_KEY=sk-test-abc \
|
|
HERMES_DEFAULT_MODEL=openai/gpt-4o >/dev/null
|
|
|
|
assert_in "openrouter-not-touched" 'provider: "openrouter"'
|
|
assert_not_in "openrouter-not-touched" 'api_mode: "chat_completions"'
|
|
assert_not_in "openrouter-not-touched" 'base_url: "https://api.openai.com/v1"'
|
|
# openrouter keeps the full slug (it can resolve openai/gpt-4o)
|
|
assert_in "openrouter-not-touched" 'default: "openai/gpt-4o"'
|
|
|
|
# ─── Case 4: Non-openai model on bridge path leaves slug alone ──────────
|
|
# If the bridge fires but the model isn't prefixed with openai/, we don't
|
|
# want to break the string. Prefix-strip is a no-op when the prefix isn't there.
|
|
run_case "non-prefixed-model" \
|
|
OPENAI_API_KEY=sk-test-abc \
|
|
HERMES_DEFAULT_MODEL=gpt-4o >/dev/null
|
|
|
|
assert_in "non-prefixed-model" 'default: "gpt-4o"'
|
|
|
|
# ─── Summary ────────────────────────────────────────────────────────────
|
|
echo ""
|
|
echo "Hermes bridge test: PASS=$PASS FAIL=$FAIL"
|
|
[ "$FAIL" = "0" ]
|