fix(start.sh): read provider/model from /configs/config.yaml (Option B PR-4)
Closes the gap where CP user-data (PR-3, task #197) writes runtime_config.{model,provider} into /configs/config.yaml but start.sh only reads HERMES_DEFAULT_MODEL / HERMES_INFERENCE_PROVIDER env vars that CP doesn't set. Result: every CP-provisioned hermes workspace booted with the built-in `nousresearch/hermes-4-70b` default and 500'd at first prompt with "No LLM provider configured" — visible in the 2026-04-30 hongmingwang tenant screenshots. New scripts/load-workspace-config.sh, sourced by start.sh before the existing DEFAULT_MODEL/PROVIDER derivation. Reads /configs/config.yaml via python3 + PyYAML and exports HERMES_DEFAULT_MODEL + HERMES_INFERENCE_PROVIDER if they're not already set. Precedence (highest to lowest): 1. HERMES_* env vars (operator override via workspace secrets) 2. /configs/config.yaml runtime_config.{model,provider} (canvas UI) 3. start.sh hard-coded fallback (nousresearch/hermes-4-70b) Resilience: - Missing config.yaml → silent skip (dev containers) - Malformed YAML → silent skip (don't kill boot) - python3 missing → silent skip - PyYAML missing → silent skip - Empty/non-dict runtime_config → silent skip Tests: scripts/test-load-workspace-config.sh — 11 cases covering all silent-skip paths + happy paths + operator override + non-string scalar coercion. Existing scripts/test-derive-provider.sh (12 cases) re-verified. Wires shell tests into CI via a new shell-tests job — those tests weren't running anywhere before, opportunistically closes that gap. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
6906a3fe91
commit
43c2569faa
22
.github/workflows/ci.yml
vendored
22
.github/workflows/ci.yml
vendored
@ -3,3 +3,25 @@ on: [push, pull_request]
|
||||
jobs:
|
||||
validate:
|
||||
uses: Molecule-AI/molecule-ci/.github/workflows/validate-workspace-template.yml@main
|
||||
|
||||
shell-tests:
|
||||
# Shell-level unit tests for scripts/*.sh. Pre-existing
|
||||
# test-derive-provider.sh + test-install-prefix-strip.sh weren't wired
|
||||
# to CI before; landing the new test-load-workspace-config.sh fix
|
||||
# opportunistically closes that gap. Runs in <5s on a fresh runner.
|
||||
name: Shell unit tests
|
||||
runs-on: ubuntu-latest
|
||||
timeout-minutes: 3
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
- uses: actions/setup-python@v5
|
||||
with:
|
||||
python-version: "3.11"
|
||||
# PyYAML is required by load-workspace-config.sh's python helper.
|
||||
# In production it's transitive via molecule-ai-workspace-runtime;
|
||||
# in this minimal CI env we install it explicitly so the YAML path
|
||||
# is exercised instead of the script's silent ImportError fallback.
|
||||
- run: pip install -q pyyaml
|
||||
- run: bash scripts/test-derive-provider.sh
|
||||
- run: bash scripts/test-install-prefix-strip.sh
|
||||
- run: bash scripts/test-load-workspace-config.sh
|
||||
|
||||
118
scripts/load-workspace-config.sh
Executable file
118
scripts/load-workspace-config.sh
Executable file
@ -0,0 +1,118 @@
|
||||
#!/usr/bin/env bash
|
||||
# load-workspace-config.sh — bridge the workspace-level /configs/config.yaml
|
||||
# (written by molecule-controlplane user-data per task #197) into the
|
||||
# hermes-specific HERMES_DEFAULT_MODEL / HERMES_INFERENCE_PROVIDER env
|
||||
# vars that start.sh + derive-provider.sh consume.
|
||||
#
|
||||
# Why this exists: PR-3 in the Option B series taught CP to write
|
||||
# `runtime_config.model` and `runtime_config.provider` into
|
||||
# /configs/config.yaml at provision time so the canvas Config tab can
|
||||
# round-trip the operator's pick. start.sh used to only read the
|
||||
# HERMES_* env vars, which CP doesn't set, so the config.yaml fields
|
||||
# were silently ignored — every workspace booted with the built-in
|
||||
# `nousresearch/hermes-4-70b` default and 500'd at first prompt with
|
||||
# "No LLM provider configured" (visible in the 2026-04-30 hongmingwang
|
||||
# tenant screenshots).
|
||||
#
|
||||
# Precedence (highest to lowest):
|
||||
# 1. HERMES_DEFAULT_MODEL / HERMES_INFERENCE_PROVIDER env vars
|
||||
# (operator override via workspace secrets — they win)
|
||||
# 2. /configs/config.yaml runtime_config.{model,provider}
|
||||
# (canvas Config tab — set via UI, written by CP user-data)
|
||||
# 3. start.sh's hard-coded fallback (nousresearch/hermes-4-70b)
|
||||
#
|
||||
# Contract:
|
||||
# Reads: /configs/config.yaml (or $MOLECULE_CONFIG_PATH/config.yaml)
|
||||
# $HERMES_DEFAULT_MODEL, $HERMES_INFERENCE_PROVIDER
|
||||
# Writes: HERMES_DEFAULT_MODEL (only if unset and config.yaml has it)
|
||||
# HERMES_INFERENCE_PROVIDER (only if unset and config.yaml has it)
|
||||
#
|
||||
# Failure modes (silent — never blocks boot):
|
||||
# - /configs/config.yaml doesn't exist → no-op
|
||||
# - python3 not on PATH → no-op (start.sh's fallback still works)
|
||||
# - PyYAML not importable → no-op
|
||||
# - Malformed YAML → no-op
|
||||
# - runtime_config absent or not a dict → no-op
|
||||
#
|
||||
# Resilience over completeness — same philosophy as the claude-code
|
||||
# adapter's _load_providers fallback. A workspace with a missing or
|
||||
# malformed config.yaml should still boot and fall through to the
|
||||
# env-var/built-in defaults instead of dying at this step.
|
||||
|
||||
# Source-only safety: don't `set -e` here — this script is `.`-sourced
|
||||
# by start.sh which already has its own set -euo pipefail. Errors here
|
||||
# would otherwise kill the parent shell.
|
||||
|
||||
_lwc_config_path="${MOLECULE_CONFIG_PATH:-/configs}/config.yaml"
|
||||
|
||||
# Skip silently if the file isn't there. Workspaces booted before PR-3
|
||||
# rolled out, or non-CP-provisioned dev containers, won't have it.
|
||||
if [ ! -f "$_lwc_config_path" ]; then
|
||||
unset _lwc_config_path
|
||||
return 0 2>/dev/null || true
|
||||
fi
|
||||
|
||||
# Skip if python3 is missing — start.sh's existing logic still works.
|
||||
if ! command -v python3 >/dev/null 2>&1; then
|
||||
unset _lwc_config_path
|
||||
return 0 2>/dev/null || true
|
||||
fi
|
||||
|
||||
# Single python invocation extracts both fields and prints `key=value`
|
||||
# lines. Importing yaml inside try/except so a runtime missing PyYAML
|
||||
# (shouldn't happen in production — molecule-ai-workspace-runtime
|
||||
# brings it transitively — but defensive against dev images) yields
|
||||
# zero output instead of an error. Any exception → empty output → the
|
||||
# read loop below sets nothing and start.sh keeps its fallbacks.
|
||||
_lwc_extracted=$(MOLECULE_CONFIG_FILE="$_lwc_config_path" python3 - <<'PYEOF' 2>/dev/null
|
||||
import os, sys
|
||||
try:
|
||||
import yaml
|
||||
except ImportError:
|
||||
sys.exit(0)
|
||||
try:
|
||||
with open(os.environ["MOLECULE_CONFIG_FILE"]) as f:
|
||||
data = yaml.safe_load(f) or {}
|
||||
except Exception:
|
||||
sys.exit(0)
|
||||
rc = data.get("runtime_config") or {}
|
||||
if not isinstance(rc, dict):
|
||||
sys.exit(0)
|
||||
# Print one key=value per line; empty values omitted so the bash side
|
||||
# can use [ -n "$value" ] without ambiguity. Values are stringified so
|
||||
# non-string YAML scalars (e.g. integers) don't break the read loop.
|
||||
for env_name, yaml_key in (
|
||||
("HERMES_DEFAULT_MODEL", "model"),
|
||||
("HERMES_INFERENCE_PROVIDER", "provider"),
|
||||
):
|
||||
v = rc.get(yaml_key)
|
||||
if v is None:
|
||||
continue
|
||||
s = str(v).strip()
|
||||
if s:
|
||||
print(f"{env_name}={s}")
|
||||
PYEOF
|
||||
)
|
||||
|
||||
# Apply the extracted values, but only when the corresponding env var
|
||||
# isn't already set — operator override (env-var pre-set, e.g. via
|
||||
# workspace secrets) must beat the YAML.
|
||||
while IFS='=' read -r _lwc_key _lwc_value; do
|
||||
[ -z "$_lwc_key" ] && continue
|
||||
case "$_lwc_key" in
|
||||
HERMES_DEFAULT_MODEL)
|
||||
if [ -z "${HERMES_DEFAULT_MODEL:-}" ] && [ -n "$_lwc_value" ]; then
|
||||
export HERMES_DEFAULT_MODEL="$_lwc_value"
|
||||
echo "[load-workspace-config] HERMES_DEFAULT_MODEL=$_lwc_value (from $_lwc_config_path)" >&2
|
||||
fi
|
||||
;;
|
||||
HERMES_INFERENCE_PROVIDER)
|
||||
if [ -z "${HERMES_INFERENCE_PROVIDER:-}" ] && [ -n "$_lwc_value" ]; then
|
||||
export HERMES_INFERENCE_PROVIDER="$_lwc_value"
|
||||
echo "[load-workspace-config] HERMES_INFERENCE_PROVIDER=$_lwc_value (from $_lwc_config_path)" >&2
|
||||
fi
|
||||
;;
|
||||
esac
|
||||
done <<<"$_lwc_extracted"
|
||||
|
||||
unset _lwc_config_path _lwc_extracted _lwc_key _lwc_value
|
||||
150
scripts/test-load-workspace-config.sh
Executable file
150
scripts/test-load-workspace-config.sh
Executable file
@ -0,0 +1,150 @@
|
||||
#!/usr/bin/env bash
|
||||
# test-load-workspace-config.sh — offline unit tests for load-workspace-config.sh.
|
||||
#
|
||||
# load-workspace-config.sh has one external dep (python3 + pyyaml) but
|
||||
# both are present on every workspace image and any CI runner with
|
||||
# Python — making this cheap to exercise as a shell-level test:
|
||||
# write a fixture YAML, point MOLECULE_CONFIG_PATH at it, source the
|
||||
# script, check $HERMES_DEFAULT_MODEL / $HERMES_INFERENCE_PROVIDER.
|
||||
#
|
||||
# Pins the Option B PR-4 contract that fixed the 2026-04-30 hongmingwang
|
||||
# tenant 500: hermes start.sh must consume runtime_config.{model,provider}
|
||||
# from /configs/config.yaml so canvas Config-tab picks reach the gateway.
|
||||
#
|
||||
# Covered cases:
|
||||
# - YAML missing → no-op (silent skip)
|
||||
# - Malformed YAML → no-op (silent skip, no exit)
|
||||
# - runtime_config absent → no-op
|
||||
# - runtime_config.{model,provider} present → exported
|
||||
# - Pre-set env vars win over YAML values (operator override)
|
||||
# - Only model present (provider stays unset)
|
||||
# - Only provider present (model stays unset)
|
||||
# - Non-string scalar values (int) coerce to string
|
||||
|
||||
set -u
|
||||
|
||||
HERE="$(cd "$(dirname "$0")" && pwd)"
|
||||
SCRIPT="$HERE/load-workspace-config.sh"
|
||||
|
||||
if [ ! -f "$SCRIPT" ]; then
|
||||
echo "FAIL: load-workspace-config.sh not found at $SCRIPT"
|
||||
exit 2
|
||||
fi
|
||||
|
||||
if ! command -v python3 >/dev/null 2>&1; then
|
||||
echo "SKIP: python3 not on PATH — load-workspace-config.sh requires it"
|
||||
exit 0
|
||||
fi
|
||||
if ! python3 -c "import yaml" 2>/dev/null; then
|
||||
echo "SKIP: pyyaml not importable — install with: pip install pyyaml"
|
||||
exit 0
|
||||
fi
|
||||
|
||||
PASS=0
|
||||
FAIL=0
|
||||
TMPDIR=$(mktemp -d -t lwc-test.XXXXXX)
|
||||
trap 'rm -rf "$TMPDIR"' EXIT
|
||||
|
||||
# Run the script in a fresh subshell with given env + a fixture
|
||||
# config.yaml, then echo the resulting env-var values for comparison.
|
||||
# Subshell isolation prevents env mutations from leaking between cases.
|
||||
run_case() {
|
||||
local label="$1" yaml_content="$2" pre_env="$3" expected_model="$4" expected_provider="$5"
|
||||
|
||||
local case_dir="$TMPDIR/$$.$RANDOM"
|
||||
mkdir -p "$case_dir"
|
||||
if [ "$yaml_content" != "<missing>" ]; then
|
||||
printf '%s' "$yaml_content" > "$case_dir/config.yaml"
|
||||
fi
|
||||
|
||||
local actual
|
||||
actual=$(bash -c "
|
||||
set +e
|
||||
unset HERMES_DEFAULT_MODEL HERMES_INFERENCE_PROVIDER
|
||||
$pre_env
|
||||
export MOLECULE_CONFIG_PATH='$case_dir'
|
||||
. '$SCRIPT' 2>/dev/null
|
||||
echo \"MODEL=\${HERMES_DEFAULT_MODEL:-}\"
|
||||
echo \"PROVIDER=\${HERMES_INFERENCE_PROVIDER:-}\"
|
||||
")
|
||||
|
||||
local actual_model actual_provider
|
||||
actual_model=$(echo "$actual" | grep -E '^MODEL=' | sed 's/^MODEL=//')
|
||||
actual_provider=$(echo "$actual" | grep -E '^PROVIDER=' | sed 's/^PROVIDER=//')
|
||||
|
||||
if [ "$actual_model" = "$expected_model" ] && [ "$actual_provider" = "$expected_provider" ]; then
|
||||
echo " PASS $label (model='$actual_model' provider='$actual_provider')"
|
||||
PASS=$((PASS+1))
|
||||
else
|
||||
echo " FAIL $label"
|
||||
echo " got model='$actual_model' provider='$actual_provider'"
|
||||
echo " expected model='$expected_model' provider='$expected_provider'"
|
||||
FAIL=$((FAIL+1))
|
||||
fi
|
||||
|
||||
rm -rf "$case_dir"
|
||||
}
|
||||
|
||||
echo "== load-workspace-config.sh =="
|
||||
|
||||
# --- File missing → no-op ---
|
||||
run_case "missing config.yaml" "<missing>" "" "" ""
|
||||
|
||||
# --- Malformed YAML → no-op (silent skip) ---
|
||||
run_case "malformed YAML" "providers: [not valid: {{{" "" "" ""
|
||||
|
||||
# --- runtime_config absent → no-op ---
|
||||
run_case "no runtime_config block" "name: foo
|
||||
runtime: hermes
|
||||
" "" "" ""
|
||||
|
||||
# --- runtime_config not a dict → no-op ---
|
||||
run_case "runtime_config is a string" "runtime_config: \"not-a-dict\"
|
||||
" "" "" ""
|
||||
|
||||
# --- Both fields present → both exported ---
|
||||
run_case "both fields" "runtime_config:
|
||||
model: nousresearch/hermes-4-70b
|
||||
provider: openrouter
|
||||
" "" "nousresearch/hermes-4-70b" "openrouter"
|
||||
|
||||
# --- Operator env var beats YAML for both ---
|
||||
run_case "env override beats both" "runtime_config:
|
||||
model: nousresearch/hermes-4-70b
|
||||
provider: openrouter
|
||||
" "export HERMES_DEFAULT_MODEL=anthropic/claude-sonnet-4-6
|
||||
export HERMES_INFERENCE_PROVIDER=anthropic" "anthropic/claude-sonnet-4-6" "anthropic"
|
||||
|
||||
# --- Only model present (provider stays unset) ---
|
||||
run_case "model only" "runtime_config:
|
||||
model: deepseek/deepseek-v4-pro
|
||||
" "" "deepseek/deepseek-v4-pro" ""
|
||||
|
||||
# --- Only provider present (model stays unset) ---
|
||||
run_case "provider only" "runtime_config:
|
||||
provider: zai
|
||||
" "" "" "zai"
|
||||
|
||||
# --- Empty string values are skipped ---
|
||||
run_case "empty-string fields skipped" "runtime_config:
|
||||
model: \"\"
|
||||
provider: \"\"
|
||||
" "" "" ""
|
||||
|
||||
# --- Non-string scalar (int) coerces to string ---
|
||||
# Not a real-world case but pins the str() coercion in the python helper
|
||||
# so a future YAML schema with numeric fields doesn't crash the boot.
|
||||
run_case "non-string scalar coerces" "runtime_config:
|
||||
model: 42
|
||||
provider: 7
|
||||
" "" "42" "7"
|
||||
|
||||
# --- Mixed: env var set for one, YAML provides the other ---
|
||||
run_case "partial env override" "runtime_config:
|
||||
model: minimax/MiniMax-M2.7-highspeed
|
||||
provider: minimax
|
||||
" "export HERMES_INFERENCE_PROVIDER=anthropic" "minimax/MiniMax-M2.7-highspeed" "anthropic"
|
||||
|
||||
echo
|
||||
echo "Total: PASS=$PASS FAIL=$FAIL"
|
||||
[ "$FAIL" -eq 0 ]
|
||||
21
start.sh
21
start.sh
@ -107,6 +107,21 @@ chmod 600 "$ENV_FILE"
|
||||
# the selection; operators override via HERMES_INFERENCE_PROVIDER
|
||||
# + HERMES_DEFAULT_MODEL env, or by editing config.yaml at runtime
|
||||
# inside the container.
|
||||
# Pull HERMES_DEFAULT_MODEL + HERMES_INFERENCE_PROVIDER out of
|
||||
# /configs/config.yaml (canvas Config tab values, written by CP
|
||||
# user-data per task #197). Env-var overrides still win — the helper
|
||||
# only sets vars that aren't already set. Sourced for env mutation.
|
||||
# Dockerfile COPYs scripts/ to /app/scripts; fall back to /scripts
|
||||
# for dev environments running start.sh with a different WORKDIR.
|
||||
#
|
||||
# Runs BEFORE the API-key-based auto-selection block below so a
|
||||
# canvas-set provider/model wins over a key-presence guess. Operators
|
||||
# who explicitly picked GLM-4.6 in the UI shouldn't get bumped to
|
||||
# anthropic/* just because ANTHROPIC_API_KEY happens to be in env too.
|
||||
LOAD_CONFIG_SCRIPT="/app/scripts/load-workspace-config.sh"
|
||||
[ -f "$LOAD_CONFIG_SCRIPT" ] || LOAD_CONFIG_SCRIPT="/scripts/load-workspace-config.sh"
|
||||
[ -f "$LOAD_CONFIG_SCRIPT" ] && . "$LOAD_CONFIG_SCRIPT"
|
||||
|
||||
# Pick a default model. The fallback used to be `nousresearch/hermes-4-70b`
|
||||
# unconditionally, which derives PROVIDER=openrouter when no Nous key is
|
||||
# present — and if OPENROUTER_API_KEY isn't set either, hermes-agent boots
|
||||
@ -119,9 +134,9 @@ chmod 600 "$ENV_FILE"
|
||||
# Fix: when HERMES_DEFAULT_MODEL is unset and HERMES_INFERENCE_PROVIDER
|
||||
# is unset, pick the default model based on which API key is actually
|
||||
# present in env. Keeps the behaviour-when-everything-is-set unchanged
|
||||
# (operator-supplied HERMES_DEFAULT_MODEL still wins). Order below is
|
||||
# rough preference (direct providers preferred over OR routing for the
|
||||
# same model family).
|
||||
# (operator-supplied HERMES_DEFAULT_MODEL still wins, including the
|
||||
# config.yaml-sourced one above). Order below is rough preference
|
||||
# (direct providers preferred over OR routing for the same model family).
|
||||
if [ -z "${HERMES_DEFAULT_MODEL:-}" ] && [ -z "${HERMES_INFERENCE_PROVIDER:-}" ]; then
|
||||
if [ -n "${HERMES_API_KEY:-}" ] || [ -n "${NOUS_API_KEY:-}" ]; then
|
||||
HERMES_DEFAULT_MODEL="nousresearch/hermes-4-70b"
|
||||
|
||||
Loading…
Reference in New Issue
Block a user