Commit Graph

1 Commits

Author SHA1 Message Date
Hongming Wang
5ebe6ccb33 test: regression guards for 2026-04-23 hermes + CP bug wave
Three complementary regression tests for the chain of P0s fixed today.
Each targets a specific bug class that reached production, and will
fire loud if any of them regress.

## 1. E2E A2A assertion enhancements (tests/e2e/test_staging_full_saas.sh)

The existing A2A check looked for "error|exception" in the response text,
which was too broad and missed the actual error patterns we hit. Now
matches each known error class individually with a diagnostic fail
message pointing at the exact bug:

  - "[hermes-agent error 401]"        → hermes #12 (API_SERVER_KEY)
  - "hermes-agent unreachable"        → gateway process died
  - "model_not_found"                 → hermes #13 (model prefix)
  - "Encrypted content is not supported" → hermes #14 (api_mode)
  - "Unknown provider"                → bridge PROVIDER misconfig

Also asserts the response contains the PONG token the prompt asked for —
catches silent-truncation/echo regressions.

## 2. Hermes install.sh bridge shell harness (tools/test-hermes-bridge.sh)

4 scenarios × 16 assertions, all offline (no docker, no network):

  - openai-bridge-happy: OPENAI_API_KEY + openai/gpt-4o →
    provider=custom, model="gpt-4o" (prefix stripped),
    api_mode=chat_completions
  - operator-custom-wins: explicit HERMES_CUSTOM_* → bridge skipped
  - openrouter-not-touched: OPENROUTER_API_KEY → provider=openrouter,
    slug kept
  - non-prefixed-model: bare "gpt-4o" → prefix-strip is a no-op

Runs in <1s, can be wired into template-hermes CI. Pins the exact
config.yaml shape — any drift in derive-provider.sh or the bridge
if-block breaks a test.

## 3. Canvas ConfigTab hermes tests (ConfigTab.hermes.test.tsx)

5 vitest cases covering the #1894 bugs:

  - Runtime loads from workspace metadata when config.yaml missing
  - "No config.yaml found" red error hidden for hermes
  - Hermes info banner shown instead
  - Langgraph workspace still sees the red error (regression-guard the
    other way)
  - config.yaml runtime wins over workspace metadata when present

## Running

  bash tools/test-hermes-bridge.sh                # 16 assertions
  cd canvas && npx vitest run src/components/tabs/__tests__/ConfigTab.hermes.test.tsx  # 5 cases
  # E2E enhancements ride on the existing staging E2E workflow

## Not yet covered (tracked in #1900)

CP admin delete-tenant EC2 cascade, cp-provisioner instance_id
lookup (#1738), purge audit SQL mismatch (#241), and pq prepared-
statement cache collision (#242). These are in-controlplane-repo
concerns — separate PR with CP-side sqlmock + integration tests.

Closes items in #1900.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 17:45:13 -07:00