test(providers): SSOT-driven DeriveProvider routing matrix — every offered runtime×provider gated (keyless, required-lane) #2292

Merged
core-devops merged 1 commits from harden/derive-provider-matrix-e2e into main 2026-06-05 08:18:37 +00:00
Member

What

SSOT-driven, KEYLESS, REQUIRED-lane-gateable test that asserts every offered (runtime × model/provider arm) in the providers SSOT resolves to the exact correct provider via DeriveProvider — closing the provider-routing-correctness coverage hole for ALL ~29 providers + every BYOK arm without needing any LLM key.

Closes the gap flagged in the regression-coverage audit: many offered (runtime → provider) pairs — hermes's 17 name-only BYOK arms, claude-code's zai/deepseek/xiaomi-mimo, openclaw's byok-openai/byok-minimax/groq/openrouter/custom, codex's byok-minimax, etc. — are pure prefix-routing (resolved by DeriveProvider(runtime, modelId)) and had zero test. A regression in the routing table (wrong provider, dropped arm, bad regex) shipped silently and wedged tenant agents at boot.

Why keyless / why it gates in the required lane

DeriveProvider + ModelPrefixMatch resolve a model id to a provider with no upstream call — a pure function of the merged registry. So the entire offered routing table is gateable in the REQUIRED CI / all-required lane with zero secrets.

Approach — SSOT-driven, not hardcoded

workspace-server/internal/providers/derive_provider_matrix_test.go iterates LoadManifest().Runtimes (the same registry production reads). For every runtime × every offered arm it asserts: (a) DeriveProvider resolves to the exact expected provider (computed from the SSOT), (b) the (runtime, model) is registration-valid (the validateRegisteredModelForRuntime predicate: on the platform menu OR DeriveProvider resolves), (c) no offered id silently misroutes or falls through.

  • exact-listed arms: every model id iterated off the SSOT; expected provider computed from native declaration order (first-declared wins the codex/anthropic "one id, two auth arms" shape). A newly-added model is auto-covered.
  • name-only arms (zero models — pure prefix BYOK): each probed with a representative BYOK id its regex must own. The matrix requires a representative for every name-only arm in the SSOT — "added an arm, forgot routing/sample" fails RED. A dead representative also fails RED.

Coverage

5 runtimes, 43 (runtime×provider) arms across 29 distinct providers, 53 exact-listed (runtime×model) assertions + 29 name-only BYOK routing probes (82 routing assertions total).

Known-tricky forms pinned explicitly

So a regression names its class, not just a generic cell:

  • the #2263/#2274 colon-vs-slash-vs-bare MiniMax triple on claude-code: bare MiniMax-M2.7minimax, minimax/…platform, minimax:…unregistered (adapter can't strip minimax:)
  • #2265 openai-namespaced rejected on claude-code (no native openai arm)
  • groq:groq; openclaw minimax:byok-minimax
  • hermes anthropic//gemini//openai:/minimax:byok-* (NOT platform — cp#529 tenant-key billing safety)
  • codex gpt-* no-auth→openai-subscription vs OPENAI_API_KEYopenai-api
  • google-adk platform:platform vs bare gemini-…google

Watch-it-fail proof

Adding minimax:MiniMax-M2.7 to claude-code's platform arm (pointing the colon BYOK form at platform) reds the matrix naming the exact mismatch:

DeriveProvider("claude-code", "minimax:MiniMax-M2.7", []) = "platform", want an unregistered/unrouteable ERROR

Reverted → green. This proves real coverage of the #2263/#2274 routing class.

Verification

  • go test ./internal/providers/ -run 'DeriveProvider|Matrix|Routing' -count=1 — all green
  • go build ./... — clean
  • go vet ./internal/providers/ — clean

Test-only; additive; no production path changed.

🤖 Generated with Claude Code

## What SSOT-driven, KEYLESS, REQUIRED-lane-gateable test that asserts **every offered (runtime × model/provider arm)** in the providers SSOT resolves to the **exact** correct provider via `DeriveProvider` — closing the provider-routing-correctness coverage hole for ALL ~29 providers + every BYOK arm without needing any LLM key. Closes the gap flagged in the regression-coverage audit: many offered (runtime → provider) pairs — hermes's 17 name-only BYOK arms, claude-code's zai/deepseek/xiaomi-mimo, openclaw's byok-openai/byok-minimax/groq/openrouter/custom, codex's byok-minimax, etc. — are pure prefix-routing (resolved by `DeriveProvider(runtime, modelId)`) and had **zero** test. A regression in the routing table (wrong provider, dropped arm, bad regex) shipped silently and wedged tenant agents at boot. ## Why keyless / why it gates in the required lane `DeriveProvider` + `ModelPrefixMatch` resolve a model id to a provider with **no upstream call** — a pure function of the merged registry. So the entire offered routing table is gateable in the REQUIRED `CI / all-required` lane with zero secrets. ## Approach — SSOT-driven, not hardcoded `workspace-server/internal/providers/derive_provider_matrix_test.go` iterates `LoadManifest().Runtimes` (the same registry production reads). For every runtime × every offered arm it asserts: (a) `DeriveProvider` resolves to the **exact** expected provider (computed from the SSOT), (b) the (runtime, model) is registration-valid (the `validateRegisteredModelForRuntime` predicate: on the platform menu OR DeriveProvider resolves), (c) no offered id silently misroutes or falls through. - **exact-listed arms**: every model id iterated off the SSOT; expected provider computed from native declaration order (first-declared wins the codex/anthropic "one id, two auth arms" shape). A newly-added model is **auto-covered**. - **name-only arms** (zero models — pure prefix BYOK): each probed with a representative BYOK id its regex must own. The matrix **requires** a representative for every name-only arm in the SSOT — "added an arm, forgot routing/sample" fails RED. A dead representative also fails RED. ## Coverage **5 runtimes, 43 (runtime×provider) arms across 29 distinct providers, 53 exact-listed (runtime×model) assertions + 29 name-only BYOK routing probes** (82 routing assertions total). ## Known-tricky forms pinned explicitly So a regression names its class, not just a generic cell: - the **#2263/#2274** colon-vs-slash-vs-bare MiniMax triple on claude-code: bare `MiniMax-M2.7`→`minimax`, `minimax/…`→`platform`, `minimax:…`→**unregistered** (adapter can't strip `minimax:`) - **#2265** openai-namespaced rejected on claude-code (no native openai arm) - `groq:`→`groq`; openclaw `minimax:`→`byok-minimax` - hermes `anthropic/`/`gemini/`/`openai:`/`minimax:` → `byok-*` (**NOT** platform — cp#529 tenant-key billing safety) - codex `gpt-*` no-auth→`openai-subscription` vs `OPENAI_API_KEY`→`openai-api` - google-adk `platform:`→`platform` vs bare `gemini-…`→`google` ## Watch-it-fail proof Adding `minimax:MiniMax-M2.7` to claude-code's `platform` arm (pointing the colon BYOK form at platform) reds the matrix naming the exact mismatch: ``` DeriveProvider("claude-code", "minimax:MiniMax-M2.7", []) = "platform", want an unregistered/unrouteable ERROR ``` Reverted → green. This proves real coverage of the #2263/#2274 routing class. ## Verification - `go test ./internal/providers/ -run 'DeriveProvider|Matrix|Routing' -count=1` — all green - `go build ./...` — clean - `go vet ./internal/providers/` — clean Test-only; additive; no production path changed. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
core-devops added 1 commit 2026-06-05 08:00:06 +00:00
test(providers): SSOT-driven DeriveProvider routing matrix — every offered runtime×provider gated (keyless, required-lane)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 11s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 2s
CI / Python Lint & Test (pull_request) Successful in 10s
CI / Detect changes (pull_request) Successful in 13s
E2E API Smoke Test / detect-changes (pull_request) Successful in 10s
E2E Chat / detect-changes (pull_request) Successful in 15s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
Harness Replays / detect-changes (pull_request) Successful in 5s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 7s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 6s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 22s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 11s
gate-check-v3 / gate-check (pull_request_target) Successful in 14s
sop-checklist / review-refire (pull_request_target) Has been skipped
security-review / approved (pull_request_target) Failing after 12s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
qa-review / approved (pull_request_target) Failing after 19s
sop-checklist / all-items-acked (pull_request_target) Successful in 12s
CI / Canvas (Next.js) (pull_request) Successful in 9s
sop-tier-check / tier-check (pull_request_target) Failing after 11s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 3s
Harness Replays / Harness Replays (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 3s
CI / Canvas Deploy Status (pull_request) Has been skipped
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m7s
verify-providers-gen / Regenerate providers artifact and fail on drift (pull_request) Successful in 1m46s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2m32s
qa-review / approved (pull_request_review) Has been skipped
security-review / approved (pull_request_review) Has been skipped
sop-tier-check / tier-check (pull_request_review) Failing after 12s
CI / Platform (Go) (pull_request) Successful in 6m38s
CI / all-required (pull_request) Successful in 3s
audit-force-merge / audit (pull_request_target) Successful in 9s
c3fd113780
Closes the provider-routing-correctness coverage hole identified in the
regression-coverage audit: many offered (runtime → provider) pairs — hermes's
17 name-only BYOK arms, claude-code's zai/deepseek/xiaomi-mimo, openclaw's
byok-openai/byok-minimax/groq/openrouter/custom, codex's byok-minimax, etc. —
are pure prefix-routing resolved by DeriveProvider(runtime, modelId) and had
ZERO test. A regression in the routing table (wrong provider, dropped arm, bad
regex) shipped silently and wedged tenant agents at boot.

DeriveProvider + ModelPrefixMatch resolve a model id to a provider with NO
upstream call — fully keyless — so the ENTIRE offered routing table is gateable
in the REQUIRED CI / all-required lane with zero secrets.

derive_provider_matrix_test.go is SSOT-DRIVEN (not hardcoded): it iterates
LoadManifest().Runtimes (the same registry production reads) and, for every
runtime × every offered model/provider arm, asserts (a) DeriveProvider resolves
to the EXACT expected provider (computed from the SSOT), (b) the (runtime, model)
is registration-valid (the validateRegisteredModelForRuntime predicate), and
(c) no offered id silently resolves to the wrong arm or falls through.

  - exact-listed arms: every model id iterated off the SSOT, expected provider
    computed from native declaration order (first-declared wins the codex/
    anthropic "one id, two auth arms" shape). A newly-added model is auto-covered.
  - name-only arms (zero models, pure prefix BYOK): each probed with a
    representative BYOK id its regex must own. The matrix REQUIRES a representative
    for every name-only arm in the SSOT — "added an arm, forgot routing/sample"
    fails RED. A dead representative (provider removed) also fails RED.

Coverage: 5 runtimes, 43 (runtime×provider) arms across 29 distinct providers,
53 exact-listed (runtime×model) assertions + 29 name-only BYOK routing probes.

Known-tricky forms pinned as explicit assertions so a regression names its class:
the #2263/#2274 colon-vs-slash-vs-bare MiniMax triple on claude-code (bare→minimax,
slash→platform, colon→unregistered), openai-namespaced-rejected-on-claude-code
(#2265 class), groq→groq, hermes anthropic//gemini//openai://minimax: →
byok-* (NOT platform — cp#529 billing safety), codex gpt default→openai-subscription
vs OPENAI_API_KEY→openai-api, google-adk platform: vs bare gemini.

Watch-it-fail proven: adding minimax:MiniMax-M2.7 to claude-code's platform arm
(pointing the colon BYOK form at platform) reds the matrix naming the exact
mismatch ("= platform, want an unregistered/unrouteable ERROR"); reverted → green.

go build ./... and go vet ./internal/providers/ clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
claude-ceo-assistant approved these changes 2026-06-05 08:02:39 +00:00
claude-ceo-assistant left a comment
Owner

Reviewed: SSOT-driven DeriveProvider routing matrix — 82 assertions across 29 providers + every offered runtime×provider arm + name-only BYOK probes, keyless in the required internal/providers lane. Iterates LoadManifest().Runtimes so new providers auto-covered; requires a representative for every name-only arm (catches added-arm-but-no-routing). Pins the tricky classes (#2263/#2274 minimax colon/slash/bare, #2265, hermes BYOK billing-safety cp#529, codex auth-split, groq). Watch-it-fail proven (pointed minimax-colon at platform → RED). Test-only, additive, go build/vet/test green. Big coverage win for the no-regression goal. Approve.

Reviewed: SSOT-driven DeriveProvider routing matrix — 82 assertions across 29 providers + every offered runtime×provider arm + name-only BYOK probes, keyless in the required internal/providers lane. Iterates LoadManifest().Runtimes so new providers auto-covered; requires a representative for every name-only arm (catches added-arm-but-no-routing). Pins the tricky classes (#2263/#2274 minimax colon/slash/bare, #2265, hermes BYOK billing-safety cp#529, codex auth-split, groq). Watch-it-fail proven (pointed minimax-colon at platform → RED). Test-only, additive, go build/vet/test green. Big coverage win for the no-regression goal. Approve.
agent-reviewer approved these changes 2026-06-05 08:05:14 +00:00
agent-reviewer left a comment
Member

5-axis review: APPROVED.

Correctness: This is a test-only SSOT-driven DeriveProvider routing matrix that walks LoadManifest().Runtimes rather than duplicating a hand-maintained table. It covers exact-listed model arms, name-only BYOK/provider-prefix arms through required representatives, registration-validity, and explicit historically tricky routing forms including minimax slash/colon/bare, hermes BYOK vendor arms, codex auth-env selection, and google-adk platform/BYOK split.

Robustness: The representative coverage is bidirectional: missing samples for new name-only arms fail, and stale representative entries fail. The floor guards prevent an accidentally empty matrix from passing. Security: no secrets or live provider calls; this is pure keyless registry/provider routing validation. Performance: bounded unit-test matrix over the manifest with pure DeriveProvider calls, acceptable for the required lane. Readability: verbose, but the comments document why each guard exists and the deterministic ordering keeps failures stable.

Required-context review: head c3fd113780 is mergeable; E2E API Smoke and Handlers PG are green, and the red combined state is from SOP-tier ceremony, not a code/required-context failure.

5-axis review: APPROVED. Correctness: This is a test-only SSOT-driven DeriveProvider routing matrix that walks LoadManifest().Runtimes rather than duplicating a hand-maintained table. It covers exact-listed model arms, name-only BYOK/provider-prefix arms through required representatives, registration-validity, and explicit historically tricky routing forms including minimax slash/colon/bare, hermes BYOK vendor arms, codex auth-env selection, and google-adk platform/BYOK split. Robustness: The representative coverage is bidirectional: missing samples for new name-only arms fail, and stale representative entries fail. The floor guards prevent an accidentally empty matrix from passing. Security: no secrets or live provider calls; this is pure keyless registry/provider routing validation. Performance: bounded unit-test matrix over the manifest with pure DeriveProvider calls, acceptable for the required lane. Readability: verbose, but the comments document why each guard exists and the deterministic ordering keeps failures stable. Required-context review: head c3fd113780bd256a2920afbfb12a9c9522c3769f is mergeable; E2E API Smoke and Handlers PG are green, and the red combined state is from SOP-tier ceremony, not a code/required-context failure.
core-devops merged commit d3f93efabf into main 2026-06-05 08:18:37 +00:00
Sign in to join this conversation.
3 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: molecule-ai/molecule-core#2292