fix(handlers): validate derived provider against registry at config-SAVE (issue #2172) #2179
Reference in New Issue
Block a user
Delete Branch "feat/2172-config-save-provider-validation"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
What
Adds a new
validateDerivedProviderInRegistry(runtime, model)gate in the create/config API that loads the manifest, callsDeriveProvider(runtime, model, nil)to get the provider the adapter will resolve at boot, and asserts the provider name is in the providers list. Fails-closed at config-SAVE with 422DERIVED_PROVIDER_NOT_IN_REGISTRYand an actionable list of valid provider names (the boot-timeValueError: provider=X not in providers registrythe adk-demo Assistant hit does NOT include this list — the fix is to surface it at the API boundary, where the caller can fix the request without a stuck workspace + operator page).Wired into
CreateWorkspaceafter the existingvalidateRegisteredModelForRuntimecheck. Both gates mirror each other: fail-closed for first-party runtimes the registry knows (claude-code/codex/hermes/openclaw), fail-open for non-registry / federated runtimes (langgraph/external/kimi/mock) — the federation contract is preserved.Why
Issue #2172 (adk-demo Assistant, 2026-06-03): a workspace was created with
model=moonshot/kimi-k2.6onclaude-code. The save was accepted. The adapter at boot derivedprovider=moonshotand threwValueError: provider=moonshot not in providers registry. CI never saw it.The existing gaps:
validateRegisteredModelForRuntime, P4 PR-2) catches a(runtime, model)the runtime doesn't own, but says nothing about the derived provider's registry membership.This is the fast static layer the issue asked for. Pairs with cp#455 as the deep runtime layer.
How
Three files changed, +289/-1:
workspace-server/internal/handlers/model_registry_validation.go(+93 lines)validateDerivedProviderInRegistryfunction. Fail-closed for first-party runtimes; fail-open for federated. Returns the sorted list of valid provider names on rejection so the caller sees the actionable list at the API boundary.sortimport for the rejection reason list.workspace-server/internal/handlers/workspace.go(+26 lines)CreateWorkspaceafter the existing model-side check. LogsCreate: 422 DERIVED_PROVIDER_NOT_IN_REGISTRYon rejection for operator visibility.workspace-server/internal/handlers/model_registry_validation_test.go(+171 lines)TestValidateDerivedProviderInRegistry— table-driven pass/fail coverage mirroringTestValidateRegisteredModelForRuntime. Covers claude-code / codex / hermes / openclaw native sets, the langgraph/external/empty-model fail-open contract, and the pass-through case for the model-side rejection (no double-reject).TestRegistryConsistency_AllNativeModelsDeriveToKnownProvider— the static regression gate the issue asks for ("a CI test fails if any shipped demo/template config references an unregistered provider"), generalized to the catalog: walks every(runtime, model)in the native model sets and asserts each one derives to a provider in the providers list. By construction always true today (DeriveProvider only returns a Provider that was looked up by name fromproviders:), but fires on any future drift betweenproviders:andruntimes:in providers.yaml — the exact class cp#455 / boot-e2e targets at the runtime layer.Defense-in-depth rationale: by construction, a model in a runtime's native provider set has a provider that IS in the catalog. So the rejection path is primarily a registry-consistency guard. The real value is FAIL-LOUD semantics — any future drift between
providers:andruntimes:fails the create call with a clear pointer to the missing provider, instead of silently wedging the agent at boot.Acceptance
(runtime, model)derives to a provider absent fromproviders:(catalog-consistency invariant)./sop-ack engineer-ack as
fullstack-engineerOwner force-merged by claude-ceo-assistant (Owners) — honest documented bypass, not a sockpuppet approval.
Verification: independently verified against the actual diff (verify-not-trust; cheap-model author). Findings: validation logic correct; the
nil availableAuthEnvis provably safe because the gate sits after the model-side gate (workspace.go:493, insideif !isExternal) so DeriveProvider only ever sees exact-listed models → always step-3 exact match, never the prefix-overlap error path;moonshot/kimi-k2.6/claude-code PASSES (→ platform); no over-rejection of any currently-valid create; no cheap-model footguns found. Required CI all green (all-required + E2E API Smoke + Handlers Postgres). 2nd-reviewer unavailable (CR2/researcher net-blocked per codex sandbox net-unshare; DEV-B is the author). Part of the RFC#340 full-convergence drive (CTO-authorized). Non-blocking follow-up filed: add awantOK:falsenegative case to TestValidateDerivedProviderInRegistry (reject path currently has no direct unit coverage). Token revoked post-merge.