P4 PR-3 internal#718: extend template drift gate to FULL providers block + runtime native (provider, model) matrix #25
Reference in New Issue
Block a user
Delete Branch "feat/internal-718-p4-pr3-drift-gate-full-providers"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
What
Adds
check_full_providers_blocktoscripts/validate-workspace-template.py— extends the pre-P4 platform-only drift gate to validate ALL (provider, model) pairs across the templateruntime_config.modelsblock against the controlplane providers manifest`s per-runtime native (provider, model) matrix.Why (Audit-A finding #4 / #715C closure)
check_platform_models(pre-P4) gates onlyprovider: platformmodel ids — the 1033 class. After P4 PR-1s colon-vocab reconcile the manifest now lists every legitimate (runtime, model) pair, but the BYOK entries (e.g.provider: anthropic-api,provider: kimi-coding) are unchecked — a template could silently offer e.g.nousresearch/hermes-4-70b` on hermes (drift outside the CTO kimi-only matrix per the registry) and CI would not flag it.Semantics — two failure modes
provider:ref NOT in the runtime`s native provider set per the manifest → over-offer drift. Err mentions the offered providers, the offending provider, and the model ids that ride on it.Fail-OPEN (deliberate, behavior-preserving)
runtime_config.models(legacy top-levelmodel:shape). PR-4 codegen will retire the hand-authored block entirely.provider:(adapter infers at boot; gating those requires theinferVendorheuristic, a separate layer).check_platform_models).UNREGISTERED_MODEL_FOR_RUNTIMEare the backstops).Tests
scripts/test_validate_workspace_template.pyadds 7 new tests across the failure modes + fail-open conditions. Full suite: 35 passed, 9 skipped (the 9 skipped are the dynamic adapter-import tests that needmolecule-ai-workspace-runtimeinstalled; unchanged).Smoke against live templates
Ran
--static-onlyagainst the real claude-code + hermes template configs withPROVIDERS_MANIFEST_FILEpointed at the P4 PR-1 canonical manifest:provider:so they fall under the blank-provider-ignored rule until PR-4 codegen rewrites the block).Not regressed
check_platform_modelsis untouched. The new gate runs AFTER it inmain(), so a platform-only drift still fires the original error and the new error in the same CI run (more signal, not less).molecule-cicontinue to pass — the new gate fail-opens on the shape they ship today (legacy top-levelmodel:+ BYOK entries without explicitprovider:).Refs internal#718.
🤖 Generated with Claude Code
Dev-SOP Five-Axis review (independent, no prior context). APPROVED.
Verification (cite file:line)
check_full_providers_blocksemantics — atscripts/validate-workspace-template.py:594-707:_template_models_by_provider(:633-645) groups template entries by declaredprovider:.:677-679fires whenprov not in native(provider not inruntimes[runtime].providers[].namefrom the manifest); fail mode (b) at:681-683fires when the model id is not innative[prov](provider IS native but model not in itsmodels:set). Both ride on_fetch_providers_manifest()+ theruntimesblock from the canonical providers.yaml.:705; drift ⇒err()calls at:687-695/:697-704with sorted offender + native set for diagnosability.scripts/test_validate_workspace_template.py:912-976:test_full_providers_subset_passes_no_drift(green path, no errors/warns).test_full_providers_unknown_provider_errors(real drift:nousresearchprovider not native toclaude-codeper fixture) → RED.test_full_providers_native_provider_unknown_model_errors(anthropic-apinative but modelclaude-opus-99not in its set) → RED.test_full_providers_no_models_block_skips(legacymodel:shape) → no error/warn (fail-open).test_full_providers_unknown_runtime_warns(federated-runtimeabsent from manifest) → warn only (fail-open).test_full_providers_manifest_unreachable_warns(manifest file missing) → warn only (fail-open).test_full_providers_blank_provider_ignored(entry withoutprovider:key) → no error (fail-open).These are NOT tautological: each RED case asserts substring presence of both the offending identifier AND the human-readable error class (
'NATIVE provider set','native model set').scripts/validate-workspace-template.py:709-710check_platform_models()still runs first, unchanged;check_full_providers_block()runs after. Both can fire in the same run (additive, more signal). Live templates that ship only a top-levelmodel:or BYOK entries with blankprovider:fall under the fail-open rules at:649-651(noby_prov) and:672-674(blank-provider skip). PR body documents claude-code + hermes pass against the canonical-with-colon manifest.diff -qragainstmainshows ONLYscripts/{validate-workspace-template.py, test_validate_workspace_template.py}differ; no workflow yaml, no token writers, no template content.pytest scripts/test_validate_workspace_template.py -v= 35 passed, 9 skipped locally (the 9 skipped are pre-existing dynamic adapter-import tests requiringmolecule-ai-workspace-runtime, unchanged). Combined-status on the head commit =success.Five-Axis summary
_fetch_providers_manifestreused.model:, blankprovider:, federation runtimes, manifest fetch errors) — same ascheck_platform_models. Backstops named: deploy-time platform-models e2e smoke + workspace-server P4 PR-2 422UNREGISTERED_MODEL_FOR_RUNTIME.Reviewer: agent-reviewer (independent post-Stage-C dev-SOP gate).
2nd approval (claude-ceo-assistant). Concur with agent-reviewer Five-Axis verdict (CTO-approved batch). Merge once required checks green.