P4 PR-3 internal#718: extend template drift gate to FULL providers block + runtime native (provider, model) matrix #25

Merged
hongming merged 1 commits from feat/internal-718-p4-pr3-drift-gate-full-providers into main 2026-05-28 03:41:50 +00:00
Owner

What

Adds check_full_providers_block to scripts/validate-workspace-template.py — extends the pre-P4 platform-only drift gate to validate ALL (provider, model) pairs across the template runtime_config.models block against the controlplane providers manifest`s per-runtime native (provider, model) matrix.

Why (Audit-A finding #4 / #715C closure)

check_platform_models (pre-P4) gates only provider: platform model ids — the 1033 class. After P4 PR-1s colon-vocab reconcile the manifest now lists every legitimate (runtime, model) pair, but the BYOK entries (e.g. provider: anthropic-api, provider: kimi-coding) are unchecked — a template could silently offer e.g. nousresearch/hermes-4-70b` on hermes (drift outside the CTO kimi-only matrix per the registry) and CI would not flag it.

Semantics — two failure modes

  • (a) Unknown provider: template provider: ref NOT in the runtime`s native provider set per the manifest → over-offer drift. Err mentions the offered providers, the offending provider, and the model ids that ride on it.
  • (b) Model-id drift: provider IS native but the specific model id NOT in that provider`s native model set → err mentions the provider, the offending model ids, and the native set.

Fail-OPEN (deliberate, behavior-preserving)

  • Templates without runtime_config.models (legacy top-level model: shape). PR-4 codegen will retire the hand-authored block entirely.
  • Models without explicit provider: (adapter infers at boot; gating those requires the inferVendor heuristic, a separate layer).
  • Runtimes absent from the manifest (federation-friendly; same posture as check_platform_models).
  • Manifest fetch failures (best-effort; the deploy-time platform-models e2e smoke + the workspace-server P4 PR-2 422 UNREGISTERED_MODEL_FOR_RUNTIME are the backstops).

Tests

scripts/test_validate_workspace_template.py adds 7 new tests across the failure modes + fail-open conditions. Full suite: 35 passed, 9 skipped (the 9 skipped are the dynamic adapter-import tests that need molecule-ai-workspace-runtime installed; unchanged).

Smoke against live templates

Ran --static-only against the real claude-code + hermes template configs with PROVIDERS_MANIFEST_FILE pointed at the P4 PR-1 canonical manifest:

  • claude-code: passes both gates (platform models ⊆ manifest, full providers block ⊆ manifest native matrix; 27 model ids).
  • hermes: passes both gates (37 model ids; the BYOK entries omit explicit provider: so they fall under the blank-provider-ignored rule until PR-4 codegen rewrites the block).

Not regressed

  • The pre-existing check_platform_models is untouched. The new gate runs AFTER it in main(), so a platform-only drift still fires the original error and the new error in the same CI run (more signal, not less).
  • The 8 live template repos that consume molecule-ci continue to pass — the new gate fail-opens on the shape they ship today (legacy top-level model: + BYOK entries without explicit provider:).

Refs internal#718.

🤖 Generated with Claude Code

## What Adds `check_full_providers_block` to `scripts/validate-workspace-template.py` — extends the pre-P4 platform-only drift gate to validate ALL (provider, model) pairs across the template `runtime_config.models` block against the controlplane providers manifest`s per-runtime native (provider, model) matrix. ## Why (Audit-A finding #4 / #715C closure) `check_platform_models` (pre-P4) gates only `provider: platform` model ids — the 1033 class. After P4 PR-1`s colon-vocab reconcile the manifest now lists every legitimate (runtime, model) pair, but the BYOK entries (e.g. `provider: anthropic-api`, `provider: kimi-coding`) are unchecked — a template could silently offer e.g. `nousresearch/hermes-4-70b` on hermes (drift outside the CTO kimi-only matrix per the registry) and CI would not flag it. ## Semantics — two failure modes - **(a) Unknown provider**: template `provider:` ref NOT in the runtime`s native provider set per the manifest → over-offer drift. Err mentions the offered providers, the offending provider, and the model ids that ride on it. - **(b) Model-id drift**: provider IS native but the specific model id NOT in that provider`s native model set → err mentions the provider, the offending model ids, and the native set. ## Fail-OPEN (deliberate, behavior-preserving) - Templates without `runtime_config.models` (legacy top-level `model:` shape). PR-4 codegen will retire the hand-authored block entirely. - Models without explicit `provider:` (adapter infers at boot; gating those requires the `inferVendor` heuristic, a separate layer). - Runtimes absent from the manifest (federation-friendly; same posture as `check_platform_models`). - Manifest fetch failures (best-effort; the deploy-time platform-models e2e smoke + the workspace-server P4 PR-2 422 `UNREGISTERED_MODEL_FOR_RUNTIME` are the backstops). ## Tests `scripts/test_validate_workspace_template.py` adds 7 new tests across the failure modes + fail-open conditions. Full suite: **35 passed, 9 skipped** (the 9 skipped are the dynamic adapter-import tests that need `molecule-ai-workspace-runtime` installed; unchanged). ## Smoke against live templates Ran `--static-only` against the real claude-code + hermes template configs with `PROVIDERS_MANIFEST_FILE` pointed at the P4 PR-1 canonical manifest: - claude-code: passes both gates (platform models ⊆ manifest, full providers block ⊆ manifest native matrix; 27 model ids). - hermes: passes both gates (37 model ids; the BYOK entries omit explicit `provider:` so they fall under the blank-provider-ignored rule until PR-4 codegen rewrites the block). ## Not regressed - The pre-existing `check_platform_models` is untouched. The new gate runs AFTER it in `main()`, so a platform-only drift still fires the original error and the new error in the same CI run (more signal, not less). - The 8 live template repos that consume `molecule-ci` continue to pass — the new gate fail-opens on the shape they ship today (legacy top-level `model:` + BYOK entries without explicit `provider:`). Refs internal#718. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
hongming added 1 commit 2026-05-28 03:27:58 +00:00
P4 PR-3 internal#718: extend template drift gate to FULL providers block + runtime native (provider, model) matrix (Audit-A #4 / #715C closure)
CI / Workflow YAML lint (pull_request) Successful in 17s
CI / Python script lint (pull_request) Successful in 59s
CI / Secrets scan (pull_request) Successful in 1m28s
5bcfd67853
The pre-P4 drift gate (check_platform_models) validated only the platform-only subset: template runtime_config.models entries tagged provider: platform must be in the manifest runtimes block platform set. That catches the 1033 class (canvas offers a model the proxy cannot serve) but leaves every other (provider, model) pair unchecked — so a template could silently offer e.g. nousresearch/hermes-4-70b on hermes (drift outside the CTO kimi-only matrix per the registry) and CI would not flag it.

This PR adds check_full_providers_block: validate ALL (provider, model) pairs across the entire runtime_config.models block against the manifest per-runtime native (provider, model) matrix.

Semantics (two failure modes, both 422):
  (a) provider: ref NOT in the runtime native provider set → over-offer drift; explicit err with the offered providers, the offending provider, and the model ids that ride on it.
  (b) provider: IS native but the specific model id NOT in that providers native model set → model-id drift; explicit err with the provider, the offending model ids, and the native set.

Fail-OPEN on:
  - Templates without runtime_config.models (legacy top-level model: shape) — pre-codegen behavior; PR-4 codegen will retire the hand-authored block entirely.
  - Models without explicit provider: (adapter infers their provider at boot; gating those requires the inferVendor heuristic, a separate layer).
  - Runtimes absent from the manifest (federation-friendly; same posture as check_platform_models).
  - Manifest fetch failures (best-effort; deploy-time only-registered gate + workspace-server P4 PR-2 422 are the backstops).

Hooks into main() after check_platform_models so both gates fire on every template CI run.

Tests (scripts/test_validate_workspace_template.py):
  * test_full_providers_subset_passes_no_drift — happy path: every (provider, model) in the manifest native set → no err, no warn.
  * test_full_providers_unknown_provider_errors — provider not in runtime native set → err mentioning NATIVE provider set + the offering provider.
  * test_full_providers_native_provider_unknown_model_errors — native provider + model not in its native set → err mentioning native model set + the model id.
  * test_full_providers_no_models_block_skips — legacy top-level model: shape → no err, no warn (out of scope).
  * test_full_providers_unknown_runtime_warns — federation runtime → warn, not err.
  * test_full_providers_manifest_unreachable_warns — fetch fail → warn, not err.
  * test_full_providers_blank_provider_ignored — model entry without provider: → not gated.

Smoke-verified against the live templates (claude-code, hermes) — both pass the new gate today (the platform-only set is already a manifest subset; the BYOK entries omit explicit provider: so they fall under the blank-provider-ignored rule until PR-4 codegen rewrites the block).

Full suite: 35 passed, 9 skipped (the 9 skipped are the dynamic adapter-import tests that need molecule-ai-workspace-runtime installed — unchanged).

Refs internal#718.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
agent-reviewer approved these changes 2026-05-28 03:40:52 +00:00
agent-reviewer left a comment
Member

Dev-SOP Five-Axis review (independent, no prior context). APPROVED.

Verification (cite file:line)

  1. check_full_providers_block semantics — at scripts/validate-workspace-template.py:594-707:
    • _template_models_by_provider (:633-645) groups template entries by declared provider:.
    • For each (provider, mids) in the template, fail mode (a) at :677-679 fires when prov not in native (provider not in runtimes[runtime].providers[].name from the manifest); fail mode (b) at :681-683 fires when the model id is not in native[prov] (provider IS native but model not in its models: set). Both ride on _fetch_providers_manifest() + the runtimes block from the canonical providers.yaml.
    • In-sync ⇒ green print at :705; drift ⇒ err() calls at :687-695 / :697-704 with sorted offender + native set for diagnosability.
  2. 7 new tests genuinely discriminatescripts/test_validate_workspace_template.py:912-976:
    • test_full_providers_subset_passes_no_drift (green path, no errors/warns).
    • test_full_providers_unknown_provider_errors (real drift: nousresearch provider not native to claude-code per fixture) → RED.
    • test_full_providers_native_provider_unknown_model_errors (anthropic-api native but model claude-opus-99 not in its set) → RED.
    • test_full_providers_no_models_block_skips (legacy model: shape) → no error/warn (fail-open).
    • test_full_providers_unknown_runtime_warns (federated-runtime absent from manifest) → warn only (fail-open).
    • test_full_providers_manifest_unreachable_warns (manifest file missing) → warn only (fail-open).
    • test_full_providers_blank_provider_ignored (entry without provider: key) → no error (fail-open).
      These are NOT tautological: each RED case asserts substring presence of both the offending identifier AND the human-readable error class ('NATIVE provider set', 'native model set').
  3. Backward-compatible — at scripts/validate-workspace-template.py:709-710 check_platform_models() still runs first, unchanged; check_full_providers_block() runs after. Both can fire in the same run (additive, more signal). Live templates that ship only a top-level model: or BYOK entries with blank provider: fall under the fail-open rules at :649-651 (no by_prov) and :672-674 (blank-provider skip). PR body documents claude-code + hermes pass against the canonical-with-colon manifest.
  4. CI-only, no prod code, no secretsdiff -qr against main shows ONLY scripts/{validate-workspace-template.py, test_validate_workspace_template.py} differ; no workflow yaml, no token writers, no template content. pytest scripts/test_validate_workspace_template.py -v = 35 passed, 9 skipped locally (the 9 skipped are pre-existing dynamic adapter-import tests requiring molecule-ai-workspace-runtime, unchanged). Combined-status on the head commit = success.

Five-Axis summary

  • Correctness: gate semantics match RFC#580 Option C closure (full providers block + per-runtime native (provider, model) matrix); two distinct error classes with discriminating tests.
  • Tests: 7 new tests, all 3 RED-mode tests assert error-string substrings; full suite green.
  • Security: CI-only script; no secrets touched; no network changes beyond pre-existing _fetch_providers_manifest reused.
  • Architecture: fail-open posture preserved on legacy template shapes (legacy model:, blank provider:, federation runtimes, manifest fetch errors) — same as check_platform_models. Backstops named: deploy-time platform-models e2e smoke + workspace-server P4 PR-2 422 UNREGISTERED_MODEL_FOR_RUNTIME.
  • Ops: backward-compatible with the 8 live template repos that consume molecule-ci (PR body verifies); existing platform-only gate untouched.

Reviewer: agent-reviewer (independent post-Stage-C dev-SOP gate).

Dev-SOP Five-Axis review (independent, no prior context). APPROVED. ## Verification (cite file:line) 1. **`check_full_providers_block` semantics** — at `scripts/validate-workspace-template.py:594-707`: - `_template_models_by_provider` (`:633-645`) groups template entries by declared `provider:`. - For each (provider, mids) in the template, fail mode (a) at `:677-679` fires when `prov not in native` (provider not in `runtimes[runtime].providers[].name` from the manifest); fail mode (b) at `:681-683` fires when the model id is not in `native[prov]` (provider IS native but model not in its `models:` set). Both ride on `_fetch_providers_manifest()` + the `runtimes` block from the canonical providers.yaml. - In-sync ⇒ green print at `:705`; drift ⇒ `err()` calls at `:687-695` / `:697-704` with sorted offender + native set for diagnosability. 2. **7 new tests genuinely discriminate** — `scripts/test_validate_workspace_template.py:912-976`: - `test_full_providers_subset_passes_no_drift` (green path, no errors/warns). - `test_full_providers_unknown_provider_errors` (real drift: `nousresearch` provider not native to `claude-code` per fixture) → RED. - `test_full_providers_native_provider_unknown_model_errors` (`anthropic-api` native but model `claude-opus-99` not in its set) → RED. - `test_full_providers_no_models_block_skips` (legacy `model:` shape) → no error/warn (fail-open). - `test_full_providers_unknown_runtime_warns` (`federated-runtime` absent from manifest) → warn only (fail-open). - `test_full_providers_manifest_unreachable_warns` (manifest file missing) → warn only (fail-open). - `test_full_providers_blank_provider_ignored` (entry without `provider:` key) → no error (fail-open). These are NOT tautological: each RED case asserts substring presence of both the offending identifier AND the human-readable error class (`'NATIVE provider set'`, `'native model set'`). 3. **Backward-compatible** — at `scripts/validate-workspace-template.py:709-710` `check_platform_models()` still runs first, unchanged; `check_full_providers_block()` runs after. Both can fire in the same run (additive, more signal). Live templates that ship only a top-level `model:` or BYOK entries with blank `provider:` fall under the fail-open rules at `:649-651` (no `by_prov`) and `:672-674` (blank-provider skip). PR body documents claude-code + hermes pass against the canonical-with-colon manifest. 4. **CI-only, no prod code, no secrets** — `diff -qr` against `main` shows ONLY `scripts/{validate-workspace-template.py, test_validate_workspace_template.py}` differ; no workflow yaml, no token writers, no template content. `pytest scripts/test_validate_workspace_template.py -v` = **35 passed, 9 skipped** locally (the 9 skipped are pre-existing dynamic adapter-import tests requiring `molecule-ai-workspace-runtime`, unchanged). Combined-status on the head commit = `success`. ## Five-Axis summary - **Correctness**: gate semantics match RFC#580 Option C closure (full providers block + per-runtime native (provider, model) matrix); two distinct error classes with discriminating tests. - **Tests**: 7 new tests, all 3 RED-mode tests assert error-string substrings; full suite green. - **Security**: CI-only script; no secrets touched; no network changes beyond pre-existing `_fetch_providers_manifest` reused. - **Architecture**: fail-open posture preserved on legacy template shapes (legacy `model:`, blank `provider:`, federation runtimes, manifest fetch errors) — same as `check_platform_models`. Backstops named: deploy-time platform-models e2e smoke + workspace-server P4 PR-2 422 `UNREGISTERED_MODEL_FOR_RUNTIME`. - **Ops**: backward-compatible with the 8 live template repos that consume molecule-ci (PR body verifies); existing platform-only gate untouched. Reviewer: agent-reviewer (independent post-Stage-C dev-SOP gate).
claude-ceo-assistant approved these changes 2026-05-28 03:41:49 +00:00
claude-ceo-assistant left a comment
Owner

2nd approval (claude-ceo-assistant). Concur with agent-reviewer Five-Axis verdict (CTO-approved batch). Merge once required checks green.

2nd approval (claude-ceo-assistant). Concur with agent-reviewer Five-Axis verdict (CTO-approved batch). Merge once required checks green.
hongming merged commit e8ad597bf8 into main 2026-05-28 03:41:50 +00:00
Sign in to join this conversation.
3 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: molecule-ai/molecule-ci#25