P2-B internal#718: billing/credential derives from provider + only-registered validation (BEHAVIOR-AFFECTING; supersedes #1966) #1971
Reference in New Issue
Block a user
Delete Branch "feat/internal-718-p2b-billing-derives-from-provider"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
internal#718 P2-B — billing/credential decision DERIVES the provider (BEHAVIOR-AFFECTING)
Stacked on #1970 (P2-A) — base is the P2-A branch (the registry/loader it consumes). Re-target to
mainafter #1970 merges.Re-points the platform-vs-BYOK billing/credential decision to DERIVE the provider from
(runtime, model)via the registry SSOT, per the CTO directive (internal#718 comment 2026-05-27): "the billing read must DERIVE the provider, not read a stored LLM_PROVIDER", "remove LLM_PROVIDER entirely as a billing source", "retire organizations.llm_billing_mode as a billing source".⚠ BEHAVIOR DELTA (tested explicitly)
What changed
ResolveLLMBillingModeDerived(ctx, ws, runtime, model, authEnv)— new SSOT resolver: explicitworkspaces.llm_billing_modeoverride (precedence 1, the only stored billing signal that survives — operator escape hatch) → elseDeriveProvider+IsPlatform→ else default-closedplatform_managed.ResolveLLMBillingMode(ctx, ws, orgMode)legacy signature kept for callers without (runtime, model) (admin route, secrets remote-pull): reads stored runtime + MODEL + auth-env from DB, delegates to the derived resolver.orgModeRETIRED/ignored — org rung gone.applyPlatformManagedLLMEnvderives directly (has runtime+model+env). Feeds the merged #1963 strip + fail-closed the correct DERIVED signal.validateRegisteredModelForRuntime+WorkspaceHandler.Create): an unregistered(runtime, model)on a registry-known runtime is flagged (X-Molecule-Model-Unregisteredheader + warning log). P2 = WARN mode, not hard-422 — the legacy colon-namespaced model vocab ("anthropic:claude-opus-4-7") is still live across the create/import/template corpus and isn't yet reconciled into the registry; hard-reject is a one-line flip gated on P3/P4 vocabulary convergence. Fails OPEN for non-registry runtimes (langgraph/external/kimi/mock/federated).Tests (TDD)
llm_billing_mode_derived_test.go— platform/non-platform/unset/override/unregistered/auth-env table + DB-error default-closed + empty-id.workspace_provision_shared_test.go— DERIVED platform→unchanged, non-platform→byok+strip+fail-closed (the FIX), unset→platform default, through the realapplyPlatformManagedLLMEnvpath. Existing #1963 override-byok tests unchanged.model_registry_validation_test.go+workspace_test.go— warn / no-warn / fail-open.Build/CI
go build ./...(+-tags=integration) green; fullgo test ./...(43 pkgs) green incl.-raceon handlers; vet clean; changed files gofmt-clean.BEHAVIOR-AFFECTING. Supersedes #1966. Cross-link internal#718 (P2-B). Do NOT merge.
Five-Axis review — agent-reviewer (independent; author hongming ≠ reviewer)
Reviewed against the #1971-branch diff vs its base (#1970 @71c68e4); head
3165b98. Built + ran the fullinternal/handlers+internal/providerssuites with-race, plus integration-tag build, vet, gofmt — all green. Confirmed the behavior-delta tests are LOAD-BEARING via targeted source mutations (each mutant → RED).VERDICT: APPROVED. Behavior delta is correct, default-platform-on-unset holds (no fleet fail-close), the #1966 stored-LLM_PROVIDER billing-read is gone, only-registered ships in WARN mode, and #1963/cp#362 are intact. One NON-BLOCKING coordination follow-up below.
Axis 1 — Behavior delta (load-bearing). PASS
All three cases verified through the REAL
applyPlatformManagedLLMEnvpath (not a mock), registry data confirmed via a throwaway DeriveProvider probe:anthropic/claude-opus-4-7→platform) AND unset→default ⇒platform_managed⇒ CP proxy creds injected — UNCHANGED. (workspace_provision_shared_test.goDERIVED_PlatformModelKeepsPlatformCreds / DERIVED_UnsetModelPlatformDefault)kimi-for-coding→kimi-coding,gpt-5.5→openai) ⇒byok⇒ #1963 strips the platform scope:global OAuth token +HasUsableLLMCred=false⇒ caller fail-closes (workspace_provision_shared.go:217). This is the Reno-leak fix. (DERIVED_NonPlatformModelStripsAndFailsClosed)Axis 2 — No silent breakage of working workspaces (fleet-safety). PASS
Derive-failure (no model / unknown runtime / unregistered / ambiguous / registry-unavailable / DB error) ALWAYS defaults closed to
platform_managed(llm_billing_mode.goResolveLLMBillingModeDerived precedence-3 + the manifest-nil + override-DB-error branches). A wrong default here would fail-close the whole platform fleet; the mutation flipping this to byok went RED across unset/unregistered/unknown-runtime, so the guard is real.DeriveProvidererrors (never silently returns a non-platform provider) on empty model — verified in the registry source.Axis 3 — #1966 stored-read fully removed. PASS
Billing decision derives from
(runtime, model, authEnv)only. No production path reads storedLLM_PROVIDERto decide billing.normalizeOrgDefaultdeleted (no callers);BillingModeSourceOrgDefaultkept as a wire-compat const but NEVER returned as a Source in prod;organizations.llm_billing_modenever queried.workspaces.llm_billing_modesurvives only as precedence-1 explicit operator override. The remainingLLM_PROVIDERreads (GetProvider/SetProvider/setProviderSecret/deriveProviderFromModelSlug) are the orthogonal CP user-data provider-HINT path (controlplane #364), not billing — internal#718 schedules their demote-to-write-through-cache for a later phase, correctly out of P2-B scope.Axis 4 — only-registered is WARN, not hard-reject. PASS
workspace.goCreate setsX-Molecule-Model-Unregistered: true+ a warning log and PROCEEDS (201) for an unregistered (runtime, model) on a registry-known runtime; the hard-422 is commented/gated on P3/P4 vocab convergence. Fails OPEN for non-registry runtimes (langgraph/external/mock/kimi/federated). MUTATION PROOF: flipping WARN→422 turned the warns-but-proceeds test RED (expected 201). Does not 422 the legacy colon vocab.Axis 5 — #1963 reuse + cp#362/P1 untouched. PASS
stripGlobalOriginLLMCreds/hasAnyPlatformManagedLLMKeybodies are byte-identical vs base (only the call-site reference appears in the diff). #1971 touches ONLYworkspace-server/internal/handlers— no proxyResolveUpstream, no controlplane, no cp#362. Existing #1963 byok override tests pass unchanged.Axis 6 — Registry usage. PASS
Uses the #1970 core registry
DeriveProvider/IsPlatform/ModelsForRuntimecorrectly.IsPlatformis the closedName=="platform"predicate. The authEnv fed to DeriveProvider is the REAL workspace env: on the provision path viaavailableAuthEnvNames(envVars), and on the legacy-shim path viareadWorkspaceDeriveInputsscanning recognized auth-env secret KEYS (never values) — verifiedopus+CLAUDE_CODE_OAUTH_TOKEN→anthropic-oauth(byok), so oauth-vs-api disambiguation fires identically to the canvas.FOLLOW-UP (non-blocking, coordination — does not affect #1971 correctness)
Three open PRs concurrently retire the org rung from the SAME resolver (
llm_billing_mode.go):refactor/drop-org-tier-llm-billing-mode(base=main) makes resolutionworkspaces.llm_billing_mode ?? platform_managed— i.e. the stored column stays the PRIMARY signal.#1971 instead makes the DERIVED provider primary and demotes the column to an override. #1930 and #1971 are on a collision course: if #1930 lands AFTER #1971 it would revert derive-from-provider back to a stored-read. Recommend, before #1971 re-targets/merges to main: close #1966 and reconcile #1930/#1931 (close, or rebase the canvas drop on #1971's override-only model). Flagging per the SOP duplicate-fix-revert hazard — not a defect in #1971.
Static review only. Nothing merged/pushed/touched-live; build+test were on a throwaway worktree at the PR head.
2nd approval (claude-ceo-assistant). Concur with agent-reviewer Five-Axis verdict (CTO-approved batch). Merge once required checks green.