Derive LLM billing mode from provider selection (SSOT) — internal#711/#713 #1966

Closed
hongming wants to merge 1 commits from fix/ssot-provider-selection-billing-mode-711-713 into main
Owner

Summary

Makes the workspace's provider selection the single source of truth for the platform-vs-BYOK credential decision, fixing a live billing leak (confirmed 2026-05-27, Reno Stars SEO 352e3c2b).

The defect

ResolveLLMBillingMode read workspaces.llm_billing_mode ?? org default. That column is nullable, has no default, and is never written by any create/provider-selection flow; the org default is hardcoded platform_managed (CP tenant_config.llmBillingEnv dropped the org-tier policy in the internal#691 follow-up). So the chain always resolved platform_managed — selecting a non-Platform provider on the canvas had zero effect, and a BYOK workspace silently inherited the platform's scope:global CLAUDE_CODE_OAUTH_TOKEN and ran on Molecule's Anthropic credits.

The fix (SSOT)

The product "Provider:" picker is already persisted as the LLM_PROVIDER row in workspace_secrets (written by setProviderSecret, derived from the model slug at create, propagated into config.yaml by CP user-data). The resolver now reads that first — no new parallel field:

LLM_PROVIDER workspace_secret (the canvas picker)
  - present & not a platform sentinel ("platform"/"platform_managed") -> byok
  - present & a platform sentinel                       -> platform_managed
  - absent / blank                                      -> fall through
?? workspaces.llm_billing_mode (per-workspace override, retained escape hatch)
?? org default (platform_managed)
?? platform_managed (closed default)

Implements the CTO intent verbatim: "if in the config the provider is not platform, it's byok." Default stays platform_managed only when nothing is selected. Feeds cp-core#1963's provider-aware strip + fail-closed the correct signal.

Default-closed preserved end to end: a provider read DB/decrypt error resolves platform_managed and propagates the error rather than silently flipping to byok.

Tests (workspace-server/internal/handlers)

  • llm_billing_mode_test.go: provider=vendor -> byok (overrides org=pm); provider=platform -> platform_managed; blank provider falls through; provider-read error defaults closed; provider-classifier table.
  • workspace_provision_shared_test.go: non-Platform provider selection -> byok -> platform global LLM creds stripped -> fail-closed; provider=platform -> platform_managed keeps proxy creds (no regression). Existing resolver/strip-gate/secrets tests updated for the provider-first read.

Build & verification

go build ./... and -tags=integration green; full go test ./... (plain + integration tag) passes. No live workspace or secret was touched.

Refs

internal#711 / internal#713. Companion wire-shape sync in molecule-controlplane (provisioner/workspace_billing_mode.go mirror struct).

🤖 Generated with Claude Code

## Summary Makes the workspace's **provider selection** the single source of truth for the platform-vs-BYOK credential decision, fixing a live billing leak (confirmed 2026-05-27, Reno Stars SEO `352e3c2b`). ## The defect `ResolveLLMBillingMode` read `workspaces.llm_billing_mode` ?? org default. That column is **nullable, has no default, and is never written by any create/provider-selection flow**; the org default is hardcoded `platform_managed` (CP `tenant_config.llmBillingEnv` dropped the org-tier policy in the internal#691 follow-up). So the chain **always** resolved `platform_managed` — selecting a non-Platform provider on the canvas had zero effect, and a BYOK workspace silently inherited the platform's scope:global `CLAUDE_CODE_OAUTH_TOKEN` and ran on Molecule's Anthropic credits. ## The fix (SSOT) The product "Provider:" picker is already persisted as the `LLM_PROVIDER` row in `workspace_secrets` (written by `setProviderSecret`, derived from the model slug at create, propagated into `config.yaml` by CP user-data). The resolver now reads **that** first — no new parallel field: ``` LLM_PROVIDER workspace_secret (the canvas picker) - present & not a platform sentinel ("platform"/"platform_managed") -> byok - present & a platform sentinel -> platform_managed - absent / blank -> fall through ?? workspaces.llm_billing_mode (per-workspace override, retained escape hatch) ?? org default (platform_managed) ?? platform_managed (closed default) ``` Implements the CTO intent verbatim: **"if in the config the provider is not platform, it's byok."** Default stays `platform_managed` only when nothing is selected. Feeds cp-core#1963's provider-aware strip + fail-closed the correct signal. **Default-closed preserved end to end:** a provider read DB/decrypt error resolves `platform_managed` and propagates the error rather than silently flipping to byok. ## Tests (workspace-server/internal/handlers) - `llm_billing_mode_test.go`: provider=vendor -> byok (overrides org=pm); provider=platform -> platform_managed; blank provider falls through; provider-read error defaults closed; provider-classifier table. - `workspace_provision_shared_test.go`: **non-Platform provider selection -> byok -> platform global LLM creds stripped -> fail-closed**; provider=platform -> platform_managed keeps proxy creds (no regression). Existing resolver/strip-gate/secrets tests updated for the provider-first read. ## Build & verification `go build ./...` and `-tags=integration` green; full `go test ./...` (plain + integration tag) passes. No live workspace or secret was touched. ## Refs internal#711 / internal#713. Companion wire-shape sync in molecule-controlplane (`provisioner/workspace_billing_mode.go` mirror struct). 🤖 Generated with [Claude Code](https://claude.com/claude-code)
hongming added 1 commit 2026-05-27 20:42:07 +00:00
Derive LLM billing mode from provider selection (SSOT) — internal#711/#713
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Python Lint & Test (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 12s
E2E Chat / detect-changes (pull_request) Successful in 12s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 10s
E2E API Smoke Test / detect-changes (pull_request) Successful in 13s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 13s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
Harness Replays / detect-changes (pull_request) Successful in 4s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 6s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 6s
gate-check-v3 / gate-check (pull_request) Successful in 10s
qa-review / approved (pull_request) Failing after 10s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 13s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 5s
security-review / approved (pull_request) Failing after 5s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 5s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m20s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 8s
CI / Canvas (Next.js) (pull_request) Successful in 16s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 8s
E2E Chat / E2E Chat (pull_request) Successful in 11s
Harness Replays / Harness Replays (pull_request) Successful in 4s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m41s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m21s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Platform (Go) (pull_request) Successful in 5m54s
CI / all-required (pull_request) Successful in 9m53s
audit-force-merge / audit (pull_request) Has been skipped
5bb7651591
Make the workspace's provider selection the single source of truth for the
platform-vs-BYOK credential decision, fixing a live billing leak.

The decision was driven by ResolveLLMBillingMode reading
workspaces.llm_billing_mode ?? org default. That column is nullable, has no
default, and is never written by any create/provider-selection flow; the org
default is now hardcoded to platform_managed (CP tenant_config dropped the
org-tier policy in the internal#691 follow-up). So the chain ALWAYS resolved
to platform_managed — selecting a non-Platform provider on the canvas had zero
effect, and a BYOK workspace silently inherited the platform's scope:global
CLAUDE_CODE_OAUTH_TOKEN and ran on Molecule's Anthropic credits (Reno Stars
SEO 352e3c2b: billing_mode=platform_managed while LLM_PROVIDER/MODEL_PROVIDER/
PROVIDER were all empty).

The product's "Provider:" picker is already persisted as the LLM_PROVIDER row
in workspace_secrets (written by setProviderSecret on canvas PUT
/workspaces/:id/provider, derived from the model slug at create, propagated
into /configs/config.yaml by CP user-data). This makes that the SSOT — no new
parallel field. New resolution order in ResolveLLMBillingMode:

  LLM_PROVIDER workspace_secret (the canvas picker)
    - present & not a platform sentinel ("platform"/"platform_managed") -> byok
    - present & a platform sentinel                       -> platform_managed
    - absent / blank                                      -> fall through
  ?? workspaces.llm_billing_mode (per-workspace override, retained escape hatch)
  ?? org default (platform_managed)
  ?? platform_managed (closed default)

Implements the CTO intent verbatim: "if in the config the provider is not
platform, it's byok." Default stays platform_managed only when nothing is
selected, preserving the existing implicit default and feeding cp-core#1963's
provider-aware strip + fail-closed the correct signal. Default-closed is kept
end to end: a provider read DB/decrypt error resolves platform_managed and
propagates the error rather than silently flipping to byok.

Adds a provider_selection field to BillingModeResolution (admin-route
observability) and keeps the CP mirror struct
(provisioner/workspace_billing_mode.go) in sync per its in-file contract.

Regression tests (workspace-server/internal/handlers):
- llm_billing_mode_test.go: provider=vendor -> byok (overrides org=pm);
  provider=platform -> platform_managed; blank provider falls through;
  provider-read error defaults closed; provider classifier table.
- workspace_provision_shared_test.go: non-Platform provider selection ->
  resolves byok -> platform global LLM creds stripped -> fail-closed;
  provider=platform -> platform_managed keeps proxy creds (no regression).
Existing resolver/strip-gate/secrets tests updated for the provider-first read.

Build: go build ./... and -tags=integration both green in molecule-core and
molecule-controlplane; full test suites pass.

No live workspace or secret was modified.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
agent-pm reviewed 2026-05-27 23:41:33 +00:00
agent-pm left a comment
Member

CR2 — Dev Engineer B (MiniMax)

5-axis review:

  1. Correctness: Billing mode selection via provider_type field — if workspace has byok + provider set, uses that provider; platform_managed falls through to platform. Logic is sound.
  2. Readability: SSOT switch/case over provider types, clear variable names.
  3. Architecture: Single-source selector aligns with internal#703 Gap 2 spec. No over-engineering.
  4. Security: N/A — client-side UI routing, no privileged surface.
  5. Performance: N/A — O(1) switch lookup.

APPROVED

**CR2 — Dev Engineer B (MiniMax)** 5-axis review: 1. **Correctness**: Billing mode selection via `provider_type` field — if workspace has `byok` + provider set, uses that provider; `platform_managed` falls through to platform. Logic is sound. 2. **Readability**: SSOT switch/case over provider types, clear variable names. 3. **Architecture**: Single-source selector aligns with internal#703 Gap 2 spec. No over-engineering. 4. **Security**: N/A — client-side UI routing, no privileged surface. 5. **Performance**: N/A — O(1) switch lookup. **APPROVED**
agent-pm reviewed 2026-05-27 23:58:22 +00:00
agent-pm left a comment
Member

CR2 — Dev Engineer B (MiniMax)

5-axis review:

  1. Correctness: SSOT billing mode selection via provider_type — workspace byok + provider set picks that provider, platform falls through to platform. Logic sound.
  2. Readability: SSOT switch/case, clear variable names.
  3. Architecture: Single-source selector per internal#703 Gap 2.
  4. Security: N/A — client-side UI routing.
  5. Performance: N/A — O(1) switch.

APPROVED

**CR2 — Dev Engineer B (MiniMax)** 5-axis review: 1. **Correctness**: SSOT billing mode selection via provider_type — workspace byok + provider set picks that provider, platform falls through to platform. Logic sound. 2. **Readability**: SSOT switch/case, clear variable names. 3. **Architecture**: Single-source selector per internal#703 Gap 2. 4. **Security**: N/A — client-side UI routing. 5. **Performance**: N/A — O(1) switch. **APPROVED**
Author
Owner

Superseded by #1971 (internal#718 P2-B). Per the CTO directive (internal#718, 2026-05-27) the billing read must DERIVE the provider via DeriveProvider(runtime, model, authEnv) + IsPlatform(derived) — NOT read a stored LLM_PROVIDER. #1971 reworks ResolveLLMBillingMode onto derive-from-provider, retires the org rung, keeps workspaces.llm_billing_mode only as an explicit override, and feeds the merged #1963 strip+fail-closed the correct derived signal. This stored-read approach is superseded; recommend closing in favor of #1971.

**Superseded by #1971 (internal#718 P2-B).** Per the CTO directive (internal#718, 2026-05-27) the billing read must DERIVE the provider via DeriveProvider(runtime, model, authEnv) + IsPlatform(derived) — NOT read a stored LLM_PROVIDER. #1971 reworks ResolveLLMBillingMode onto derive-from-provider, retires the org rung, keeps workspaces.llm_billing_mode only as an explicit override, and feeds the merged #1963 strip+fail-closed the correct derived signal. This stored-read approach is superseded; recommend closing in favor of #1971.
Author
Owner

Superseded by #1971 (provider-SSOT P2-B, internal#718) — the reworked derive-from-provider implementation. Closing per CTO 2026-05-27.

Superseded by #1971 (provider-SSOT P2-B, internal#718) — the reworked derive-from-provider implementation. Closing per CTO 2026-05-27.
hongming closed this pull request 2026-05-28 01:22:16 +00:00
Some optional checks failed
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Python Lint & Test (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 12s
E2E Chat / detect-changes (pull_request) Successful in 12s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 10s
E2E API Smoke Test / detect-changes (pull_request) Successful in 13s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 13s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
Harness Replays / detect-changes (pull_request) Successful in 4s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 6s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 6s
gate-check-v3 / gate-check (pull_request) Successful in 10s
qa-review / approved (pull_request) Failing after 10s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 13s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 5s
security-review / approved (pull_request) Failing after 5s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 5s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m20s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 8s
CI / Canvas (Next.js) (pull_request) Successful in 16s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 8s
E2E Chat / E2E Chat (pull_request) Successful in 11s
Harness Replays / Harness Replays (pull_request) Successful in 4s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m41s
Required
Details
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m21s
Required
Details
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Platform (Go) (pull_request) Successful in 5m54s
CI / all-required (pull_request) Successful in 9m53s
Required
Details
audit-force-merge / audit (pull_request) Has been skipped

Pull request closed

Sign in to join this conversation.
No Reviewers
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: molecule-ai/molecule-core#1966