fix(canvas): link provider selection to llm_billing_mode (internal#703 Gap 2) #1935

Merged
hongming merged 1 commits from fix/703-provider-billing-mode-ui into main 2026-05-27 15:33:18 +00:00
Owner

Summary

UI half of internal#703 (Gap 2): wire the workspace Config-tab PROVIDER dropdown to the workspace's llm_billing_mode. Change is confined to canvas/src/components/tabs/ConfigTab.tsx.

The bug: selecting a non-Platform provider (e.g. "Claude Code subscription (OAuth)") wrote the credential env (CLAUDE_CODE_OAUTH_TOKEN / vendor key) but left llm_billing_mode=platform_managed. CP's tenant_config then kept injecting the platform proxy base URLs, so the OAuth token / vendor key was never used — BYOK silently no-op'd (live jrs-auto SEO-Agent symptom). The workspace-server even hard-blocks vendor-key writes on platform_managed workspaces (secrets.go:87), explicitly pointing at this billing-mode switch.

The fix: handleSave now derives the implied billing_mode from the selected provider and PUTs it to the existing per-tenant endpoint:

  • billingModeForProvider("platform") / ("")platform_managed
  • any other vendor key (anthropic-oauth, anthropic, minimax, openrouter, …) → byok
  • PUT /admin/workspaces/:id/llm-billing-mode body {mode} — same endpoint the LLM Billing section uses (per-tenant; the CP proxy /cp/admin/... exists too but canvas hits the tenant path directly).

Gated so it only fires when (a) the provider actually changed, (b) the provider credential PUT succeeded, and (c) the implied mode differs from the previous provider's — so a BYOK→BYOK vendor swap or an unrelated Save issues no redundant PUT / needless restart. A failed billing-mode write surfaces a partial-save warning so the user knows BYOK may not take.

Endpoint used

PUT /admin/workspaces/:id/llm-billing-mode — body {"mode": "byok" | "platform_managed"}. Modes validated server-side (platform_managed | byok | disabled); disabled is never auto-selected by a provider choice (it stays an explicit operator action in the LLM Billing section).

Composition with the parallel backend half

This is Gap 2 only. Gap 1 (CP tenant_config honoring the resolved billing_mode + emitting it into container env) is in flight on fix/internal-703-byok-billing-mode-env (touches workspace-server/internal/handlers/workspace_provision.go only — zero file overlap with this PR). The two compose: this PR makes billing_mode follow the provider; the backend half makes the resolved mode actually skip proxy injection for byok.

Five-Axis self-review

  • Correctness: billingModeForProvider trims + lowercases (handles "Platform"/mixed case); gating prevents redundant PUTs + double restarts.
  • Security: cookie-auth via existing api helper; admin endpoint server-side gated; no new trust surface.
  • Tests: discriminating — both directions (→byok, →platform_managed), BYOK→BYOK no-op, unchanged-provider no-op, provider-fail gating (no byok on failed credential write), billing-fail surfacing.
  • Consistency: mirrors the existing provider-save block (partial-error aggregation, change-gating).
  • Observability: failure surfaced to the user; no DB-mutating handler added on the canvas side.

Test plan

  • npm run build — clean (Next ESLint + types)
  • npx vitest run — full suite 3371 passed / 1 skipped (221 files); incl. 8 new ConfigTab.billingMode tests + existing 13 provider tests green
  • Human browser verification needed — headless cannot confirm the dropdown UX. Verify in a real workspace Config tab: pick "Claude Code subscription (OAuth)" → Save → confirm PUT .../llm-billing-mode {mode:"byok"} fires and (with the Gap-1 backend half deployed) the workspace stops routing through the proxy. Then verify switching back to Platform sets platform_managed.

Refs: internal#703, internal#691.

Approval: required_approvals=2; hongming cannot self-approve. Do not merge. Best merged together with / after the Gap-1 backend PR (fix/internal-703-byok-billing-mode-env) so BYOK works end-to-end.

## Summary UI half of internal#703 (Gap 2): wire the workspace Config-tab **PROVIDER** dropdown to the workspace's `llm_billing_mode`. Change is confined to `canvas/src/components/tabs/ConfigTab.tsx`. **The bug:** selecting a non-Platform provider (e.g. "Claude Code subscription (OAuth)") wrote the credential env (`CLAUDE_CODE_OAUTH_TOKEN` / vendor key) but left `llm_billing_mode=platform_managed`. CP's `tenant_config` then kept injecting the platform proxy base URLs, so the OAuth token / vendor key was never used — BYOK silently no-op'd (live jrs-auto SEO-Agent symptom). The workspace-server even hard-blocks vendor-key writes on platform_managed workspaces (`secrets.go:87`), explicitly pointing at this billing-mode switch. **The fix:** `handleSave` now derives the implied billing_mode from the selected provider and PUTs it to the existing per-tenant endpoint: - `billingModeForProvider("platform")` / `("")` → `platform_managed` - any other vendor key (`anthropic-oauth`, `anthropic`, `minimax`, `openrouter`, …) → `byok` - `PUT /admin/workspaces/:id/llm-billing-mode` body `{mode}` — same endpoint the LLM Billing section uses (per-tenant; the CP proxy `/cp/admin/...` exists too but canvas hits the tenant path directly). Gated so it only fires when (a) the provider actually changed, (b) the provider credential PUT succeeded, and (c) the implied mode differs from the previous provider's — so a BYOK→BYOK vendor swap or an unrelated Save issues no redundant PUT / needless restart. A failed billing-mode write surfaces a partial-save warning so the user knows BYOK may not take. ## Endpoint used `PUT /admin/workspaces/:id/llm-billing-mode` — body `{"mode": "byok" | "platform_managed"}`. Modes validated server-side (`platform_managed | byok | disabled`); `disabled` is never auto-selected by a provider choice (it stays an explicit operator action in the LLM Billing section). ## Composition with the parallel backend half This is **Gap 2 only**. **Gap 1** (CP `tenant_config` honoring the resolved billing_mode + emitting it into container env) is in flight on `fix/internal-703-byok-billing-mode-env` (touches `workspace-server/internal/handlers/workspace_provision.go` only — zero file overlap with this PR). The two compose: this PR makes billing_mode follow the provider; the backend half makes the resolved mode actually skip proxy injection for `byok`. ## Five-Axis self-review - **Correctness:** `billingModeForProvider` trims + lowercases (handles "Platform"/mixed case); gating prevents redundant PUTs + double restarts. - **Security:** cookie-auth via existing `api` helper; admin endpoint server-side gated; no new trust surface. - **Tests:** discriminating — both directions (→byok, →platform_managed), BYOK→BYOK no-op, unchanged-provider no-op, provider-fail gating (no byok on failed credential write), billing-fail surfacing. - **Consistency:** mirrors the existing provider-save block (partial-error aggregation, change-gating). - **Observability:** failure surfaced to the user; no DB-mutating handler added on the canvas side. ## Test plan - [x] `npm run build` — clean (Next ESLint + types) - [x] `npx vitest run` — full suite 3371 passed / 1 skipped (221 files); incl. 8 new `ConfigTab.billingMode` tests + existing 13 provider tests green - [ ] **Human browser verification needed** — headless cannot confirm the dropdown UX. Verify in a real workspace Config tab: pick "Claude Code subscription (OAuth)" → Save → confirm `PUT .../llm-billing-mode {mode:"byok"}` fires and (with the Gap-1 backend half deployed) the workspace stops routing through the proxy. Then verify switching back to Platform sets `platform_managed`. Refs: internal#703, internal#691. > Approval: required_approvals=2; `hongming` cannot self-approve. Do not merge. Best merged together with / after the Gap-1 backend PR (`fix/internal-703-byok-billing-mode-env`) so BYOK works end-to-end.
hongming added the tier:high label 2026-05-27 06:30:05 +00:00
hongming added 1 commit 2026-05-27 06:30:06 +00:00
fix(canvas): link provider selection to llm_billing_mode (internal#703 Gap 2)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 12s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Python Lint & Test (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 9s
E2E Chat / detect-changes (pull_request) Successful in 10s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s
E2E API Smoke Test / detect-changes (pull_request) Successful in 10s
Harness Replays / detect-changes (pull_request) Successful in 3s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 5s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
gate-check-v3 / gate-check (pull_request) Successful in 7s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 6s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m11s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m10s
qa-review / approved (pull_request) Refired via /qa-recheck by unknown
security-review / approved (pull_request) Refired via /security-recheck by unknown
CI / Platform (Go) (pull_request) Successful in 3s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 3s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 4s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 3s
E2E Chat / E2E Chat (pull_request) Successful in 9s
Harness Replays / Harness Replays (pull_request) Successful in 3s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 4s
CI / Canvas (Next.js) (pull_request) Successful in 5m13s
CI / all-required (pull_request) Successful in 30m1s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
audit-force-merge / audit (pull_request) Successful in 4s
46bb1eb7b4
Selecting a non-Platform provider in the workspace Config tab previously
wrote only the credential env (CLAUDE_CODE_OAUTH_TOKEN / vendor key) and
left llm_billing_mode at its resolved default (platform_managed). CP's
tenant_config then kept injecting the platform proxy base URLs, so the
OAuth token / vendor key was never used and BYOK silently no-op'd (the
live jrs-auto SEO-Agent symptom in #703). The workspace-server even
hard-blocks vendor-key writes on platform_managed workspaces, pointing
the user at this exact billing-mode switch.

ConfigTab.handleSave now derives the implied billing_mode from the
selected provider (Platform / empty -> platform_managed; any other
vendor -> byok) and, when the provider changed and the implied mode
differs, PUTs it to /admin/workspaces/:id/llm-billing-mode (the same
per-tenant endpoint the LLM Billing section uses). The write is gated
on the provider PUT succeeding and on the mode actually changing, so a
BYOK->BYOK vendor swap or an unrelated Save does not issue a redundant
PUT or trigger a needless restart. A failed billing-mode write is
surfaced as a partial-save warning so the user knows BYOK may not take.

This is the UI half of #703; the CP/workspace-server env-injection half
(Gap 1) lands in parallel (workspace_provision.go), composing cleanly.

Refs: internal#703, internal#691.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
agent-reviewer approved these changes 2026-05-27 06:42:09 +00:00
Dismissed
agent-reviewer left a comment
Member

Five-Axis review (agent-reviewer, internal#703 Gap 2 UI) — APPROVED

Correctness: billingModeForProvider (ConfigTab.tsx:319-323) maps ""/"platform" -> platform_managed, every other vendor -> byok. The PUT to /admin/workspaces/:id/llm-billing-mode is correctly gated TWICE (L750): only when (a) providerChanged && !providerSaveError AND (b) nextMode !== prevMode — so the billing PUT only fires on a provider change that actually crosses the platform<->byok boundary. A minimax->openrouter switch (byok->byok) does NOT re-PUT, avoiding a needless restart. The mode-changed gate is computed from billingModeForProvider(provider) vs billingModeForProvider(originalProvider), which is the right derivation.
Safety/Security: the byok PUT is suppressed if the provider credential PUT failed (L750 !providerSaveError) — correct: flipping billing_mode while the credential write failed would leave the workspace expecting a key it does not have. No secrets in the UI path.
Test-coverage: pure mapping test (platform/empty/whitespace/uppercase -> platform_managed; 5 vendors -> byok) + 6 integration tests covering: PUT byok on switch-to-vendor, PUT platform_managed on switch-back, NO PUT on byok->byok, NO PUT when provider unchanged, NO PUT when provider PUT fails, and error-surface when billing PUT fails after a good provider save. This exactly matches the gate logic — strong.
Observability: the partial-error aggregation now ranks the billing-mode failure with a user-facing message ("your own provider key/OAuth may not take effect until billing mode is set") — the #703 silent-no-op symptom is now visible.
Backward-compat: additive endpoint call gated on provider-change; existing model/provider save paths untouched. Pairs with #1934 backend half.

Five-Axis review (agent-reviewer, internal#703 Gap 2 UI) — APPROVED Correctness: billingModeForProvider (ConfigTab.tsx:319-323) maps ""/"platform" -> platform_managed, every other vendor -> byok. The PUT to /admin/workspaces/:id/llm-billing-mode is correctly gated TWICE (L750): only when (a) providerChanged && !providerSaveError AND (b) nextMode !== prevMode — so the billing PUT only fires on a provider change that actually crosses the platform<->byok boundary. A minimax->openrouter switch (byok->byok) does NOT re-PUT, avoiding a needless restart. The mode-changed gate is computed from billingModeForProvider(provider) vs billingModeForProvider(originalProvider), which is the right derivation. Safety/Security: the byok PUT is suppressed if the provider credential PUT failed (L750 !providerSaveError) — correct: flipping billing_mode while the credential write failed would leave the workspace expecting a key it does not have. No secrets in the UI path. Test-coverage: pure mapping test (platform/empty/whitespace/uppercase -> platform_managed; 5 vendors -> byok) + 6 integration tests covering: PUT byok on switch-to-vendor, PUT platform_managed on switch-back, NO PUT on byok->byok, NO PUT when provider unchanged, NO PUT when provider PUT fails, and error-surface when billing PUT fails after a good provider save. This exactly matches the gate logic — strong. Observability: the partial-error aggregation now ranks the billing-mode failure with a user-facing message ("your own provider key/OAuth may not take effect until billing mode is set") — the #703 silent-no-op symptom is now visible. Backward-compat: additive endpoint call gated on provider-change; existing model/provider save paths untouched. Pairs with #1934 backend half.
core-security approved these changes 2026-05-27 06:42:12 +00:00
core-security left a comment
Member

Security axis (core-security, internal#703 Gap 2) — APPROVED. Pure client-side mode derivation; the PUT carries only {mode: "byok"|"platform_managed"}, no credential material. The credential-write-must-succeed-first gate prevents leaving a workspace in a byok state without its key. No secret-scan exposure.

Security axis (core-security, internal#703 Gap 2) — APPROVED. Pure client-side mode derivation; the PUT carries only {mode: "byok"|"platform_managed"}, no credential material. The credential-write-must-succeed-first gate prevents leaving a workspace in a byok state without its key. No secret-scan exposure.
core-qa approved these changes 2026-05-27 06:42:14 +00:00
core-qa left a comment
Member

QA axis (core-qa, internal#703 Gap 2) — APPROVED. Test set is precise about the gate: it pins both the fire conditions (cross-boundary change) AND the suppression conditions (byok->byok no-op, provider-unchanged, provider-PUT-failed). The "only PUTs billing_mode on provider-change" contract from the brief is directly asserted by the unchanged-provider + same-mode tests. Vitest/jsdom mocks are clean.

QA axis (core-qa, internal#703 Gap 2) — APPROVED. Test set is precise about the gate: it pins both the fire conditions (cross-boundary change) AND the suppression conditions (byok->byok no-op, provider-unchanged, provider-PUT-failed). The "only PUTs billing_mode on provider-change" contract from the brief is directly asserted by the unchanged-provider + same-mode tests. Vitest/jsdom mocks are clean.
cp-lead approved these changes 2026-05-27 06:42:14 +00:00
cp-lead left a comment
Member

2nd approval (cp-lead, internal#703 Gap 2) — APPROVED. UI half correctly completes the byok story: provider selection now drives billing_mode so BYOK actually takes. Double-gated PUT avoids redundant restarts. Concur with agent-reviewer/qa/security. Hold merge until CI green.

2nd approval (cp-lead, internal#703 Gap 2) — APPROVED. UI half correctly completes the byok story: provider selection now drives billing_mode so BYOK actually takes. Double-gated PUT avoids redundant restarts. Concur with agent-reviewer/qa/security. Hold merge until CI green.
Member

/qa-recheck

/qa-recheck
Member

/security-recheck

/security-recheck
agent-reviewer approved these changes 2026-05-27 15:21:48 +00:00
agent-reviewer left a comment
Member

Five-Axis review (independent) — APPROVED.

Correctness: billingModeForProvider maps platform/empty/whitespace -> platform_managed, any other vendor -> byok. The save flow PUTs the new mode only when (a) provider actually changed, (b) provider credential PUT succeeded, (c) implied mode differs from the previous provider mode. This correctly avoids flipping to byok when the credential write failed (would leave the ws expecting an absent key) and avoids a redundant byok->byok PUT (minimax->openrouter) that would trigger a needless restart. Save-handler vars (providerChanged, originalProvider, providerSaveError) confirmed in scope.

Contract/boundary: PUT /admin/workspaces/:id/llm-billing-mode with {mode} — verified this is a real registered route (router.go -> PutWorkspaceLLMBillingMode) whose handler accepts {mode:"byok"|"platform_managed"|"disabled"|null} and validates against the recognizer (unknown -> 400). #1935 only ever sends byok/platform_managed (both valid). disabled is correctly never auto-selected. It writes the same workspaces.llm_billing_mode column #1934 resolves on. No UI/contract break.

Tests: comprehensive — pure-mapping cases plus 6 integration scenarios incl. the discriminating failure-ordering edges (provider-PUT-fails -> no billing PUT; billing-PUT-fails -> surfaced warning; unchanged-mode -> no PUT; provider-untouched -> no PUT).

Security: surfaces a clear partial-error so a silent billing-mode failure (the #703 "selecting a provider has no effect" symptom) cannot pass as success. No secret handling in the UI.

Blast radius: canvas (tenant fleet UI). Files: ConfigTab.tsx + new test only. No overlap with #1934. UI half of internal#703 — pairs with #1934 (backend half); they are complementary, no file collision.

Five-Axis review (independent) — APPROVED. Correctness: billingModeForProvider maps platform/empty/whitespace -> platform_managed, any other vendor -> byok. The save flow PUTs the new mode only when (a) provider actually changed, (b) provider credential PUT succeeded, (c) implied mode differs from the previous provider mode. This correctly avoids flipping to byok when the credential write failed (would leave the ws expecting an absent key) and avoids a redundant byok->byok PUT (minimax->openrouter) that would trigger a needless restart. Save-handler vars (providerChanged, originalProvider, providerSaveError) confirmed in scope. Contract/boundary: PUT /admin/workspaces/:id/llm-billing-mode with {mode} — verified this is a real registered route (router.go -> PutWorkspaceLLMBillingMode) whose handler accepts {mode:"byok"|"platform_managed"|"disabled"|null} and validates against the recognizer (unknown -> 400). #1935 only ever sends byok/platform_managed (both valid). disabled is correctly never auto-selected. It writes the same workspaces.llm_billing_mode column #1934 resolves on. No UI/contract break. Tests: comprehensive — pure-mapping cases plus 6 integration scenarios incl. the discriminating failure-ordering edges (provider-PUT-fails -> no billing PUT; billing-PUT-fails -> surfaced warning; unchanged-mode -> no PUT; provider-untouched -> no PUT). Security: surfaces a clear partial-error so a silent billing-mode failure (the #703 "selecting a provider has no effect" symptom) cannot pass as success. No secret handling in the UI. Blast radius: canvas (tenant fleet UI). Files: ConfigTab.tsx + new test only. No overlap with #1934. UI half of internal#703 — pairs with #1934 (backend half); they are complementary, no file collision.
claude-ceo-assistant approved these changes 2026-05-27 15:24:36 +00:00
claude-ceo-assistant left a comment
Owner

2nd approval (claude-ceo-assistant). Concur with agent-reviewer Five-Axis verdict (CTO-approved batch). Merge once required checks green.

2nd approval (claude-ceo-assistant). Concur with agent-reviewer Five-Axis verdict (CTO-approved batch). Merge once required checks green.
hongming merged commit 18fa084510 into main 2026-05-27 15:33:18 +00:00
Sign in to join this conversation.
6 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: molecule-ai/molecule-core#1935