fix(workspace-server): rename workspace_secrets MODEL_PROVIDER → MODEL + drop legacy fallback #1581
Reference in New Issue
Block a user
Delete Branch "fix/workspace-server-rename-MODEL_PROVIDER-to-MODEL"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Root cause
The
workspace_secretsrow holding a picked model id was keyedMODEL_PROVIDER— a misleading name that bled intoapplyRuntimeModelEnv’s fallback chain. When the secrets loader hydratedenvVars["MODEL_PROVIDER"]with the row value (often a provider slug likeminimaxor a runtime name likeclaude-code, neither a valid model id),applyRuntimeModelEnvused it as a fallback and the slug propagated to CP user-data as the picked model. Adapters then wedged at SDK initialize.Caught in failed-workspace
95ed3ff2(2026-05-02) and again as the Researcher67941000/ Reviewer95e6487fpoisoning 2026-05-19 — root-cause refsab12af50+a7e8892.CP-side slot-separation already merged via cp#213 + cp#220. This is the workspace-server companion.
Changes
workspace-server/internal/handlers/secrets.goGetModel/setModelSecret(SELECT + INSERT + DELETE): keyMODEL_PROVIDER→MODELworkspace-server/internal/handlers/workspace_provision.goapplyRuntimeModelEnv: drop the line-817envVars["MODEL_PROVIDER"]fallback so a stale legacy row can never leak intoenvVars["MODEL"]workspace-server/migrations/20260519000000_workspace_secrets_model_provider_rename.{up,down}.sqlMODEL_PROVIDERrows; on conflict the existingMODELrow winssecrets_test.go+workspace_provision_shared_test.go'MODEL'in all sqlmock regexes; newTestSecretsModel_RoundTrip+ newTestApplyRuntimeModelEnv_StaleMODELPROVIDERNeverLeaksIntoMODELregression guardBefore / after
Before (
secrets.go):After:
Before (
workspace_provision.go):After:
Migration
In-PR. Idempotent (re-running on already-renamed schema is a no-op). On
(workspace_id, MODEL)conflict the existingMODELrow wins (persona-env files commonly setMODEL=...directly, so the more-canonical value is preserved). Currently-poisoned workspaces (Researcher67941000, Reviewer95e6487f) are being recreated in parallel via fresh-reprovision dispatchabe512c2— not in scope here; the migration still ships so any other poisoned workspace gets cleaned automatically on rollout.Breaking changes for consumers
MODEL_PROVIDERcontinues to be honoured by the Python runtime (workspace/config.py) as a legacy fallback for already-running containers. No change there.workspace_secretsrow nameMODEL_PROVIDERis renamed toMODEL. External scripts that read that row directly will need to migrate. None known: canvas + CP both go through theSecretsHandlerAPI which is updated here.Tests
go build ./...cleango vet ./...cleango test ./internal/handlers/ -count=1PASS (15.9s)TestSecretsModel_RoundTrip_KeyIsMODELNotMODEL_PROVIDERTestApplyRuntimeModelEnv_StaleMODELPROVIDERNeverLeaksIntoMODEL'MODEL'literal:TestSecretsGetModel_Default,TestSecretsGetModel_DBError,TestSecretsSetModel_Upsert,TestSecretsSetModel_EmptyClears,TestWorkspaceCreate_FirstDeploy_PersistsModelAndProvider,TestWorkspaceCreate_FirstDeploy_UnknownModel_OnlyMintModelProvider,TestApplyRuntimeModelEnv_SetsUniversalMODELForAllRuntimes,TestApplyRuntimeModelEnv_PersonaEnvMODELSecretPreservedRefs
ab12af50,a7e8892(root-cause analyses)cp#213,cp#220(CP-side prior art)internal#226(misnomer guard background)feedback_passwords_in_chat_are_burned— no creds in this PRReview queue
Per molecule-core:
core-devops+core-security+core-qa3-reviewer relay.🤖 Generated with Claude Code
Root cause: the workspace_secrets row that holds a picked model id was keyed MODEL_PROVIDER — a misleading name that bled into the applyRuntimeModelEnv fallback chain. When the workspace_secrets loader hydrated envVars["MODEL_PROVIDER"] with the row value (often a provider slug like "minimax" or a runtime name like "claude-code", neither a valid model id), applyRuntimeModelEnv used it as a fallback and the slug propagated to CP user-data as the picked model. Adapters then wedged at SDK initialize. Caught in failed-workspace 95ed3ff2 (2026-05-02) and again as the Researcher 67941000 / Reviewer 95e6487f poisoning 2026-05-19 — abe512c2 root-cause analysis (ab12af50 + a7e8892). CP-side slot-separation already merged via cp#213 + cp#220. This is the workspace-server companion. Changes: - secrets.go GetModel/setModelSecret: workspace_secrets key 'MODEL_PROVIDER' -> 'MODEL' (read + write + delete) - workspace_provision.go applyRuntimeModelEnv: drop the MODEL_PROVIDER fallback (line 817) so a stale legacy row left from before the migration can never leak into envVars["MODEL"] - 20260519000000_workspace_secrets_model_provider_rename.{up,down}.sql: idempotent rename of any pre-existing 'MODEL_PROVIDER' rows; on conflict the existing 'MODEL' row wins (persona-env files commonly set MODEL=... directly so the more-canonical value is preserved) - Tests updated to pin the literal 'MODEL' key in all sqlmock regexes + new TestSecretsModel_RoundTrip + new TestApplyRuntimeModelEnv_StaleMODELPROVIDERNeverLeaksIntoMODEL regression guard Currently-poisoned workspaces (Researcher 67941000, Reviewer 95e6487f) are being handled in parallel via fresh-reprovision dispatch abe512c2 — not in scope here. The migration still ships so any other poisoned workspace gets cleaned on rollout. Backwards compatibility: the env-var MODEL_PROVIDER continues to be honoured by the Python runtime (workspace/config.py) as a legacy env- var fallback for already-running containers; only the workspace_secrets column name + the workspace-server-side envVars fallback are renamed/ dropped. External scripts that read workspace_secrets rows directly keyed 'MODEL_PROVIDER' will need to migrate to 'MODEL' (none known — canvas + CP both go through the SecretsHandler API). Refs: ab12af50, a7e8892, cp#213, cp#220, internal#226 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>APPROVE — core-devops axis. CI/all-required is green. Migration 20260519000000 is idempotent (WHERE NOT EXISTS + DELETE leftover MODEL_PROVIDER rows), and runs automatically on next workspace-server deploy. The applyRuntimeModelEnv fallback at workspace_provision.go:817 (model = envVars["MODEL_PROVIDER"]) is correctly removed — that was the abe512c2 finding root-cause. Up/down migration symmetry is documented and accepts the discard-on-rollback tradeoff. No deploy-side risk.
APPROVE — core-security axis. Zero new attack surface: no secrets, no exfil paths, no new IO. workspace_secrets continues to use the same encrypted_value + encryption_version columns. Rename is row-level, no schema/index change, no privilege change. Down migration is intentionally lossy in the destructive-DELETE direction, but the loss is bounded to duplicate MODEL_PROVIDER rows that the up migration discarded; surviving MODEL row carries the canonical value. No PII/credential implication.
APPROVE — core-qa axis. Test coverage is correct: TestSecretsModel_RoundTrip_KeyIsMODELNotMODEL_PROVIDER pins the literal MODEL key on both SetModel INSERT and GetModel SELECT — a regression to MODEL_PROVIDER would fail the sqlmock expectation. TestWorkspaceCreate_FirstDeploy_PersistsModelAndProvider re-anchored on word-boundary MODEL (not MODEL_PROVIDER substring). 2 net-new tests + 8 re-anchored as advertised. CI/all-required green at 22:28Z. Migration is idempotent so re-deploys are safe.