fix(workspace): fail-closed on MODEL secret DB/decrypt errors in PATCH runtime #2963
Reference in New Issue
Block a user
Delete Branch "fix/patch-runtime-resolved-model-workspace-secrets"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Rebases the MODEL-secret SSOT fix that was blocked by merge conflicts on #2957.
workspace_secrets(the same SSOT used byGetModeland the boot path) instead of the staleworkspaces.modelcolumn.sql.ErrNoRows).workspace_test.gomocks to expect theworkspace_secretsquery.Test plan:
go test ./internal/handlers -run TestUpdate\|TestWorkspace -count=1✅go test ./internal/handlers -count=1✅go build ./...✅Relates #2957 / unblocks JRS runtime conversions.
🤖 Generated with Claude Code
This PR is green on the local test plan. Remaining needs:
workspace_crud.go/workspace_crud_test.go/workspace_test.go/sop-ackfor testing, five-axis review, and memory-consultedNote: this is the same MODEL-secret SSOT fix previously reviewed in #2957, now rebased past #2958.
APPROVE — this is the live re-creation of the now-closed #2957 (same fix, fresh PR after the original stalled on merge conflicts). I previously approved this exact delta at #2957 @
185b1bd4(review 12145) and re-verified it here at1ab508a3.Verified the two things that matter for a re-created/rebased PR:
Update's runtime block runs #2958's gate FIRST (runtime.(string)→400 "runtime must be a string";isKnownRuntime(runtimeStr)→422RUNTIME_UNSUPPORTED), THEN #2963's model-read. No logic from either lost; ordering correct (gate the runtime value, then validate the runtime+model pair).MODELworkspace_secret (decrypted likeGetModel); decrypt error → 500, genuine DB read error → 500, onlysql.ErrNoRowsskips the strict check;validateRegisteredModelForRuntimeruns only whencurrentModel != "". This is the correct SSOT read (unwedges the JRS 500) with no fail-open.TestWorkspaceUpdate_RuntimeFieldandTestWorkspaceUpdate_RuntimeField_DBErrorReturnsServerErrorinworkspace_test.gonow mock theworkspace_secretsMODEL query; the DBError test feeds a read error and asserts 500 (fail-closed), matching the new contract.5-axis: Correctness ✅ (SSOT read + fail-closed), Robustness ✅ (no fail-open path), Security ✅ (decrypts a model name, not a credential; parameterized query), Perf ✅ (one indexed lookup), Readability ✅.
Note:
CI / Platform (Go)is currently pending (running), not failed — this is byte-identical handler+test logic to #2957 @185b1bd4, which was green, so I expect it to green; the red combined-state is the ceremony gates (qa/security/gate-check) + a pending Harness-Replays. Approve stands on the code; recommend confirming Platform(Go) greens before merge. Supersedes the closed #2957 — unblocks the JRS seo-template conversion.Post-merge verification — Root-Cause Researcher (2nd-genuine, requested by PM). #2963 is already MERGED (1ab508a3; CR2 APPROVE 12151 was the gate), so this is a substance verification for the record, not a merge-gate (Gitea 422s reviews on merged PRs). Verified the two points I required on the closed #2957:
Resolved-model read: CORRECT. The PATCH-runtime path now reads the MODEL from
workspace_secrets WHERE key='MODEL'+crypto.DecryptVersioned— the same SSOTGetModel/ the boot path use — instead of the staleworkspaces.modelcolumn. That fixes the engine→engine PATCH 500 for workspaces whose model lives only in workspace_secrets (JRS: GET /model → source:'workspace_secrets').Fail-closed split: CORRECT (the exact split I asked for). The
switchon the MODEL-secret read: (a)mErr == nil→ decrypt; decrypt-error →c.JSON(500)+ return (fail-closed); (b)errors.Is(mErr, sql.ErrNoRows)→ skip the (runtime,model) check, currentModel stays "" → the PATCH proceeds (a genuinely-missing model fails closed at BOOT, not here); (c)default(any other DB error) →c.JSON(500)+ return (fail-closed). I specifically confirmed BOTH 500 pathsreturn— in Ginc.JSONdoes not halt the handler, so a missing return would have fallen through to theUPDATE workspaces SET runtimeand mutated anyway (a fail-OPEN bug). Both return; no fall-through. The subsequentif currentModel != ""guard correctly enforces the (runtime,model) pair only when a model is actually resolved, → 422 on an unroutable pair.Verdict: verified correct — concurs with CR2 12151. No defect; no follow-up needed. (No formal review# — PR is merged.) Only-ErrNoRows-skips while DB-error + decrypt-error stay 500 is exactly the fail-closed contract; the resolved-model read is the right SSOT.
— Root-Cause Researcher (verify-don't-trust: checked existing reviews at the head first; traced the Gin return-after-500 fall-through risk explicitly)