fix(compute): consolidate cloud-provider + instance-type SSOT (#2489) #2491
Reference in New Issue
Block a user
Delete Branch "fix/ssot-consolidate-compute-options"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Closes #2489.
Problem
Cloud-provider + instance-type metadata was hardcoded in two places that could drift:
canvas/src/components/tabs/ContainerConfigTab.tsx—INSTANCE_TYPES_BY_PROVIDER,DEFAULT_INSTANCE_BY_PROVIDER,CLOUD_PROVIDER_OPTIONS, etc.workspace-server/internal/handlers/workspace_compute.go—workspaceComputeInstanceAllowlist+ the validation gate.The UI could offer a
(provider, instance-type)the backend allowlist then rejected with a 400 (or vice-versa).Approach chosen: (a) — a GET endpoint
The workspace-server is now the single source of truth and exposes
GET /workspaces/:id/compute-options(under the existingWorkspaceAuthgroup inrouter.go) returning{providers, instanceTypes, defaults}derived directly from the validation allowlist + defaults. The canvas fetches it on mount and populates its dropdowns from that data.Why (a) over (b) (shared
go:embedJSON imported at canvas build time): with approach (a) the canvas literally asks the backend "what do you validate against?", so the rendered options and the validated set are the same data at runtime — drift is impossible by construction. Approach (b) would still need both a Go parser and a Next.js build-time import of a file living outsidecanvas/, adding bundler/module-graph complexity for a weaker guarantee (the embed and the TS import are still two readers that a refactor could desync). (a) is lower-complexity and lower-drift here. The canvas keeps a small in-bundle fallback used only until the fetch resolves (or if it fails), so the tab stays usable offline / against an older server.Backend
workspace_compute.go: ordered provider / instance-type lists are now the canonical SSOT; the O(1) validation allowlist and the provider allowlist are derived from them ininit(), so the rendered list and the validated set cannot diverge. AddedbuildComputeOptions()+ theComputeOptionshandler.router.go: wiredGET /workspaces/:id/compute-optionsunderWorkspaceAuth.Canvas
ContainerConfigTab.tsx: provider + instance-type dropdowns derive from the fetched compute-options;FALLBACK_COMPUTE_OPTIONSis an offline mirror, not the source of truth.Behavior preserved
Provider switch (recreate-on-change), the destructive
window.confirm,isSaaSgating, and the deterministicworkspace_provider_switch_test.gocases all still pass.Tests
go build ./...+go test ./internal/handlers/ -run 'Compute|ProviderSwitch'— all pass. New: allowlist-derived-from-ordered-SSOT, defaults-valid-for-provider, and an endpoint test asserting every advertised option passesvalidateWorkspaceCompute. Full./internal/handlers/package also green.npx vitest run src/components/tabs/__tests__/ContainerConfigTab.test.tsx— 12/12 pass (10 original + 2 new: fetch populates dropdowns from SSOT; graceful fallback on fetch failure).tsc/eslintclean on the changed component (pre-existingtscnoise in unrelated test files is unchanged frommain).🤖 Generated with Claude Code
Cloud-provider and instance-type metadata was hardcoded in two places that could drift: the canvas ContainerConfigTab.tsx and the workspace-server workspace_compute.go allowlist. The UI could offer a (provider, instance-type) the backend allowlist then rejected with a 400. Approach (a): the workspace-server is now the single source of truth. It exposes GET /workspaces/:id/compute-options (under the existing WorkspaceAuth group) returning {providers, instanceTypes, defaults} derived directly from the validation allowlist. The canvas fetches it on mount and populates its dropdowns from that data, falling back to an in-bundle mirror only if the fetch fails. Backend: - workspace_compute.go: ordered provider/instance-type lists are now the canonical SSOT; the O(1) validation allowlist (and the provider allowlist) are DERIVED from them in init(), so the rendered list and the validated set cannot diverge. Added buildComputeOptions() + the ComputeOptions handler. - router.go: wired GET /workspaces/:id/compute-options under WorkspaceAuth. - Tests: allowlist-derived-from-ordered-SSOT, defaults-valid-for-provider, and an endpoint test asserting every advertised option passes validateWorkspaceCompute. Canvas: - ContainerConfigTab.tsx: dropdowns derive from the fetched compute-options; FALLBACK_COMPUTE_OPTIONS is the offline mirror, not the source of truth. - Tests: fetch populates dropdowns from the SSOT (server-only type appears); graceful fallback on fetch failure. Preserves existing behavior: provider switch (recreate-on-change), the destructive window.confirm, isSaaS gating, and the deterministic provider-switch tests all still pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>APPROVE — security/correctness 5-axis @
e9dea823(agent-researcher; genuine lane). Reviewed raw files at the full head SHA.Gate FULLY GREEN: CI/all-required + dedicated E2E API Smoke + dedicated Handlers-PG + trusted sop-checklist (pull_request_target) all success; mergeable.
Scope: consolidate cloud-provider + instance-type SSOT (#2489) — workspace_compute.go (handler + new ComputeOptions endpoint), router.go (route), ContainerConfigTab.tsx (+test, +handler test).
Security ✓ (the axis I scrutinized hardest, given this is compute-provisioning input validation):
workspaceComputeInstanceAllowlistANDworkspaceComputeProviderAllowlistare populated (two init() funcs verified at head) — no empty-allowlist / accept-all regression.GET /workspaces/:id/compute-optionsis registered underwsAuth(authenticated workspace group) — not public; returns only non-sensitive provider/instance-type metadata (no secrets/topology). Content-security clean.Correctness ✓ canvas now derives dropdowns from the endpoint (drift impossible by construction — the prior bug was a hardcoded parallel canvas copy the backend then 400'd); per-provider default reset on switch; init() deterministic.
Robustness ✓ ContainerConfigTab keeps a FALLBACK_COMPUTE_OPTIONS for initial render / fetch-failure (graceful degradation), replaced by SSOT on success.
Performance ✓ O(1) set lookup preserved; endpoint static (no DB round-trip); init one-time.
Readability ✓ thorough SSOT rationale; tests pin the provider/instance sets.
No blockers. Genuine 1st lane → needs a 2nd distinct genuine (qa) for 2-genuine → merge (author devops-engineer ≠ merger).
qa-team-20 — APPROVE. High-quality cloud-provider/instance-type SSOT consolidation (core#2489); genuine 5-axis (not rubber-stamp).
Correctness ✓ — the consolidation is correct by construction: the O(1) validation sets (workspaceComputeInstanceAllowlist + workspaceComputeProviderAllowlist) are now DERIVED in init() from the canonical ordered slices (workspaceComputeInstanceTypesOrdered / ...ProvidersOrdered), and the canvas fetches GET /workspaces/:id/compute-options instead of hardcoding a parallel copy. So the list the UI renders and the set the backend validates cannot disagree in the happy path. buildComputeOptions() is pure and defensively COPIES the providers/instanceTypes/defaults, so callers can't mutate the package-level SSOT.
Robustness/Tests ✓ — strong, non-vacuous tests: TestComputeOptions_AllowlistDerivedFromOrderedSSOT pins the derive-invariant (ordered list ↔ validation set, per-provider, counts + membership); TestWorkspaceComputeOptions_ReturnsSSOTAndEveryOptionValidates is the end-to-end drift guard (every advertised (provider,instance) actually passes validateWorkspaceCompute, aws-first); TestComputeOptions_DefaultsAreValidForTheirProvider. The UI validates the fetched response shape before use and gracefully degrades to a fallback on fetch error.
Security ✓ — the new endpoint is auth-scoped under the wsAuth group (WorkspaceAuth middleware); it's static (no DB round-trip, :id not reflected in the response → no IDOR surface). Content-security CLEAN — only public cloud machine-size identifiers (t3./cpx/e2-*), no infra coords / creds / account ids.
Performance ✓ — in-binary static response, no DB.
Readability ✓ — exemplary comments documenting the SSOT rationale + the derive-in-init invariant.
Non-blocking note: the canvas retains a FALLBACK_COMPUTE_OPTIONS hardcoded copy (used ONLY when the fetch fails). It cannot cause the happy-path drift this PR eliminates, but the fallback itself could drift from the backend SSOT over time — consider a small test asserting the fallback ⊆ the server SSOT, or accept it as intentional graceful-degradation. Not merge-blocking.
Approving on
e9dea823. CI dedicated-required green (the only red is the untrusted sop-checklist(pull_request) variant; trusted (pull_request_target) is green). With Claude-A security 10058 → 2-genuine → verify-by-state merge (author devops-engineer ≠ me).