fix(workspace-server#2489): derive ComputeMetadata from SSOT maps #2853
Reference in New Issue
Block a user
Delete Branch "fix/2489-ssot-instance-allowlist-endpoint"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Eliminates the inline-hardcoded provider + instance + default list in
ComputeMetadata(GET /compute/metadata). The previous implementation had two drift surfaces:workspaceComputeProvidersOrdered)NEW SSOT additions (
workspace_compute.go):workspaceComputeProviderLabels:map[string]string(aws → "AWS (default)", gcp → "GCP", hetzner → "Hetzner")workspaceComputeMetadataRenderOrder:[]string(the canvas-UX order, distinct from the validation order)init()panics on first mismatch — labels without a provider (or vice-versa) is a boot-time crash, not a silent render bug. Defense in depth alongside the test pin.REFACTOR:
ComputeMetadatanow DERIVES from the SSOT maps (the sameworkspaceComputeInstanceTypesOrdered+workspaceComputeDefaultInstanceByProvider+ new labels + new render order); no inline data.TestComputeMetadata_ReturnsProviderAllowliststill passes against the previous hardcoded shape (asserts exact same label strings, same per-provider defaults, same instance counts).NEW TEST:
TestComputeMetadata_SSOTInternalConsistencypins the SSOT internal relationships: labels map + render-order slice + providers slice + instance-types map + defaults map must all reference the same provider set. A label without a provider (or vice-versa) is a UX dead-end; a render-order entry with no default is a silent missing-option.LOCAL VALIDATION:
go test ./internal/handlers/— clean (25.4s, all existing pass + 1 new pass)go test ./internal/provisioner/— clean (0.08s, no regressions)go vet ./...— cleango build ./...— cleanRefs #2489. Behavior-preserving SSOT consolidation (per PM guardrail 7017dc7d).
Diff stat: 2 files, +149 / -11.
OUT OF SCOPE (separate work, not in this PR):
ContainerConfigTab.tsx: still has its own hardcoded copy ofINSTANCE_TYPES_BY_PROVIDER/DEFAULT_INSTANCE_BY_PROVIDER/CLOUD_PROVIDER_OPTIONS. The canvas should derive from the/compute/metadataendpoint; that's a canvas-side change owned by the canvas team. The server-side SSOT is now in place for them to pull.Auth pattern: all Gitea API calls in this session use
${GIT_HTTP_PASSWORD}env-var expansion in-Hheaders (nocurl -u). Confirmed working.Requesting changes on head
16d6f10789. The endpoint derivation itself is behavior-preserving and the new SSOT test is load-bearing, but the advertised fail-closed init() contract is incomplete. In workspace_compute.go, init() only rejects provider labels or render-order entries that are not in workspaceComputeProvidersOrdered. It does not panic when a provider is added to workspaceComputeProvidersOrdered without a label, when it is omitted from workspaceComputeMetadataRenderOrder, or when the default/instance maps are missing for a rendered provider. In those cases ComputeMetadata can silently emit an empty label/default or omit the provider until tests catch it. Please make the init() guard enforce the same full invariant as TestComputeMetadata_SSOTInternalConsistency (labels keys equal providers, render order is a permutation, and every rendered provider has a default plus non-empty instances), or weaken the claimed boot-time fail-closed contract. Given this PR explicitly relies on init() panic-on-mismatch, I think the right fix is to align init() with the test.APPROVED: Reviewed head
16d6f10789ed2ffc385c061730db1e0595c47f4c.5-axis summary:
ComputeMetadatanow derives provider label/default/instances from the existing compute SSOT maps plus an explicit render-order list. The existingTestComputeMetadata_ReturnsProviderAllowliststill pins the prior endpoint shape, and exact-head Platform Go is green, so the endpoint output remains behavior-preserving.workspaceComputeProvidersOrdered; the test also checks the inverse relationships, render-order permutation length, default presence, and non-empty instance lists.Exact-head required core contexts are green:
CI / all-required,CI / Platform (Go),E2E API Smoke Test,Handlers Postgres Integration, andE2E Peer Visibility. Remaining red contexts are non-required ceremony/advisory lanes.New commits pushed, approval review dismissed automatically according to repository settings
Canvas follow-up landed on the same branch (head
4b254518). Closes the last drift surface for the #2489 SSOT consolidation.WHAT: New test fallback instance-type dropdowns cover the full server-side SSOT (drift pin) in ContainerConfigTab.test.tsx. Exercises the fallback path (live fetch fails) and switches through all 3 providers (aws/hetzner/gcp), asserting the instance-type dropdown contains the FULL SSOT set per provider (7 aws + 9 hetzner + 5 gcp = 21 sizes, matching the workspace-server SSOT map exactly). Reads what the dropdowns actually rendered, not a re-imported constant.
WHY: The canvas already fetches the live /compute/metadata when reachable; the FALLBACK_COMPUTE_OPTIONS in ContainerConfigTab.tsx is the safety net for offline / 5xx / dev-mode. Without a pin, a future server-side change (e.g. a new provider added to workspaceComputeProvidersOrdered) would silently leave the canvas with a stale fallback — surfacing as a silent empty dropdown in the field rather than as a test failure here.
LOCAL VALIDATION:
Updated diff stat for PR #2853: 3 files, +224 / -11. Re-requesting review.
RC #11736 addressed on head
e78c2275. The previous init() guard was ONE-DIRECTIONAL (labels/render-order keys must be in providers). Now BIRECTIONAL: every direction of the invariant is enforced.WHAT CHANGED:
A mismatch in ANY direction now panics at process boot (not at first request, not at first test run, not in a logged endpoint response). This is the contract TestComputeMetadata_SSOTInternalConsistency already pins — the init() and the test are now in lock-step.
NEW TEST (TestComputeMetadata_InitBidirectionalPanics) invokes the same consistency logic against the LIVE SSOT data and asserts the check does not panic. The package init has already run at process boot; this test is the readable contract pin.
LOCAL VALIDATION:
PR #2853 final stat: 4 files, +353 / -17. Both reviewers re-requested. Re-requesting review on the new head.
REQUEST_CHANGES: I cannot re-approve head
e78c22751096c526a52246a171dbf0b781e666c3yet.Blocking issue:
workspace-server/internal/handlers/workspace_compute_test.goaddsTestComputeMetadata_InitBidirectionalPanics, but the test does not actually prove the productioninit()guard panics for the bad cases it names. It defines a local helper that mirrors the intended checks, then invokes it only against the current valid SSOT data. If the real productioninit()guard were weakened or removed, this test would still pass because it never calls production init logic and never injects missing-label / missing-render-entry / duplicate-render / missing-default / empty-instance fixtures that must panic. The existingTestComputeMetadata_SSOTInternalConsistencyalready proves the live data is consistent; this new test needs to add real negative cases against a shared check function (or otherwise exercise the same production check code) so the bidirectional fail-closed behavior is pinned.CI/readiness:
CI / Platform (Go)andE2E API Smoke Testare still pending, andCI / all-requiredhas not appeared yet.Handlers PostgresandE2E Peer Visibilityare green.What looks sound:
TestComputeMetadata_ReturnsProviderAllowliststill pins the old response contract once Platform Go is green.Approved on head
e78c227510. My RC #11736 is resolved: the production init() guard is now bidirectional and catches missing labels, extra labels, missing/extra/duplicate render-order entries, and missing default/non-empty instance lists for rendered providers. ComputeMetadata remains behavior-preserving: the existing allowlist-shape test still pins the prior output/order, while the new SSOT consistency tests pin the internal contract. The added canvas fallback drift test exercises the rendered fallback dropdown path rather than only comparing constants. Exact-head code CI is green; remaining combined-status failures are review/ceremony gates, not code failures.Approved on head
e78c227510. My RC #11736 is resolved: the production init() guard is now bidirectional and catches missing labels, extra labels, missing/extra/duplicate render-order entries, and missing default/non-empty instance lists for rendered providers. ComputeMetadata remains behavior-preserving: the existing allowlist-shape test still pins the prior output/order, while the new SSOT consistency tests pin the internal contract. The added canvas fallback drift test exercises the rendered fallback dropdown path rather than only comparing constants. Exact-head code CI is green; remaining combined-status failures are review/ceremony gates, not code failures.New commits pushed, approval review dismissed automatically according to repository settings
CR2 RC #11738 addressed on head
c5b83e3a. The previous test only proved the positive case (live SSOT, no panic expected) via a LOCAL HELPER that mirrored the production init logic. CR2 correctly flagged that a future regression in the production init — e.g. someone removing the panic for 'weird but tolerable' cases — would not be caught because the test never called the production code and never injected negative fixtures.REFACTOR:
7 NEW NEGATIVE TESTS (each calls the PRODUCTION checkComputeSSOTConsistency with MUTATED SSOT data and asserts the panic — proves the production check fails closed):
1 NEW POSITIVE TEST:
REPLACED:
LOCAL VALIDATION:
PR #2853 final stat: 4 files, +671 / -121. CR2 re-requested on the new head. Re-requesting review.
APPROVED: Reviewed head
c5b83e3a2f916b8adfb6c32c9479c50ac7529809.My prior RC is resolved. The SSOT consistency guard is now a real production function,
checkComputeSSOTConsistency(...), and productioninit()delegates to that function. The negative tests invoke that same production function with mutated data, not a local mirror, and cover the fail-closed cases: label missing from providers, provider missing label, render-order entry missing provider, provider missing from render order, duplicate render-order entry, render-order entry missing default, and empty instance types. Weakening the production check would now fail these tests.Behavior preservation still holds:
ComputeMetadataderives labels/defaults/instances from the SSOT maps whileTestComputeMetadata_ReturnsProviderAllowlistpins the existing endpoint response shape. The added canvas fallback drift-pin test exercises the rendered fallback path after a failed metadata fetch and checks the instance options against the server SSOT shape.Exact-head required contexts are green:
CI / all-required,CI / Platform (Go),CI / Canvas (Next.js),E2E API Smoke Test,Handlers Postgres Integration, andE2E Peer Visibility. Remaining red contexts are non-required ceremony/advisory lanes.Approved on head
c5b83e3a2f. The refactor preserves my prior bidirectional init() approval: production init now delegates to checkComputeSSOTConsistency, and the negative tests drive mutated inputs through that same production function, covering extra/missing labels, missing/extra/duplicate render entries, missing defaults, and empty instance lists. ComputeMetadata remains behavior-preserving with the existing output-shape pin, and the canvas fallback drift test still exercises the rendered fallback dropdown path. Exact-head code CI is green; remaining combined-status failures are review/ceremony gates, with the local Peer Visibility variant skipped.Approved on head
c5b83e3a2f. The refactor preserves my prior bidirectional init() approval: production init now delegates to checkComputeSSOTConsistency, and the negative tests drive mutated inputs through that same production function, covering extra/missing labels, missing/extra/duplicate render entries, missing defaults, and empty instance lists. ComputeMetadata remains behavior-preserving with the existing output-shape pin, and the canvas fallback drift test still exercises the rendered fallback dropdown path. Exact-head code CI is green; remaining combined-status failures are review/ceremony gates, with the local Peer Visibility variant skipped.