fix(workspace-server): derive image-refresh runtime allowlist from providers SSOT (google-adk drift) (#578) #2348
Reference in New Issue
Block a user
Delete Branch "fix/578-google-adk-image-refresh-allowlist"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Fixes #578 — google-adk runtime allowlist DRIFT between controlplane (accepts
google-adkfor pin-promote/redeploy) and the molecule-core tenant image-refresh endpoint, which hardcodedAllRuntimes = {claude-code, codex, hermes, openclaw}(no google-adk). A google-adk pin was accepted CP-side, thenPOST /admin/workspace-images/refresh?runtime=google-adkreturned 400 "unknown runtime" at the tenant, so google-adk image fixes never deployed.Fix — unified, not just patched
Rather than append
google-adk(which would drift again),AllRuntimesis now DERIVED at package init fromproviders.LoadManifest().Runtimes— the sameinternal/providers/providers.yamlruntimes:SSOT (mirrored from CP's providers.yaml) the rest of the platform already routes against (DeriveProvider,ModelsForRuntime,templates_registry.go,llm_billing_mode.go). The CP pin-promote allowlist and the tenant refresh allowlist are now provably the same set.internal/providers/gen/registry_gen.gois explicitly "no production path imports this package yet", so the runtime-embeddedproviders.LoadManifest()(already imported inhandlers, fail-closed, no network) is the correct reusable SSOT.imageRefreshFallbackRuntimes(now including google-adk) is used only if the embedded manifest fails to load, preserving endpoint availability under a manifest regression. A drift-guard test pins it to the SSOT too.imagewatch.New()copieshandlers.AllRuntimesat construction, so the auto-refresh watcher now also tracks google-adk — no extra change needed.Tests (all green)
TestAllRuntimes_IncludesGoogleADK— google-adk is now in the allowlist (direct #578 regression).TestAllRuntimes_MatchesProvidersSSOT— derived list == providers SSOT runtime keys (drift guard: CP/tenant can't diverge again).TestImageRefreshFallbackMatchesSSOT— the static fallback is pinned to the SSOT.TestRefresh_RejectsUnknownRuntime— reject guard intact (genuinely-unknown runtime still 400s) AND the 400known_runtimesbody advertises google-adk.go build ./...,go vet ./internal/handlers/, andgo test ./internal/handlers/ ./internal/imagewatch/ ./internal/providers/...all pass.Note on the other runtime registry
There is a separate
runtime_registry.goknownRuntimes(built frommanifest.jsonworkspace_templates) governing workspace provisioning — a different concern from image pull/recreate. This PR intentionally does not touch it.🤖 Generated with Claude Code
Fresh 5-axis approval on current head
d61d9af761. Correctness: AllRuntimes now derives from providers.LoadManifest().Runtimes, so the tenant image-refresh allowlist follows the providers.yaml runtime SSOT instead of a hardcoded slice; google-adk is covered without adding another drift point. Robustness: deterministic sorting plus a static fallback preserves endpoint availability if manifest loading fails, and the fallback is itself pinned by a drift-guard test. Security: no auth or secret handling changes; the fix removes a deployment propagation fail-closed gap where CP accepted a runtime the tenant rejected. Performance: package-init manifest load is local/embedded and the runtime list is computed once. Readability/tests: the regression and SSOT/fallback tests are meaningful, including the handler 400 known_runtimes assertion. CI/all-required is green and mergeable=true. Note: qa-review/security-review/sop-tier-check still show failing pull_request_target statuses in the raw status list, but they are not blocking CI/all-required on this head.Official APPROVED for the code change on head
d61d9af7. Independent pass: tenant workspace-image refresh runtimes are now derived from providers.LoadManifest().Runtimes; google-adk is covered through the SSOT instead of a hand-maintained allowlist; fallback and handler tests catch CP↔tenant drift and unknown-runtime behavior remains a 400. Note: current combined CI is still red on governance contexts, so merge remains blocked until those gates are refreshed/green.