fix(provision): platform-managed workspace must fail-closed when CP proxy env absent (#2162) #2164

Merged
core-be merged 3 commits from fix/2162-platform-managed-fail-closed-missing-proxy into main 2026-06-03 06:21:10 +00:00
Member

Fixes #2162.

Root cause

applyPlatformManagedLLMEnv returned HasUsableLLMCred: true when MOLECULE_LLM_BASE_URL + MOLECULE_LLM_USAGE_TOKEN were empty, causing claude-code workspaces to boot credential-less and hit the 600s provision-timeout sweep (adk-demo dark-wedge class, ws 29b95be9).

Fix

  • Empty-proxy-env path returns HasUsableLLMCred: false (was true).
  • Caller aborts with MISSING_PLATFORM_PROXY, symmetric to the BYOK MISSING_BYOK_CREDENTIAL hard-fail (#711).
  • User-facing error message explains the boot-race and retry path.

Regression tests

  • TestApplyPlatformManagedLLMEnv_MissingProxyEnvFailClosed: asserts HasUsableLLMCred=false when proxy env absent.
  • TestApplyPlatformManagedLLMEnv_ProxyEnvPresentInjectsCredential: asserts ANTHROPIC_API_KEY + ANTHROPIC_BASE_URL injected when proxy env present.

Refs: #2162, #711, #1994

Fixes #2162. ## Root cause `applyPlatformManagedLLMEnv` returned `HasUsableLLMCred: true` when `MOLECULE_LLM_BASE_URL` + `MOLECULE_LLM_USAGE_TOKEN` were empty, causing claude-code workspaces to boot credential-less and hit the 600s provision-timeout sweep (adk-demo dark-wedge class, ws `29b95be9`). ## Fix - Empty-proxy-env path returns `HasUsableLLMCred: false` (was `true`). - Caller aborts with `MISSING_PLATFORM_PROXY`, symmetric to the BYOK `MISSING_BYOK_CREDENTIAL` hard-fail (#711). - User-facing error message explains the boot-race and retry path. ## Regression tests - `TestApplyPlatformManagedLLMEnv_MissingProxyEnvFailClosed`: asserts `HasUsableLLMCred=false` when proxy env absent. - `TestApplyPlatformManagedLLMEnv_ProxyEnvPresentInjectsCredential`: asserts `ANTHROPIC_API_KEY` + `ANTHROPIC_BASE_URL` injected when proxy env present. Refs: #2162, #711, #1994
core-be added 1 commit 2026-06-03 02:04:58 +00:00
fix(provision): platform-managed workspace must fail-closed when CP proxy env absent (#2162)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 3s
CI / Python Lint & Test (pull_request) Successful in 10s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Failing after 3s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 7s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 11s
Harness Replays / detect-changes (pull_request) Successful in 10s
CI / Detect changes (pull_request) Successful in 16s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 9s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 17s
qa-review / approved (pull_request_target) Failing after 3s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 28s
E2E API Smoke Test / detect-changes (pull_request) Successful in 30s
E2E Chat / detect-changes (pull_request) Successful in 30s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 16s
sop-checklist / review-refire (pull_request_target) Has been skipped
security-review / approved (pull_request_target) Failing after 3s
sop-tier-check / tier-check (pull_request_target) Successful in 4s
gate-check-v3 / gate-check (pull_request_target) Successful in 18s
Harness Replays / Harness Replays (pull_request) Successful in 2s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 11s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 1s
E2E Chat / E2E Chat (pull_request) Successful in 2s
CI / Canvas (Next.js) (pull_request) Successful in 8s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 11s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m23s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m1s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m56s
CI / Platform (Go) (pull_request) Failing after 4m37s
CI / all-required (pull_request) Has been skipped
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Failing after 8m39s
55e201157a
applyPlatformManagedLLMEnv falsely reported HasUsableLLMCred:true when
MOLECULE_LLM_BASE_URL + MOLECULE_LLM_USAGE_TOKEN were empty, causing
claude-code workspaces to boot credential-less and hit the 600s
provision-timeout sweep (adk-demo dark-wedge class).

Fix:
- Empty-proxy-env path returns HasUsableLLMCred:false (was true).
- Caller aborts with MISSING_PLATFORM_PROXY, symmetric to the BYOK
  MISSING_BYOK_CREDENTIAL hard-fail.
- User-facing error message explains the boot-race and retry path.

Regression tests:
- TestApplyPlatformManagedLLMEnv_MissingProxyEnvFailClosed: asserts
  HasUsableLLMCred=false when proxy env absent.
- TestApplyPlatformManagedLLMEnv_ProxyEnvPresentInjectsCredential:
  asserts ANTHROPIC_API_KEY + ANTHROPIC_BASE_URL injected when proxy
  env present.

Refs: #2162, #711 (BYOK fail-closed pattern), #1994
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
core-be added 1 commit 2026-06-03 05:41:44 +00:00
test(provision): supply CP proxy env in tests that hit platform-managed default
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
CI / Python Lint & Test (pull_request) Successful in 3s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
E2E Chat / detect-changes (pull_request) Successful in 6s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Failing after 1s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 3s
gate-check-v3 / gate-check (pull_request_target) Successful in 3s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 12s
qa-review / approved (pull_request_target) Failing after 3s
sop-checklist / review-refire (pull_request_target) Has been skipped
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 4s
security-review / approved (pull_request_target) Failing after 4s
sop-tier-check / tier-check (pull_request_target) Successful in 5s
E2E Chat / E2E Chat (pull_request) Successful in 8s
Harness Replays / detect-changes (pull_request) Successful in 16s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 15s
CI / Detect changes (pull_request) Successful in 27s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 26s
E2E API Smoke Test / detect-changes (pull_request) Successful in 27s
Harness Replays / Harness Replays (pull_request) Successful in 6s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 28s
CI / Canvas (Next.js) (pull_request) Successful in 1s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Shellcheck (E2E scripts) (pull_request) Successful in 4s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 4s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 59s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m1s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m57s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Failing after 4m19s
CI / Platform (Go) (pull_request) Failing after 4m15s
CI / all-required (pull_request) Has been skipped
334d485efc
The #2162 fix adds a MISSING_PLATFORM_PROXY abort when a platform-managed
workspace has no CP proxy env. Five existing tests call prepareProvisionContext
or provisionWorkspaceCP with a payload that resolves to platform_managed but
do not set MOLECULE_LLM_BASE_URL / MOLECULE_LLM_USAGE_TOKEN, causing them to
abort early and fail their assertions.

Add the proxy env to:
- TestPrepareProvisionContext_ParentIDInjected
- TestPrepareProvisionContext_InjectsGitHTTPCredsFromPersonaToken
- TestPrepareProvisionContext_WorkspaceSecretWinsOverPersonaToken
- TestProvisionWorkspaceCP_NoInternalErrorsInBroadcast
- TestProvisionWorkspaceCP_ConcurrentBurst_NoSilentDrop

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
core-be added 1 commit 2026-06-03 05:54:30 +00:00
test(provision): supply CP proxy env in auto-routing tests (#2162)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 3s
CI / Python Lint & Test (pull_request) Successful in 2s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Failing after 2s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 3s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 8s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
gate-check-v3 / gate-check (pull_request_target) Successful in 3s
qa-review / approved (pull_request_target) Failing after 3s
sop-checklist / review-refire (pull_request_target) Has been skipped
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 3s
sop-tier-check / tier-check (pull_request_target) Successful in 3s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 22s
E2E Chat / detect-changes (pull_request) Successful in 23s
Harness Replays / detect-changes (pull_request) Successful in 21s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 21s
CI / Detect changes (pull_request) Successful in 25s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 24s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 1s
CI / Canvas (Next.js) (pull_request) Successful in 8s
Harness Replays / Harness Replays (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 7s
E2E Chat / E2E Chat (pull_request) Successful in 9s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m0s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 51s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m1s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Failing after 3m58s
CI / Platform (Go) (pull_request) Successful in 4m41s
CI / all-required (pull_request) Successful in 2s
qa-review / approved (pull_request_review) Has been skipped
security-review / approved (pull_request_review) Has been skipped
sop-tier-check / tier-check (pull_request_review) Successful in 6s
security-review / approved (pull_request_target) Refired via /security-recheck by unknown
audit-force-merge / audit (pull_request_target) Successful in 9s
9a28c88682
Three auto-routing tests (TestProvisionWorkspaceAuto_RoutesToCPWhenSet,
TestRestartWorkspaceAuto_RoutesToCPWhenSet,
TestProvisionWorkspaceAutoSync_RoutesToCPWhenSet) use
models.CreateWorkspacePayload with Runtime="claude-code" and empty Model.
This now derives to platform_managed billing mode, which fails closed
with MISSING_PLATFORM_PROXY when the CP proxy env is absent.

Supply the proxy env via t.Setenv so the tests reach the CP provisioner
stub instead of aborting early.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
core-qa approved these changes 2026-06-03 06:14:20 +00:00
core-qa left a comment
Member

core-qa official-approve — #2162 fail-closed. Verified by CTO: empty-proxy path returns HasUsableLLMCred=false + prepareProvisionContext aborts MISSING_PLATFORM_PROXY (symmetric to BYOK), with the watch-fail regression test. CI/Platform(Go)+all-required green. qa APPROVE.

core-qa official-approve — #2162 fail-closed. Verified by CTO: empty-proxy path returns HasUsableLLMCred=false + prepareProvisionContext aborts MISSING_PLATFORM_PROXY (symmetric to BYOK), with the watch-fail regression test. CI/Platform(Go)+all-required green. qa APPROVE.
core-security approved these changes 2026-06-03 06:14:40 +00:00
core-security left a comment
Member

core-security official-approve — #2162 fail-closed platform-managed provision. No credential-less boot; aborts MISSING_PLATFORM_PROXY. security APPROVE.

core-security official-approve — #2162 fail-closed platform-managed provision. No credential-less boot; aborts MISSING_PLATFORM_PROXY. security APPROVE.
Member

/qa-recheck

/qa-recheck
Member

/security-recheck

/security-recheck
core-be merged commit 9aafcf7ad3 into main 2026-06-03 06:21:10 +00:00
Sign in to join this conversation.
4 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: molecule-ai/molecule-core#2164