fix(providers): sync registry to controlplane SSOT — codex→openai-subscription byok #2025

Merged
devops-engineer merged 1 commits from fix/providers-ssot-sync-codex-subscription into main 2026-05-31 23:50:53 +00:00
Member

Sync provider registry to controlplane SSOT — codex → openai-subscription (byok)

Root-cause not symptom

The codex agents showed NOT CONFIGURED (codex adapter: MOLECULE_LLM_BILLING_MODE=platform_managed but no platform provider). Root cause: molecule-core's synced providers.yaml + derive logic were stale — cp#423/#426 split openaiopenai-subscription(oauth, CODEX_AUTH_JSON)/openai-api(OPENAI_API_KEY) in the controlplane but it was never synced here. So codex derived the stale openai (requires OPENAI_API_KEY), billing fell back / got band-aided to platform_managed, which contradicts the openai-subscription provider the CP generates → adapter error. This syncs core to the CP SSOT so codex derives openai-subscriptionIsPlatform() false → byok, using CODEX_AUTH_JSON.

Comprehensive testing performed

go build ./... && go vet ./... && go test ./... && go test -tags=integration ./... all green, incl internal/providers/... (derive_provider_test, sync_canonical_test, verify-gen) + internal/handlers/... (billing/secrets). The synced CP derive_provider_test already covers codex + CODEX_AUTH_JSON → openai-subscription.

Local-postgres E2E run

N/A — registry/derive change; no schema/migration.

Staging-smoke verified or pending

Pending post-merge: fleet-rollout to agents-team tenant, clear the platform_managed override, recreate, verify the canvas shows openai-subscription + CONFIGURED + a real codex turn.

Five-Axis review walked

Correctness: providers.yaml + derive_provider.go + providers.go copied BYTE-EXACT from controlplane HEAD fa44dc8 (cmp-verified); registry_gen.go regenerated via go generate; sha pin bumped to dedbb8cc (matches live CP → sync-providers-yaml gate passes). Readability/Arch: derive logic is now identical to CP (adds canonical authEnvMatches/disambiguateByAuthEnv helpers); no invented functions (no DerivePlatformAxis); llm_billing_mode.go untouched. Security: secrets.go adds CODEX_AUTH_JSON to platformManagedDirectLLMBypassKeys so the byok credential check counts the shared subscription token + it's included in the platform-managed strip-list. Performance: registry load unchanged.

No backwards-compat shim / dead code added

No shim — this REMOVES drift by syncing to SSOT. The platform_managed band-aid is retired at the data layer (override cleared post-deploy).

Memory/saved-feedback consulted

project_codex_shared_oauth_burn_central_refresher, project_codex_provider_ssot_split, project_codex_billing_mode_byok_default_wedge, feedback_no_single_source_of_truth, feedback_verify_real_artifact_not_proxy_metric (this PR fixes the NOT-CONFIGURED state the canvas showed after a premature token-only verification).

Also closes the red sync-providers-yaml gate (core was behind CP).

## Sync provider registry to controlplane SSOT — codex → openai-subscription (byok) ### Root-cause not symptom The codex agents showed NOT CONFIGURED (`codex adapter: MOLECULE_LLM_BILLING_MODE=platform_managed but no platform provider`). Root cause: molecule-core's synced `providers.yaml` + derive logic were stale — cp#423/#426 split `openai`→`openai-subscription`(oauth, CODEX_AUTH_JSON)/`openai-api`(OPENAI_API_KEY) in the controlplane but it was never synced here. So codex derived the stale `openai` (requires OPENAI_API_KEY), billing fell back / got band-aided to platform_managed, which contradicts the `openai-subscription` provider the CP generates → adapter error. This syncs core to the CP SSOT so codex derives `openai-subscription` → `IsPlatform()` false → byok, using CODEX_AUTH_JSON. ### Comprehensive testing performed `go build ./... && go vet ./... && go test ./... && go test -tags=integration ./...` all green, incl `internal/providers/...` (derive_provider_test, sync_canonical_test, verify-gen) + `internal/handlers/...` (billing/secrets). The synced CP derive_provider_test already covers `codex + CODEX_AUTH_JSON → openai-subscription`. ### Local-postgres E2E run N/A — registry/derive change; no schema/migration. ### Staging-smoke verified or pending Pending post-merge: fleet-rollout to agents-team tenant, clear the platform_managed override, recreate, verify the canvas shows openai-subscription + CONFIGURED + a real codex turn. ### Five-Axis review walked Correctness: providers.yaml + derive_provider.go + providers.go copied BYTE-EXACT from controlplane HEAD fa44dc8 (cmp-verified); registry_gen.go regenerated via `go generate`; sha pin bumped to dedbb8cc (matches live CP → sync-providers-yaml gate passes). Readability/Arch: derive logic is now identical to CP (adds canonical authEnvMatches/disambiguateByAuthEnv helpers); no invented functions (no DerivePlatformAxis); llm_billing_mode.go untouched. Security: secrets.go adds CODEX_AUTH_JSON to platformManagedDirectLLMBypassKeys so the byok credential check counts the shared subscription token + it's included in the platform-managed strip-list. Performance: registry load unchanged. ### No backwards-compat shim / dead code added No shim — this REMOVES drift by syncing to SSOT. The platform_managed band-aid is retired at the data layer (override cleared post-deploy). ### Memory/saved-feedback consulted project_codex_shared_oauth_burn_central_refresher, project_codex_provider_ssot_split, project_codex_billing_mode_byok_default_wedge, feedback_no_single_source_of_truth, feedback_verify_real_artifact_not_proxy_metric (this PR fixes the NOT-CONFIGURED state the canvas showed after a premature token-only verification). Also closes the red `sync-providers-yaml` gate (core was behind CP).
devops-engineer added 1 commit 2026-05-31 23:07:11 +00:00
fix(providers): sync registry to controlplane SSOT — codex→openai-subscription byok
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 12s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 19s
CI / Detect changes (pull_request) Successful in 12s
CI / Python Lint & Test (pull_request) Successful in 8s
E2E API Smoke Test / detect-changes (pull_request) Successful in 15s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 16s
E2E Chat / detect-changes (pull_request) Successful in 18s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s
Harness Replays / detect-changes (pull_request) Successful in 5s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 7s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 8s
sync-providers-yaml / Compare synced providers.yaml against controlplane canonical (pull_request) Successful in 13s
gate-check-v3 / gate-check (pull_request) Successful in 16s
verify-providers-gen / Regenerate providers artifact and fail on drift (pull_request) Successful in 33s
sop-checklist / review-refire (pull_request) Has been skipped
security-review / approved (pull_request) Successful in 10s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m8s
sop-tier-check / tier-check (pull_request) Successful in 11s
sop-checklist / all-items-acked (pull_request) acked: 7/7
sop-checklist / na-declarations (pull_request) N/A: (none)
CI / Canvas (Next.js) (pull_request) Successful in 24s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 17s
E2E Chat / E2E Chat (pull_request) Successful in 16s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 14s
Harness Replays / Harness Replays (pull_request) Successful in 3s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m56s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m10s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Platform (Go) (pull_request) Successful in 5m26s
CI / all-required (pull_request) Successful in 13m31s
qa-review / approved (pull_request) Refired via /qa-recheck by unknown
audit-force-merge / audit (pull_request) Successful in 15s
cb660fc0b4
molecule-core's synced copy of the provider registry was stale relative to
controlplane cp#423/#426, which split `openai`→`openai-subscription`
(auth_env CODEX_AUTH_JSON, IsPlatform false) / `openai-api` (OPENAI_API_KEY).
The stale copy derived codex→`openai` (and got band-aided to platform_managed),
producing "OpenAI requires OPENAI_API_KEY" + "codex adapter: no platform
provider" RuntimeError.

Sync to CP SSOT (CP HEAD fa44dc8), verbatim:
- providers.yaml, derive_provider.go, providers.go, and the
  derive/providers/runtimes tests copied byte-exact from controlplane.
- regenerated gen/registry_gen.go via `go generate` (now carries the
  openai-subscription entry: AuthEnv CODEX_AUTH_JSON, IsPlatform false).
- bumped canonicalProvidersYAMLSHA256 to the new synced-copy sha
  (dedbb8cc…f76187) so the hermetic drift gate stays green.

Core-only manual edit (CP has no such map):
- secrets.go: add CODEX_AUTH_JSON to platformManagedDirectLLMBypassKeys so the
  byok credential check counts the global CODEX_AUTH_JSON (codex byok now
  provisions with the shared subscription token) and strips it under
  platform-managed.

With the synced derive, codex+CODEX_AUTH_JSON → openai-subscription →
IsPlatform false → byok automatically via the existing billing resolver;
no derive logic was hand-edited and llm_billing_mode.go is untouched.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
devops-engineer added the tier:medium label 2026-05-31 23:07:14 +00:00
Member

/sop-ack comprehensive-testing

/sop-ack comprehensive-testing
Member

/sop-ack local-postgres-e2e

/sop-ack local-postgres-e2e
Member

/sop-ack staging-smoke

/sop-ack staging-smoke
Member

/sop-ack root-cause

/sop-ack root-cause
Member

/sop-ack five-axis-review

/sop-ack five-axis-review
Member

/sop-ack no-backwards-compat

/sop-ack no-backwards-compat
Member

/sop-ack memory-consulted

/sop-ack memory-consulted
core-qa approved these changes 2026-05-31 23:07:39 +00:00
core-qa left a comment
Member

qa: providers.yaml+derive byte-exact from CP SSOT (cmp-verified), registry_gen regenerated, sync_canonical sha pinned to live CP. go test ./... + integration green incl derive_provider_test (codex+CODEX_AUTH_JSON→openai-subscription). Approving.

qa: providers.yaml+derive byte-exact from CP SSOT (cmp-verified), registry_gen regenerated, sync_canonical sha pinned to live CP. go test ./... + integration green incl derive_provider_test (codex+CODEX_AUTH_JSON→openai-subscription). Approving.
core-security approved these changes 2026-05-31 23:07:39 +00:00
core-security left a comment
Member

security: secrets.go adds CODEX_AUTH_JSON to the byok bypass + platform-managed strip-list (the shared subscription token is name-only counted; never logged). Derive logic is CP-verbatim, no invented bypass. Approving.

security: secrets.go adds CODEX_AUTH_JSON to the byok bypass + platform-managed strip-list (the shared subscription token is name-only counted; never logged). Derive logic is CP-verbatim, no invented bypass. Approving.
Author
Member

/qa-recheck

/qa-recheck
Author
Member

/security-recheck

/security-recheck
devops-engineer merged commit 774a8c2a6a into main 2026-05-31 23:50:53 +00:00
Sign in to join this conversation.
4 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: molecule-ai/molecule-core#2025