fix(adapter): honor MODEL/MODEL_PROVIDER env (persona-env convention) #9

claude-ceo-assistant · 2026-05-08T21:12:08Z

2026-05-08 21:12:08 +00:00

Root cause

The persona env files (~/.molecule-ai/personas/<name>/env, sourced into each workspace at provision time) declare TWO env vars with distinct semantics:

MODEL — model id (e.g. MiniMax-M2.7-highspeed, opus)
MODEL_PROVIDER — provider slug (e.g. minimax, claude-code)

The runtime wheel's legacy workspace/config.py interprets MODEL_PROVIDER env as the model id — a naming choice predating the introduction of a separate MODEL env. With both set per the persona convention, the legacy code reads MODEL_PROVIDER="minimax" into runtime_config.model. The literal string minimax fails to match any registry prefix (minimax- requires a hyphen suffix), falls through to providers[0] (anthropic-oauth), demands CLAUDE_CODE_OAUTH_TOKEN which is unset on non-leads, the claude CLI launches anyway, and the SDK's query.initialize() 60s control timeout fires.

Live evidence from a wedged container ws-3286381d-422:

None of CLAUDE_CODE_OAUTH_TOKEN set for model=minimax (provider=anthropic-oauth)
...
Exception: Control request timeout: initialize

The brief hypothesised that claude_sdk_executor.py lacked dispatch logic and always called the Anthropic SDK regardless of MODEL_PROVIDER. The brief was wrong on the location: dispatch logic ALREADY exists in adapter.py::setup() (model → provider → base_url + auth_env via _resolve_provider, the #180 fix). The bug is upstream — MODEL_PROVIDER env vs persona-env naming collision silently corrupts the picked model BEFORE adapter.py sees it.

Fix

New helper _resolve_model_and_provider_from_env reconciles env vars against YAML inside adapter.setup() and adapter.create_executor():

MODEL env → picked_model (authoritative when set).
MODEL_PROVIDER env → explicit_provider IFF the value matches a registered provider name. Backward-compat: if MODEL is unset and MODEL_PROVIDER doesn't match a registered slug, treat it as a legacy model id (canvas Save+Restart pre-this-fix).
YAML runtime_config.{model,provider} fills any field env didn't supply.

Contained in this repo per the brief's scope guidance — does NOT touch the runtime wheel's workspace/config.py (would need a separate molecule-core PR), does NOT refactor sibling template repos, does NOT change the persona-env dispatch policy.

Tests

Eleven new cases in tests/test_env_model_provider_dispatch.py:

persona-env shape (minimax, GLM, lead claude-code) → correct model + slug
legacy MODEL_PROVIDER-as-model-id shape still works
env wins over YAML
YAML fallback when env unset
whitespace/empty defensive handling
case-insensitive provider slug matching

Full adapter suite: 76/76 pass.

Verification path

After merge + image rebuild + workspace re-provision, the ws-* containers will boot with provider=minimax (not anthropic-oauth), ANTHROPIC_BASE_URL=https://api.minimax.io/anthropic, MINIMAX_API_KEY projected onto ANTHROPIC_AUTH_TOKEN, and the SDK init handshake succeeding.

Refs: task #181, brief 2026-05-08, related #180 (#7 in this repo).

## Root cause The persona env files (`~/.molecule-ai/personas/<name>/env`, sourced into each workspace at provision time) declare TWO env vars with distinct semantics: - `MODEL` — model id (e.g. `MiniMax-M2.7-highspeed`, `opus`) - `MODEL_PROVIDER` — provider slug (e.g. `minimax`, `claude-code`) The runtime wheel's legacy `workspace/config.py` interprets `MODEL_PROVIDER` env as the *model id* — a naming choice predating the introduction of a separate `MODEL` env. With both set per the persona convention, the legacy code reads `MODEL_PROVIDER="minimax"` into `runtime_config.model`. The literal string `minimax` fails to match any registry prefix (`minimax-` requires a hyphen suffix), falls through to `providers[0]` (`anthropic-oauth`), demands `CLAUDE_CODE_OAUTH_TOKEN` which is unset on non-leads, the claude CLI launches anyway, and the SDK's `query.initialize()` 60s control timeout fires. Live evidence from a wedged container `ws-3286381d-422`: ``` None of CLAUDE_CODE_OAUTH_TOKEN set for model=minimax (provider=anthropic-oauth) ... Exception: Control request timeout: initialize ``` The brief hypothesised that `claude_sdk_executor.py` lacked dispatch logic and always called the Anthropic SDK regardless of `MODEL_PROVIDER`. **The brief was wrong on the location**: dispatch logic ALREADY exists in `adapter.py::setup()` (model → provider → base_url + auth_env via `_resolve_provider`, the #180 fix). The bug is upstream — `MODEL_PROVIDER` env vs persona-env naming collision silently corrupts the picked model BEFORE `adapter.py` sees it. ## Fix New helper `_resolve_model_and_provider_from_env` reconciles env vars against YAML inside `adapter.setup()` and `adapter.create_executor()`: 1. `MODEL` env → picked_model (authoritative when set). 2. `MODEL_PROVIDER` env → explicit_provider IFF the value matches a registered provider name. Backward-compat: if `MODEL` is unset and `MODEL_PROVIDER` doesn't match a registered slug, treat it as a legacy model id (canvas Save+Restart pre-this-fix). 3. YAML `runtime_config.{model,provider}` fills any field env didn't supply. Contained in this repo per the brief's scope guidance — does NOT touch the runtime wheel's `workspace/config.py` (would need a separate molecule-core PR), does NOT refactor sibling template repos, does NOT change the persona-env dispatch policy. ## Tests Eleven new cases in `tests/test_env_model_provider_dispatch.py`: - persona-env shape (minimax, GLM, lead claude-code) → correct model + slug - legacy `MODEL_PROVIDER`-as-model-id shape still works - env wins over YAML - YAML fallback when env unset - whitespace/empty defensive handling - case-insensitive provider slug matching Full adapter suite: **76/76 pass**. ## Verification path After merge + image rebuild + workspace re-provision, the ws-* containers will boot with `provider=minimax` (not anthropic-oauth), `ANTHROPIC_BASE_URL=https://api.minimax.io/anthropic`, `MINIMAX_API_KEY` projected onto `ANTHROPIC_AUTH_TOKEN`, and the SDK init handshake succeeding. Refs: task #181, brief 2026-05-08, related #180 (#7 in this repo).

claude-ceo-assistant added 1 commit 2026-05-08 21:12:10 +00:00

fix(adapter): honor MODEL/MODEL_PROVIDER env (persona-env convention)

CI / Adapter unit tests (push) Successful in 1m40s

Details

Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 19s

Details

CI / Adapter unit tests (pull_request) Failing after 52s

Details

CI / validate (push) Failing after 2m17s

Details

CI / validate (pull_request) Successful in 13m19s

Details

1742b60e62

Fixes the 2026-05-08 dev-tree wedge: 22/27 non-lead workspaces (minimax tier)
stuck in degraded after /org/import, every chat hanging on
`Control request timeout: initialize`.

Root cause
----------
The persona env files (`~/.molecule-ai/personas/<name>/env`) declare a TWO-
variable convention:
  - MODEL          = model id   ("MiniMax-M2.7-highspeed")
  - MODEL_PROVIDER = provider slug ("minimax")

The runtime wheel's legacy `workspace/config.py` interprets MODEL_PROVIDER
as the *model id* — a name chosen long before there was a separate MODEL
env. With both set, the legacy code reads MODEL_PROVIDER="minimax" into
runtime_config.model. The literal string "minimax" doesn't match any
registry prefix (`minimax-` requires a hyphen suffix), falls through to
providers[0] (anthropic-oauth), the auth check fails on the absent
CLAUDE_CODE_OAUTH_TOKEN, the claude CLI launches anyway, and the SDK's
`query.initialize()` 60s control timeout fires.

The brief hypothesised `claude_sdk_executor.py` lacked dispatch logic.
Phase 1 evidence: dispatch ALREADY exists in adapter.py — model -> provider
-> base_url + auth_env routing was correctly built for #180. The bug was
upstream: MODEL_PROVIDER's name collision with the persona-env convention
silently corrupted the picked model BEFORE adapter.py saw it.

Fix
---
New helper `_resolve_model_and_provider_from_env` reconciles env vars
against YAML inside adapter.setup() and create_executor():

  1. MODEL env -> picked_model (authoritative when set).
  2. MODEL_PROVIDER env -> explicit_provider IFF the value matches a
     registered provider name. Backward-compat: if MODEL is unset and
     MODEL_PROVIDER doesn't match a registered slug, treat it as a
     legacy model id (canvas Save+Restart pre-this-fix).
  3. YAML runtime_config.{model,provider} fills any field env didn't
     supply.

Contained in the template repo per the brief's scope guidance — does NOT
touch the runtime wheel's workspace/config.py (which would need a separate
molecule-core PR), and does NOT change the persona-env dispatch policy
(Phase 2 mapping 2026-05-08).

Tests
-----
Eleven new cases in tests/test_env_model_provider_dispatch.py covering:
  - persona-env shape (minimax, GLM, lead claude-code) -> correct model + slug
  - legacy MODEL_PROVIDER-as-model-id shape still works
  - env wins over YAML
  - YAML fallback when env unset
  - whitespace/empty defensive handling
  - case-insensitive provider slug matching

Full adapter test suite: 76/76 pass.

Verification path
-----------------
After image rebuild + workspace re-provision, ws-* containers will boot
with provider=minimax (not anthropic-oauth), ANTHROPIC_BASE_URL set to
https://api.minimax.io/anthropic, MINIMAX_API_KEY projected onto
ANTHROPIC_AUTH_TOKEN, and the SDK init handshake succeeding.

Refs: task #181, brief 2026-05-08, related #180 (#7 in this repo)