fix(adapter): honor MODEL/MODEL_PROVIDER env (persona-env convention) #9
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "fix/dispatch-on-model-env-task-181"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Root cause
The persona env files (
~/.molecule-ai/personas/<name>/env, sourced into each workspace at provision time) declare TWO env vars with distinct semantics:MODEL— model id (e.g.MiniMax-M2.7-highspeed,opus)MODEL_PROVIDER— provider slug (e.g.minimax,claude-code)The runtime wheel's legacy
workspace/config.pyinterpretsMODEL_PROVIDERenv as the model id — a naming choice predating the introduction of a separateMODELenv. With both set per the persona convention, the legacy code readsMODEL_PROVIDER="minimax"intoruntime_config.model. The literal stringminimaxfails to match any registry prefix (minimax-requires a hyphen suffix), falls through toproviders[0](anthropic-oauth), demandsCLAUDE_CODE_OAUTH_TOKENwhich is unset on non-leads, the claude CLI launches anyway, and the SDK'squery.initialize()60s control timeout fires.Live evidence from a wedged container
ws-3286381d-422:The brief hypothesised that
claude_sdk_executor.pylacked dispatch logic and always called the Anthropic SDK regardless ofMODEL_PROVIDER. The brief was wrong on the location: dispatch logic ALREADY exists inadapter.py::setup()(model → provider → base_url + auth_env via_resolve_provider, the #180 fix). The bug is upstream —MODEL_PROVIDERenv vs persona-env naming collision silently corrupts the picked model BEFOREadapter.pysees it.Fix
New helper
_resolve_model_and_provider_from_envreconciles env vars against YAML insideadapter.setup()andadapter.create_executor():MODELenv → picked_model (authoritative when set).MODEL_PROVIDERenv → explicit_provider IFF the value matches a registered provider name. Backward-compat: ifMODELis unset andMODEL_PROVIDERdoesn't match a registered slug, treat it as a legacy model id (canvas Save+Restart pre-this-fix).runtime_config.{model,provider}fills any field env didn't supply.Contained in this repo per the brief's scope guidance — does NOT touch the runtime wheel's
workspace/config.py(would need a separate molecule-core PR), does NOT refactor sibling template repos, does NOT change the persona-env dispatch policy.Tests
Eleven new cases in
tests/test_env_model_provider_dispatch.py:MODEL_PROVIDER-as-model-id shape still worksFull adapter suite: 76/76 pass.
Verification path
After merge + image rebuild + workspace re-provision, the ws-* containers will boot with
provider=minimax(not anthropic-oauth),ANTHROPIC_BASE_URL=https://api.minimax.io/anthropic,MINIMAX_API_KEYprojected ontoANTHROPIC_AUTH_TOKEN, and the SDK init handshake succeeding.Refs: task #181, brief 2026-05-08, related #180 (#7 in this repo).
Fixes the 2026-05-08 dev-tree wedge: 22/27 non-lead workspaces (minimax tier) stuck in degraded after /org/import, every chat hanging on `Control request timeout: initialize`. Root cause ---------- The persona env files (`~/.molecule-ai/personas/<name>/env`) declare a TWO- variable convention: - MODEL = model id ("MiniMax-M2.7-highspeed") - MODEL_PROVIDER = provider slug ("minimax") The runtime wheel's legacy `workspace/config.py` interprets MODEL_PROVIDER as the *model id* — a name chosen long before there was a separate MODEL env. With both set, the legacy code reads MODEL_PROVIDER="minimax" into runtime_config.model. The literal string "minimax" doesn't match any registry prefix (`minimax-` requires a hyphen suffix), falls through to providers[0] (anthropic-oauth), the auth check fails on the absent CLAUDE_CODE_OAUTH_TOKEN, the claude CLI launches anyway, and the SDK's `query.initialize()` 60s control timeout fires. The brief hypothesised `claude_sdk_executor.py` lacked dispatch logic. Phase 1 evidence: dispatch ALREADY exists in adapter.py — model -> provider -> base_url + auth_env routing was correctly built for #180. The bug was upstream: MODEL_PROVIDER's name collision with the persona-env convention silently corrupted the picked model BEFORE adapter.py saw it. Fix --- New helper `_resolve_model_and_provider_from_env` reconciles env vars against YAML inside adapter.setup() and create_executor(): 1. MODEL env -> picked_model (authoritative when set). 2. MODEL_PROVIDER env -> explicit_provider IFF the value matches a registered provider name. Backward-compat: if MODEL is unset and MODEL_PROVIDER doesn't match a registered slug, treat it as a legacy model id (canvas Save+Restart pre-this-fix). 3. YAML runtime_config.{model,provider} fills any field env didn't supply. Contained in the template repo per the brief's scope guidance — does NOT touch the runtime wheel's workspace/config.py (which would need a separate molecule-core PR), and does NOT change the persona-env dispatch policy (Phase 2 mapping 2026-05-08). Tests ----- Eleven new cases in tests/test_env_model_provider_dispatch.py covering: - persona-env shape (minimax, GLM, lead claude-code) -> correct model + slug - legacy MODEL_PROVIDER-as-model-id shape still works - env wins over YAML - YAML fallback when env unset - whitespace/empty defensive handling - case-insensitive provider slug matching Full adapter test suite: 76/76 pass. Verification path ----------------- After image rebuild + workspace re-provision, ws-* containers will boot with provider=minimax (not anthropic-oauth), ANTHROPIC_BASE_URL set to https://api.minimax.io/anthropic, MINIMAX_API_KEY projected onto ANTHROPIC_AUTH_TOKEN, and the SDK init handshake succeeding. Refs: task #181, brief 2026-05-08, related #180 (#7 in this repo)