molecule-ai-workspace-template-claude-code

molecule-ai/molecule-ai-workspace-template-claude-code

Author	SHA1	Message	Date
claude-ceo-assistant	8adc3576fd	fix(adapter): map persona-friendly slugs (claude-code, anthropic) to registry names Some checks failed CI / Adapter unit tests (push) Successful in 1m46s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 50s Details CI / Adapter unit tests (pull_request) Successful in 2m16s Details CI / validate (pull_request) Successful in 6m18s Details CI / validate (push) Failing after 18m56s Details Phase 4 verification surfaced a follow-up edge case the initial fix missed: the persona env files use friendlier slugs than the registry's canonical names: * MODEL_PROVIDER=claude-code -> anthropic-oauth (Claude Code subscription) * MODEL_PROVIDER=anthropic -> anthropic-api (direct Anthropic API key) Without an alias map, a lead workspace's MODEL_PROVIDER=claude-code env fell through the slug-detection path; when the YAML didn't pin a provider, the model-prefix matcher saw MODEL=MiniMax-M2.7 and routed the lead to MiniMax — even though CLAUDE_CODE_OAUTH_TOKEN was clearly the intended auth path. Add _PROVIDER_SLUG_ALIASES with the two operator-facing slugs that don't match registry names verbatim. The alias map is consulted before the slug-vs-legacy detection, so claude-code now resolves to anthropic-oauth and the lead boots through OAuth as intended. Tests ----- + test_persona_env_lead_with_minimax_model_routes_via_oauth — lock in the alias-map behavior so a future contributor can't silently re-introduce the lead-mis-routed-to-MiniMax bug. + test_anthropic_alias_resolves_to_anthropic_api — covers the second alias path. Updated test_persona_env_lead_claude_code_resolves_correctly to assert the new (correct) behavior: provider == 'anthropic-oauth', not None. Full adapter suite: 78/78 pass.	2026-05-08 14:23:59 -07:00
claude-ceo-assistant	1742b60e62	fix(adapter): honor MODEL/MODEL_PROVIDER env (persona-env convention) Some checks failed CI / Adapter unit tests (push) Successful in 1m40s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 19s Details CI / Adapter unit tests (pull_request) Failing after 52s Details CI / validate (push) Failing after 2m17s Details CI / validate (pull_request) Successful in 13m19s Details Fixes the 2026-05-08 dev-tree wedge: 22/27 non-lead workspaces (minimax tier) stuck in degraded after /org/import, every chat hanging on `Control request timeout: initialize`. Root cause ---------- The persona env files (`~/.molecule-ai/personas/<name>/env`) declare a TWO- variable convention: - MODEL = model id ("MiniMax-M2.7-highspeed") - MODEL_PROVIDER = provider slug ("minimax") The runtime wheel's legacy `workspace/config.py` interprets MODEL_PROVIDER as the model id — a name chosen long before there was a separate MODEL env. With both set, the legacy code reads MODEL_PROVIDER="minimax" into runtime_config.model. The literal string "minimax" doesn't match any registry prefix (`minimax-` requires a hyphen suffix), falls through to providers[0] (anthropic-oauth), the auth check fails on the absent CLAUDE_CODE_OAUTH_TOKEN, the claude CLI launches anyway, and the SDK's `query.initialize()` 60s control timeout fires. The brief hypothesised `claude_sdk_executor.py` lacked dispatch logic. Phase 1 evidence: dispatch ALREADY exists in adapter.py — model -> provider -> base_url + auth_env routing was correctly built for #180. The bug was upstream: MODEL_PROVIDER's name collision with the persona-env convention silently corrupted the picked model BEFORE adapter.py saw it. Fix --- New helper `_resolve_model_and_provider_from_env` reconciles env vars against YAML inside adapter.setup() and create_executor(): 1. MODEL env -> picked_model (authoritative when set). 2. MODEL_PROVIDER env -> explicit_provider IFF the value matches a registered provider name. Backward-compat: if MODEL is unset and MODEL_PROVIDER doesn't match a registered slug, treat it as a legacy model id (canvas Save+Restart pre-this-fix). 3. YAML runtime_config.{model,provider} fills any field env didn't supply. Contained in the template repo per the brief's scope guidance — does NOT touch the runtime wheel's workspace/config.py (which would need a separate molecule-core PR), and does NOT change the persona-env dispatch policy (Phase 2 mapping 2026-05-08). Tests ----- Eleven new cases in tests/test_env_model_provider_dispatch.py covering: - persona-env shape (minimax, GLM, lead claude-code) -> correct model + slug - legacy MODEL_PROVIDER-as-model-id shape still works - env wins over YAML - YAML fallback when env unset - whitespace/empty defensive handling - case-insensitive provider slug matching Full adapter test suite: 76/76 pass. Verification path ----------------- After image rebuild + workspace re-provision, ws-* containers will boot with provider=minimax (not anthropic-oauth), ANTHROPIC_BASE_URL set to https://api.minimax.io/anthropic, MINIMAX_API_KEY projected onto ANTHROPIC_AUTH_TOKEN, and the SDK init handshake succeeding. Refs: task #181, brief 2026-05-08, related #180 (#7 in this repo)	2026-05-08 14:11:42 -07:00
dev-lead	291f356dab	fix(adapter,tests): isolate _load_providers tests from multi-path lookup All checks were successful Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s Details CI / Adapter unit tests (push) Successful in 1m1s Details CI / Adapter unit tests (pull_request) Successful in 1m2s Details CI / validate (push) Successful in 3m23s Details CI / validate (pull_request) Successful in 3m22s Details The 5 _load_providers tests were single-path-only: they wrote a config.yaml to tmp_path and called _load_providers(str(tmp_path)), expecting the lookup to read tmp_path/config.yaml. After the multi-path fix in #7, _load_providers also checks: 1. _CANONICAL_ADAPTER_DIR/config.yaml (= /opt/adapter/config.yaml) 2. _TEMPLATE_DIR/config.yaml (= dirname(__file__)/config.yaml) 3. ${config_path}/config.yaml (the test's tmp_path) Path 2 finds the repo's bundled config.yaml on the test runner's disk before path 3 — the tests then see the bundled providers list instead of the test's expected behavior. Two surface changes: 1. adapter.py — extract `os.path.dirname(os.path.abspath(__file__))` into a module-level `_TEMPLATE_DIR` constant, mirroring `_CANONICAL_ADAPTER_DIR`. Production behavior identical (resolved once at import). Tests can monkeypatch the module attribute to redirect the path-2 lookup. 2. tests/test_adapter_prevalidate.py — 5 _load_providers tests monkeypatch `_CANONICAL_ADAPTER_DIR` and `_TEMPLATE_DIR` to a non-existent tmp subdir, isolating the test to the workspace config_path branch they always meant to test. The 6th _load_providers test (`test_load_providers_parses_yaml_and_normalizes`) already passed because path 2 returns 7 providers and that's what that test expects — left unchanged. Verification: pytest tests/ 65/65 PASS pytest tests/test_adapter_prevalidate.py -k load_providers 6/6 PASS Closes molecule-core#129 follow-up — the unit tests were the last red on the template repo's CI. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 13:27:56 -07:00
dev-lead	b96a6d2569	fix(adapter): restore multi-path _load_providers (canonical + template + workspace) Some checks failed Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s Details CI / Adapter unit tests (pull_request) Failing after 59s Details CI / Adapter unit tests (push) Failing after 1m6s Details CI / validate (pull_request) Successful in 3m22s Details CI / validate (push) Successful in 3m21s Details The template's _load_providers had only ONE lookup path (${config_path}/config.yaml = /configs/config.yaml) — which is the per-workspace override, NOT the template's bundled provider registry. Every MiniMax/GLM/Kimi/DeepSeek model resolved to anthropic-oauth and crashed at first LLM call: None of CLAUDE_CODE_OAUTH_TOKEN set for model=MiniMax-M2.7-highspeed (provider=anthropic-oauth) — the adapter will fail on the first LLM call with AuthenticationError ... probed_cli_error='Not logged in · Please run /login' Canary chronic red 38h+ on 2026-05-07/08 traced to this. The fix that the May-4 image already had bundled — a 4-path lookup with canonical /opt/adapter/config.yaml + __file__-adjacent + workspace override + builtins fallback — was never on Gitea main, so post- suspension rebuilds dropped it. Restoring here. Resolution order: 1. /opt/adapter/config.yaml (canonical, provisioner-contracted) 2. dirname(__file__)/config.yaml (covers /app/config.yaml from Dockerfile #6 as well as dev/test imports) 3. ${config_path}/config.yaml (per-workspace override) 4. _BUILTIN_PROVIDERS (oauth + anthropic-api fallback) Verified locally: ps=_load_providers('/nonexistent') returns the 7 providers from /tmp/cctmpl/config.yaml via path 2 (the __file__-adjacent lookup). Without the fix, returns 2 (builtins). Closes molecule-core#129 failure mode #1 (the original "Agent error (Exception)" 38h chronic red). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 13:12:24 -07:00
claude-ceo-assistant (Claude Opus 4.7 on Hongming's MacBook)	a2c7bf3d3b	fix(adapter): honor explicit provider config — fail fast when not in registry (#180 ) Workspace operators set 'provider: minimax' in /configs/config.yaml expecting the adapter to route to MiniMax. Pre-fix behavior: adapter ignored 'provider:' entirely, _resolve_provider model-matched against _BUILTIN_PROVIDERS (anthropic-oauth + anthropic-api only), no model_prefix matched 'MiniMax-M2.7-highspeed', silent fallback to providers[0] (anthropic-oauth) — SDK kept using CLAUDE_CODE_OAUTH_TOKEN, hit OAuth quota under a name the operator never asked for. Fix: _resolve_provider now takes an explicit_provider arg. setup() reads it from runtime_config.provider OR top-level config.yaml provider:. Explicit name in registry → returned. Not in registry → ValueError with the two paths to fix (add provider entry, or switch runtime template). 10 new tests cover: explicit-in-registry returns match, case-insensitive, not-in-registry raises with actionable message, defense-in-depth against silent fallback regression, custom-registry lookup, empty/None treated as no-explicit (back-compat). Closes #180. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 10:58:51 -07:00
Hongming Wang	b9a1fa1b1f	feat: per-vendor env routing for third-party providers (task #244 ) Some checks failed CI / validate (push) Failing after 0s Details CI / Adapter unit tests (push) Failing after 6s Details Third-party Anthropic-compat providers (MiniMax, GLM, Kimi, DeepSeek) all reuse the Anthropic SDK's wire format, which means the claude CLI and claude-code-sdk read the bearer token from ANTHROPIC_AUTH_TOKEN no matter which vendor is being talked to. Pre-#244: * Canvas surfaced the vendor-specific name (MINIMAX_API_KEY, etc.) to the user — so a user who saved only MINIMAX_API_KEY hit a silent 401 on first call. * The boot audit said `MINIMAX_API_KEY=set`, making it look like an SDK bug rather than a routing gap. * A user with multiple vendor keys could only run one workspace at a time because they all fought over the shared ANTHROPIC_AUTH_TOKEN slot. Diagnostic-only audit logging shipped earlier (#32) but the actual routing was never written — task #244 was mismarked complete. Changes: * config.yaml: third-party model `required_env` now references the per-vendor name (MINIMAX_API_KEY, GLM_API_KEY, KIMI_API_KEY, DEEPSEEK_API_KEY) so canvas asks the user for the right key. First-party Anthropic models still use ANTHROPIC_AUTH_TOKEN / CLAUDE_CODE_OAUTH_TOKEN. * config.yaml: each third-party provider's `auth_env` lists the vendor name FIRST (priority order) so projection picks the vendor key over a stale ANTHROPIC_AUTH_TOKEN. * adapter.py: new `_project_vendor_auth(provider)` helper, called from `setup()` right after `_resolve_provider`. Idempotent — only projects when ANTHROPIC_AUTH_TOKEN is unset (operator override always wins). Logs the projection by NAME, never by VALUE (mirrors `_audit_auth_env_presence`). * tests/test_provider_routing.py: 6 new tests pin the contract — vendor-key-set projects, AUTH_TOKEN-already-set is never clobbered, first-party providers skip projection, secret value never leaks into a log record, empty-string vendor env doesn't trigger projection, and the same routing fires for GLM / Kimi / DeepSeek. Mirrors the parallel hermes-side fix from task #249 / hermes PR #38; keeps the two runtimes' multi-vendor UX in lockstep. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 22:20:03 -07:00
Hongming Wang	78ae139609	feat(adapter,entrypoint): boot env audit + crash-loop diagnosis logging Adds two operator-visible boot diagnostics that close the diagnosis gap exposed by the 2026-05-02 MiniMax E2E crash-loop. The universal canvas-picked-model fix (Bug B) and per-model required_env (Bug D) live in molecule-core PR #2538 — this PR adds the per-template visibility that complements them so operators can answer "is the key missing or is routing wrong?" from `docker logs` alone. Changes ------- adapter.py: - _AUTH_ENV_AUDIT tuple of 8 vendor env names (CLAUDE_CODE_OAUTH_TOKEN, ANTHROPIC_API_KEY/AUTH_TOKEN/BASE_URL, MINIMAX/GLM/KIMI/DEEPSEEK_API_KEY). - _audit_auth_env_presence() helper — single INFO line of NAME=set/unset pairs. NEVER logs values; the test pins this with a "fake-secret-MUST- NOT-LEAK" sentinel that must never appear in the log message. - One call site at the end of setup()'s boot banner so every workspace start emits both "which provider got picked" and "which envs are present" in adjacent log lines. entrypoint.sh: - log_boot_context() function fired once before the gosu drop (as root) and once after (as agent) so an operator can spot env values lost across the privilege drop. Emits uid/gid/user/hostname/workspace_id/ platform_url/configs_dir/workspace_dir + the same 8 env names as NAME=set/unset. Mirror of _AUTH_ENV_AUDIT — list pinned in sync by a new AST-style test (test_audit_env_list_matches_entrypoint_sh) that parses entrypoint.sh and asserts set-equality with adapter.py's tuple. tests/test_adapter_logging.py (new): - 4 tests covering the audit contract: every name appears, all-unset scenario, empty-string treated as unset (matches routing semantics), and the cross-file sync gate against entrypoint.sh's for-loop. - Stubs molecule_runtime + a2a so the helpers can be imported without the real wheel installed in CI (mirrors test_adapter_prevalidate.py's scaffolding pattern). Why this complements molecule-core PR #2538 ------------------------------------------- - PR #2538 makes Bug B (canvas-picked model silently dropped) impossible by resolving model centrally in workspace/config.py:load_config — every adapter (claude-code, hermes, codex, future ones) gets the passthrough for free. - PR #2538 makes Bug D (preflight rejects valid auth for non-default models) impossible by REPLACE-not-union per-entry required_env. - This template PR is the per-template observability layer: when one of those universal fixes regresses (or when an operator misconfigs a vendor key), the boot logs say exactly which env was present at each tier. Validated end-to-end on workspace be27badd-00a7-4cef-91e8-af428175c76f (clean boot, MINIMAX_API_KEY=set audited, no crash-loop). Closes part of molecule-monorepo task #248. Sibling of #2538 for molecule-core. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 21:41:05 -07:00
Hongming Wang	a78626ced4	fix(adapter): keep setup() routing — strip prefix only at CLI invocation Live-test revealed a regression in PR #24's setup() strip: the wheel- default `anthropic:claude-opus-4-7` paired with an OAuth workspace (CLAUDE_CODE_OAUTH_TOKEN set, no ANTHROPIC_API_KEY) is the realistic production shape. Stripping in setup() routes those users into the `anthropic-api` provider entry, after which the CLI hangs at `initialize` because no API key env is set. Caught on workspace dd40faf8 on 2026-05-01 — banner went `provider=anthropic-api` and A2A wedged on Control request timeout. Pre-fix routing (let prefixed strings fall through to providers[0] = anthropic-oauth) is actually correct for this combo. The strip is only needed at the CLI invocation site (create_executor) where claude's `--model` arg must be a bare id. Tests: replace `test_setup_strip_routes_prefixed_anthropic_to_anthropic_api` with `test_setup_keeps_prefix_routing_oauth_for_anthropic_prefix`, which pins the inverse — prefixed model + OAuth env stays on oauth and emits no API-key warning. The 5 unit cases on `_strip_provider_prefix` plus the `create_executor` strip pins remain unchanged. 36/36 pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 16:46:54 -07:00
Hongming Wang	6ba4cc6a01	fix(adapter): strip LangChain-style provider prefix before CLI invocation The molecule-runtime wheel's config.py defaults model to `anthropic:claude-opus-4-7` so langchain/crewai consumers get a uniform provider:model string out of the box. The claude CLI's --model arg expects the bare model id and silently exits 1 (no stderr) on prefixed strings — root cause of the 2026-05-01 "Agent error (Exception)" mid-A2A bug. Diagnosed via strace on a live workspace: the CLI received `--model anthropic:claude-opus-4-7` and exit_group(1)'d before any non-fatal output. Add `_strip_provider_prefix` and call it in both setup() (so _resolve_provider routes anthropic:claude-X correctly to anthropic-api instead of falling back to oauth) and create_executor() (so the bare id reaches the CLI). Only known-Claude prefixes are stripped; unknown ones (openai:, bedrock:) pass through so the CLI fails loudly instead of being silently mangled. Coverage: 8 new tests — unit tests for the helper across all branches, end-to-end `create_executor` strip on dict + dataclass shapes, and a caplog-based setup() test that pins provider=anthropic-api routing after the strip (the silent-fallback failure mode this fix eliminates). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 16:06:53 -07:00
Hongming Wang	25e86963f3	fix(adapter): drop dead _normalize_provider({}) fallback in _resolve_provider The empty-providers fallback in `_resolve_provider` was load-bearing when `_load_providers` could return an empty tuple, but after PR #22's per-entry hardening every return path yields a non-empty registry (builtins on parse failure, the parsed list otherwise). The leftover `_normalize_provider({})` branch became dead and outright broken: with the stricter `_normalize_provider` rejecting nameless entries, the fallback now returns None and would crash setup() on `provider["auth_mode"]` the moment anything called `_resolve_provider` with an empty tuple. Replace the dead branch with an explicit ValueError + pre-condition docstring. Defensive — no production caller can hit this — but turns a future silent NoneType crash into an actionable error. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 00:05:16 -07:00
Hongming Wang	2b9b4306eb	fix(adapter): per-entry isolation in _load_providers + tighten _normalize_provider Two correctness issues spotted in self-review of `c6f4912`: 1. String-as-prefix typo split into character tuple. ``model_prefixes: mimo-`` (operator forgot brackets) used to iterate over characters → ``('m','i','m','o','-')``, silently routing every model id starting with 'm', 'i', or '-' through the entry. Now: non-list values coerce to empty tuple (entry survives but matches nothing — operator notices in boot banner, not via misrouted requests). 2. Single bad provider entry nuked the whole registry. _load_providers built the registry via a generator inside tuple(...). One AttributeError mid-comprehension (e.g. ``[mimo-, 123]`` — int's missing .lower()) propagated out, broad except caught it, registry silently fell back to _BUILTIN_PROVIDERS (oauth + anthropic-api only). Every third-party model would then route to anthropic-oauth — exactly the silent-fallback failure mode this PR was meant to eliminate. Now: per-entry try/except drops the bad entry with a warning, rest survives. Also: entries without a string ``name`` field are now dropped with a warning instead of silently using the placeholder ``<unnamed>`` — operator typos surface in boot logs. Tests: 28 passing (3 new regression tests covering both issues plus the no-name path). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 23:58:24 -07:00
Hongming Wang	c6f4912d09	feat(adapter): data-driven provider registry in config.yaml Move the model→endpoint→auth-env mapping out of hardcoded constants in adapter.py + entrypoint.sh into a single `providers:` list at the top of config.yaml. The adapter loads it at boot via _load_providers; canvas Config tab will read the same YAML for its Provider dropdown so UI and adapter never disagree on what's available. Adding a new provider becomes a one-line YAML edit — no Python or shell changes. Includes 5 third-party providers ready out of the box (Anthropic-compat endpoints, Bearer-style ANTHROPIC_AUTH_TOKEN OR ANTHROPIC_API_KEY auth): xiaomi-mimo https://api.xiaomimimo.com/anthropic minimax https://api.minimax.io/anthropic zai https://api.z.ai/api/anthropic (NEW) moonshot https://api.moonshot.ai/anthropic (NEW) deepseek https://api.deepseek.com/anthropic (NEW) Plus 7 new model entries in runtime_config.models (mimo-v2.5, MiniMax-M2, MiniMax-M2.7, GLM-4.6, GLM-4.5, kimi-k2.5, kimi-k2, deepseek-v4-pro, deepseek-v4-flash) so they show up in the Canvas Config dropdown. Operator override unchanged: ANTHROPIC_BASE_URL set as a workspace secret still wins over the registry default — the escape hatch for regional endpoints (Xiaomi token-plan-sgp, MiniMax api.minimaxi.com). entrypoint.sh: drops the `mimo-*` case mapping (adapter handles routing now). _BUILTIN_PROVIDERS retained as malformed-YAML fallback so a bare-bones workspace still boots with oauth + anthropic-api defaults. Tests: 25 passing. New coverage: - YAML parses + normalizes to expected shape - Malformed YAML falls back to builtins (warning, not raise) - Each new provider routes its model id to the right base_url - ANTHROPIC_AUTH_TOKEN alone satisfies third-party auth check - Operator-set ANTHROPIC_BASE_URL overrides registry default - Case-insensitive prefix match (MiniMax-M2 / minimax-m2.7 / GLM-4.6) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 23:29:40 -07:00
Hongming Wang	c646b8cebe	feat(adapter): raise on third-party model without ANTHROPIC_BASE_URL Aligns setup()'s third-party-model-without-URL handling with create_executor()'s pre-validate (#19) — both unrecoverable misconfigurations now raise ValueError at boot instead of one warning and one raising. Why: a third-party (mimo-) model selected without ANTHROPIC_BASE_URL sends every LLM request to api.anthropic.com with a non-Anthropic key, 401-ing every prompt. Workspace boots, looks "online" via heartbeat, but is structurally broken on the user-facing path. The previous warning-only path produced the same end-user symptom as the 2026-04-30 incident (workspace looks alive, every interaction fails) just via a different misconfig shape. Symmetry: create_executor raises when ANTHROPIC_BASE_URL is set to a non-Anthropic host but no model is picked. setup() now raises when a third-party model is picked but no URL is set. Together they catch both halves of the misconfig surface at boot, before the workspace enters "online" status. Adds 4 setup() tests: - raises on third-party + no URL - passes on third-party + URL - passes on OAuth alias (sonnet) + no URL - passes on Anthropic API id (claude-) + no URL Stubs molecule_runtime.plugins.load_plugins as a no-op so the pass-path tests run cleanly without the runtime installed. Test count: 11 (7 create_executor + 4 setup). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 22:50:25 -07:00
Hongming Wang	a4d83cb356	chore(adapter): drop redundant urlparse imports + dead ternary Self-review follow-up to #19. Two cosmetic cleanups: - urlparse is now imported at module-top (added in #17 alongside the auth-mode classification) so the two inline `from urllib.parse import urlparse` statements inside conditional branches are redundant. - The log-format ternary " (custom upstream)" if base_url else "" lives inside `if base_url:` — base_url is unconditionally truthy there, so the else branch was dead code. No behavior change. Tests still 7/7 green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 22:45:04 -07:00
Hongming Wang	61f935674f	Merge branch 'main' into feat/adapter-prevalidate	2026-04-30 22:38:53 -07:00
Hongming Wang	0d95b5098a	feat(adapter): pre-validate ANTHROPIC_BASE_URL + missing model combo The 2026-04-30 staging incident traced back to workspaces booting with ANTHROPIC_BASE_URL pointing at a non-Anthropic shim (MiniMax / OpenAI gateway) but no explicit model configured. The adapter silently fell back to "sonnet" — an Anthropic-native alias the upstream didn't recognize — and the SDK --print probe hung 30s before timing out. Platform's phantom-busy sweep then nuked the workspace at 10min, producing "every workspace dead" with the root cause buried in a 30s subprocess hang. Pre-validate the combo at adapter boot: when ANTHROPIC_BASE_URL host is non-Anthropic AND no explicit model is set, raise ValueError with an actionable message pointing to MODEL_PROVIDER / runtime_config.model. Also log the resolved model + base_url_host every boot so future failures explain themselves in the workspace logs without digging into the SDK subprocess. Tests live under tests/ with their own pytest.ini that anchors rootdir there — keeps pytest from importing the package __init__.py (which does the runtime-discovery relative import that requires molecule_runtime installed). 7 tests cover: misconfig raises with the right message, Anthropic-native passes, no-base-url passes, custom-url + explicit model passes, dataclass + dict shapes, unparseable URL no-crash. CI runs them on every push/PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 22:35:49 -07:00
Hongming Wang	824bc4a176	adapter: warn for the right env var per auth mode + log boot banner The pre-multi-provider warning hardcoded CLAUDE_CODE_OAUTH_TOKEN — it fired even when an operator legitimately picked claude-sonnet-4-6 (API key) or mimo-v2-flash (third-party) and set ANTHROPIC_API_KEY instead. Misleading. Now classifies the picked model into oauth / anthropic_api / third_party_anthropic_compat and warns about the env var that auth path actually needs. Adds a single-line boot banner so workspace logs surface which provider was selected and (for third-party) which base-URL host took effect — host-only, never full URL. Adds an additional warning when a third-party model is selected but ANTHROPIC_BASE_URL is unset, since the symptom otherwise is silent fall-through to api.anthropic.com with a third-party key (401). Functional tests against 14 model-id cases (oauth aliases, claude-* versioned, all 4 mimo-* variants, case-insensitivity, empty/None, unknown id fallback) all pass — see commit's pre-push validation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-29 03:15:21 -07:00
Molecule AI Plugin-Dev	c930626f82	fix(adapter): warn at startup if CLAUDE_CODE_OAUTH_TOKEN is absent (KI-001) adapter.py:setup() now emits a logger.warning() if CLAUDE_CODE_OAUTH_TOKEN is absent, so operators see the problem immediately instead of getting a silent AuthenticationError on the first LLM call. known-issues.md updated to mark KI-001 as resolved.	2026-04-29 01:57:16 -07:00
Hongming Wang	e7dea39df2	fix: qualify all bare imports of runtime modules Five `from <runtime_module> import` statements in adapter.py + claude_sdk_executor.py were never qualified when the template was extracted to its own repo (#87). They worked when the runtime was bundled into workspace/ where bare imports resolved against sibling files; in the template repo they explode at startup with ModuleNotFoundError as soon as Python reaches the import. Caught by manual provision after pipeline-3 wire-real E2E. The plugins import was the first one tripped because it sits in adapter.setup() — earlier bare imports inside claude_sdk_executor.py are deferred until the executor is constructed. Pattern: any `from <X> import Y` where X is a workspace/ module -> `from molecule_runtime.X import Y`. Fixes: - adapter.py:97 plugins - claude_sdk_executor.py executor_helpers, heartbeat, a2a_client, platform_auth Same class of bug as the runtime's TOP_LEVEL_MODULES drift but inverted — instead of forgetting to rewrite imports IN the wheel, the template authors forgot to qualify imports IN the template code (the build script's rewriter only runs on workspace/ -> wheel). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 05:20:24 -07:00
Hongming Wang	280e89c50b	fix: export Adapter alias so runtime adapter discovery works `workspace/adapters/__init__.py:get_adapter()` does `getattr(mod, "Adapter")` after importing ADAPTER_MODULE. Without the alias the runtime's preflight check fails with: [FAIL] Runtime: ADAPTER_MODULE='adapter' imported, but no `Adapter` class is exported. Add `Adapter = YourAdapterClass` at module scope Symptom: workspace container restarts forever, never reaches `online`. This contract was added (or hardened) in #123's adapter-discovery refactor. Hermes's adapter.py already has `Adapter = HermesAgentAdapter` at module scope; claude-code missed the migration. gemini-cli template has the same bug — file separately. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 04:55:30 -07:00
Hongming Wang	c9ca671cd5	feat(adapter): declare provides_native_session + idle_timeout_override Wires this template into the platform's capability-primitive layer (molecule-core task #117). Two declarations: 1. RuntimeCapabilities(provides_native_session=True) — the claude-agent-sdk maintains a long-lived streaming session with its own client state. The platform's a2a_queue would double-buffer that in-flight state if it didn't know the SDK owned it. Once primitive #5 lands in molecule-core, the platform's enqueue path will skip workspaces declaring this and dispatch directly. 2. idle_timeout_override() returning 900 (15 min) — Opus + multi-step tool use legitimately runs 8-10 min between broadcaster events. The pre-capability bug (molecule-core PR #2128) hit this: the platform's 5min idle timer cancelled mid-flight during long packaging steps. The override moves the per-workspace ceiling up without leaving genuinely-wedged runs hanging too long. Consumed by molecule-core PR #2139 in a2a_proxy.dispatchA2A. Other capability flags stay False — see inline docstring for the per-flag rationale (notably native_status_mgmt is partially adapter- driven via runtime_state="wedged" but the recovery path stays platform- owned, so we don't claim it yet). Requires molecule-ai-workspace-runtime with RuntimeCapabilities (PR #2137 in molecule-core, merged 2026-04-27). The current requirements.txt pin (>=0.1.0) will pick up the latest released version on next image rebuild. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 23:07:28 -07:00
Hongming Wang	7f9b2b4189	feat: add adapter code + Dockerfile for standalone deployment Adapters extracted from molecule-monorepo/workspace-template. Uses molecule-ai-workspace-runtime PyPI package for shared infrastructure. - adapter.py — runtime-specific adapter class - requirements.txt — runtime-specific deps + molecule-ai-workspace-runtime - Dockerfile — FROM python:3.11-slim, pip install, COPY adapter, molecule-runtime entrypoint - ADAPTER_MODULE=adapter tells the runtime to load this repo's Adapter class Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-16 04:27:22 -07:00

22 Commits