Commit Graph

7 Commits

Author SHA1 Message Date
Hongming Wang
a78626ced4 fix(adapter): keep setup() routing — strip prefix only at CLI invocation
Live-test revealed a regression in PR #24's setup() strip: the wheel-
default `anthropic:claude-opus-4-7` paired with an OAuth workspace
(CLAUDE_CODE_OAUTH_TOKEN set, no ANTHROPIC_API_KEY) is the realistic
production shape. Stripping in setup() routes those users into the
`anthropic-api` provider entry, after which the CLI hangs at
`initialize` because no API key env is set. Caught on workspace
dd40faf8 on 2026-05-01 — banner went `provider=anthropic-api` and
A2A wedged on Control request timeout.

Pre-fix routing (let prefixed strings fall through to providers[0] =
anthropic-oauth) is actually correct for this combo. The strip is only
needed at the CLI invocation site (create_executor) where claude's
`--model` arg must be a bare id.

Tests: replace `test_setup_strip_routes_prefixed_anthropic_to_anthropic_api`
with `test_setup_keeps_prefix_routing_oauth_for_anthropic_prefix`,
which pins the inverse — prefixed model + OAuth env stays on oauth and
emits no API-key warning. The 5 unit cases on `_strip_provider_prefix`
plus the `create_executor` strip pins remain unchanged. 36/36 pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 16:46:54 -07:00
Hongming Wang
e14f33a670 feat(executor): forward --dangerously-load-development-channels to claude CLI
The wheel-side push UX gates (capability + instructions, molecule-core
PR #2463) only matter if the host claude CLI is willing to register a
non-allowlisted experimental channel. During the channels research
preview the CLI requires --dangerously-load-development-channels to
bypass its allowlist; without it, every notifications/claude/channel
fired by the inbox bridge arrives at the host and is silently dropped.

claude-agent-sdk forwards arbitrary CLI flags to the spawned subprocess
via ClaudeAgentOptions.extra_args (claude_agent_sdk/_internal/transport/
subprocess_cli.py:340). Wire the flag in unconditionally — the flag is
harmless on builds that already allowlist the capability and required
on builds during the research preview, so there is no version skew to
guard. Remove the line once channels graduate to the default allowlist.

Test pins the wiring with a stubbed ClaudeAgentOptions recorder; runs
in CI without claude_agent_sdk / a2a / molecule_runtime installed via
the same _ensure_module/_ensure_attr pattern as the existing adapter
prevalidate test, but tolerates real packages being present locally.

Verified by injection: removing the extra_args line makes the test
fail with a message naming the missing flag and citing the SDK file
that consumes it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 16:26:58 -07:00
Hongming Wang
6ba4cc6a01 fix(adapter): strip LangChain-style provider prefix before CLI invocation
The molecule-runtime wheel's config.py defaults model to
`anthropic:claude-opus-4-7` so langchain/crewai consumers get a uniform
provider:model string out of the box. The claude CLI's --model arg
expects the bare model id and silently exits 1 (no stderr) on prefixed
strings — root cause of the 2026-05-01 "Agent error (Exception)" mid-A2A
bug. Diagnosed via strace on a live workspace: the CLI received
`--model anthropic:claude-opus-4-7` and exit_group(1)'d before any
non-fatal output.

Add `_strip_provider_prefix` and call it in both setup() (so
_resolve_provider routes anthropic:claude-X correctly to anthropic-api
instead of falling back to oauth) and create_executor() (so the bare
id reaches the CLI). Only known-Claude prefixes are stripped; unknown
ones (openai:, bedrock:) pass through so the CLI fails loudly instead
of being silently mangled.

Coverage: 8 new tests — unit tests for the helper across all branches,
end-to-end `create_executor` strip on dict + dataclass shapes, and a
caplog-based setup() test that pins provider=anthropic-api routing
after the strip (the silent-fallback failure mode this fix eliminates).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 16:06:53 -07:00
Hongming Wang
2b9b4306eb fix(adapter): per-entry isolation in _load_providers + tighten _normalize_provider
Two correctness issues spotted in self-review of c6f4912:

1. String-as-prefix typo split into character tuple. ``model_prefixes:
   mimo-`` (operator forgot brackets) used to iterate over characters
   → ``('m','i','m','o','-')``, silently routing every model id starting
   with 'm', 'i', or '-' through the entry. Now: non-list values coerce
   to empty tuple (entry survives but matches nothing — operator notices
   in boot banner, not via misrouted requests).

2. Single bad provider entry nuked the whole registry. _load_providers
   built the registry via a generator inside tuple(...). One AttributeError
   mid-comprehension (e.g. ``[mimo-, 123]`` — int's missing .lower())
   propagated out, broad except caught it, registry silently fell back
   to _BUILTIN_PROVIDERS (oauth + anthropic-api only). Every third-party
   model would then route to anthropic-oauth — exactly the silent-fallback
   failure mode this PR was meant to eliminate. Now: per-entry try/except
   drops the bad entry with a warning, rest survives.

Also: entries without a string ``name`` field are now dropped with a
warning instead of silently using the placeholder ``<unnamed>`` —
operator typos surface in boot logs.

Tests: 28 passing (3 new regression tests covering both issues plus
the no-name path).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 23:58:24 -07:00
Hongming Wang
c6f4912d09 feat(adapter): data-driven provider registry in config.yaml
Move the model→endpoint→auth-env mapping out of hardcoded constants
in adapter.py + entrypoint.sh into a single `providers:` list at the
top of config.yaml. The adapter loads it at boot via _load_providers;
canvas Config tab will read the same YAML for its Provider dropdown so
UI and adapter never disagree on what's available. Adding a new
provider becomes a one-line YAML edit — no Python or shell changes.

Includes 5 third-party providers ready out of the box (Anthropic-compat
endpoints, Bearer-style ANTHROPIC_AUTH_TOKEN OR ANTHROPIC_API_KEY auth):

  xiaomi-mimo  https://api.xiaomimimo.com/anthropic
  minimax      https://api.minimax.io/anthropic
  zai          https://api.z.ai/api/anthropic           (NEW)
  moonshot     https://api.moonshot.ai/anthropic        (NEW)
  deepseek     https://api.deepseek.com/anthropic       (NEW)

Plus 7 new model entries in runtime_config.models (mimo-v2.5, MiniMax-M2,
MiniMax-M2.7, GLM-4.6, GLM-4.5, kimi-k2.5, kimi-k2, deepseek-v4-pro,
deepseek-v4-flash) so they show up in the Canvas Config dropdown.

Operator override unchanged: ANTHROPIC_BASE_URL set as a workspace
secret still wins over the registry default — the escape hatch for
regional endpoints (Xiaomi token-plan-sgp, MiniMax api.minimaxi.com).

entrypoint.sh: drops the `mimo-*` case mapping (adapter handles routing
now). _BUILTIN_PROVIDERS retained as malformed-YAML fallback so a
bare-bones workspace still boots with oauth + anthropic-api defaults.

Tests: 25 passing. New coverage:
  - YAML parses + normalizes to expected shape
  - Malformed YAML falls back to builtins (warning, not raise)
  - Each new provider routes its model id to the right base_url
  - ANTHROPIC_AUTH_TOKEN alone satisfies third-party auth check
  - Operator-set ANTHROPIC_BASE_URL overrides registry default
  - Case-insensitive prefix match (MiniMax-M2 / minimax-m2.7 / GLM-4.6)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 23:29:40 -07:00
Hongming Wang
c646b8cebe feat(adapter): raise on third-party model without ANTHROPIC_BASE_URL
Aligns setup()'s third-party-model-without-URL handling with
create_executor()'s pre-validate (#19) — both unrecoverable
misconfigurations now raise ValueError at boot instead of one warning
and one raising.

Why: a third-party (mimo-*) model selected without ANTHROPIC_BASE_URL
sends every LLM request to api.anthropic.com with a non-Anthropic key,
401-ing every prompt. Workspace boots, looks "online" via heartbeat,
but is structurally broken on the user-facing path. The previous
warning-only path produced the same end-user symptom as the
2026-04-30 incident (workspace looks alive, every interaction fails)
just via a different misconfig shape.

Symmetry: create_executor raises when ANTHROPIC_BASE_URL is set to a
non-Anthropic host but no model is picked. setup() now raises when a
third-party model is picked but no URL is set. Together they catch
both halves of the misconfig surface at boot, before the workspace
enters "online" status.

Adds 4 setup() tests:
- raises on third-party + no URL
- passes on third-party + URL
- passes on OAuth alias (sonnet) + no URL
- passes on Anthropic API id (claude-*) + no URL

Stubs molecule_runtime.plugins.load_plugins as a no-op so the pass-path
tests run cleanly without the runtime installed. Test count: 11 (7
create_executor + 4 setup).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 22:50:25 -07:00
Hongming Wang
0d95b5098a feat(adapter): pre-validate ANTHROPIC_BASE_URL + missing model combo
The 2026-04-30 staging incident traced back to workspaces booting with
ANTHROPIC_BASE_URL pointing at a non-Anthropic shim (MiniMax / OpenAI
gateway) but no explicit model configured. The adapter silently fell
back to "sonnet" — an Anthropic-native alias the upstream didn't
recognize — and the SDK --print probe hung 30s before timing out.
Platform's phantom-busy sweep then nuked the workspace at 10min,
producing "every workspace dead" with the root cause buried in a
30s subprocess hang.

Pre-validate the combo at adapter boot: when ANTHROPIC_BASE_URL host
is non-Anthropic AND no explicit model is set, raise ValueError with
an actionable message pointing to MODEL_PROVIDER / runtime_config.model.
Also log the resolved model + base_url_host every boot so future
failures explain themselves in the workspace logs without digging
into the SDK subprocess.

Tests live under tests/ with their own pytest.ini that anchors rootdir
there — keeps pytest from importing the package __init__.py (which
does the runtime-discovery relative import that requires
molecule_runtime installed). 7 tests cover: misconfig raises with the
right message, Anthropic-native passes, no-base-url passes, custom-url
+ explicit model passes, dataclass + dict shapes, unparseable URL
no-crash. CI runs them on every push/PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 22:35:49 -07:00