molecule-ai-workspace-runtime

molecule-ai/molecule-ai-workspace-runtime

Fork 0

Commit Graph

Author	SHA1	Message	Date
rabbitblood	4bafea58ae	fix(llm_auth): tighten base-URL hostname match + strip whitespace + no token in logs Self-review findings on #38: 1. Token substring leak: the "unknown prefix" warning included the first 12 chars of the token in the log message. Logs get shipped to Langfuse / CloudWatch / slack-firehose — 12 bytes of a secret in a log is still 12 bytes too many. Warning no longer references the token value at all. 2. Base-URL substring match was too loose: `"anthropic.com" not in base` would accept `https://proxy.anthropic.com.evil.example/` as "looks like Anthropic, keep the URL." Replaced with an allowlist of exact hostnames parsed via urllib.parse.urlparse. 3. Whitespace in pasted tokens: operators frequently paste tokens from terminals with a trailing newline. The token would flow through startswith() detection but then fail downstream auth with a confusing "malformed token" error. Strip and persist the cleaned value. 4. Malformed base URL crash guard: if someone sets ANTHROPIC_BASE_URL to something urlparse can't handle, don't crash — fall through to clearing it, which is the safe choice in OAuth mode. Added 5 new tests covering each of the above. 16/16 tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 10:46:07 -07:00
rabbitblood	0a0f11b41f	feat(runtime): auto-detect LLM token type, normalise env on boot Platform stores per-workspace LLM credentials under a single key (ANTHROPIC_AUTH_TOKEN in workspace_secrets). But downstream tools expect different env var names depending on the token type: sk-ant-oat01-* → CLAUDE_CODE_OAUTH_TOKEN (Claude Code OAuth session) sk-ant-api03-* → ANTHROPIC_API_KEY (direct Anthropic API) sk-cp-* → ANTHROPIC_AUTH_TOKEN (proxy: MiniMax, gateways) Without normalisation, an OAuth token under ANTHROPIC_AUTH_TOKEN gets sent as a bearer to api.anthropic.com, which responds: 401 authentication_error: OAuth authentication is currently not supported. This was a platform-wide footgun: anyone rotating LLM keys had to know the exact env var for each token type, AND make sure stale overrides were cleared, AND set ANTHROPIC_BASE_URL correctly for proxies (or NOT set for native Claude). Nothing downstream could help — the SDK just saw the wrong var. Fix: - New molecule_runtime/llm_auth.py — normalise_llm_env() mutates os.environ (or any dict) to the correct shape based on token prefix. Returns a NormalisationResult for logging. - main.py calls it as step 0, before any adapter/executor import. Every adapter (claude-code, langgraph, crewai, autogen, hermes, …) benefits automatically — no per-adapter branching needed. - 11 unit tests covering all prefix paths, edge cases, and the "operator deliberately set CLAUDE_CODE_OAUTH_TOKEN" precedence rule. Operationally: this means operators can keep using one ANTHROPIC_AUTH_TOKEN slot in platform settings and just paste whatever token the agent needs. No env-var-name awareness required. Tested locally: 11/11 new tests pass. 83 other tests unchanged (pre-existing failures on staging are all unrelated: test_workspace_id_validation, test_a2a_mcp_server RBAC, the test_imports.main module-walker — same signature as on staging HEAD before this PR). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 10:41:47 -07:00

Author

SHA1

Message

Date

rabbitblood

4bafea58ae

fix(llm_auth): tighten base-URL hostname match + strip whitespace + no token in logs

Self-review findings on #38:

1. **Token substring leak**: the "unknown prefix" warning included the
   first 12 chars of the token in the log message. Logs get shipped to
   Langfuse / CloudWatch / slack-firehose — 12 bytes of a secret in a
   log is still 12 bytes too many. Warning no longer references the
   token value at all.

2. **Base-URL substring match was too loose**: `"anthropic.com" not in
   base` would accept `https://proxy.anthropic.com.evil.example/` as
   "looks like Anthropic, keep the URL." Replaced with an allowlist of
   exact hostnames parsed via urllib.parse.urlparse.

3. **Whitespace in pasted tokens**: operators frequently paste tokens
   from terminals with a trailing newline. The token would flow through
   startswith() detection but then fail downstream auth with a
   confusing "malformed token" error. Strip and persist the cleaned
   value.

4. **Malformed base URL crash guard**: if someone sets ANTHROPIC_BASE_URL
   to something urlparse can't handle, don't crash — fall through to
   clearing it, which is the safe choice in OAuth mode.

Added 5 new tests covering each of the above. 16/16 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-23 10:46:07 -07:00

rabbitblood

0a0f11b41f

feat(runtime): auto-detect LLM token type, normalise env on boot

Platform stores per-workspace LLM credentials under a single key
(ANTHROPIC_AUTH_TOKEN in workspace_secrets). But downstream tools
expect different env var names depending on the token type:

  sk-ant-oat01-*  → CLAUDE_CODE_OAUTH_TOKEN  (Claude Code OAuth session)
  sk-ant-api03-*  → ANTHROPIC_API_KEY        (direct Anthropic API)
  sk-cp-*         → ANTHROPIC_AUTH_TOKEN     (proxy: MiniMax, gateways)

Without normalisation, an OAuth token under ANTHROPIC_AUTH_TOKEN gets
sent as a bearer to api.anthropic.com, which responds:

    401 authentication_error: OAuth authentication is currently not
    supported.

This was a platform-wide footgun: anyone rotating LLM keys had to
know the exact env var for each token type, AND make sure stale
overrides were cleared, AND set ANTHROPIC_BASE_URL correctly for
proxies (or NOT set for native Claude). Nothing downstream could
help — the SDK just saw the wrong var.

Fix:

- New molecule_runtime/llm_auth.py — normalise_llm_env() mutates
  os.environ (or any dict) to the correct shape based on token
  prefix. Returns a NormalisationResult for logging.
- main.py calls it as step 0, before any adapter/executor import.
  Every adapter (claude-code, langgraph, crewai, autogen, hermes,
  …) benefits automatically — no per-adapter branching needed.
- 11 unit tests covering all prefix paths, edge cases, and the
  "operator deliberately set CLAUDE_CODE_OAUTH_TOKEN" precedence
  rule.

Operationally: this means operators can keep using one
ANTHROPIC_AUTH_TOKEN slot in platform settings and just paste
whatever token the agent needs. No env-var-name awareness required.

Tested locally: 11/11 new tests pass. 83 other tests unchanged
(pre-existing failures on staging are all unrelated:
test_workspace_id_validation, test_a2a_mcp_server RBAC, the
test_imports.main module-walker — same signature as on staging
HEAD before this PR).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-23 10:41:47 -07:00

2 Commits