Commit Graph

1 Commits

Author SHA1 Message Date
Molecule AI Backend Engineer
b7c8f18ab2 feat(hermes): expose reasoning mode for Hermes 4 via OpenAI-compat API (#496)
Hermes 4 is a hybrid-reasoning model trained on <think> tags; without asking
for thinking we pay flagship $/tok but get non-reasoning quality. This adds a
dedicated HermesA2AExecutor that dispatches to any OpenAI-compat endpoint
(OpenRouter, Nous Portal) and enables native reasoning for Hermes 4 models.

Key decisions:
- ProviderConfig + _reasoning_supported() detect Hermes 4 by model slug
  substring ("hermes-4", "hermes4") — case-insensitive, no config needed
- extra_body={"reasoning": {"enabled": True}} sent only to Hermes 4 entries;
  Hermes 3 path unchanged (no extra_body, no regressions)
- choices[0].message.reasoning + reasoning_details extracted and written to
  an OTEL span (hermes.reasoning) — deliberately NOT echoed in the A2A reply
  so the reasoning trace never contaminates the agent's next-turn context
- API key / base URL default to OPENAI_API_KEY / OPENAI_BASE_URL env vars
  with openrouter.ai/api/v1 as the fallback endpoint
- _client injection parameter for unit tests (no live API calls needed)
- Error sanitization: only exception class name surfaces to user (mirrors
  sanitize_agent_error() convention from cli_executor.py)

Test coverage: 35 tests, 100% coverage on all new code paths including:
  - _reasoning_supported() — Hermes 4/3/unknown/empty/uppercase
  - ProviderConfig — field assignment and capability flags
  - extra_body presence for Hermes 4, absence for Hermes 3
  - reasoning not in A2A reply; _log_reasoning called when trace present
  - reasoning_details forwarded; span attributes set correctly
  - Telemetry failure swallowed (never blocks response)
  - API error → sanitized class-name-only reply
  - cancel() → TaskStatusUpdateEvent(state=canceled)

Full suite: 990 passed, 0 failed (no regressions).

Resolves #496

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 20:38:45 +00:00