HermesA2AExecutor now supports sending system context as ordered, separate
role=system messages instead of a single concatenated string — the model
format recommended by NousResearch.
Changes:
- HermesA2AExecutor.__init__: new system_blocks kwarg (list[str|None]|None)
stored as an independent copy; None blocks and empty strings silently skipped
- _build_messages(): when system_blocks is not None, emits each non-empty
block as a separate {"role": "system"} entry in Hermes-recommended order
(persona → tools context → reasoning policy); falls through to legacy
system_prompt path when system_blocks is None (backward compatible)
Backward compatibility: existing callers that pass a single system_prompt
string continue to work identically — no changes required.
Tests (12 new, 47 total):
- system_blocks stored as independent copy (mutation safe)
- three-block stacked ordering preserved
- empty / None blocks silently skipped
- all-empty list → zero system messages
- system_blocks overrides system_prompt when both provided
- legacy system_prompt path unchanged
- stacked blocks appear in the live API call kwargs
Closes#499
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Both PRs restructured the same chat.completions.create() call to use a
create_kwargs dict. Resolved by keeping both __init__ params and both
conditionals in the create_kwargs block.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds response_format support to HermesA2AExecutor so callers can request
structured JSON output via the OpenAI-native response_format parameter.
Changes:
- _validate_response_format(): validates type (json_schema/json_object/text)
and required sub-fields; returns None if valid, error message if invalid
- HermesA2AExecutor.__init__: new response_format kwarg, stored as _response_format
- execute(): validates before API call — invalid schema enqueues error and
returns early without hitting Hermes API; valid and non-None adds
response_format= to create_kwargs; None omits the field entirely
Tests (12 new):
- _validate_response_format: all valid types, invalid type, missing fields
- constructor stores response_format correctly
- valid response_format forwarded to API call
- response_format omitted when None (no key in call kwargs)
- invalid schema → error message enqueued, API not called
Closes#498
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Instead of injecting tool definitions as text into the system prompt,
HermesA2AExecutor now accepts a tools: list[dict] | None constructor
parameter containing OpenAI-format tool definitions and forwards them
via the native tools= parameter on chat.completions.create().
Empty list / None rule: when tools is falsy, the tools key is omitted
from the API call entirely — never sent as tools=[] — so providers
that reject an empty tools array don't return a 400.
Tool-call response handling: when the model returns finish_reason
"tool_calls" with no text content, the executor serialises the call
list as a JSON string and enqueues it as the A2A reply. This keeps
the executor thin (single API call per turn, no ReAct loop) while
surfacing function-call intent in a structured, parseable format.
Changes:
- HermesA2AExecutor.__init__: new tools kwarg; stored as self._tools
(copy; mutating the input list has no effect)
- execute(): builds create_kwargs dict and conditionally adds tools=
only when self._tools is non-empty; handles tool_calls response
- Module docstring: new "Native tools (#497)" section with schema
reference and edge-case explanation
Tests (12 new, 47 total in hermes test file, 1002 total suite):
- tools stored correctly in constructor (copy, None, [], non-empty)
- non-empty tools forwarded as tools= in API call
- multiple tools all forwarded
- empty list ([] and None and default) → tools key absent from call
- model tool_call response → JSON-serialised list as A2A reply
- multiple tool_calls → all in JSON reply
- text content present → text wins over tool_calls
Closes#497
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Hermes 4 is a hybrid-reasoning model trained on <think> tags; without asking
for thinking we pay flagship $/tok but get non-reasoning quality. This adds a
dedicated HermesA2AExecutor that dispatches to any OpenAI-compat endpoint
(OpenRouter, Nous Portal) and enables native reasoning for Hermes 4 models.
Key decisions:
- ProviderConfig + _reasoning_supported() detect Hermes 4 by model slug
substring ("hermes-4", "hermes4") — case-insensitive, no config needed
- extra_body={"reasoning": {"enabled": True}} sent only to Hermes 4 entries;
Hermes 3 path unchanged (no extra_body, no regressions)
- choices[0].message.reasoning + reasoning_details extracted and written to
an OTEL span (hermes.reasoning) — deliberately NOT echoed in the A2A reply
so the reasoning trace never contaminates the agent's next-turn context
- API key / base URL default to OPENAI_API_KEY / OPENAI_BASE_URL env vars
with openrouter.ai/api/v1 as the fallback endpoint
- _client injection parameter for unit tests (no live API calls needed)
- Error sanitization: only exception class name surfaces to user (mirrors
sanitize_agent_error() convention from cli_executor.py)
Test coverage: 35 tests, 100% coverage on all new code paths including:
- _reasoning_supported() — Hermes 4/3/unknown/empty/uppercase
- ProviderConfig — field assignment and capability flags
- extra_body presence for Hermes 4, absence for Hermes 3
- reasoning not in A2A reply; _log_reasoning called when trace present
- reasoning_details forwarded; span attributes set correctly
- Telemetry failure swallowed (never blocks response)
- API error → sanitized class-name-only reply
- cancel() → TaskStatusUpdateEvent(state=canceled)
Full suite: 990 passed, 0 failed (no regressions).
Resolves#496
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>