Commit Graph

6 Commits

Author SHA1 Message Date
rabbitblood
88ce2a18cd feat(hermes): Phase 2d-i — system-prompt.md injection on all 3 dispatch paths
The Hermes adapter never read /configs/system-prompt.md. Any role that
switched to runtime: hermes was silently losing its role identity because
the system prompt wasn't passed to the model. This PR fixes that by:

1. HermesA2AExecutor.__init__ takes new optional `config_path` kwarg
2. `create_executor(config_path=...)` forwards to the constructor
3. `adapter.py` passes `config.config_path` through from AdapterConfig
4. `execute()` reads system-prompt.md via executor_helpers.get_system_prompt
   (hot-reload-capable — reads on every turn, not just at startup)
5. `_do_inference(user_message, history, system_prompt)` — new arg threads
   through the dispatch to each native path
6. Each path uses the provider's NATIVE system field:
   - OpenAI-compat: prepends `{"role":"system", "content":...}` to messages
   - Anthropic: top-level `system=` kwarg (NOT in messages — Anthropic
     requires system at the top level)
   - Gemini: `config=GenerateContentConfig(system_instruction=...)`

## Phase scoreboard
- 2a (in main) — native Anthropic dispatch infra
- 2b (in main) — native Gemini dispatch
- 2c (in main) — multi-turn history on all paths
- **2d-i (this PR)** — system prompts on all paths
- 2d-ii (future) — tool calling on native paths
- 2d-iii (future) — vision content blocks on native paths
- 2d-iv (future) — streaming

## Test coverage

46/46 tests pass (20 Phase 2 dispatch + 26 Phase 1 registry):

- Existing dispatch tests updated to assert the 3-arg call shape
  `("hello", None, None)` — history + system_prompt both None
- 4 new tests:
  - `dispatch_passes_system_prompt_to_anthropic` — happy path, third arg flows
  - `dispatch_passes_system_prompt_to_gemini` — happy path
  - `dispatch_passes_system_prompt_to_openai` — happy path
  - `executor_accepts_config_path_kwarg` — constructor stores config_path
  - `create_executor_forwards_config_path` — both back-compat and registry
    resolution paths forward config_path through to the executor

## Back-compat

- `config_path=None` (default) → execute() skips system-prompt injection,
  same behavior as pre-2d-i
- Workspaces with `runtime: hermes` but no `/configs/system-prompt.md`
  file get `system_prompt=None` (get_system_prompt returns fallback),
  same as before
- The 13 OpenAI-compat providers work identically — system_prompt just
  adds a leading message, which every OpenAI-compat endpoint already
  supports
- Anthropic + Gemini previously got zero system context; now they get
  the same system prompt the workspace's system-prompt.md carries

## Why this matters

Before this PR: if someone flipped a workspace from `runtime: claude-code`
to `runtime: hermes`, the agent would act generically (no role identity,
no project conventions, no CLAUDE.md context) because the Hermes executor
never looked at system-prompt.md. That's a silent correctness regression
the test suite wouldn't catch because none of our live workspaces use
the hermes runtime today.

With this PR: Hermes workspaces get the same system prompt injection as
Claude-code workspaces, making the `runtime: hermes` switch a true drop-in
alternative.

## Related
- #267 Phase 2c (multi-turn history — in main)
- #255 Phase 2b (gemini native — in main)
- #240 Phase 2a (anthropic native — in main)
- #208 Phase 1 (provider registry — in main)
- project_hermes_multi_provider.md — Phase 2d-i was the next queued item
2026-04-15 16:21:47 -07:00
rabbitblood
d40a9d940c feat(hermes): Phase 2c — multi-turn history passed natively to all paths
Completes the Phase 2 scope by keeping conversation turns as turns across
all three dispatch paths. Pre-2c, history was flattened into a single user
message via shared_runtime.build_task_text, which worked as a fallback but
lost the model's native multi-turn awareness (role attribution,
instruction-following on mid-conversation corrections, system-prompt
grounding against prior turns).

Phase 2a + 2b shipped the dispatch infrastructure + per-provider native
paths. This PR uses them properly.

## What's new

- **`_history_to_openai_messages(user_message, history)`** (static) — maps
  A2A `(role, text)` tuples to OpenAI Chat Completions
  `[{"role":"user"|"assistant","content":str}]`. Roles: `human`→`user`,
  `ai`→`assistant`. Current turn appended as the final user message.

- **`_history_to_anthropic_messages`** (static) — identical wire shape to
  OpenAI for text-only turns, so it delegates. Phase 2d tool_use/vision
  blocks will diverge here.

- **`_history_to_gemini_contents`** (static) — Gemini uses a different
  shape: `role="user"|"model"` (NOT "assistant") and text wrapped in
  `parts=[{"text":...}]`. Delegates to none of the others.

- **`_do_openai_compat(user_message, history=None)`** — accepts history,
  builds messages via `_history_to_openai_messages`. Back-compat: pass
  `history=None` to get the old single-turn behavior.

- **`_do_anthropic_native(user_message, history=None)`** — same signature
  change, calls `_history_to_anthropic_messages`. Still uses
  `anthropic.AsyncAnthropic().messages.create()`, just with proper
  multi-turn.

- **`_do_gemini_native(user_message, history=None)`** — same pattern,
  calls `_history_to_gemini_contents`, passes to Gemini's
  `generate_content(contents=...)`.

- **`_do_inference(user_message, history=None)`** — new signature,
  dispatches by auth_scheme as before, passes both args through.

- **`execute()`** — no longer calls `build_task_text`. Calls
  `extract_history(context)` directly and forwards to `_do_inference`.
  Removes the `build_task_text` import (not needed in this file anymore).

## Tests

Existing 7 dispatch tests updated for the new `(user_message, history)`
signature — they assert the path is called with `("hello", None)` since
they pass no history.

5 NEW tests:

- `test_history_to_openai_messages_empty_history` — empty history degrades
  to single user message (back-compat)
- `test_history_to_openai_messages_multi_turn` — round-trip of a 3-turn
  history + current turn
- `test_history_to_anthropic_messages_same_as_openai` — cross-check that
  anthropic path produces identical wire shape for text-only
- `test_history_to_gemini_contents_uses_model_role_and_parts_wrapper` —
  verifies the Gemini-specific role mapping (`ai`→`model`) + parts wrapper
- `test_dispatch_passes_history_through` — end-to-end: _do_inference
  forwards history to the chosen provider path

All 41 tests pass (15 Phase 2 dispatch + 26 Phase 1 registry):

    pytest tests/test_hermes_phase2_dispatch.py tests/test_hermes_providers.py
    41 passed in 0.07s

## Back-compat

- No public API changes to `create_executor()`. Callers that hit
  `execute()` via A2A get the new multi-turn behavior automatically via
  `extract_history(context)`.
- Callers that passed an empty history list (or None) get the same
  single-turn behavior as pre-2c.
- The `build_task_text` helper in shared_runtime is unchanged — other
  adapters (AutoGen, LangGraph) that use it keep working. Only Hermes
  bypasses it now.

## What's NOT in this PR (Phase 2d)

- Tool calling / function calling on native paths (anthropic `tools=`,
  gemini `tools=Tool(function_declarations=[...])`)
- Vision content blocks (image_url → anthropic `{type:"image", source:
  {type:"base64",...}}` / gemini `{inline_data:{mime_type,data}}`)
- System instructions pass-through (anthropic `system=`, gemini
  `system_instruction=`)
- Streaming (`astream_messages` / `streamGenerateContent` stream variants)
- Extended thinking (anthropic `thinking={"type":"enabled"}`) / Gemini
  thinking config

Phase 2c is the **multi-turn upgrade**. Tool + vision + streaming are
Phase 2d, scoped in project_hermes_multi_provider.md.

## Related

- #240 Phase 2a (native Anthropic dispatch — in main)
- #255 Phase 2b (native Gemini dispatch — in main)
- Phase 1 (#208 — provider registry baseline, in main)
- `project_hermes_multi_provider.md` queued memory
- CEO 2026-04-15: "focus on supporting hermes agent"
2026-04-15 14:21:10 -07:00
rabbitblood
485dcb4cae feat(hermes): Phase 2b — native Google Gemini generateContent dispatch path
Completes Hermes Phase 2 by adding the second native SDK path: Google Gemini
via the official `google-genai` Python SDK. Stacked on top of Phase 2a
(feat/hermes-phase2-native-sdks) which introduced the dispatch infra +
the anthropic native path.

## What's new in this PR

1. `providers.py`: flip `gemini` entry to `auth_scheme="gemini"` and
   update `base_url` from the OpenAI-compat endpoint
   (`/v1beta/openai`) to the bare host
   (`https://generativelanguage.googleapis.com`) which the native SDK
   uses.

2. `executor.py`: new method `_do_gemini_native(task_text)` that uses
   `google.genai.Client().aio.models.generate_content(...)`. Dispatch
   table in `_do_inference` now routes `"gemini"` → `_do_gemini_native`.
   Same fail-loud semantics as `_do_anthropic_native` — missing SDK
   raises a clear RuntimeError with install instructions.

3. `requirements.txt`: add `google-genai>=1.0.0`.

4. `test_hermes_phase2_dispatch.py`: +3 tests
   - `test_gemini_entry_has_gemini_scheme` — registry flip + base URL
     validated
   - `test_dispatch_gemini_scheme_calls_gemini_native` — dispatch runs
     gemini native, not openai-compat or anthropic-native
   - `test_gemini_native_raises_clear_error_when_sdk_missing` — fail-loud
     on missing `google-genai` package
   Plus updated existing dispatch tests to mock `_do_gemini_native`
   alongside the other paths so "no cross-calls" assertions stay tight.

All 36 tests pass locally (10 Phase 2 dispatch + 26 Phase 1 registry):

    pytest tests/test_hermes_phase2_dispatch.py tests/test_hermes_providers.py
    36 passed in 0.07s

## Dispatch table after this PR

    auth_scheme="openai"     → _do_openai_compat (13 providers)
    auth_scheme="anthropic"  → _do_anthropic_native (1 provider, Phase 2a)
    auth_scheme="gemini"     → _do_gemini_native (1 provider, Phase 2b) ← NEW
    <unknown>                → _do_openai_compat + warning (forward-compat)

## Back-compat

- All 13 openai-scheme providers unchanged
- `hermes_api_key` / `HERMES_API_KEY` / `OPENROUTER_API_KEY` paths unchanged
- Only `gemini` provider changes behavior: now uses native generateContent
  instead of the `/v1beta/openai` compat shim
- Existing Gemini callers setting `GEMINI_API_KEY` get the native path
  automatically — no caller changes needed

## What's NOT in this PR (future phases)

- Streaming support (`astream_messages` / `streamGenerateContent` stream
  variants) for either native path
- Tool calling / function calling on native paths
- Vision content blocks (image_url → anthropic image blocks; image_url →
  gemini inline_data with base64 + mime_type)
- Extended thinking (anthropic) / thinking config (gemini)
- System instructions pass-through on the gemini native path

Phase 2c/2d will layer these on. This PR is the minimum-viable native
dispatch — single-turn text in, text out — same shape as Phase 2a.

## Stacking

This PR targets `feat/hermes-phase2-native-sdks` (Phase 2a) as its base
branch, NOT main, so the diff shows only the Gemini-specific additions.
When Phase 2a merges to main, GitHub auto-rebases this PR onto the new
main head. If reviewer prefers a single combined PR, close #240 and land
this one instead — the commits on feat/hermes-phase2-native-sdks are
already included in this branch's history.

## Related

- #240 Phase 2a (parent branch)
- #208 Phase 1 (registry + openai-compat path — already in main)
- `project_hermes_multi_provider.md` queued memory — Phase 2 was the next
  item, this PR completes it
- `docs/ecosystem-watch.md` → `### Hermes Agent` — Research Lead's
  eco-watch entry that catalogued Hermes's native provider list and
  shaped the original Phase 2 scope
2026-04-15 13:20:39 -07:00
rabbitblood
3985d80220 feat(hermes): Phase 2a — native Anthropic Messages API dispatch path
Completes the Hermes adapter's native-SDK plan for the provider that gains
the most from leaving OpenAI-compat: Anthropic. OpenAI-compat works fine for
plain text turns on every provider (Phase 1 covered that with one code path
for all 15 providers), but Anthropic's Messages API has first-class tool use,
vision content blocks, and extended thinking that the OpenAI-compat shim
strips or mis-translates.

Rather than ship all native SDK paths in one PR (Anthropic + Gemini + future),
this lands Anthropic only (Phase 2a). Gemini is Phase 2b, shipping after a
production measurement window on Phase 2a.

## Design

Providers now dispatch by `auth_scheme` field. Phase 1 added the field but
every provider used `"openai"`. Phase 2 flips `anthropic` to `"anthropic"`
and wires a second inference path keyed on that:

- `HermesA2AExecutor._do_openai_compat(task_text)` — existing path, handles
  14 of 15 providers (Nous Portal, OpenRouter, OpenAI, xAI, Gemini, Qwen,
  GLM, Kimi, MiniMax, DeepSeek, Groq, Together, Fireworks, Mistral)
- `HermesA2AExecutor._do_anthropic_native(task_text)` — NEW, uses the
  official `anthropic` Python SDK's `AsyncAnthropic().messages.create(...)`
- `HermesA2AExecutor._do_inference(task_text)` — dispatches by
  `self.provider_cfg.auth_scheme`

Unknown schemes fall back to OpenAI-compat with a logged warning, so future
provider additions don't crash if a native SDK path ships late.

## Fail-loud on missing SDK

`_do_anthropic_native` raises a clear `RuntimeError` with install
instructions if the `anthropic` package is missing at runtime:

    Hermes anthropic native path requires the `anthropic` package. Install
    in the workspace image with `pip install anthropic>=0.39.0` or set
    HERMES provider=openrouter to route Claude models through OpenRouter's
    OpenAI-compat shim instead.

This is intentional: silent fallback would mask fidelity loss (tool_use
blocks become plain text, vision gets stripped). Loud failure is better.

`requirements.txt` adds `anthropic>=0.39.0` so the package is baked into
the workspace-template image build path. Operators building custom workspace
images without anthropic installed get the loud error.

## Back-compat

- `create_executor(hermes_api_key="x")` → still routes to Nous Portal
  (`auth_scheme="openai"`), unchanged
- `HERMES_API_KEY` env var → still first in RESOLUTION_ORDER
- `OPENROUTER_API_KEY` env var → still second
- All 14 OpenAI-compat providers unchanged — they take the same code path
  as before
- ONLY `anthropic` provider changes behavior: it now uses the native
  Messages API instead of the `/v1/chat/completions` compat shim

## Constructor signature change

`HermesA2AExecutor.__init__` now takes `provider_cfg: ProviderConfig`
instead of separate `api_key + base_url + model`. The three fields are
derived from `provider_cfg` + an optional model override. This is a
breaking change for any external caller building an executor directly,
but the only documented public entry point is `create_executor()`, which
is updated in the same commit to pass the cfg through.

## Test coverage

`workspace-template/tests/test_hermes_phase2_dispatch.py` — 7 new tests:

1. `test_anthropic_entry_has_anthropic_scheme` — registry flip
2. `test_all_other_providers_still_openai_scheme` — regression guard
3. `test_dispatch_openai_scheme_calls_openai_compat` — happy path
4. `test_dispatch_anthropic_scheme_calls_anthropic_native` — happy path
5. `test_dispatch_unknown_scheme_falls_back_to_openai_compat` — forward compat
6. `test_anthropic_native_raises_clear_error_when_sdk_missing` — fail-loud
7. `test_create_executor_passes_provider_cfg` — constructor wiring

All pass locally (pytest tests/test_hermes_phase2_dispatch.py -v, 0.04s).
Phase 1 tests unchanged: `test_hermes_providers.py` 26/26 pass, no
regressions.

## What's NOT in this PR (Phase 2b)

- Gemini native `generateContent` path (`auth_scheme="gemini"`)
- Streaming support across both native paths (`astream_messages`, `streamGenerateContent`)
- Tool calling on the anthropic native path (the `tools` + `tool_use` blocks)
- Vision content blocks (image_url → anthropic image blocks)
- Extended thinking parameter passthrough

All scoped in `project_hermes_multi_provider.md`. Phase 2a is the minimum
viable native Anthropic dispatch — single-turn text in, text out, no tools.

## Related

- Phase 1 baseline (already in main): #208 — provider registry + OpenAI-compat path
- Queued memory: `project_hermes_multi_provider.md` — full phased plan
- Triggering directive: CEO 2026-04-15 — "once current works are cleared,
  focus on supporting hermes agent"
2026-04-15 12:23:56 -07:00
rabbitblood
8d8ca18bc0 feat(hermes): Phase 1 — multi-provider registry (15 providers, back-compat preserved)
Ships the first half of the queued Hermes adapter expansion. PR 2 only
supported Nous Portal + OpenRouter; this adds 13 more providers reachable
via OpenAI-compat endpoints. Native SDK paths for Anthropic + Gemini are
Phase 2 (better tool-calling + vision fidelity).

## What's new

**`workspace-template/adapters/hermes/providers.py`** (new file, 220 LOC):
- ``ProviderConfig`` dataclass: name, env vars, base URL, default model, auth scheme, docs
- ``PROVIDERS`` dict with 15 entries across 4 groups:
  - PR 2 baseline: nous_portal, openrouter
  - Frontier commercial: openai, anthropic, xai, gemini
  - Chinese providers: qwen, glm, kimi, minimax, deepseek
  - OSS/alt: groq, together, fireworks, mistral
- ``RESOLUTION_ORDER`` tuple: priority for auto-detect (back-compat first,
  then commercial, then Chinese, then OSS/alt)
- ``resolve_provider(explicit=None)`` -> (ProviderConfig, api_key)
  - With explicit name: routes to that provider, raises if env var empty
  - Without: walks RESOLUTION_ORDER, first env-var-set provider wins

**`workspace-template/adapters/hermes/executor.py`** (refactored):
- `create_executor(hermes_api_key=None, provider=None, model=None)` now has
  three parameters:
  - `hermes_api_key`: PR 2 back-compat — routes to Nous Portal
  - `provider`: canonical short name from the registry (e.g. "anthropic")
  - `model`: optional override of the provider's default model
- Delegates all resolution to `providers.resolve_provider()` — no more
  hardcoded URLs or env var lookups in the executor itself
- `HermesA2AExecutor.__init__` no longer has Nous-specific defaults; callers
  pass base_url + model explicitly (which create_executor always does)

**`workspace-template/tests/test_hermes_providers.py`** (new file, 26 tests):
- Registry shape invariants (count >= 15, no duplicates, every config valid)
- PR 2 back-compat: HERMES_API_KEY / OPENROUTER_API_KEY still route correctly
- Auto-detect for every provider in the registry (parametrized — guards against
  typos in env var lists)
- Explicit `provider=` bypass of auto-detect
- Error cases: unknown provider, explicit-but-empty, auto-detect-with-no-env
- All 26 tests pass locally in 0.08s

## Back-compat guarantees

| Scenario | PR 2 behavior | This PR behavior |
|---|---|---|
| `create_executor(hermes_api_key="x")` | Nous Portal | Nous Portal (unchanged) |
| `HERMES_API_KEY=x` env, auto-detect | Nous Portal | Nous Portal (unchanged) |
| `OPENROUTER_API_KEY=x` env, auto-detect | OpenRouter | OpenRouter (unchanged) |
| Both env + explicit hermes_api_key param | Nous Portal (param wins) | Nous Portal (param wins, unchanged) |

Nothing existing can break. New callers gain access to 13 more providers.

## What's NOT in this PR (Phase 2)

- **Native Anthropic Messages API path** — better tool calling, vision, extended
  thinking. Requires pulling in `anthropic` SDK. ~50 LOC.
- **Native Gemini generateContent path** — for vision + google tools. Requires
  `google-genai` SDK. ~50 LOC.
- **Streaming support across all providers** — current executor is non-streaming
  (single chat.completions.create call). Streaming works with openai.AsyncOpenAI
  but hasn't been wired to the A2A event queue path. ~30 LOC.
- **Per-provider model overrides in config.yaml** — Phase 1 uses the registry's
  default_model. Phase 2 adds a `hermes: { provider: qwen, model: qwen3-coder-plus }`
  block in the workspace config.
- **`.env.example` updates** — not critical since the registry itself documents
  every env var via the `env_vars` field, but nice-to-have.

## Related
- Queued memory: `project_hermes_multi_provider.md`
- CEO directive 2026-04-15: *"once current works are cleared, I want you to
  focus on supporting hermes agent, right now it doesnt take too much providers"*
- `docs/ecosystem-watch.md` → `### Hermes Agent` — Research Lead's eco-watch
  entry listed "Nous Portal, OpenRouter, GLM, Kimi, MiniMax, OpenAI, …" which
  shaped this registry's initial set

## Test plan
- [x] Unit tests: 26/26 pass locally (pytest)
- [ ] CI will run on the self-hosted macOS arm64 runner
- [ ] Smoke test in a real workspace: set QWEN_API_KEY and verify Technical
      Researcher actually hits Alibaba DashScope successfully
- [ ] Integration test per provider with real API keys (gated on env, skip
      when not set — Phase 2 CI addition)
2026-04-15 11:14:35 -07:00
Dev Lead Agent
08fe37aee1 feat: implement Hermes adapter create_executor() with OpenRouter fallback
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-13 16:47:29 -07:00