Commit Graph

1608 Commits

Author SHA1 Message Date
Teknium
bc79e227e6 feat(curator): background skill maintenance (issue #7816)
Adds the Curator — an auxiliary-model background task that periodically
reviews AGENT-CREATED skills and keeps the collection tidy: tracks usage,
transitions unused skills through active → stale → archived, and spawns
a forked AIAgent to consolidate overlaps and patch drift.

Default: enabled, inactivity-triggered (no cron daemon). Runs on CLI
startup and gateway boot when the last run is older than interval_hours
(default 24) AND the agent has been idle for min_idle_hours (default 2).

Invariants (all load-bearing):
- Never touches bundled or hub-installed skills (.bundled_manifest +
  .hub/lock.json double-filter)
- Never auto-deletes — archive only. Archives are recoverable
  via `hermes curator restore <skill>`
- Pinned skills bypass all auto-transitions
- Uses the aux client; never touches the main session's prompt cache

New files:
- tools/skill_usage.py — sidecar .usage.json telemetry, atomic writes,
  provenance filter
- agent/curator.py — orchestrator: config, idle gating, state-machine
  transitions (pure, no LLM), forked-agent review prompt
- hermes_cli/curator.py — `hermes curator {status,run,pause,resume,
  pin,unpin,restore}` subcommand
- tests/tools/test_skill_usage.py — 29 tests
- tests/agent/test_curator.py — 25 tests

Modified files (surgical patches):
- tools/skills_tool.py — bump view_count on successful skill_view
- tools/skill_manager_tool.py — bump patch_count on skill_manage
  patch/edit/write_file/remove_file; forget record on delete
- hermes_cli/config.py — add curator: section to DEFAULT_CONFIG
- hermes_cli/commands.py — add /curator CommandDef with subcommands
- hermes_cli/main.py — register `hermes curator` subparser via
  register_cli() from hermes_cli.curator
- cli.py — /curator slash-command dispatch + startup hook
- gateway/run.py — gateway-boot hook (mirrors CLI)

Validation:
- 54 new tests across skill_usage + curator, all passing in 3s
- 346 tests across all touched files' neighbors green
- 2783 tests across hermes_cli/ + gateway/test_run_progress_topics.py green
- CLI smoke: `hermes curator status/pause/resume` work end-to-end

Companion to PR #16026 (class-first skill review prompt) — together
they form a loop: the review prompt stops near-duplicate skill creation
at the source, and the curator prunes/consolidates what still accumulates.

Refs #7816.
2026-04-28 22:33:33 -07:00
JackJin
88e07c42b4 fix(cli): prevent .env sanitizer from splitting GLM_API_KEY by LM_API_KEY suffix
The known-key splitter in `_sanitize_env_lines` used substring matching
to find concatenated KEY=VALUE pairs. When a registered key was a suffix
of another (LM_API_KEY is a suffix of GLM_API_KEY), the shorter key's
needle would match inside the longer one, causing the sanitizer to
rewrite `GLM_API_KEY=...` as `G\nLM_API_KEY=...` and silently break
Z.AI/GLM auth (and similarly `GLM_BASE_URL` -> `G\nLM_BASE_URL`).

Drop matches whose needle range is fully contained within a longer
overlapping match. Two regression tests cover the suffix-collision case
and confirm a real concatenation that happens to start with the longer
key still splits where it should.

Fixes #17138
2026-04-28 22:22:45 -07:00
Brooklyn Nicholson
6b4ef00a2c review(copilot): keep /reload cli_only since gateway has no handler 2026-04-28 22:22:30 -07:00
Brooklyn Nicholson
4858e26eaa feat(tui): port classic CLI /reload (.env hot-reload) to TUI
Classic CLI exposes ``/reload`` (re-reads ~/.hermes/.env into
``os.environ`` via ``hermes_cli.config.reload_env``) so newly added API
keys take effect without restarting the session.  The TUI was missing
the parity command, so users had to Ctrl+C out and ``hermes --tui``
again whenever they added or rotated a credential.

Three small wires:

* New ``reload.env`` JSON-RPC method in ``tui_gateway/server.py`` that
  delegates to ``hermes_cli.config.reload_env`` and returns the count
  of vars updated.
* New ``/reload`` slash command in ``ui-tui/src/app/slash/commands/ops.ts``
  matching the existing ``/reload-mcp`` pattern (native RPC, no slash
  worker).
* Drop ``cli_only=True`` from the ``reload`` ``CommandDef`` in
  ``hermes_cli/commands.py`` so help/menus surface it in the TUI too.
  ``reload_env`` itself is environment-agnostic.

Same caveat as classic CLI: the *currently constructed* agent's
credential pool / provider routing does not auto-rebuild.  Users who
want a brand-new credential resolution should follow with ``/new``.

Tests:
* New ``test_reload_env_rpc_calls_hermes_cli_reload_env`` confirms
  RPC delegates and reports the count.
* New ``test_reload_env_rpc_surfaces_errors`` confirms exceptions are
  rendered as JSON-RPC errors.
* ``createSlashHandler.test.ts`` slash-parity matrix extended with
  ``['/reload', 'reload.env', {}]`` so we can't regress the routing.

Validation:
  scripts/run_tests.sh tests/test_tui_gateway_server.py — 92/92.
  scripts/run_tests.sh tests/hermes_cli/test_commands.py — 128/128.
  cd ui-tui && npm run type-check — clean; npm test --run — 390/390.
2026-04-28 22:22:30 -07:00
Brooklyn Nicholson
d1ee4915f3 fix(browser): address Copilot review on /browser connect
Fixes from Copilot's two passes on PR #17238:

* Validate parsed URL once: reject missing host, invalid port, and
  unsupported scheme up front so malformed inputs (e.g. http://:9222
  or http://localhost:abc) don't fall through to a generic 5031.
* Tighten _is_default_local_cdp to require a discovery-style path so
  ws://127.0.0.1:9222/devtools/browser/<id> is not collapsed to bare
  http://127.0.0.1:9222 (which would lose the path and break the
  connect).
* Move browser.manage into _LONG_HANDLERS so the up-to-10s
  launch-and-retry loop runs on the RPC pool instead of blocking the
  main dispatcher.
* try_launch_chrome_debug uses Windows-appropriate detach kwargs
  (creationflags=DETACHED_PROCESS|CREATE_NEW_PROCESS_GROUP) instead
  of POSIX-only start_new_session=True.
* manual_chrome_debug_command uses subprocess.list2cmdline on
  Windows so the printed instruction is cmd.exe-compatible.
* Mirror host/port validation in cli.py /browser connect so the
  classic CLI never persists an invalid BROWSER_CDP_URL.
2026-04-28 22:11:10 -07:00
Brooklyn Nicholson
26816d1f77 refactor(tui): tighten /browser connect plumbing
Split browser.manage into a small dispatcher with named connect/disconnect
helpers, fold _http_ok / _probe_urls / _normalize_cdp_url out of the nested
probe loop, collapse the failure-message scaffolding, and DRY the chrome
candidate path tables. Behaviour and event shape unchanged.
2026-04-28 22:11:10 -07:00
Brooklyn Nicholson
69ff114ee2 fix(browser): avoid bogus Chrome launch fallback
Detect an actual Chrome/Chromium executable before printing a manual CDP launch command, including common WSL-mounted Windows browser paths, so /browser connect does not suggest google-chrome when it is unavailable.
2026-04-28 22:11:10 -07:00
Brooklyn Nicholson
f10a3df632 fix(tui): align /browser connect local CDP handling
Share Chrome CDP launch helpers between the classic CLI and TUI so default /browser connect uses loopback consistently, retries local Chrome launch, and reports a copyable manual-start command instead of claiming a dead connection.
2026-04-28 22:11:10 -07:00
Teknium
8c892c1453
refactor(redact): canonical mask_secret helper; fix status.py DIM drift (#17207)
Three modules independently implemented the same "preserve head+tail of
a secret, mask the middle" logic with slightly different behaviors that
had started to drift:

  hermes_cli/config.py redact_key  — 12-char floor, 4+4, DIM '(not set)'
  hermes_cli/status.py redact_key  — 12-char floor, 4+4, plain '(not set)'  ← drift
  hermes_cli/dump.py _redact       — 12-char floor, 4+4, empty string

The visible bug: 'hermes status' displayed the '(not set)' placeholder
in plain text while 'hermes config' showed it in dim text. Same concept,
inconsistent UI.

Introduces mask_secret() in agent/redact.py as the canonical helper,
with head/tail/floor/placeholder/empty kwargs. The three call sites
become one-line wrappers that differ only in the 'empty' handling:

  config.redact_key  → mask_secret(k, empty=color('(not set)', Colors.DIM))
  status.redact_key  → mask_secret(k, empty=color('(not set)', Colors.DIM))
  dump._redact       → mask_secret(v)  # empty → ''

agent.redact._mask_token (log redactor, different policy: 18-char floor,
6+4 visible, '***' on empty) also ports to mask_secret but retains its
own empty-case handling to preserve the historical '***' return.

Net: the three display-time redactors now agree on formatting, the
canonical helper lives in one place, and future tweaks (e.g. adding
bullet-point masking, changing the head/tail widths) happen once.

Verified:
- 3/3 tests/hermes_cli/test_web_server.py::TestRedactKey pass
- 89/89 agent/tests/test_redact.py + tests/tools/test_browser_secret_exfil.py
  + tests/hermes_cli/test_redact_config_bridge.py pass
- Live 'hermes status', 'hermes config', 'hermes dump' all render the
  same way they did before (verified against actual env with real
  keys: OpenRouter, Firecrawl, Browserbase, FAL, Tinker all show
  'prefix...suffix'; Kimi shows '***' at <12 chars; unset shows
  '(not set)' uniformly).

Co-authored-by: teknium1 <teknium@users.noreply.github.com>
2026-04-28 21:04:35 -07:00
Scott Trinh
fd943461ca fix(doctor): accept catalog provider aliases
Validate configured providers against both Hermes runtime provider ids and
catalog-normalized provider ids. This keeps providers like ai-gateway from
being rejected after catalog resolution maps them to models.dev ids.

Keep credential checks and vendor-slug warnings anchored to the runtime id
so doctor reports actionable provider names in follow-up diagnostics.
2026-04-28 18:27:42 -07:00
brooklyn!
6b09df39be
fix(tui): restore macOS copy behavior and theme polish (#17131)
This PR groups the TUI fixes that restore macOS Terminal usability and clean up the theme/composer regressions:

- copy transcript selections on macOS drag-release so Terminal.app users can copy while mouse tracking is enabled
- copy composer selections on macOS drag-release; composer selection is internal to TextInput and does not use the global Ink selection bus
- keep IDE Cmd+C forwarding setup macOS-only, and make keybinding conflict checks respect simple when-clause overlap/negation
- force truecolor before chalk initializes (unless NO_COLOR / FORCE_COLOR / HERMES_TUI_TRUECOLOR opt-outs apply) so the default banner keeps its gold/amber/bronze gradient in Terminal.app
- move TUI surfaces onto semantic theme tokens and preserve skin prompt symbols as bare tokens with renderer-owned spacing
- render focused placeholders as dim hint text in TTY mode instead of inverse/selected-looking synthetic cursor text
2026-04-28 18:47:14 -05:00
brooklyn!
7d81d76366
feat(tui): pluggable busy-indicator styles (#13610) (#17150)
* feat(tui): pluggable busy-indicator styles (kaomoji/emoji/unicode/ascii)

The status-bar `FaceTicker` rotated through wide-and-variable kaomoji
glyphs (`(。•́︿•̀。)`, `( ͡° ͜ʖ ͡°)`, …) every 2.5s.  Real display widths range
from ~5 to ~16 columns, so the rest of the bar (cwd, ctx %, voice,
bg counter) shifted on every cycle.  Padding the verb alone (#17116)
helped but didn't address the dominant jitter source — the glyph
itself.

Add four indicator styles, configurable + hot-swappable:

* `kaomoji` (default — preserves the existing vibe; verb is now
  pad-stable so the only width churn left is the kaomoji itself).
* `emoji`  — single 2-col emoji frame (`⚕ 🌀 🤔  🍵 🔮`).
* `unicode` — `unicode-animations` braille spinner (1-col, smooth).
* `ascii`  — `| / - \` (1-col, max compat).

Wires:

* `display.tui_status_indicator` in `DEFAULT_CONFIG` (default
  `kaomoji`).
* New JSON-RPC `config.set/get indicator` keys, narrow allow-list.
* `applyDisplay` reads the field and patches `UiState.indicatorStyle`,
  so the existing `mtime` poll picks up `~/.hermes/config.yaml` edits
  within ~5s without a TUI restart.
* `/indicator [style]` slash command (alias `/indicator-style`,
  subcommand completion `kaomoji|emoji|unicode|ascii`).  Bare form
  shows the current style; setter fires `config.set` and
  optimistically `patchUiState({ indicatorStyle })` so the live TUI
  swaps immediately, matching the `/skin` UX.
* `CommandDef("indicator", ..., subcommands=...)` so classic CLI
  autocomplete + TUI `complete.slash` both surface it.
* `FaceTicker` decouples spinner cadence from verb cadence — the
  glyph runs at the spinner's authored interval (or `FACE_TICK_MS`
  for kaomoji), the verb stays on the original 2.5s cycle, and both
  re-arm cleanly when style changes.

Tests:

* `normalizeIndicatorStyle` rejects unknown / non-string input.
* `applyDisplay → tui_status_indicator` covers fan-out + fallback.
* `/indicator <style>` hot-swaps `UiState.indicatorStyle` after a
  successful `config.set`.
* `/indicator sparkle` rejects with the usage hint and never hits
  the gateway.
* Slash-parity matrix gets `'/indicator'` → `config.get`.

Validation:
  cd ui-tui && npm run type-check — clean; npm test --run — 398/398.
  scripts/run_tests.sh tests/test_tui_gateway_server.py
  tests/hermes_cli/test_commands.py — 220/220.

* chore(tui): drop /indicator-style alias to declutter autocomplete

* fix(tui): drop verb-width pad — /indicator handles glyph jitter directly

* fix(tui): unicode indicator style hides the verb (cleanest option)

* refactor(tui): single source of truth for INDICATOR_STYLES; cleaner error format

Round 1 Copilot review on PR #17150:

- Exported `INDICATOR_STYLES` const tuple from `interfaces.ts`;
  `IndicatorStyle` union type is derived from it. `useConfigSync`
  builds its validation Set from the tuple, and `session.ts` uses it
  for both the usage hint and the runtime allow-list — adding/removing
  a style now touches one line.
- Backend `config.set indicator` error message: switched
  `sorted(allowed)` list repr to `pick one of ascii|emoji|kaomoji|unicode`
  (matches the TUI usage hint), and reports the normalized `raw`
  instead of the original `value`. Backend allowed tuple now has a
  comment pointing back at `INDICATOR_STYLES` so the two stay aligned.

Note: kept the verb portion unpadded per design intent — fixed-width
padding was the exact UX the `/indicator` command was added to remove.
Stable width comes from the glyph; verbs cycling is part of the kawaii
aesthetic. Reply on the verb thread will explain.

* fix(tui): drop type collapse + gate verb timer + DEFAULT_INDICATOR_STYLE

Round 2 Copilot review on PR #17150:

- `tui_status_indicator?: 'ascii' | ... | string` collapses to `string`
  in TS — consumers got no narrowing. Documented as plain `string` with
  a comment about runtime validation via `normalizeIndicatorStyle`.
- `FaceTicker` always started a 2.5s verb interval, even for the
  `unicode` style which hides the verb entirely. Now gated on
  `showVerb` from `renderIndicator` — `unicode` stays calm.

Pre-emptive self-review (avoid round 3):
- Three call sites duplicated the literal `'kaomoji'` default
  (uiStore, normalizeIndicatorStyle, slash command). Added
  `DEFAULT_INDICATOR_STYLE` to interfaces.ts and threaded it through
  so changing the default touches one line.

* fix(tui-gateway): normalize config.get indicator output to match TUI render

Round 4 Copilot review on PR #17150: `config.get` for `indicator`
returned the raw `display.tui_status_indicator` value without
validation, so a hand-edited config.yaml with stray casing or an
unknown style would leave `/indicator` printing one thing while
the TUI rendered the kaomoji default (frontend's
`normalizeIndicatorStyle` does this normalization on receive).

Lifted the allow-list to module scope as `_INDICATOR_STYLES` /
`_INDICATOR_DEFAULT`, reused by both `config.set` and `config.get`.
Comment notes the alignment with `INDICATOR_STYLES` /
`DEFAULT_INDICATOR_STYLE` in interfaces.ts so adding/removing a
style is a one-line change on each end.

Tests cover: known value verbatim, casing/whitespace normalize,
unknown→default, unset→default.

* fix(tui-gateway): preserve falsy-input diagnostics in config.set indicator error

Round 5 Copilot review on PR #17150: `raw = str(value or "").strip().lower()`
collapsed any falsy non-string (`0`, `False`, `[]`) to empty string,
so the error message read `unknown indicator: ` with nothing after —
losing the original input.

Switched to `("" if value is None else str(value)).strip().lower()`
so only `None` (the genuine 'no value' case) becomes blank.  Used
`{raw!r}` in the error so the diagnostic is unambiguous (`'0'` vs `0`).

Tests:
- known-value happy path (`'EMOJI'` → `'emoji'`)
- falsy non-string inputs (`0` / `False` / `[]`) surface meaningfully
- `None` keeps the blank-repr error
2026-04-28 18:19:16 -05:00
brooklyn!
87d3fa6f1c
feat(tui): opt-in auto-resume of the most recent session (#17130)
* feat(tui): opt-in auto-resume of the most recent session

`hermes --tui` always forges a fresh session at startup unless the user
sets `HERMES_TUI_RESUME=<id>`.  Disconnects, terminal-window crashes,
and accidental Ctrl+D therefore lose every piece of in-flight context
even though `state.db` still has the full history a `/resume` away.

Add an opt-in path that mirrors classic CLI's `hermes -c` muscle
memory: when `display.tui_auto_resume_recent: true` is set in
`~/.hermes/config.yaml`, the TUI looks up the most recent human-facing
session and resumes it instead of starting fresh.  Default off so
existing users aren't surprised; explicit `HERMES_TUI_RESUME` always
wins.

Wires:

* New `session.most_recent` JSON-RPC in `tui_gateway/server.py` that
  returns the first non-`tool` row from `list_sessions_rich`, or
  `{"session_id": null}` when none.  Uses the same deny-list as
  `session.list` so sub-agent rows can't sneak in.
* `createGatewayEventHandler.handleReady` re-ordered: explicit
  `STARTUP_RESUME_ID` first (unchanged), then conditional auto-resume
  via `config.get full → display.tui_auto_resume_recent`, then the
  legacy `newSession()` fallback.  Failures of either RPC fall back
  to `newSession()` so the path is always finite.
* Default `display.tui_auto_resume_recent: False` added to
  `DEFAULT_CONFIG` in `hermes_cli/config.py` (no `_config_version`
  bump per AGENTS.md — deep-merge handles the additive key).

Tests:

* 4 new vitest cases in `createGatewayEventHandler.test.ts` cover
  every gate-and-fallback combination (env wins, config off, config
  on with hit, config on with miss).
* 3 new pytest cases for `session.most_recent` (denied row skip,
  tool-only → null, db-unavailable → null).

Validation:
  scripts/run_tests.sh tests/test_tui_gateway_server.py — 93/93.
  cd ui-tui && npm run type-check — clean; npm test --run — 393/393.

* review(copilot): fold session.most_recent errors into null + extend ConfigDisplayConfig

* review(copilot): cover RPC-rejection fallbacks in auto-resume tests
2026-04-28 16:53:38 -05:00
kshitijk4poor
5d2f9b5d7d fix: follow-up for salvaged PR #17061
- Remove dead _lmstudio_loaded_context attribute from run_agent.py (set
  but never read — the loaded context is pushed to context_compressor.update_model
  which is the actual consumer)
- Cache empty reasoning options with 60s TTL to avoid per-turn HTTP probe
  for non-reasoning LM Studio models. Non-empty results cached permanently.
- Extract _lmstudio_server_root(), _lmstudio_request_headers(), and
  _lmstudio_fetch_raw_models() shared helpers in models.py — eliminates
  URL-strip + auth-header + HTTP-call duplication across probe_lmstudio_models,
  ensure_lmstudio_model_loaded, and lmstudio_model_reasoning_options
- Revert runtime_provider.py base_url precedence change: preserve the
  established contract (saved config.base_url > env var > default) for all
  api_key providers
- Remove unnecessary config version bump 22→23
- Fix TUI test: relax target_model assertion to avoid module-cache flake
- AUTHOR_MAP: added rugved@lmstudio.ai → rugvedS07
2026-04-28 12:27:36 -07:00
Rugved Somwanshi
214ca943ac feat(agent): add lmstudio integration 2026-04-28 12:27:36 -07:00
Teknium
b53a091b97
remove: BOOT.md built-in hook (#17093)
BOOT.md was merged in PR #3733 before the feature was ready — the
built-in hook spawned a bare AIAgent() with no model/runtime kwargs,
which immediately 401s on any provider with a custom endpoint. Three
separate community PRs (#5240, #12514, #14992) tried to paper over it.

Remove the BOOT.md hook entirely and its user-facing docs/tips. Keep
the gateway/builtin_hooks/ package and the HookRegistry._register_builtin_hooks()
hook-point intact as the extension surface for future always-on
gateway hooks.

Closes #5239.

Co-authored-by: teknium1 <teknium@users.noreply.github.com>
2026-04-28 09:50:27 -07:00
Teknium
df51ad7973
perf(config): mtime-cache load_config() and read_raw_config() (#17041)
load_config() and read_raw_config() now cache their result keyed on
the config file's (mtime_ns, size). On cache hit they return a deepcopy
of the cached value, skipping yaml.safe_load + deep-merge + normalize +
env-var expansion entirely. save_config() + migrate_config() write via
atomic_yaml_write which produces a fresh inode, so stat() sees a new
mtime_ns and the next load repopulates automatically — no explicit
invalidation hook needed.

Measured per-call cost:
  load_config() cold:   13.3 ms
  load_config() cached:  0.23 ms  (57x faster)
  read_raw_config() cached: 0.13 ms

A single gateway turn hits the config 5-15 times (session context,
auxiliary client resolution, memory config, plugin hooks, approval
lookups, per-tool settings). That's 65-200 ms/turn of pure YAML
re-parsing on main. After this change: 1-3 ms/turn.

Also migrates gateway/run.py's 6 direct yaml.safe_load(config.yaml)
call sites through _load_gateway_config, which now shares the
read_raw_config cache when _hermes_home agrees with the canonical
config path. The direct-read fallback is retained for tests that
monkeypatch gateway_run._hermes_home without touching HERMES_HOME.

Safety:
- load_config() returns a deepcopy on every call; the 67+ call sites
  that mutate the result (cfg["model"]["default"] = ..., etc.) can't
  corrupt the cache.
- save_config() / atomic_yaml_write bump mtime, naturally invalidating
  the cache for the next reader.
- Cache is keyed on str(config_path), so HERMES_HOME profile switches
  don't collide.

Verified:
- 112 config tests pass (test_config, test_config_env_expansion,
  test_config_env_refs, test_config_drift, test_config_validation,
  test_aux_config).
- 87 gateway tests pass (test_verbose_command, test_session_info,
  test_compress_focus, test_runtime_footer, test_resume_command,
  test_reasoning_command, test_approve_deny_commands,
  test_run_progress_interrupt).
- Live hermes chat smoke — 2 turns + /model switch + tool calls,
  zero errors in agent.log.

Co-authored-by: teknium1 <teknium@users.noreply.github.com>
2026-04-28 07:06:35 -07:00
Teknium
42be5e49b0
fix(browser): detect missing Chromium and fail fast with actionable error (#17039)
Previously, check_browser_requirements() only checked for the agent-browser
CLI, not the Chromium binary it drives. When the CLI was present but
Chromium wasn't (common in Docker images predating the playwright install
step), the browser tool was advertised to the agent, every call hung for
the full command timeout (~30s each, ~220s for a chained navigate), and
the agent eventually gave up with no useful error — users saw 'browser
not working' with empty errors.log.

Changes:
- tools/browser_tool.py: add _chromium_installed() checking
  PLAYWRIGHT_BROWSERS_PATH + default Playwright cache paths for
  chromium-* / chromium_headless_shell-* dirs; wire into
  check_browser_requirements() for local mode (cloud providers
  unaffected). _run_browser_command fails fast with an actionable
  Docker vs. host message instead of hanging. _running_in_docker()
  checks /.dockerenv and /proc/1/cgroup.
- hermes_cli/tools_config.py: post_setup for 'Local Browser' now runs
  'agent-browser install --with-deps' after npm install to actually
  download Chromium. In Docker, points user at the updated image pull
  instead of trying to install into a read-only layer. Cloud-provider
  post_setup (browserbase) skips Chromium install entirely.
- tests/tools/test_browser_chromium_check.py: new tests covering
  search roots, install detection, requirements branches (local/cloud/
  camofox), and the fast-fail guard in docker/non-docker contexts.
- tests/tools/test_browser_homebrew_paths.py: 5 existing subprocess-path
  tests now mock _chromium_installed=True since they exercise the
  post-guard subprocess path.

Co-authored-by: teknium1 <teknium@users.noreply.github.com>
2026-04-28 07:03:44 -07:00
Teknium
5ed1eb0d0f
docs(config): surface telegram.reactions in DEFAULT_CONFIG (#17028)
The telegram.reactions key was already wired up (gateway/config.py bridges
it to TELEGRAM_REACTIONS at startup) but was undocumented and missing from
DEFAULT_CONFIG, so users had no way to discover it. Add it with the
existing off-by-default behavior preserved.

No behavior change — runtime default stays False.

Co-authored-by: teknium1 <teknium@users.noreply.github.com>
2026-04-28 07:02:30 -07:00
Teknium
e123f4ecf0
feat(gateway): opt-in runtime-metadata footer on final replies (#17026)
Append a compact 'model · 68% · ~/projects/hermes' footer to the FINAL
message of each turn, disabled by default (display.runtime_footer.enabled).
Answers the Telegram-side parity ask: runtime context that the CLI status
bar already shows is now available in messaging replies when enabled.

Wiring:
- gateway/runtime_footer.py: resolve_footer_config + format_runtime_footer +
  build_footer_line. Pure-function renderer; per-platform overrides under
  display.platforms.<platform>.runtime_footer.
- gateway/run.py: appends footer to response right after reasoning prepend
  so it lands only on the final message (never tool progress or streaming
  chunks). When streaming already delivered the body (already_sent), the
  footer is sent as a small trailing message instead.
- agent_result now exposes context_length alongside last_prompt_tokens so
  the footer can compute the pct; both gateway return paths updated.
- /footer [on|off|status] slash command, wired in CLI (cli.py) and gateway
  (gateway/run.py both running-agent bypass and main dispatch). Global
  toggle only; per-platform overrides via config.yaml.

Graceful degradation:
- Missing context_length (unknown model) → pct field silently dropped
  (no '?%' artifact).
- Empty final_response → no footer appended.
- Unknown field names in config → silently ignored.

Tests: 25-case unit suite (tests/gateway/test_runtime_footer.py) plus E2E
harness covering streaming vs non-streaming branches, per-platform override,
and the exact argument contract gateway/run.py uses.

Co-authored-by: teknium1 <teknium@users.noreply.github.com>
2026-04-28 06:50:04 -07:00
Teknium
6085d7a93e
chore: remove unused imports and dead locals (ruff F401, F841) (#17010)
Mechanical cleanup across 43 files — removes 46 unused imports
(F401) and 14 unused local variables (F841) detected by
`ruff check --select F401,F841`. Net: -49 lines.

Also fixes a latent NameError in rl_cli.py where `get_hermes_home()`
was called at module line 32 before its import at line 65 — the
module never imported successfully on main. The ruff audit surfaced
this because it correctly saw the symbol as imported-but-unused
(the call happened before the import ran); the fix moves the import
to the top of the file alongside other stdlib imports.

One `# noqa: F401` kept in hermes_cli/status.py for `subprocess`:
tests monkeypatch `hermes_cli.status.subprocess` as a regression
guard that systemctl isn't called on Termux, so the name must
exist at module scope even though the module body doesn't reference
it. Docstring explains the reason.

Also fixes an invalid `# noqa:` directive in
gateway/platforms/discord.py:308 that lacked a rule code.

Co-authored-by: teknium1 <teknium@users.noreply.github.com>
2026-04-28 06:46:45 -07:00
Dejie Guo
8cced33784 fix(model): prefer live models for user providers 2026-04-28 06:45:35 -07:00
Teknium
72dea9f4f7
feat(gateway): make hygiene hard message limit configurable (#17000)
The gateway session-hygiene pre-compression safety valve had a hardcoded
400-message threshold. On long-lived sessions with short turns this was
either too high (users with aggressive compression preferences) or too
low (users with very large context models who want to keep more history
in-flight).

Add compression.hygiene_hard_message_limit (default 400) so it can be
tuned without forking the gateway.

Reported by @OP (Apr 26 feedback bundle).

## Changes
- hermes_cli/config.py: new DEFAULT_CONFIG key with 400 default
- gateway/run.py: read compression.hygiene_hard_message_limit at
  hygiene-time, fall back to 400 if missing/invalid
- tests/gateway/test_session_hygiene.py: two tests — override fires at
  the configured limit, default does not fire below 400

Co-authored-by: teknium1 <teknium@users.noreply.github.com>
2026-04-28 05:43:12 -07:00
Teknium
06164a7b28
fix(codex): resync pool entry from auth.json after reauth (#17001)
When openai-codex tokens expire or the ChatGPT account hits a 429
window, the pool entry gets marked STATUS_EXHAUSTED with
last_error_reset_at many hours in the future. If the user then runs
`hermes model` / `hermes auth openai-codex` to reauth, fresh tokens
land in ~/.hermes/auth.json but the pool entry stayed frozen behind
its reset_at — every request kept failing with 'credential pool: no
available entries (all exhausted or empty)' until the original window
elapsed.

_available_entries() already had auth.json/credentials-file resync
branches for anthropic/claude_code and nous/device_code; openai-codex
was missing. Added _sync_codex_entry_from_auth_store() mirroring the
nous version (reads state["tokens"][{access,refresh}_token] +
state["last_refresh"]) and wired it into the exhausted-entry resync
loop.

Also softens the 'codex CLI not found' doctor warning — native
device-code OAuth does not require the Codex binary, only
importing existing Codex CLI tokens does. Downgraded to an info line.

Reported on Discord by p1aceho1der: Codex stalled indefinitely after
a rate-limit reset, reauth didn't help, and doctor falsely warned
that the codex CLI was required.

Co-authored-by: teknium1 <teknium@users.noreply.github.com>
2026-04-28 05:43:09 -07:00
helix4u
d8c5573ffe fix(profiles): migrate Honcho host on rename 2026-04-28 05:37:09 -07:00
teknium1
7444e49d4e fix(gateway): use transcript timestamp for auto-continue freshness
Follow-up to PR #16802 (BeliefanX). The original fix read
`agent_history[-1].get("timestamp")` for the tool-tail freshness gate,
but `gateway/run.py` strips the `timestamp` field off all tool/tool_call
rows when building `agent_history` from the raw transcript (see
`clean_msg = {k: v for k, v in msg.items() if k != "timestamp"}`).  At
runtime the tool-tail branch always saw `None` and silently took the
legacy-fresh path — the stale-guard never fired for the tool-tail case
it was supposed to cover.

Changes:
- Read the freshness signal from the RAW `history` list (via new
  `_last_transcript_timestamp()` helper) BEFORE the strip.  Both the
  resume_pending branch and the tool-tail branch use this single signal,
  replacing the two divergent ones.
- Default window bumped 15 min → 1 hour via new
  `_AUTO_CONTINUE_FRESHNESS_SECS_DEFAULT`.  The 15-minute default was
  shorter than the default `gateway_timeout` of 30 min, so a legitimate
  long-running turn interrupted near its timeout boundary and resumed
  shortly after would have been misclassified as stale.
- Configurable via `config.yaml` `agent.gateway_auto_continue_freshness`
  (bridged to `HERMES_AUTO_CONTINUE_FRESHNESS` at gateway startup — same
  pattern as `gateway_timeout`).  Set to 0 to disable the gate.
- `_coerce_gateway_timestamp` now explicitly rejects bool (which is a
  subclass of int and would otherwise coerce to 0.0/1.0).
- Tests rewritten to exercise the real production data shape: raw
  `history` → `_build_agent_history` strip → freshness decision.  A
  regression guard (`test_stale_tool_tail_with_production_data_shape`)
  asserts `agent_history` tool rows carry NO timestamp, protecting
  against someone "fixing" the original bug by re-adding the stripped
  field (which would break the OpenAI tool-result message contract).

Add BeliefanX to scripts/release.py AUTHOR_MAP.

E2E verified: config.yaml → env var bridge → helper returns configured
value; default 1h window; malformed/empty env var falls back to default;
ISO-Z timestamps parse; ms-epoch coerced; bool rejected.
2026-04-28 05:20:35 -07:00
Teknium
b61d9b297a refactor: consolidate symlink-safe atomic replace into shared helper
Extract the islink/realpath guard from the 16743 fix into a single
atomic_replace() helper in utils.py, then migrate every os.replace()
call site in the codebase to use it.

The original PR #16777 correctly identified and fixed the bug, but
only patched 9 of ~24 call sites. The same bug class (managed
deployments that symlink state files silently losing the link on
every write) still existed at auth.json, sessions file, gateway
config, env_loader, webhook subscriptions, debug store, model
catalog, pairing, google OAuth, nous rate guard, and more.

Rather than add another 10+ copies of the same three-line guard,
consolidate into atomic_replace(tmp, target) which:
- resolves symlinks via os.path.realpath before os.replace
- returns the resolved real path so callers can re-apply permissions
- is a drop-in replacement for os.replace at the use sites

Changes:
- utils.py: new atomic_replace() helper + atomic_json_write /
  atomic_yaml_write now call it instead of inlining the guard
- 16 files: all os.replace() call sites migrated to atomic_replace()
  - agent/{google_oauth, nous_rate_guard, shell_hooks}.py
  - cron/jobs.py
  - gateway/{pairing, session, platforms/telegram}.py
  - hermes_cli/{auth, config, debug, env_loader, model_catalog, webhook}.py
  - tools/{memory_tool, skill_manager_tool, skills_sync}.py

Tests: tests/test_atomic_replace_symlinks.py pins the invariant for
atomic_replace + atomic_json_write + atomic_yaml_write, covers plain
files, first-time creates, broken symlinks, and permission preservation.

Refs #16743
Builds on #16777 by @vominh1919.
2026-04-28 04:58:22 -07:00
vominh1919
3ab97a32d1 fix: preserve symlinks during atomic file writes (#16743)
os.replace(tmp, path) replaces the symlink itself with a regular file,
breaking users who symlink config.yaml, SOUL.md, or .env from ~/.hermes/
to a dotfiles repo or managed profile package.

Fix: resolve symlinks via os.path.realpath() before os.replace(), so the
real file is overwritten in-place while the symlink survives.

Fixed in 7 files covering all os.replace call sites:
- utils.py (atomic_json_write, atomic_yaml_write — fixes save_config)
- hermes_cli/config.py (env sanitizer, save_env_value, remove_env_value)
- tools/skill_manager_tool.py (_atomic_write_text — SOUL.md writes)
- tools/memory_tool.py (memory file writes)
- tools/skills_sync.py (manifest writes)
- cron/jobs.py (job state + output file writes)
- agent/shell_hooks.py (hook file writes)

Fixes NousResearch/hermes-agent#16743
2026-04-28 04:58:22 -07:00
Ruda Porto Filgueiras
a23f18cc3e fix(bedrock): add live model discovery and region resolution for non-US regions
provider_model_ids("bedrock") fell through to a static _PROVIDER_MODELS
table containing only hardcoded us.* model IDs.  Users configured for
non-US AWS regions (eu-central-1, ap-northeast-1, etc.) saw wrong or no
models in /model and autocomplete.

Root causes fixed:

1. models.py: provider_model_ids() now calls discover_bedrock_models()
   keyed by the resolved region before falling back to the static table.
   A new bedrock_model_ids_or_none() helper in bedrock_adapter.py
   consolidates the discover -> extract IDs -> fallback pattern used by
   all three call sites.

2. providers.py: registers bedrock in HERMES_OVERLAYS with
   transport=bedrock_converse and auth_type=aws_sdk so
   get_provider("bedrock") and resolve_provider_full("bedrock") work.

3. model_switch.py: list_authenticated_providers() sections 2 and 3
   detect AWS credentials via has_aws_credentials() for aws_sdk
   overlays and use live discovery for the model list.

4. bedrock_adapter.py: resolve_bedrock_region() reads the configured
   region from botocore.session before falling back to us-east-1,
   covering users who set their region in ~/.aws/config via a named
   profile rather than env vars.

5. tui_gateway/server.py: passes provider= to get_model_context_length()
   so context window lookups work correctly for the Bedrock provider.
2026-04-28 03:53:11 -07:00
simonweng
a6a6cf047d feat(providers): add tencent-tokenhub provider support
Registers tencent-tokenhub (https://tokenhub.tencentmaas.com/v1) as a
new API-key provider with model tencent/hy3-preview (256K context).

- PROVIDER_REGISTRY entry + TOKENHUB_API_KEY / TOKENHUB_BASE_URL env vars
- Aliases: tencent, tokenhub, tencent-cloud, tencentmaas
- openai_chat transport with is_tokenhub branch for top-level
  reasoning_effort (Hy3 is a reasoning model)
- tencent/hy3-preview:free added to OpenRouter curated list
- 60+ tests (provider registry, aliases, runtime resolution,
  credentials, model catalog, URL mapping, context length)
- Docs: integrations/providers.md, environment-variables.md,
  model-catalog.json

Author: simonweng <simonweng@tencent.com>
Salvaged from PR #16860 onto current main (resolved conflicts with
#16935 Azure Anthropic env-var hint tests and the --provider choices=
list removal in chat_parser).
2026-04-28 03:45:52 -07:00
Teknium
bd10acd747
fix(providers): honor key_env/api_key_env on Azure Anthropic + accept alias in normalizer (#16935)
Three related fixes around custom env-var-name hints for provider entries.

1. Azure Anthropic path: previously hardcoded to look up AZURE_ANTHROPIC_KEY
   then ANTHROPIC_API_KEY with no way to override.  If a user wrote
     model:
       provider: anthropic
       base_url: https://my-resource.services.ai.azure.com/anthropic
       key_env: MY_CUSTOM_KEY
   the key_env hint was silently ignored and the resolver raised
   'No Azure Anthropic API key found' even when MY_CUSTOM_KEY was set
   in the environment.  The runtime now checks, in order:
     (1) os.getenv(model_cfg.key_env)
     (2) os.getenv(model_cfg.api_key_env)    # docs alias
     (3) model_cfg.api_key                     # inline value
     (4) AZURE_ANTHROPIC_KEY                   # historical default
     (5) ANTHROPIC_API_KEY                     # historical default
   Error message updated to mention key_env as an option.

2. Provider entry normalizer (_normalize_custom_provider_entry): accept
   'api_key_env' as a snake_case alias for 'key_env', and 'apiKeyEnv' as a
   camelCase alias.  Adds both to the _KNOWN_KEYS set so the 'unknown
   config keys ignored' warning doesn't fire on valid configs.

3. _VALID_CUSTOM_PROVIDER_FIELDS: add 'key_env'.  That set documents
   supported custom_providers entry fields; it was drifting from reality
   since key_env has been read at runtime in auxiliary_client.py,
   runtime_provider.py, and main.py for a while.

Docs: website/docs/guides/azure-foundry.md now uses the canonical key_env
field and notes that api_key_env / keyEnv / apiKeyEnv are accepted as
aliases.

Validation: 12 new tests in test_runtime_provider_resolution.py covering
all 5 Azure Anthropic resolution paths + 4 normalizer-alias tests.  Pass
rate across related suites (165 + 46 tests): 100%.

Co-authored-by: teknium1 <teknium@users.noreply.github.com>
2026-04-28 02:12:08 -07:00
Teknium
e63364b8df
revert: computer-use cua-driver (PR #16919) (#16927)
Reverts PR #16919 (commits dad10a78d, 413ee1a28, b4a8031b2, afb958829)
which was merged prematurely. Restoring the pre-merge state so #14817
and #15328 can be revisited as standing PRs.

Reverted commits:
- afb958829 fix(computer-use): harden image-rejection fallback + AUTHOR_MAP
- b4a8031b2 fix(computer-use): unwrap _multimodal tool results
- 413ee1a28 feat(computer-use): background focus-safe backend
- dad10a78d feat(computer-use): cua-driver backend, universal any-model schema

Co-authored-by: teknium1 <teknium@users.noreply.github.com>
2026-04-28 01:57:21 -07:00
Teknium
cf0852f92e
feat(claw-migrate): harden OpenClaw import with plan-first apply, redaction, and pre-migration backup (#16911)
* feat(claw-migrate): harden OpenClaw import with plan-first apply, redaction, and pre-migration backup

Adopts four design patterns from OpenClaw's reciprocal migrate-hermes
importer so both migration paths have the same safety posture.

- **Refuse-on-conflict apply.** 'hermes claw migrate' now refuses to
  execute when the plan has any conflict items, unless --overwrite is
  set. Previously the user could say 'yes, proceed' and end up with a
  silent partial migration that skipped every conflicting item.
- **Engine-level secret redaction.** The report.json and summary.md
  written to disk (and --json stdout) run through a redactor that
  matches OpenClaw's key-name markers and value-shape patterns
  (sk-*, ghp_*, xox*-, AIza*, Bearer *). Prevents accidental API key
  leakage in bug reports and support channels.
- **Pre-migration tarball snapshot.** Apply creates one timestamped
  restore-point archive of ~/.hermes/ at ~/.hermes/migration/pre-migration-backups/
  before any mutation, excluding regenerable directories
  (sessions, logs, cache). Opt out with --no-backup.
- **Blocked-by-earlier-conflict sequencing.** If a config.yaml write
  hits conflict/error mid-apply, subsequent config-mutating options
  are marked skipped with reason 'blocked by earlier apply conflict'
  rather than attempting partial writes.
- **Structured warnings[] and next_steps[] on the report** — actionable
  guidance surfaces in both JSON output and summary.md.
- **--json output mode** — emits the redacted report on stdout for CI.

Also flips --preset full to NOT auto-enable --migrate-secrets. Users
now have to opt in to secret import explicitly, mirroring OpenClaw's
two-phase posture.

Status/kind/action constants are defined (STATUS_MIGRATED etc) with
values that match the existing strings the script emits, so the
report schema is backward-compatible. ItemResult gains a 'sensitive'
bool field that redaction and consumers can key off.

Validation: 26 new unit tests + 1 updated test in tests/skills/
test_openclaw_migration_hardening.py and test_claw.py cover redaction
(key markers, value patterns, recursion, on-disk), warnings/next_steps,
blocked-by-earlier sequencing, --json mode, and the preset-flip.
Manual E2E against a fake $HERMES_HOME with real-shaped secrets
confirmed: (1) secrets never appear in stdout or on disk,
(2) _cmd_migrate refuses apply when plan has conflicts,
(3) --overwrite proceeds past the guard and the backup tarball is
created, (4) --no-backup skips the archive.

Related docs: website/docs/guides/migrate-from-openclaw.md and
website/docs/reference/cli-commands.md updated to reflect the
preset-flip and new --no-backup flag.

* refactor(claw-migrate): reuse hermes backup system for pre-migration snapshot

Drops the inline tarball in hermes_cli/claw.py in favor of
hermes_cli.backup.create_pre_migration_backup(), which shares an
implementation with create_pre_update_backup via a new
_write_full_zip_backup helper.  Benefits:

- Consistent exclusion rules with hermes backup (_EXCLUDED_DIRS,
  _EXCLUDED_SUFFIXES, _EXCLUDED_NAMES — single source of truth).
- SQLite safe-copy via _safe_copy_db (state.db restores cleanly).
- Zip format restorable with 'hermes import <archive>'.
- Lives under ~/.hermes/backups/pre-migration-*.zip alongside
  pre-update-*.zip — one place for all snapshot archives.
- Auto-prune rotation with separate keep counters (pre-migration
  keeps 5, pre-update keeps 5, they don't touch each other's files).

7 new tests in tests/hermes_cli/test_backup.py lock the contract:
directory location, shared exclusion rules, _validate_backup_zip
acceptance (i.e. restorable with 'hermes import'), non-recursive
into prior backups, rotation, missing-home handling, and the
invariant that pre-migration rotation never touches pre-update
backups.

Help text and docs updated — the restore hint now says
'hermes import <name>' instead of 'tar -xzf <archive> -C ~/'.

* chore(claw-migrate): use backup._format_size and drop duplicate output line

Minor polish using another existing primitive from hermes_cli.backup:

- Show backup archive size with _format_size (e.g. '(245 B)' or '(2.4 MB)')
  matching the format hermes backup already uses.
- Drop the duplicate 'Pre-migration backup saved' line after Migration
  Results — the earlier 'Pre-migration backup: <path> (<size>)' line
  already surfaces the path before apply runs.

---------

Co-authored-by: teknium1 <teknium@users.noreply.github.com>
2026-04-28 01:50:23 -07:00
Teknium
a83f669bcf fix(models): auto-derive xAI model list from models.dev cache (#16699)
Follow-up to the static list refresh: replace the hardcoded xAI entries
with _xai_curated_models(), mirroring the _codex_curated_models()
pattern from PR #7844. The helper reads $HERMES_HOME/models_dev_cache.json
at import time (no network call) and falls back to a small static list
when the cache is missing or malformed.

Why: _PROVIDER_MODELS["xai"] has drifted once already (issue #16699) and
will drift again next time xAI renames a model. Hermes already maintains
the models.dev cache and uses it for context-length lookups; pointing
_PROVIDER_MODELS at the same source means the /model picker self-heals on
the next cache refresh instead of requiring a PR.

Behavior:
- With cache populated (normal user): shows every current xAI model ID,
  picks up renames automatically on next refresh.
- Without cache (fresh install, offline): falls back to a static snapshot
  of the 9 current flagship IDs.
- Malformed cache / unexpected shape: same static fallback, no crash.

Import time verified <20ms — disk read only, no HTTP.

Addresses the structural piece of #16699 ("consider a single
_provider_models(provider) resolver") for xAI. Other per-provider lists
can adopt the same pattern as drift is observed.
2026-04-28 01:49:50 -07:00
vominh1919
6c78305294 fix(models): update stale xAI model list (#16699)
_PROVIDER_MODELS["xai"] was pointing at model IDs the xAI direct API
no longer accepts:
- grok-4.20-reasoning
- grok-4-1-fast-reasoning

Replaced with the actual current xAI catalog IDs from models.dev
($HERMES_HOME/models_dev_cache.json, mirror of https://models.dev/api.json):
  grok-4.20-0309-reasoning
  grok-4.20-0309-non-reasoning
  grok-4.20-multi-agent-0309
  grok-4-1-fast
  grok-4-1-fast-non-reasoning
  grok-4-fast
  grok-4-fast-non-reasoning
  grok-4
  grok-code-fast-1

The xAI-direct API (https://api.x.ai/v1) serves the dated IDs shown
above; the bare aliases (grok-4.20, grok-4.1-fast, etc.) are
OpenRouter/Vercel-gateway normalizations and are not accepted on
xAI-direct. Those gateways remain unaffected.

Fixes #16699
2026-04-28 01:49:50 -07:00
westers
632ddf2a0a fix(cli): honor user-defined providers via chat --provider and -m <alias>
Three related issues prevented user-defined providers in `providers:` and
`model_aliases:` from being reachable through standard CLI flags. Requests
silently routed to the configured `model.base_url` instead of the user-
intended endpoint.

* hermes_cli/model_switch.py — root cause of the silent misrouting:
  `_ensure_direct_aliases()` rebound `DIRECT_ALIASES` to a freshly-loaded
  dict, leaving every `from hermes_cli.model_switch import DIRECT_ALIASES`
  caller stuck on the stale empty original. Switched to `.update()` so
  module attribute references stay valid.

* hermes_cli/main.py — chat subcommand `--provider` had `choices=[...]`
  hardcoded to built-in providers, rejecting valid keys from user
  `providers:` config. Dropped the choices list; runtime resolution
  validates correctly downstream.

* hermes_cli/oneshot.py — `-m <alias>` only resolved the model name; the
  alias's base_url was never propagated. Now consults `DIRECT_ALIASES`
  before falling through to `detect_provider_for_model`, and threads the
  alias's base_url to `resolve_runtime_provider(explicit_base_url=...)`.

* hermes_cli/runtime_provider.py — `_resolve_named_custom_runtime` now
  honors `(provider="custom", explicit_base_url=...)` so a base_url
  propagated from a direct-alias resolution actually builds a runtime
  instead of falling through to provider-registry handlers that don't
  know about ad-hoc local endpoints.

Verified: `hermes chat --provider <user-key> -m <model> -q "..."` and
`hermes -m <user-alias> -z "..."` both route to the user-intended
endpoint, observable via the target server's request log.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 01:47:20 -07:00
Teknium
dad10a78d0 feat(computer-use): cua-driver backend, universal any-model schema
Background macOS desktop control via cua-driver MCP — does NOT steal the
user's cursor or keyboard focus, works with any tool-capable model.

Replaces the Anthropic-native `computer_20251124` approach from the
abandoned #4562 with a generic OpenAI function-calling schema plus SOM
(set-of-mark) captures so Claude, GPT, Gemini, and open models can all
drive the desktop via numbered element indices.

- `tools/computer_use/` package — swappable ComputerUseBackend ABC +
  CuaDriverBackend (stdio MCP client to trycua/cua's cua-driver binary).
- Universal `computer_use` tool with one schema for all providers.
  Actions: capture (som/vision/ax), click, double_click, right_click,
  middle_click, drag, scroll, type, key, wait, list_apps, focus_app.
- Multimodal tool-result envelope (`_multimodal=True`, OpenAI-style
  `content: [text, image_url]` parts) that flows through
  handle_function_call into the tool message. Anthropic adapter converts
  into native `tool_result` image blocks; OpenAI-compatible providers
  get the parts list directly.
- Image eviction in convert_messages_to_anthropic: only the 3 most
  recent screenshots carry real image data; older ones become text
  placeholders to cap per-turn token cost.
- Context compressor image pruning: old multimodal tool results have
  their image parts stripped instead of being skipped.
- Image-aware token estimation: each image counts as a flat 1500 tokens
  instead of its base64 char length (~1MB would have registered as
  ~250K tokens before).
- COMPUTER_USE_GUIDANCE system-prompt block — injected when the toolset
  is active.
- Session DB persistence strips base64 from multimodal tool messages.
- Trajectory saver normalises multimodal messages to text-only.
- `hermes tools` post-setup installs cua-driver via the upstream script
  and prints permission-grant instructions.
- CLI approval callback wired so destructive computer_use actions go
  through the same prompt_toolkit approval dialog as terminal commands.
- Hard safety guards at the tool level: blocked type patterns
  (curl|bash, sudo rm -rf, fork bomb), blocked key combos (empty trash,
  force delete, lock screen, log out).
- Skill `apple/macos-computer-use/SKILL.md` — universal (model-agnostic)
  workflow guide.
- Docs: `user-guide/features/computer-use.md` plus reference catalog
  entries.

44 new tests in tests/tools/test_computer_use.py covering schema
shape (universal, not Anthropic-native), dispatch routing, safety
guards, multimodal envelope, Anthropic adapter conversion, screenshot
eviction, context compressor pruning, image-aware token estimation,
run_agent helpers, and universality guarantees.

469/469 pass across tests/tools/test_computer_use.py + the affected
agent/ test suites.

- `model_tools.py` provider-gating: the tool is available to every
  provider. Providers without multi-part tool message support will see
  text-only tool results (graceful degradation via `text_summary`).
- Anthropic server-side `clear_tool_uses_20250919` — deferred;
  client-side eviction + compressor pruning cover the same cost ceiling
  without a beta header.

- macOS only. cua-driver uses private SkyLight SPIs
  (SLEventPostToPid, SLPSPostEventRecordTo,
  _AXObserverAddNotificationAndCheckRemote) that can break on any macOS
  update. Pin with HERMES_CUA_DRIVER_VERSION.
- Requires Accessibility + Screen Recording permissions — the post-setup
  prints the Settings path.

Supersedes PR #4562 (pyautogui/Quartz foreground backend, Anthropic-
native schema). Credit @0xbyt4 for the original #3816 groundwork whose
context/eviction/token design is preserved here in generic form.
2026-04-28 01:46:36 -07:00
kshitijk4poor
42cc905c13 feat(plugins): add bundled observability/langfuse plugin
Opt-in Langfuse tracing for Hermes conversations — LLM calls, tool
usage, usage/cost breakdown per span. Hooks into pre/post_api_request,
pre/post_llm_call, pre/post_tool_call. SDK is optional; missing SDK or
credentials renders the plugin inert.

Salvaged from PR #16845 by @kshitijk4poor, who wrote the plugin
(~875 LOC, 6 hooks, Langfuse usage-details/cost-details normalization,
read_file payload summarization).

Salvage scope (why this isn't PR #16845 as-authored):
- Lives at plugins/observability/langfuse/ (standalone kind, opt-in via
  plugins.enabled) instead of a new parallel optional-plugins/
  directory. Standalone bundled plugins are already opt-in — only their
  plugin.yaml is scanned at startup; the Python module is not imported
  unless the user enables it. The premise of optional-plugins/ (avoid
  import cost for users who don't want it) is already solved by the
  existing plugin system.
- Dropped the triple activation gate (plugins.enabled +
  plugins.langfuse.enabled + HERMES_LANGFUSE_ENABLED). The Hermes plugin
  system's own enable/disable is authoritative; runtime credentials
  gate whether the hook actually traces.
- Rewrote _is_enabled() → cached _get_langfuse() with an _INIT_FAILED
  sentinel. The original called hermes_cli.config.load_config() from
  every hook invocation (full yaml parse + deep merge + env expansion
  on every pre/post_tool_call, potentially 100+ times per turn). The
  cached version reads env once and returns the cached client or None
  on every subsequent call with zero further work.
- hermes tools → Langfuse Observability post-setup adds
  observability/langfuse to plugins.enabled directly (via
  _save_enabled_set) instead of going through an install-copy flow.

Enable:
  hermes tools                                        # interactive
  hermes plugins enable observability/langfuse        # manual

Required env (set by `hermes tools` or in ~/.hermes/.env):
  HERMES_LANGFUSE_PUBLIC_KEY
  HERMES_LANGFUSE_SECRET_KEY
  HERMES_LANGFUSE_BASE_URL                            # optional

Co-authored-by: kshitijk4poor <kshitijk4poor@gmail.com>
2026-04-28 01:40:59 -07:00
Surat Srichan
a8f9c56cb4 fix(config): accept fallback_model list (chain) in validator + save
Runtime already supports list-form fallback_model (run_agent.py:1459
iterates fallback_chain; fallback_cmd.py migrates legacy single-dict
configs to list format). The config validator and save_config comment
gate still assumed single-dict form and flagged list-form configs as
errors. Fix both:

- validate_config_structure: when fallback_model is a list, validate
  each entry has provider+model; keep the existing single-dict path.
- save_config: suppress the "add fallback_model" comment when any list
  entry is well-formed.

Adds 4 list-form validator tests.
2026-04-28 01:40:25 -07:00
vominh1919
0169c51820 fix(config): add request_timeout_seconds and stale_timeout_seconds to provider _KNOWN_KEYS
Both keys are documented in cli-config.yaml.example and read at runtime by
hermes_cli/timeouts.py (get_provider_request_timeout and get_provider_stale_timeout),
but the provider-entry validator in config.py flagged them as unknown, producing
noisy warnings on every CLI invocation for users who followed the documented config.

Fixes #16779
2026-04-28 01:28:25 -07:00
Hafiy Zakaria
40bd6d4709 fix: honor agent.disabled_toolsets in gateway sessions
Previously, agent.disabled_toolsets in config.yaml only worked for CLI
mode (run_agent.py --disabled_toolsets). The gateway always passed
enabled_toolsets to AIAgent, and get_tool_definitions() ignored
disabled_toolsets when enabled_toolsets was set.

Fix: _get_platform_tools() now reads agent.disabled_toolsets from config
and excludes those toolsets from the returned set. This runs last so it
overrides everything above.

Added 3 tests covering cross-platform suppression, explicit platform
config override, and empty/missing config no-op behavior.
2026-04-28 01:23:16 -07:00
briandevans
66a05e44d6 fix(copilot): require successful exchange when walking credential_pool catalog tokens
Address Copilot review on #16868:

1. Tighten pool iteration. ``validate_copilot_token`` only rejects empty
   strings and classic PATs (``ghp_*``); a malformed/unsupported ``gho_*``
   token at ``credential_pool.copilot[0]`` would pass the gate and short-
   circuit the loop, hiding a later valid entry. Switch to calling
   ``exchange_copilot_token`` directly: only entries that actually exchange
   into a live Copilot API token are returned. Bad/expired entries fall
   through to the next, and an exhausted pool returns ``""`` so the picker
   falls back to the curated list (existing behaviour).

2. Reword the docstring + test module docstring to describe the pool seed
   path accurately — ``hermes auth add copilot`` adds an api-key-typed
   credential whose ``access_token`` field stores the pasted token, and
   ``_seed_from_env`` mirrors ``COPILOT_GITHUB_TOKEN`` from
   ``~/.hermes/.env`` into the pool. The previous wording implied
   ``auth add copilot`` itself ran the device-code flow, which it does
   not (the device-code flow lives in ``hermes model``).

Two new tests cover the iteration change:
  - ``test_skips_pool_entry_that_fails_to_exchange`` — pool[0] raises,
    pool[1] succeeds, picker uses pool[1].
  - ``test_all_pool_entries_fail_exchange_returns_empty`` — every entry
    raises, return ``""``.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 01:18:09 -07:00
briandevans
fdfe40a48b fix(copilot): fall back to credential_pool OAuth access_token for /model picker (#16708)
Users whose only Copilot credential is the OAuth `access_token` saved by
`hermes auth add copilot` (device-code flow) saw the `/model` picker drop
back to a stale hardcoded list. Reason: `_resolve_copilot_catalog_api_key`
only consulted env vars (`COPILOT_GITHUB_TOKEN` / `GH_TOKEN` /
`GITHUB_TOKEN`) and the `gh auth token` CLI fallback, never the credential
pool that Hermes's own login flow writes into `auth.json`. With no token,
the live catalog fetch silently 401s and the picker hides current models
(claude-opus-4.7, claude-sonnet-4.6, gpt-5.5, grok-code-fast-1) — even
though `/model <id>` works fine because runtime inference reads the pool
through a different code path.

Mirror the Codex catalog resolver pattern: env-var first (unchanged), then
walk `read_credential_pool("copilot")` for the first entry with a
supported `access_token` (`gho_*` / `github_pat_*` / `ghu_*`). Run it
through `get_copilot_api_token()` so the catalog request uses the same
exchanged token the runtime path uses. Classic PATs (`ghp_*`) are still
rejected up-front via `validate_copilot_token` since the Copilot API
doesn't accept them.

Strictly additive: env still wins, and a missing/locked auth.json (or any
exception during pool read) still returns "" so the caller falls through
to the curated catalog.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 01:18:09 -07:00
Teknium
dd789a4fdf
fix(mcp): move discovery out of model_tools import side effect (#16856) (#16899)
model_tools.py ran discover_mcp_tools() as a module-level side effect.
discover_mcp_tools() uses a blocking 120s wait internally (via
_run_on_mcp_loop -> future.result(timeout=120)).

The gateway lazy-imports run_agent -> model_tools on the first user
message, which happens inside the asyncio event loop thread.  A slow or
unreachable MCP server therefore froze Discord shard heartbeats and
Telegram polling for up to 120s on the first message after gateway
start.

Fix: remove the module-level call.  Every entry point now runs
discovery explicitly at its own startup, using the context-appropriate
blocking/non-blocking pattern:

- gateway/run.py:       loop.run_in_executor(None, discover_mcp_tools)
                        before platforms start accepting traffic
- hermes_cli/main.py:   inline (no event loop at CLI startup)
- tui_gateway/entry.py: inline (sync stdin loop, no event loop)
- acp_adapter/entry.py: inline before asyncio.run()

Closes #16856.
2026-04-28 01:17:58 -07:00
ztexydt-cqh
1d5e25f353 fix(gateway): persist /sethome home channel to .env across all platforms
_handle_set_home_command wrote FEISHU_HOME_CHANNEL / DISCORD_HOME_CHANNEL /
etc. as top-level keys into config.yaml, but load_gateway_config() only
reads home channels from env vars. After every gateway restart the home
channel was lost — on every platform, not just Feishu.

Fix: switch /sethome to save_env_value(), which atomically writes to
~/.hermes/.env and updates the current process env in one shot. The
handler builds the env key from platform_name.upper(), so one line
change repairs /sethome for every platform that has a HOME_CHANNEL
env var.

Also widen _EXTRA_ENV_KEYS in hermes_cli/config.py so HOME_CHANNEL and
HOME_CHANNEL_NAME for every platform are treated as managed env vars:
SIGNAL, SLACK, SMS, DINGTALK, BLUEBUBBLES, FEISHU, WECOM, YUANBAO, plus
the missing *_NAME variants for DISCORD/TELEGRAM/MATTERMOST.

Closes #16806

Co-authored-by: teknium1 <screenmachine@gmail.com>
2026-04-28 01:17:17 -07:00
Teknium
9048fd020f fix(cli): tighten stale-dashboard match to explicit patterns
Replace the Linux/macOS pgrep regex ("hermes.*dashboard") with a ps
scan + the same explicit patterns list already used on the Windows
branch and in hermes_cli.gateway._scan_gateway_pids:

    hermes dashboard
    hermes_cli.main dashboard
    hermes_cli/main.py dashboard

The old greedy regex would match any cmdline containing both words —
e.g. a chat session whose argv mentions "dashboard" or an unrelated
grafana/dashboard-server process. Added regression tests for both.

Follow-up tightening on #16881.
2026-04-28 01:14:44 -07:00
Societus
66b1142384 fix(cli): warn about stale dashboard processes after hermes update
The dashboard is a long-lived server process users start and forget.
When hermes update replaces files on disk, the running process holds
the old Python backend in memory while the JS bundle gets updated,
producing a silent frontend/backend mismatch (e.g. v0.11.0 changed
the session token header -- old backends reject every API call).

Scan for running dashboard processes after a successful update (both
git and ZIP paths) and print a warning with their PIDs and restart
instructions. Mirrors the existing pattern for gateway processes.

Fixes #16872
2026-04-28 01:14:44 -07:00
Sanjay
b52ceccfa8 fix(opencode): re-derive api_mode per target model on /model switch
opencode-zen and opencode-go each serve both anthropic_messages
(e.g. minimax-m2.7) and chat_completions (e.g. deepseek-v4-flash)
models behind a single base_url. The api_mode resolver in
hermes_cli/runtime_provider.py honoured the persisted
model_cfg.api_mode (set by the previous default model) before checking
the opencode model registry, so /model deepseek-v4-flash from a session
whose default was minimax-m2.7 inherited 'anthropic_messages', stripped
'/v1' from base_url (the Anthropic SDK adds its own /v1/messages), and
404'd.

Promote the opencode detection branch above the configured_mode check
in both api_mode resolution paths:

- _resolve_runtime_from_pool_entry (pool-backed providers)
- _resolve_api_key_runtime          (api-key providers, fallback path)

Both branches now call opencode_model_api_mode(provider, effective_model)
unconditionally for opencode-zen/go before considering any persisted
api_mode, so the mode always reflects the model the user just switched
to.

Existing tests pass (12/12 in tests/hermes_cli/test_model_switch_opencode_anthropic.py).

Fixes #16878
2026-04-28 01:14:35 -07:00
左奇银
07a818804e feat(alibaba): add qwen3.6-plus to supported models
- Add qwen3.6-plus to the Alibaba DashScope curated model list
- Enables model switching via /model qwen3.6-plus without auto-correction warning
2026-04-28 01:14:31 -07:00
Teknium
8269f9056c
feat(fast): broaden /fast whitelist to all OpenAI + Anthropic models (#16883)
Switch _PRIORITY_PROCESSING_MODELS and _ANTHROPIC_FAST_MODE_MODELS from
hardcoded frozensets to prefix-based matching. Any gpt-*, o1*, o3*, o4*
(OpenAI) and any claude-* (Anthropic) now exposes /fast.

Fixes the case where gpt-5.5 and other post-catalog models silently
skipped Priority Processing because they weren't in the frozenset.
Future OpenAI/Anthropic releases will work without a catalog bump.

Safety:
- Codex-series (*codex*) still excluded — they route through the Codex
  Responses API which doesn't take service_tier.
- Anthropic adapter already gates speed=fast on native endpoints only
  (_is_third_party_anthropic_endpoint), so claude-sonnet-4.6 on
  OpenRouter/Bedrock/opencode-zen won't leak the unknown beta.
- service_tier=priority is silently dropped by non-OpenAI proxies, so
  false positives are harmless.
2026-04-28 00:44:43 -07:00
Teknium
8081425a1c
feat(security): make secret redaction off by default (#16794)
Flips security.redact_secrets from true to false in DEFAULT_CONFIG, and
the HERMES_REDACT_SECRETS env-var fallback in agent/redact.py now
requires explicit opt-in ("1"/"true"/"yes"/"on") to enable.

New installs and users without a security.redact_secrets key get pass-
through tool output. Existing users whose config.yaml explicitly sets
redact_secrets: true keep redaction on — the config-yaml -> env-var
bridges in hermes_cli/main.py and gateway/run.py still honor their
setting.

Also updates the inline config comments, website docs, and the
hermes-agent skill so /hermes config set security.redact_secrets true
is now the documented way to turn it on.
2026-04-27 21:24:08 -07:00
Adam Rummer
1eab5960f0 feat(matrix): add dm_auto_thread config for DM auto-threading
Adds MATRIX_DM_AUTO_THREAD env var (default: false) to control
auto-threading in DM rooms independently from channel auto-threading.

Closes #15398
2026-04-27 21:22:44 -07:00
Teknium
30307a9802
feat(plugins): add pre_approval_request / post_approval_response hooks (#16776)
Plugins can now observe dangerous-command approval events in real time,
on both the CLI-interactive path and the async gateway path. This is the
missing hook surface external tools need to build approval notifiers
(macOS menu-bar allow/deny, Slack alerts, audit logs, etc.) without
forking Hermes or running a parallel gateway adapter.

Changes:
- hermes_cli/plugins.py: add two entries to VALID_HOOKS
- tools/approval.py: fire both hooks from check_all_command_guards --
  around prompt_dangerous_approval (CLI surface) and around the
  notify_cb + blocking event.wait loop (gateway surface)
- website/docs/user-guide/features/hooks.md: document both hooks with
  a macOS-notification example
- tests/tools/test_approval_plugin_hooks.py: 5 tests covering CLI once,
  CLI deny, plugin-crash resilience, gateway approve, gateway timeout

Hooks are observer-only: return values are ignored, so plugins cannot
veto or pre-answer an approval (use pre_tool_call for that). A crashing
plugin cannot break the approval flow -- invoke_hook swallows per-
callback errors, and the wrapper logs and swallows dispatch-layer
errors too.

Surface kwarg distinguishes "cli" from "gateway"; post hook reports
choice as one of once/session/always/deny/timeout.
2026-04-27 20:08:33 -07:00
brooklyn!
e0e67a99bb
fix(tui): address copilot follow-up review on PR #16732 (#16740)
- moveCursor(extend=true) now collapses to the bare cursor when the
  computed offset equals the existing anchor instead of leaving a
  zero-length sel. Without this, Shift+Left at col 0 / Shift+Home at
  start would silently hide the hardware cursor (selected truthy)
  without rendering any highlight.
- _tui_need_npm_install also catches UnicodeDecodeError so a corrupted
  / non-UTF8 lockfile falls back to the mtime path the docstring
  promises instead of crashing.

Made-with: Cursor
2026-04-27 16:54:25 -07:00
brooklyn!
e7091bb326
fix(tui): mouse + keyboard text selection in the composer (#16732)
* feat(tui): auto copy-on-select for transcript text

Drag in the transcript already highlighted but you had to press Cmd+C to
land it on the clipboard, and the highlight cleared on copy — most users
never realised selection existed. Now drag-release fires copySelectionNoClear
so the text is on the clipboard immediately while the highlight stays put,
matching iTerm2's "Copy to pasteboard on selection" default. Esc clears.

Behaviour:
- Single click in the input still positions the cursor (TextInput onClick).
- Single click in the transcript still does nothing destructive.
- Double / triple click select word / line, then drag extends.
- /copyselect [on|off|toggle] (alias /cos) flips the setting at runtime,
  HERMES_TUI_DISABLE_COPY_ON_SELECT=1 disables at startup, persists via
  display.tui_copy_on_select in config.yaml.

Help overlay now lists drag-select, multi-click, and click-to-position
so the gestures are discoverable.

Made-with: Cursor

* fix(tui): support prompt text selection gestures

Add mouse drag selection and Shift+Arrow/Home/End extension inside the TUI composer so prompt text behaves like a normal editable field while keeping click-to-position and right-click paste intact.

Made-with: Cursor

* Revert "feat(tui): auto copy-on-select for transcript text"

This reverts commit 6701288fe07a53af873e1ef53855a9618d733327.

* fix(tui): allow composer selection from prompt whitespace

Give the composer a one-cell mouse capture pad before the editable text. The prompt glyph/gutter still does not become selectable, but dragging from the edge now anchors at input offset 0 so users do not need to hit the first character precisely.

Made-with: Cursor

* fix(tui): clear selections from blank composer space

Clicking blank space in the transcript or composer now clears active TUI/input selections like a normal text surface. TextInput clicks stop bubbling so cursor placement and selection gestures keep their local behavior.

Made-with: Cursor

* fix(tui): delegate prompt gutter drags to composer text

The prompt gutter is now an input gesture region, not selectable content. Dragging from the whitespace or prompt area anchors the composer selection at offset 0, while selection highlight/copy remains limited to actual input text.

Made-with: Cursor

* fix(tui): move composer cursor to end on selection clear

External clear actions now collapse the composer selection to the end of the input, matching normal text-field behavior after dismissing a selection.

Made-with: Cursor

* fix(tui): capture composer padding before prompt

Add an explicit mouse capture cell over the left padding before the prompt glyph. Drags starting there now delegate to the composer input at offset 0 instead of starting terminal-level selection over the prompt chrome.

Made-with: Cursor

* fix(tui): avoid npm install on lockfile mtime churn

Compare package-lock.json against npm's hidden node_modules lock by content instead of mtimes. Git checkouts and npm lock rewrites can make the root lockfile newer even when installed dependencies already match, causing hermes --tui to print Installing TUI dependencies on every launch.

Made-with: Cursor

* fix(tui): include prompt leading cell in gesture region

Use the prompt box's real layout region to cover the leading whitespace cell before the glyph. The cell now participates in mouse hit testing and delegates to composer selection instead of starting terminal-level selection.

Made-with: Cursor

* fix(tui): widen prompt-side gesture capture band

Capture a wider left-side band around the composer prompt row so drags starting in terminal gutter/padding cells are consumed and delegated to input selection, instead of triggering terminal-level selection chrome.

Made-with: Cursor

* fix(tui): make pre-prompt spacer non-selectable content

Replace the sticky-prompt fallback `Text(' ')` with an empty spacer box so the visual gap remains but no literal space character is rendered/copyable before the composer prompt.

Made-with: Cursor

* fix(tui): capture pre-prompt spacer without shifting prompt layout

Revert the widened negative-margin prompt capture band and instead capture drags on the dedicated spacer row above the prompt. This keeps prompt/text alignment stable while still delegating whitespace-start drags to composer selection.

Made-with: Cursor

* fix(tui): align prompt with status bar and capture full input row

Drop the leading prompt column from 3 to 2 so the input first character lines up with the status bar text. Wrap the prompt+input row in a single mouse-capture box and stop event propagation from TextInput's own handlers so any drag in that row delegates to composer selection without leaking to terminal-level selection.

Made-with: Cursor

* fix(tui): anchor hardware cursor during composer selection

When a composer selection covers a row exactly the column width, the rendered text fills the row and the terminal auto-wraps the hardware cursor to col 0 of the next row, leaving a ghost block beneath the prompt. Park the cursor at the start of the input box during selection so it can't escape the input region.

Made-with: Cursor

* fix(tui): hide hardware cursor during composer selection

Stop fighting auto-wrap by hiding the hardware cursor outright while the
composer has an active selection. This prevents both the ghost block under
the prompt (cursor wrapping past the last cell) and the parked-cursor block
on the first selected character. The cursor restores as soon as the
selection clears or focus changes.

Made-with: Cursor

* chore(tui): /clean — drop dead capture-pad path, dedupe gutter handlers

- TextInput: remove unused leftCaptureColumns prop and capture-pad math, drop
  unused mouseApi.startAt, fold mouse offset into a single offsetAt helper,
  share a MouseEventLite type across the four handlers.
- appLayout: hoist a GutterMouseEvent type and an endInputDrag callback so the
  spacer/prompt/input rows share one shape.
- _tui_need_npm_install: lift the runtime-only key set to a module constant,
  collapse nested isinstance checks, and document the mtime fallback.

Made-with: Cursor

* fix(tui): address copilot review on PR #16732

- Split InputSelection.clear() into clear() (cursor-preserving) and
  collapseToEnd() (clear + jump to end). Cmd+C copy paths keep using
  clear() so the cursor stays put; the blank-area click in useMainApp
  switches to collapseToEnd() to match the requested UX.
- Spacer-row drags now force row=0 when forwarding into the input,
  since the spacer's vertical origin doesn't align with the input box
  and Ink mouse-capture keeps dispatching motion to the original
  target. Prompt+input row drag keeps localRow because origins match.

Made-with: Cursor

* fix(tui): give TextInput Box an explicit width

After the /clean pass dropped the unused capture-pad math, the wrapping
Box also lost its explicit width and started sizing to its rendered
content. Clicks past the last character missed TextInput and fell
through to the parent prompt-row Box, which collapsed the cursor to
offset 0. Pin the Box back to `columns` so the input owns its full
column span regardless of value length.

Made-with: Cursor

* feat(tui): double-click select-all + hide cursor on terminal blur

- Track click time/offset in TextInput so a quick second click on the
  same offset triggers select-all. Ink's screen-level multi-click is
  bypassed once our onMouseDown captures, so the gesture has to be
  detected locally.
- Extend the cursor-hide effect to also fire when the terminal loses
  focus, so the hollow-rect ghost most terminals draw at the parked
  cursor position disappears too.

Made-with: Cursor

* chore(tui): /clean — extract isMultiClickAt helper

Pull the click-recurrence math out of TextInput's onMouseDown into a
small isMultiClickAt(offset) helper so the handler reads as the gesture
list it actually is (multi-click → select-all, otherwise start).
Drop the redundant length>0 guard now that selectAll() already noops on
an empty value.

Made-with: Cursor

* docs(tui): explain _tui_need_npm_install content-vs-mtime comparison

Expand the docstring so future readers understand why we parse the
lockfiles instead of comparing mtimes, what the optional/peer skip
covers, how stale hidden-lock entries are handled, and when we fall
back to mtime.
2026-04-27 16:43:48 -07:00
kshitijk4poor
56724147ef fix(providers/gmi): post-salvage review fixes
- config.py: remove dead ENV_VARS_BY_VERSION[17] entry (current _config_version
  is 22, so all users are past version 17 and would never be prompted for
  GMI_API_KEY on upgrade — consistent with how arcee was added)
- auxiliary_client.py: use google/gemini-3.1-flash-lite-preview as GMI aux
  model instead of anthropic/claude-opus-4.6 (matches cheap fast-model pattern
  used by all other providers: zai→glm-4.5-flash, kimi→kimi-k2-turbo-preview,
  stepfun→step-3.5-flash, kilocode→google/gemini-3-flash-preview)
- test_gmi_provider.py: fix malformed write_text() call in doctor test
  (was: write_text("GMI_API_KEY=*** encoding="utf-8") → missing closing quote,
  wrote literal string 'GMI_API_KEY=*** encoding=' to .env file)
- test_gmi_provider.py + test_auxiliary_client.py: update aux model assertions
  to match new cheaper default
- docs/integrations/providers.md: add 'gmi' to inline 'Supported providers'
  fallback list (was only in the table, not the inline list at line ~1181)
- docs/reference/cli-commands.md: add 'gmi' to --provider choices list
2026-04-27 11:17:59 -07:00
Isaac Huang
c53fcb0173 feat(providers): add GMI Cloud as a first-class API-key provider (#11955)
Add GMI Cloud (api.gmi-serving.com) as a full first-class API-key provider
with built-in auth, aliases, model catalog, CLI entry points, auxiliary client
routing, context length resolution, doctor checks, env var tracking, and docs.

- auth.py: ProviderConfig for 'gmi' (api_key, GMI_API_KEY / GMI_BASE_URL)
- providers.py: HermesOverlay with extra_env_vars for models.dev detection
- models.py: curated slash-form model catalog; live /v1/models fetch
- main.py: 'gmi' in _named_custom_provider_map and --provider choices
- model_metadata.py: _URL_TO_PROVIDER, _PROVIDER_PREFIXES, dedicated
  context-length probe block (GMI's /models has authoritative data)
- auxiliary_client.py: alias entries; _compat_model fix for slash-form
  models on cached aggregator-style clients; gmi aux default model
- doctor.py: GMI in provider connectivity checks
- config.py: GMI_API_KEY / GMI_BASE_URL in OPTIONAL_ENV_VARS
- conftest.py: explicit GMI_BASE_URL clearing (not caught by _API_KEY suffix)
- docs: providers.md, environment-variables.md, fallback-providers.md,
  configuration.md, quickstart.md (expands provider table)

Co-authored-by: Isaac Huang <isaachuang@Isaacs-MacBook-Pro.local>
2026-04-27 11:17:59 -07:00
Brooklyn Nicholson
7a6128cc4f fix(tui): harden active-session temp file handling
- create HERMES_TUI_ACTIVE_SESSION_FILE with mkstemp instead of a predictable tmp path and always cleanup in finally
- add assertions that launch wiring uses a randomized session file path and removes it on exit
2026-04-27 08:52:12 -07:00
Brooklyn Nicholson
4b28140912 fix(cli): tighten MRU lookup and session DB cleanup
- use a grouped last_active join in search_sessions to avoid per-row correlated max lookups
- always close SessionDB in _resolve_last_session via finally and add regression coverage for search failure cleanup
2026-04-27 08:52:12 -07:00
Brooklyn Nicholson
653b5ec128 fix(tui): report actual session on exit 2026-04-27 08:52:12 -07:00
Brooklyn Nicholson
164e33aa46 fix(cli): resolve -c by true MRU session
- order session listing by computed last_active in SessionDB so callers get MRU rows directly
- keep _resolve_last_session as a single-row lookup and add regression coverage for >20 session sampling
2026-04-27 08:52:12 -07:00
Teknium
817633bc5d
feat(backup): exclude SQLite WAL/SHM/journal sidecars (#16576)
The backup takes a consistent snapshot of each .db via sqlite3.backup(),
so shipping the live .db-wal / .db-shm / .db-journal alongside pairs the
fresh snapshot with stale sidecar state and produces a torn restore on
first open. Sidecars are transient and SQLite regenerates them on next
connection anyway.

This also trims multi-MB of junk from every zip — state.db-wal alone was
~9 MB here, doubled by the fact the WAL is the live write-ahead log, not
data.
2026-04-27 06:43:52 -07:00
Teknium
a9033c9220
feat(backup): exclude checkpoints/ from backups (#16572)
Session-local trajectory cache — keyed by session hash, regenerated
per-session, won't port to another machine anyway. On a large install
this was multiple GB of pure noise in every zip.

Also adds a regression test for the pre-existing backups/ exclusion
so the two machine-local dirs share coverage.
2026-04-27 06:40:18 -07:00
Teknium
ea3c5a14c3
feat(update): make pre-update backup opt-in (off by default) (#16566)
The zip backup could add minutes to every 'hermes update' on large
HERMES_HOME directories. Flip the default to off and add a --backup
flag for one-off opt-in runs.

- updates.pre_update_backup default: True -> False
- hermes update: new --backup flag (opposite of existing --no-backup)
- Silent no-op when disabled (no message spam on every update)
- Existing --no-backup still works and wins over --backup
- Users who explicitly set pre_update_backup: true keep the old behavior
- Tests updated to cover default-off, --backup opt-in, and config-enabled paths
2026-04-27 06:36:35 -07:00
Teknium
ec671c4154
feat(image-input): native multimodal routing based on model vision capability (#16506)
* feat(image-input): native multimodal routing based on model vision capability

Attach user-sent images as OpenAI-style content parts on the user turn when
the active model supports native vision, so vision-capable models see real
pixels instead of a lossy text description from vision_analyze.

Routing decision (agent/image_routing.py::decide_image_input_mode):

  agent.image_input_mode = auto | native | text  (default: auto)

In auto mode:
  - If auxiliary.vision.provider/model is explicitly configured, keep the
    text pipeline (user paid for a dedicated vision backend).
  - Else if models.dev reports supports_vision=True for the active
    provider/model, attach natively.
  - Else fall back to text (current behaviour).

Call sites updated: gateway/run.py (all messaging platforms), tui_gateway
(dashboard/Ink), cli.py (interactive /attach + drag-drop).

run_agent.py changes:
  - _prepare_anthropic_messages_for_api now passes image parts through
    unchanged when the model supports vision — the Anthropic adapter
    translates them to native image blocks. Previous behaviour
    (vision_analyze → text) only runs for non-vision Anthropic models.
  - New _prepare_messages_for_non_vision_model mirrors the same contract
    for chat.completions and codex_responses paths, so non-vision models
    on any provider get text-fallback instead of failing at the provider.
  - New _model_supports_vision() helper reads models.dev caps.

vision_analyze description rewritten: positions it as a tool for images
NOT already visible in the conversation (URLs, tool output, deeper
inspection). Prevents the model from redundantly calling it on images
already attached natively.

Config default: agent.image_input_mode = auto.

Tests: 35 new (test_image_routing.py + test_vision_aware_preprocessing.py),
all existing tests that reference _prepare_anthropic_messages_for_api
still pass (198 targeted + new tests green).

* feat(image-input): size-cap + resize oversized images, charge image tokens in compressor

Two follow-ups that make the native image routing safer for long / heavy
sessions:

1) Oversize handling in build_native_content_parts:
   - 20 MB ceiling per image (matches vision_tools._MAX_BASE64_BYTES,
     the most restrictive provider — Gemini inline data).
   - Delegates to vision_tools._resize_image_for_vision (Pillow-based,
     already battle-tested) to downscale to 5 MB first-try.
   - If Pillow is missing or resize still overshoots, the image is
     dropped and reported back in skipped[]; caller falls back to text
     enrichment for that image.

2) Image-token accounting in context_compressor:
   - New _IMAGE_TOKEN_ESTIMATE = 1600 (matches Claude Code's constant;
     within the realistic range for Anthropic/GPT-4o/Gemini billing).
   - _content_length_for_budget() helper: sums text-part lengths and
     charges _IMAGE_CHAR_EQUIVALENT (1600 * 4 chars) per image/image_url/
     input_image part.  Base64 payload inside image_url is NOT counted
     as chars — dimensions don't matter, only image-presence.
   - Both tail-cut sites (_prune_old_tool_results L527 and
     _find_tail_cut_by_tokens L1126) now call the helper so multi-image
     conversations don't slip past compression budget.

Tests: 9 new in test_image_routing.py (oversize triggers resize,
resize-fails-returns-None, oversize-skipped-reported), 11 new in
test_compressor_image_tokens.py (flat charge per image, multiple images,
Responses-API / Anthropic-native / OpenAI-chat shapes, no-inflation on
raw base64, bounds-check on the constant, integration test that an
image-heavy tail actually gets trimmed).

* fix(image-input): replace blanket 20MB ceiling with empirically-verified per-provider limits

The previous commit imposed a hardcoded 20 MB base64 ceiling on all
providers, triggering auto-resize on anything larger. This was wrong in
both directions:

  * Too loose for Anthropic — actual limit is 5 MB (returns HTTP 400
    'image exceeds 5 MB maximum' above that).
  * Too strict for OpenAI / Codex / OpenRouter — accept 49 MB+ without
    complaint (empirically verified April 2026 with progressive PNG
    sizes).

New behaviour:

  * _PROVIDER_BASE64_CEILING table: only anthropic and bedrock have a
    ceiling (5 MB, since bedrock-on-Claude shares Anthropic's decoder).
  * Providers NOT in the table get no ceiling — images attach at native
    size and we trust the provider to return its own error if it
    disagrees. A provider-specific 400 message is clearer than us
    guessing wrong and silently degrading image quality.
  * build_native_content_parts() gains a keyword-only provider arg;
    gateway/CLI/TUI pass the active provider so Anthropic users get
    auto-resize protection while OpenAI users don't pay it.
  * Resize target dropped from 5 MB to 4 MB to slide safely under
    Anthropic's boundary with header overhead.

Empirical measurements (direct API, no Hermes in the loop):

    image b64     anthropic   openrouter/gpt5.5   codex-oauth/gpt5.5
    0.19 MB       ✓           ✓                   ✓
    12.37 MB      ✗ 400 5MB   ✓                   ✓
    23.85 MB      ✗ 400 5MB   ✓                   ✓
    49.46 MB      ✗ 413       ✓                   ✓

Tests: rewrote TestOversizeHandling (5 tests): no-ceiling pass-through,
Anthropic resize fires, Anthropic skip on resize-fail, build_native_parts
routes ceiling by provider, unknown provider gets no ceiling. All 52
targeted tests pass.

* refactor(image-input): attempt native, shrink-and-retry on provider reject

Replace proactive per-provider size ceilings with a reactive shrink path
on the provider's actual rejection. All providers now attempt native
full-size attachment first; if the provider returns an image-too-large
error, the agent silently shrinks and retries once.

Why the previous design was wrong: hardcoding provider ceilings
(anthropic=5MB, others=unlimited) meant OpenAI users on a 10MB image
paid no tax, but Anthropic users lost quality on anything >5MB even
though the empirical behaviour at provider-reject time is the same
(shrink + retry). Baking the table into the routing layer also
requires updating Hermes every time a provider's limit changes.

Reactive design:
  - image_routing.py: _file_to_data_url encodes native size, no ceiling.
    build_native_content_parts drops its provider kwarg.
  - error_classifier.py: new FailoverReason.image_too_large + pattern
    match ("image exceeds", "image too large", etc.) checked BEFORE
    context_overflow so Anthropic's 5MB rejection lands in the right
    bucket.
  - run_agent.py: new _try_shrink_image_parts_in_messages walks api
    messages in-place, re-encodes oversized data: URL image parts
    through vision_tools._resize_image_for_vision to fit under 4MB,
    handles both chat.completions (dict image_url) and Responses
    (string image_url) shapes, ignores http URLs (provider-fetched).
    New image_shrink_retry_attempted flag in the retry loop fires the
    shrink exactly once per turn after credential-pool recovery but
    before auth retries.

E2E verified live against Anthropic claude-sonnet-4-6:
  - 17.9MB PNG (23.9MB b64) attached at native size
  - Anthropic returns 400 "image exceeds 5 MB maximum"
  - Agent logs '📐 Image(s) exceeded provider size limit — shrank and
    retrying...'
  - Retry succeeds, correct response delivered in 6.8s total.

Tests: 12 new (8 shrink-helper shapes + 4 classifier signals),
replaces 5 proactive-ceiling tests with 3 simpler 'native attach works'
tests. 181 targeted tests pass. test_enum_members_exist in
test_error_classifier.py updated for the new enum value.
2026-04-27 06:27:59 -07:00
Teknium
8ed599dc05
feat(update): auto-backup HERMES_HOME before hermes update (#16539)
Every 'hermes update' now runs a full backup of ~/.hermes/ first, so
users can always roll back to the exact state they had before the
update if anything goes wrong (corrupted sessions.db, broken skills,
config migrations that don't round-trip, etc.).

Changes:
- hermes_cli/backup.py: new create_pre_update_backup() helper. Writes
  to <HERMES_HOME>/backups/pre-update-<stamp>.zip using the same
  exclusion rules and SQLite safe-copy as 'hermes backup'. Auto-rotates
  (keep last N, pre-update-*.zip only — hand-dropped zips in backups/
  are untouched). Adds 'backups' to _EXCLUDED_DIRS so subsequent backups
  don't nest prior ones.
- hermes_cli/main.py: _run_pre_update_backup() wired into
  _cmd_update_impl before any git operation. Prints save path, restore
  command, and how to disable. Swallows failures so a broken backup
  never blocks the update itself. New --no-backup flag on 'hermes
  update' for one-off override.
- hermes_cli/config.py: new 'updates' section in DEFAULT_CONFIG with
  pre_update_backup (default true) and backup_keep (default 5).
  Auto-surfaces in the dashboard config UI.
- tests/hermes_cli/test_backup.py: +11 tests covering backup location,
  content parity with 'hermes backup', no-recursion, rotation, manual
  file preservation, config gate, --no-backup flag, flag-wins-over-config.
2026-04-27 05:36:19 -07:00
Teknium
bb00b783fb fix(cli): eliminate ghost status-bar + DSR input leaks from terminal drift
The CLI renders through prompt_toolkit in non-full-screen mode, so every
repaint uses the renderer's tracked _cursor_pos.y to cursor_up() + erase
before drawing the new frame. Any time that tracked position drifts from
terminal reality, redraws stack on top of stale content instead of
overwriting it. Four user-visible bugs share this root cause.

Fixes:

- #5474 (SIGWINCH ghosts): the resize wrapper previously only handled
  column-shrink reflow. Generalize it to force a full screen-clear
  (erase_screen + cursor_goto(0,0)) and renderer.reset() on every resize
  — covers widen, row-shrink, and multiplexer SIGWINCH-less redraws.

- #8688 (cmux/tmux tab switch): no SIGWINCH fires on focus regain, so
  prompt_toolkit has no signal to recover. Add a _force_full_redraw()
  helper, bound to Ctrl+L (standard bash/zsh/vim convention) and exposed
  as /redraw. Users can manually clear drift without restarting Hermes.

- #14692 (DSR response leaks — ^[[53;1R): resize storms make
  prompt_toolkit's CSI 6n queries race past the input parser; the
  terminal's reply ends up as literal input text. Add a sibling of the
  bracketed-paste sanitizer that strips \x1b[<row>;<col>R and the
  caret-escape visible form from paste text, buffer text-filter, and
  the input-processing loop.

The idle-redraw removal (#12641) is in the preceding commit from
@foxion37 — keeping them as separate commits preserves attribution.
2026-04-27 05:31:47 -07:00
Teknium
90a3e73daf
fix(debug): sweep expired paste.rs uploads on a real timer (#16431)
Previously 'hermes debug share' uploads only got DELETEd when the user
ran 'hermes debug share' again — opportunistic-sweep-on-invoke was the
only cleanup path. A user who uploaded once and never ran debug again
left pastes up until paste.rs's retention kicked in (which, empirically,
never actually expires them).

Hook _sweep_expired_pastes into the gateway cron ticker at the same
hourly cadence as the image/document cache cleanups. The opportunistic
sweep in 'hermes debug share' stays as a fallback for CLI-only users
who never start the gateway.
2026-04-27 00:36:33 -07:00
Teknium
21f503c23c
feat(update): snapshot pairing data before git pull (#16383)
Quick state snapshot now includes pairing JSONs (generic + legacy +
Feishu comment pairing), and `hermes update` takes a pre-update
snapshot labeled `pre-update` before pulling.

Pairing data lives outside state.db in platform-specific JSONs under
~/.hermes/pairing/, ~/.hermes/platforms/pairing/, and
~/.hermes/feishu_comment_pairing.json.  The update command already
couldn't touch $HERMES_HOME, but #15733 reports lost pairing after
an update — this gives users something to restore from via
`/snapshot list` / `/snapshot restore <id>` if anything clobbers
the approved-user lists.

- Extend _QUICK_STATE_FILES with pairing paths (files + dirs)
- Snapshot walks directories recursively and records each file in the
  manifest individually so restore logic is unchanged
- _cmd_update_impl calls create_quick_snapshot(label='pre-update')
  after 'Found N new commits' and before 'Pulling updates'
- Snapshot failures are logged at debug and never block the update

Refs #15733.
2026-04-27 00:19:12 -07:00
Teknium
8258f4dcb7
fix(model): avoid persisting key_env-resolved secrets to providers entry (#16372)
When 'hermes model' runs against a providers: (keyed-schema) entry that
relies only on key_env, the picker resolves the env var for the live
/models request and then wrote a synthesized 'api_key: ${KEY_ENV}' back
to the providers.<key> entry. That's redundant — the runtime already
resolves from key_env directly — and it clutters configs that
intentionally keep credentials out of config.yaml.

Only persist provider_entry['api_key'] when the user originally had an
inline value (literal secret or ${VAR} template). Entries that declared
only key_env stay clean on save.

Fixes #15803.
2026-04-26 21:52:12 -07:00
Teknium
c5781d50c7
fix(azure-foundry): auto-route gpt-5.x / codex / o-series to Responses API (#16361)
Azure Foundry deploys GPT-5.x, codex-*, and o1/o3/o4 reasoning models as
Responses-API-only.  Calling /chat/completions against these deployments
returns 400 'The requested operation is unsupported.', which broke any
user who ran 'hermes model' on Azure, picked a gpt-5/codex deployment,
and kept the default api_mode: chat_completions.  Verified in a user
debug bundle on 2026-04-26: gpt-5.3-codex failed on synopsisse.openai.azure.com
with that exact payload while gpt-4o-pure on the same endpoint worked.

Adds azure_foundry_model_api_mode(model_name) that returns
codex_responses when the model name starts with gpt-5, codex, o1, o3,
or o4 — otherwise None so chat_completions / anthropic_messages stay
untouched for gpt-4o, Llama, Claude-via-Anthropic, etc.

Resolver (both the direct Azure Foundry path and the pool-entry path)
consults it and upgrades api_mode unless the user explicitly picked
anthropic_messages.  target_model (from /model mid-session switch)
takes precedence over the persisted default so switching from gpt-4o
to gpt-5.3-codex routes correctly before the next request.

Docs: correct the azure-foundry guide which previously claimed Azure
keeps gpt-5.x on chat completions — that was only true for early Azure
OpenAI, not Azure Foundry codex/o-series deployments.

Tests: 14 unit tests for azure_foundry_model_api_mode + 6 integration
tests in TestAzureFoundryResolution covering Bob's exact scenario,
target_model override, anthropic_messages guard, and o3-mini.
2026-04-26 21:33:31 -07:00
brooklyn!
e63929d4f3
Merge pull request #15926 from NousResearch/bb/tui-long-session-perf
perf(tui): stabilize long-session scrolling
2026-04-26 23:10:08 -05:00
Teknium
9c416e20ab
feat(skills): install skills from a direct HTTP(S) URL (#16323)
* feat(skills): install skills from a direct HTTP(S) URL

Adds UrlSource adapter so `hermes skills install <url-to-SKILL.md>` and
`/skills install <url>` work as first-class operations — no more
improvising with curl + patch + cp.

- Claims identifiers that start with http(s):// and end in .md
- Skips /.well-known/skills/ URLs (WellKnownSkillSource handles those)
- Skill name from YAML frontmatter, URL-slug fallback
- Single-file SKILL.md only (v1 scope — multi-file skills need a manifest)
- Trust level 'community'; full security scan still runs
- Lock file stores the URL as identifier so `hermes skills update`
  re-fetches from the same URL cleanly

Scope matches real user need from @versun's docx feedback where
`https://sharethis.chat/SKILL.md` had no first-class install path.

* feat(skills): interactive name/category for URL installs + --name override

Follow-up to the UrlSource adapter. The previous commit fell back to weak
heuristics when frontmatter had no ``name:`` and could produce garbage names
like ``SKILL`` or ``unnamed-skill``. Now:

tools/skills_hub.py
- ``UrlSource._is_valid_skill_name()`` — strict identifier check
  (``^[a-z][a-z0-9_-]*$``), rejects sentinel values (``SKILL``, ``README``,
  ``INDEX``, ``unnamed-skill``, empty, non-strings).
- ``_resolve_skill_name()`` returns ``Optional[str]`` — ``None`` when
  nothing valid is resolvable. Also ignores unsafe frontmatter names
  (``../evil``) and falls through to URL slug instead of returning None
  immediately, so a URL with a bad frontmatter but a good path still
  works.
- ``fetch()``/``inspect()`` carry an ``awaiting_name=True`` marker in
  metadata/extra when resolution fails, letting ``do_install`` decide
  whether to prompt, apply an override, or error out.

hermes_cli/skills_hub.py
- ``do_install`` gains a ``name_override`` parameter.
- On URL-sourced bundles with ``awaiting_name=True``:
  1. If ``name_override`` is valid → use it.
  2. If ``name_override`` is invalid → refuse with a clear error.
  3. Else if ``skip_confirm=True`` (non-interactive: slash / TUI /
     gateway / scripts) → refuse with an actionable retry hint pointing
     at ``--name <your-name>`` on both CLI and slash forms.
  4. Else (interactive TTY) → prompt for the name.
- Interactive TTY also prompts for a category when none is given for a
  URL-sourced install, hinting existing category buckets so users can
  reuse ``productivity``, ``devops``, etc. Empty input → flat install.
- ``_existing_categories()`` scans ``~/.hermes/skills/`` for subdirs that
  look like category buckets (contain nested SKILL.md files); skips
  top-level skills and hidden dirs.
- ``_prompt_for_skill_name()`` / ``_prompt_for_category()`` helpers
  (EOF/Ctrl-C-safe, match the existing ``Confirm [y/N]`` prompt style).

hermes_cli/main.py
- ``hermes skills install`` argparse gains ``--name <name>``.

hermes_cli/skills_hub.py (slash)
- ``/skills install <url> --name <x>`` parsing added.

Tests
- tests/tools/test_skills_hub.py: updated ``UrlSource`` tests to assert
  the new ``awaiting_name`` metadata; added 4 new tests for
  ``_is_valid_skill_name`` rejection sets and the awaiting-name marker.
- tests/hermes_cli/test_skills_hub.py: 8 new tests covering --name
  override accept/reject, non-interactive error, interactive name prompt,
  interactive category prompt, cancel-aborts-install, and
  ``_existing_categories`` scan behavior (buckets vs flat skills).
- E2E verified all four paths (no-name/no-override → error;
  --name override → install; frontmatter name → install;
  invalid --name → rejection).

---------

Co-authored-by: teknium1 <teknium@noreply.github.com>
2026-04-26 20:57:10 -07:00
Teknium
366351b94d refactor(timeouts): drop redundant ImportError in except clause
Exception already covers ImportError; (ImportError, Exception) was a
cosmetic wart from the bugfix. Pure no-op.
2026-04-26 20:48:20 -07:00
sprmn24
16e243e067 fix(timeouts): guard load_config() call against runtime exceptions
Both get_provider_request_timeout() and get_provider_stale_timeout()
wrapped the load_config import in try/except ImportError but left the
actual load_config() call unprotected. A corrupt config file, YAML
parse error, or permission failure would raise instead of returning
None safely.

Move load_config() inside the try block so any exception returns None.
2026-04-26 20:48:20 -07:00
Brooklyn Nicholson
3e1664923d Revert "fix(tui): report actual session on exit"
This reverts commit 1566f1eecc.
2026-04-26 22:43:34 -05:00
Brooklyn Nicholson
c23463fce9 chore(tui): keep MRU resume split out of perf PR
- remove the temporary -c MRU logic and companion test from this branch so PR #15926 stays focused on TUI perf work
- keep the resume-ordering change isolated in the dedicated follow-up PR
2026-04-26 22:40:35 -05:00
Brooklyn Nicholson
625c31fcea fix(tui): run built TUI with production React by default
CPU profiling showed the built TUI loading React development modules unless NODE_ENV was set. Default CLI and dashboard TUI children to production while preserving explicit user overrides.
2026-04-26 21:34:31 -05:00
Brooklyn Nicholson
7da2f07641 Merge remote-tracking branch 'origin/main' into bb/tui-long-session-perf 2026-04-26 21:07:15 -05:00
Teknium
478444c262
feat(checkpoints): auto-prune orphan and stale shadow repos at startup (#16303)
Every working dir hermes ever touches gets its own shadow git repo under
~/.hermes/checkpoints/{sha256(abs_dir)[:16]}/.  The per-repo _prune is a
no-op (comment in CheckpointManager._prune says so), so abandoned repos
from deleted/moved projects or one-off tmp dirs pile up forever.  Field
reports put the typical offender at 1000+ repos / ~12 GB on active
contributor machines.

Adds an opt-in startup sweep that mirrors the sessions.auto_prune
pattern from #13861 / #16286:

- tools/checkpoint_manager.py: new prune_checkpoints() and
  maybe_auto_prune_checkpoints() helpers.  Deletes shadow repos that
  are orphan (HERMES_WORKDIR marker points to a path that no longer
  exists) or stale (newest in-repo mtime older than retention_days).
  Idempotent via a CHECKPOINT_BASE/.last_prune marker file so it only
  runs once per min_interval_hours regardless of how many hermes
  processes start up.
- hermes_cli/config.py: new checkpoints.auto_prune /
  retention_days / delete_orphans / min_interval_hours knobs.
  Default auto_prune: false so users who rely on /rollback against
  long-ago sessions never lose data silently.
- cli.py / gateway/run.py: startup hooks gated on checkpoints.auto_prune,
  called right next to the existing state.db maintenance block.
- Docs updated with the new config knobs.
- 11 regression tests: orphan/stale deletion, precedence, byte-freed
  tracking, non-shadow dir skip, interval gating, corrupt marker
  recovery.

Refs #3015 (session-file disk growth was fixed in #16286; this covers
the checkpoint side noted out-of-scope there).
2026-04-26 19:05:52 -07:00
Teknium
87610ce380 fix(tools): coerce quoted use_gateway in image_gen UI detection
Follow-up to #15960 — the provider-active detection in tools_config.py
also read use_gateway with raw truthiness (is False, not dict.get), so
quoted 'false' caused the FAL-direct row to show wrong active status in
the hermes tools picker. Route both sites through is_truthy_value().
2026-04-26 19:02:55 -07:00
Yoimex
f66ebe64e8 fix(cli): coerce use_gateway config flags in tool routing 2026-04-26 19:02:55 -07:00
Teknium
34eb1aaa9a
fix(update): use npm ci to stop rewriting package-lock on every update (#16295)
`npm install --silent` (used by `_build_web_ui` and `_update_node_dependencies`)
silently rewrites package-lock.json on npm ≥ 10 (strips "peer": true etc.),
leaving the working tree dirty after every `hermes update`. The next update
then detects the dirty lockfile and stashes it — producing a trail of
hermes-update-autostash entries for web/package-lock.json, ui-tui/package-lock.json,
and root package-lock.json.

Switch to `npm ci` (strict, lockfile-preserving) via a new
`_run_npm_install_deterministic` helper that falls back to `npm install`
when the lockfile is missing or out of sync (WIP forks).

Verified locally: all three lockfiles stay byte-identical after the real
_build_web_ui / _update_node_dependencies run twice back-to-back. Fallback
path tested with a deliberately out-of-sync lockfile and a no-lockfile case.
2026-04-26 18:51:31 -07:00
Teknium
ab6879634e
yuanbao platform (#16298)
Co-authored-by: loongzhao <loongzhao@tencent.com>
2026-04-26 18:50:49 -07:00
Teknium
5eb6cd82b2
fix(sessions): /save lands under $HERMES_HOME, widen browse+TUI picker, force-refresh ollama-cloud on setup (#16296)
Four independent session-UX bugs reported by an external user (#16294).

/save wrote hermes_conversation_<ts>.json to CWD — invisible to
'hermes sessions browse' and easy to lose. Snapshots now write under
~/.hermes/sessions/saved/ and the command prints the absolute path plus
a 'hermes --resume <id>' hint for the live DB-indexed session.

'hermes sessions browse' default --limit raised from 50 to 500. With the
old ceiling, users with moderately long histories saw only the most
recent 50 rows and assumed older sessions had been lost.

TUI session.list (`/resume` picker) switched from a hardcoded allow-list
of 13 gateway source names to a deny-list of just { 'tool' }. Sessions
tagged acp / webhook / user-defined HERMES_SESSION_SOURCE values and
any newly-added platform now surface. Default limit 20 → 200.

ollama-cloud provider setup passes force_refresh=True to
fetch_ollama_cloud_models() so a user entering their API key sees the
fresh catalog (e.g. deepseek v4 flash, kimi k2.6) immediately instead
of waiting up to an hour for the disk cache TTL to expire.

Closes #16294.
2026-04-26 18:49:48 -07:00
Teknium
55e9329ee6 feat(config): register bundled-skill API keys in OPTIONAL_ENV_VARS
Adds NOTION_API_KEY, LINEAR_API_KEY, TENOR_API_KEY, and AIRTABLE_API_KEY
to OPTIONAL_ENV_VARS so:

- They persist to ~/.hermes/.env via save_env_value like every other
  key Hermes knows about, instead of being ad-hoc variables the user
  has to hand-edit the dotfile for.
- load_env() / reload_env() populate os.environ from .env on every
  startup — the user sets the key once, skills keep working across
  restarts without losing access.
- hermes setup / hermes config show surface them as known optional
  vars with the correct signup URL (linear.app/settings/api,
  airtable.com/create/tokens, etc.).

These four entries use category="skill" (new) rather than "tool".
tools/environments/local.py auto-adds every category=tool/messaging
entry to _HERMES_PROVIDER_ENV_BLOCKLIST, which stops env passthrough
from leaking provider credentials into the execute_code sandbox
(GHSA-rhgp-j443-p4rf). Skill API keys are the opposite case — the
point is for the agent's subprocess to see them so curl can read
Authorization headers — so they must be outside the blocklist. The
new category is inert for that check.

All four entries are advanced=True: they show up in 'hermes config'
and 'hermes status' displays, but do not nag users who have never
touched those skills during setup checklists.

E2E verified: save_env_value → reload_env → os.environ populated →
skill_view reports setup_needed=False → env_passthrough registers
the key for subprocess inheritance.
2026-04-26 18:45:15 -07:00
George Glessner
5b5a53a155 fix(cli): check hermes_cli/web_dist/ not web/dist/ for build staleness
_web_ui_build_needed() in PR #14914 checked web_dir/"dist" as the
sentinel, but vite.config.ts sets outDir: "../hermes_cli/web_dist" so
the build output lands in hermes_cli/web_dist/, never in web/dist/.
The sentinel was therefore always missing → _web_ui_build_needed always
returned True → npm install + Vite build ran on every startup → OOM on
low-memory VPS persisted unchanged.

Fix: derive dist_dir as web_dir.parent / "hermes_cli" / "web_dist" so
the sentinel points to the actual build output directory.

Fixes #14898
2026-04-26 18:43:57 -07:00
Yang Zhi
3b60abb6bb fix(sessions): delete on-disk transcript files during prune and delete (#3015)
`delete_session()` and `prune_sessions()` only removed SQLite records,
leaving .json/.jsonl transcript files on disk forever. Over time this
causes unbounded disk growth (~27MB/day observed).

Changes:
- Add `_remove_session_files()` static helper that cleans up
  `{session_id}.json`, `.jsonl`, and `request_dump_{session_id}_*.json`
- `delete_session()` accepts optional `sessions_dir` param and removes
  files for the deleted session and its children
- `prune_sessions()` accepts optional `sessions_dir` param and removes
  files for all pruned sessions after the DB transaction
- Wire up CLI `hermes sessions delete` and `hermes sessions prune` to
  pass `sessions_dir`
- File cleanup is best-effort (OSError silenced) so DB operations are
  never blocked by filesystem issues
- Fully backward-compatible: `sessions_dir=None` (default) preserves
  existing behavior
2026-04-26 18:31:07 -07:00
Teknium
635253b918
feat(busy): add 'steer' as a third display.busy_input_mode option (#16279)
Enter while the agent is busy can now inject the typed text via /steer —
arriving at the agent after the next tool call — instead of interrupting
(current default) or queueing for the next turn.

Changes:
- cli.py: keybinding honors busy_input_mode='steer' by calling
  agent.steer(text) on the UI thread (thread-safe), with automatic
  fallback to 'queue' when the agent is missing, steer() is unavailable,
  images are attached, or steer() rejects the payload. /busy accepts
  'steer' as a fourth argument alongside queue/interrupt/status.
- gateway/run.py: busy-message handler and the PRIORITY running-agent
  path both route through running_agent.steer() when the mode is 'steer',
  with the same fallback-to-queue safety net. Ack wording tells users
  their message was steered into the current run. Restart-drain queueing
  now also activates for 'steer' so messages aren't lost across restarts.
- agent/onboarding.py: first-touch hint has a steer branch for both
  CLI and gateway.
- hermes_cli/commands.py: /busy args_hint updated to include steer,
  and 'steer' is registered as a subcommand (completions).
- hermes_cli/web_server.py: dashboard select widget offers steer.
- hermes_cli/config.py, cli-config.yaml.example, hermes_cli/tips.py:
  inline docs updated.
- website/docs/user-guide/cli.md + messaging/index.md: documented.
- Tests: steer set/status path for /busy; onboarding hints;
  _load_busy_input_mode accepts steer; busy-session ack exercises
  steer success + two fallback-to-queue branches.

Requested on X by @CodingAcct.

Default is unchanged (interrupt).
2026-04-26 18:21:29 -07:00
Brooklyn Nicholson
bde89c169b fix(cli): -c picks the most recently used session 2026-04-26 16:17:39 -05:00
Brooklyn Nicholson
1566f1eecc fix(tui): report actual session on exit 2026-04-26 15:55:01 -05:00
Teknium
4921b26945
fix(cron): keep homeassistant toolset enabled when HASS_TOKEN is set (#16208)
After #14798 made cron honor per-platform `hermes tools` config, the
`_DEFAULT_OFF_TOOLSETS` filter silently stripped `homeassistant` from
cron jobs for users who'd been relying on the previous blanket toolset.
Norbert's HA cron reports regressed as a result.

The HA toolset is already runtime-gated by its `check_fn` (requires
HASS_TOKEN to register any tools). When HASS_TOKEN is set the user has
explicitly opted in — `_DEFAULT_OFF_TOOLSETS` adds nothing in that case,
so stop double-gating and restore HA for cron / cli / other platforms
without an explicit saved toolset list.

moa and rl stay off by default (original #14798 goal preserved).

Fixes HA cron regression reported by Norbert.
2026-04-26 12:55:58 -07:00
Teknium
541cd732e8
chore(models): drop deepseek from OpenRouter and Nous Portal curated picker lists (#16197)
Removes deepseek/deepseek-v4-pro and deepseek/deepseek-v4-flash from
OPENROUTER_MODELS and _PROVIDER_MODELS['nous'], then regenerates
website/static/api/model-catalog.json so the hosted picker JSON drops
them too. Direct-API deepseek provider support is unchanged.
2026-04-26 12:28:17 -07:00
Teknium
897dc3a2bb
fix(install+update): add /usr/local/bin PATH guard for RHEL root non-login shells (#16191)
* fix(install): add /usr/local/bin PATH guard for RHEL root non-login shells

The FHS-layout branch assumed /usr/local/bin is on PATH for every
standard shell. That holds for login shells (via /etc/profile's
pathmunge) but breaks on RHEL/CentOS/Rocky/Alma 8+ root in non-login
interactive shells (su, sudo -s, tmux panes, some web terminals) —
/etc/bashrc does not add /usr/local/bin and /root/.bash_profile
doesn't either. Result: hermes command links to /usr/local/bin/hermes
but the user has to type the absolute path each time.

Probe a fresh 'bash -i -c' (non-login interactive, matching the user
scenario) after symlinking. If hermes isn't resolvable, append an
idempotent PATH guard to /root/.bashrc and /root/.bash_profile, same
grep pattern already used by the ~/.local/bin branch below. No change
on distros where /usr/local/bin is already inherited.

* fix(update): repair RHEL root PATH on hermes update

Existing RHEL/CentOS/Rocky/Alma root installs won't be repaired by the
install.sh fix alone because 'hermes update' is an in-place git pull, not
a rerun of install.sh. Port the same probe + idempotent .bashrc write
into cmd_update so affected users get fixed automatically on next update.

_ensure_fhs_path_guard() runs after 'Update complete!':
- Linux + root + FHS-layout install (command at /usr/local/bin/hermes) only
- Probe: env -i bash -i -c 'command -v hermes' — fresh non-login interactive
  shell, same scenario the user reports
- On failure, append PATH guard to /root/.bashrc and /root/.bash_profile,
  skipping if any uncommented PATH line already mentions /usr/local/bin
- Silent no-op on macOS, non-root, legacy layout, or shells that already
  resolve hermes
2026-04-26 12:22:37 -07:00
Teknium
087e74d4d7
feat(slack): register every gateway command as a native slash (Discord/Telegram parity) (#16164)
Every command in COMMAND_REGISTRY (/btw, /stop, /model, /help, /new,
/bg, /reset, ...) is now a first-class Slack slash command instead of
a /hermes <subcommand>. Users get the same autocomplete-driven slash
picker experience Slack users expect and that Discord and Telegram
already provide.

Previously Slack registered ONE native slash (/hermes) and split on
the first word, so typing /btw in Slack's composer got 'couldn't find
an app for /btw' because the workspace manifest never declared it.

Changes
- hermes_cli/commands.py: slack_native_slashes() + slack_app_manifest()
  generate a Slack manifest from the registry (canonical names +
  aliases + plugin commands), clamped to Slack's 50-slash cap with
  /hermes reserved as the catch-all.
- gateway/platforms/slack.py: single regex matcher dispatches every
  registered slash to _handle_slash_command, which dispatches on
  command['command']. Legacy /hermes <subcommand> keeps working for
  backward compat with older workspace manifests.
- hermes_cli/slack_cli.py + hermes_cli/main.py: new 'hermes slack
  manifest' command prints/writes a full manifest (display info,
  OAuth scopes, event subs, socket mode, slash commands) ready to
  paste into 'Create from manifest' or Features → App Manifest.
- hermes_cli/setup.py: _setup_slack() now writes the manifest up-front
  and points users at the 'From an app manifest' flow; also offers
  to refresh the manifest on reconfigure for picking up new commands.
- Tests: 14 new tests covering native-slash dispatch (/btw, /stop,
  /model), legacy /hermes <sub> compat, manifest structure, and
  telegram<->slack parity (every Telegram command must also register
  as a Slack slash). Existing /hermes-registration test updated to
  assert the new regex matches /hermes, /btw, /stop, /model, /help.
- Docs: slack.md gains a 'Slash Commands' section + Option A manifest
  flow in Step 1; cli-commands.md documents 'hermes slack manifest'.

Users pick up the new slashes by running 'hermes slack manifest --write'
and pasting into Features → App Manifest → Edit in their Slack app
config, then Save (Slack prompts for reinstall if scopes changed).
2026-04-26 11:38:32 -07:00
Teknium
42c076d349
feat(browser): auto-spawn local Chromium for LAN/localhost URLs in cloud mode (#16136)
When a cloud browser provider (Browserbase / Browser-Use / Firecrawl) is
configured, browser_navigate now transparently spawns a local Chromium
sidecar for URLs whose host resolves to a private/loopback/LAN address
(localhost, 127.0.0.1, 192.168.x.x, 10.x.x.x, *.local, *.lan, *.internal,
::1, 169.254.x.x). Public URLs continue to use the cloud provider in the
same conversation.

Previously, setting BROWSERBASE_API_KEY / cloud_provider: browserbase
pinned the whole tool to cloud for the process — localhost URLs were
either SSRF-blocked (default) or sent to Browserbase (where they 404'd
because the cloud can't reach your LAN). Users who wanted 'cloud for
public, local for localhost' had no way to express it short of toggling
providers mid-session.

Implementation uses a composite session key scheme: the bare task_id
serves the cloud session, and a '{task_id}::local' sidecar serves the
local Chromium. _last_active_session_key[task_id] tracks which of the
two served the most recent nav so snapshot/click/fill/etc. hit the
correct one. cleanup_browser(bare_task_id) reaps both.

Feature is on by default. Opt out via:
  browser:
    auto_local_for_private_urls: false

The cloud provider never sees private URLs. Post-redirect SSRF guard
is preserved: redirects from public onto private addresses still block.
2026-04-26 09:57:58 -07:00
Teknium
0e2a53eab2
feat(skills): show enabled/disabled status in 'skills list' (#16129)
'hermes skills list' now shows every skill's enabled/disabled status
and accepts --enabled-only to filter down to what will actually load
for the active profile:

    hermes -p dario skills list --enabled-only

Previously the command was a flat catalog — it did not apply
skills.disabled from config.yaml, so there was no way to see the
live skill set for a profile without reading config by hand.
Profile switching already works via -p (swaps HERMES_HOME); this
just surfaces the result visibly.

Changes:
- hermes_cli/skills_hub.py: do_list adds a Status column and an
  enabled_only filter; summary reports enabled/disabled split
- hermes_cli/main.py: --enabled-only flag on 'skills list'
- /skills list slash command accepts --enabled-only too
- tests: 4 new (status column, disabled marking, enabled-only
  hiding, no platform leakage into get_disabled_skill_names);
  existing fixtures updated to accept skip_disabled kwarg

Reported by @mochizukimr on X.
2026-04-26 09:20:53 -07:00
Teknium
f2d655529a fix(auth): hoist get_env_value import + strengthen .env fallback tests
Follow-up to cherry-picked PR #15920:

- agent/credential_pool.py: hoist 'from hermes_cli.config import get_env_value'
  to module top instead of inline try/except in each seed site (3 sites).
  No import cycle — hermes_cli/config.py doesn't depend on agent.credential_pool.
- hermes_cli/auth.py: same hoist for the _resolve_api_key_provider_secret loop.
- tests/tools/test_credential_pool_env_fallback.py: replace smoke-only tests
  with real .env file I/O. Each test writes a temp ~/.hermes/.env, verifies
  _seed_from_env / _resolve_api_key_provider_secret read from it, and asserts
  the full priority chain: os.environ > .env > credential_pool. Uses
  'deepseek' as the test provider since 'openai' isn't in PROVIDER_REGISTRY
  and _seed_from_env's generic path requires a real pconfig lookup.
2026-04-26 08:32:09 -07:00
阿泥豆
8443998dc3 fix(auth): resolve API keys from ~/.hermes/.env and credential_pool
_resolve_api_key_provider_secret() and _seed_from_env() only checked
os.environ for provider API keys. When keys exist in ~/.hermes/.env but
are not loaded into the process environment (e.g. ACP adapter entry
point, post-session-start .env edits, or non-CLI entry points), the
resolution returns an empty string, causing HTTP 401 failures.

Changes:
- credential_pool._seed_from_env: use get_env_value() which checks both
  os.environ and ~/.hermes/.env file, preventing _prune_stale_seeded_entries
  from removing valid entries whose env var isn't in os.environ
- credential_pool._seed_from_env: same fix for openrouter and
  base_url_env_var resolution
- auth._resolve_api_key_provider_secret: use get_env_value() instead of
  os.getenv(), and add credential_pool fallback when env resolution fails

Fixes #15914
2026-04-26 08:32:09 -07:00
Teknium
06f81752ed
Revert "feat(kanban): durable multi-profile collaboration board (#16081)" (#16098)
This reverts commit 15937a6b46.
2026-04-26 08:29:37 -07:00
Teknium
15937a6b46
feat(kanban): durable multi-profile collaboration board (#16081)
New `hermes kanban` CLI subcommand + `/kanban` slash command + skills for
worker and orchestrator profiles. SQLite-backed task board
(~/.hermes/kanban.db) shared across all profiles on the host. Zero
changes to run_agent.py, no new core tools, no tool-schema bloat.

Motivation: delegate_task is a function call — sync fork/join, anonymous
subagent, no resumability, no human-in-the-loop. Kanban is the durable
shape needed for research triage, scheduled ops, digital twins,
engineering pipelines, and fleet work. They coexist (workers may call
delegate_task internally).

What this adds
- hermes_cli/kanban_db.py — schema, CAS claim, dependency resolution,
  dispatcher, workspace resolution, worker-context builder.
- hermes_cli/kanban.py — 15-verb CLI surface and shared run_slash()
  entry point used by both CLI and gateway.
- skills/devops/kanban-worker — how a profile should work a claimed task.
- skills/devops/kanban-orchestrator — "you are a dispatcher, not a
  worker" template with anti-temptation rules.
- /kanban slash command wired into cli.py and gateway/run.py. Bypasses
  the running-agent guard (board writes don't touch agent state), so
  /kanban unblock can free a stuck worker mid-conversation.
- Design spec at docs/hermes-kanban-v1-spec.pdf — comparative analysis
  vs Cline Kanban, Paperclip, NanoClaw, Gemini Enterprise; 8 patterns;
  4 user stories; implementation plan; concurrency correctness.
- Docs: website/docs/user-guide/features/kanban.md, CLI reference
  updated, sidebar entry added.

Architecture highlights
- Three planes: control (user + gateway), state (board + dispatcher),
  execution (pool of profile processes).
- Every worker is a full OS process, spawned as `hermes -p <profile>`.
  No in-process subagent swarms — solves NanoClaw's SDK-lifecycle
  failure class.
- Atomic claim via SQLite CAS in a BEGIN IMMEDIATE transaction; stale
  claims reclaimed 15 min after their TTL expires.
- Tenant namespacing via one nullable column — one specialist fleet
  can serve many businesses with data isolation by workspace path.

Tests: 60 targeted tests (schema, CAS atomicity, dependency resolution,
dispatcher, workspace kinds, tenancy, CLI + slash surface). All pass
hermetic via scripts/run_tests.sh.
2026-04-26 08:24:26 -07:00
Teknium
7fa70b6c87
refactor: /btw is now an alias for /background (#16053)
The ephemeral no-tools side-question variant of /btw confused users who
expected 'by-the-way' to mean 'run this off to the side with tools' —
they'd type /btw and get a toolless agent that couldn't do the work.
/bg worked because it was /background with full tools.

Collapse the two: /btw and /bg both alias to /background. One command,
one behavior, no more gotchas about which variant has tools.

Removed:
- _handle_btw_command in cli.py and gateway/run.py
- _run_btw_task + _active_btw_tasks state in gateway/run.py
- prompt.btw JSON-RPC method + btw.complete event in tui_gateway
- BtwStartResponse type + btw.complete case in ui-tui
- Standalone /btw slash tree registration in Discord
- Standalone btw CommandDef in hermes_cli/commands.py

Updated:
- background CommandDef aliases: (bg,) -> (bg, btw)
- TUI session.ts: local btw handler merged into background
- Docs and tips updated to describe /btw as a /background alias
2026-04-26 07:11:08 -07:00
Teknium
1e37ddc929
feat(cli): add 'hermes fallback' command to manage fallback providers (#16052)
Manage the fallback_providers chain from the CLI instead of hand-editing
config.yaml. The picker reuses select_provider_and_model() from 'hermes
model' — same provider list, same credential prompts, same model picker.

  hermes fallback [list]   Show the current chain (primary + fallbacks)
  hermes fallback add      Run the model picker, append selection to chain
  hermes fallback remove   Pick an entry to delete (arrow-key menu)
  hermes fallback clear    Remove all entries (with confirmation)

'add' snapshots config['model'] before calling the picker, extracts the
user's selection from the post-picker state, then restores the primary
and appends {provider, model, base_url?, api_mode?} to fallback_providers.
Auth store's active_provider is snapshot/restored too so OAuth-provider
fallbacks don't silently deactivate the user's primary. Duplicates and
self-as-fallback are rejected. Legacy single-dict 'fallback_model' entries
are auto-migrated to the list format on first write.
2026-04-26 06:19:04 -07:00
Teknium
83c1c201f6
feat(onboarding): contextual first-touch hints for /busy and /verbose (#16046)
Instead of a blocking first-run questionnaire, show a one-time hint the first
time the user hits each behavior fork:

1. First message while the agent is working — appends a hint to the busy-ack
   explaining the /busy queue vs /busy interrupt knob, phrased to match the
   mode that was just applied (don't tell a queue-mode user to switch to
   queue).

2. First tool that runs for >= 30s in the noisiest progress mode
   (tool_progress: all) — prints a hint about /verbose to cycle display
   modes (all -> new -> off -> verbose). Gated on /verbose actually being
   usable on the surface: always shown on CLI; on gateway only shown when
   display.tool_progress_command is enabled.

Each hint is latched in config.yaml under onboarding.seen.<flag>, so it
fires exactly once per install across CLI, gateway, and cron, then never
again. Users can wipe the section to re-see hints.

New:
- agent/onboarding.py — is_seen / mark_seen / hint strings, shared by
  both CLI and gateway.
- onboarding.seen in DEFAULT_CONFIG (hermes_cli/config.py) and in
  load_cli_config defaults (cli.py). No _config_version bump — deep
  merge handles new keys.

Wired:
- gateway/run.py: _handle_active_session_busy_message appends the hint
  after building the ack.  progress_callback tracks tool.completed
  duration and queues the tool-progress hint into the progress bubble.
- cli.py: CLI input loop appends the busy-input hint on the first busy
  Enter; _on_tool_progress appends the tool-progress hint on the first
  >=30s tool completion.  In-memory CLI_CONFIG is also updated so
  subsequent fires in the same process are suppressed immediately.

All writes go through atomic_yaml_write and are wrapped in try/except
so onboarding can never break the input/busy-ack paths.
2026-04-26 06:06:27 -07:00
Teknium
855366909f
feat(models): remote model catalog manifest for OpenRouter + Nous Portal (#16033)
OpenRouter and Nous Portal curated picker lists now resolve via a JSON
manifest served by the docs site, falling back to the in-repo snapshot
when unreachable. Lets us update model lists without shipping a release.

Live URL: https://hermes-agent.nousresearch.com/docs/api/model-catalog.json
(source at website/static/api/model-catalog.json; auto-deploys via the
existing deploy-site.yml GitHub Pages pipeline on every merge to main).

Schema (v1) carries id + optional description + free-form metadata at
manifest, provider, and model levels. Pricing and context length stay
live-fetched via existing machinery (/v1/models endpoints, models.dev).

Config (new model_catalog section, default enabled):
  model_catalog.url       master manifest URL
  model_catalog.ttl_hours disk cache TTL (default 24h)
  model_catalog.providers.<name>.url   optional per-provider override

Fetch pipeline: in-process cache -> disk cache (fresh < TTL) -> HTTP
fetch -> disk-cache-on-failure fallback -> in-repo snapshot as last
resort. Never raises to callers; at worst returns the bundled list.

Changes:
- website/static/api/model-catalog.json    initial manifest (35 OR + 31 Nous)
- scripts/build_model_catalog.py           regenerator from in-repo lists
- hermes_cli/model_catalog.py              fetch + validate + cache module
- hermes_cli/models.py                     fetch_openrouter_models() +
                                           new get_curated_nous_model_ids()
- hermes_cli/main.py, hermes_cli/auth.py   Nous flows use the helper
- hermes_cli/config.py                     model_catalog defaults
- website/docs/reference/model-catalog.md  + sidebars.ts
- tests/hermes_cli/test_model_catalog.py   21 tests (validation, fetch
                                           success/failure, accessors,
                                           disabled, overrides, integration)
2026-04-26 05:46:43 -07:00
Teknium
59b56d445c
feat(hooks): add duration_ms to post_tool_call + transform_tool_result (#15429)
Plugin hooks fired after a tool dispatch now receive an integer
duration_ms kwarg measuring how long the tool's registry.dispatch()
call took (time.monotonic() before/after). Inspired by Claude Code
2.1.119 which added the same field to PostToolUse hook inputs.

Wire points:
- model_tools.py: measure dispatch latency, pass duration_ms to
  invoke_hook("post_tool_call", ...) and invoke_hook("transform_tool_result", ...)
- hermes_cli/hooks.py: include duration_ms in the synthetic payload
  used by 'hermes hooks test' and 'hermes hooks doctor' so shell-hook
  authors see the same shape at development time as runtime
- shell hooks (agent/shell_hooks.py): no code change needed;
  _serialize_payload already surfaces non-top-level kwargs under
  payload['extra'], so duration_ms lands at extra.duration_ms for
  shell-hook scripts

Plugin authors can now build latency dashboards, per-tool SLO alerts,
and regression canaries without having to wrap every tool manually.

Test: tests/test_model_tools.py::test_post_tool_call_receives_non_negative_integer_duration_ms
E2E: real PluginManager + dispatch monkey-patched with a 50ms sleep,
hook callback observes duration_ms=50 (int).

Refs: https://code.claude.com/docs/en/changelog (2.1.119, Apr 23 2026)
2026-04-25 22:13:12 -07:00
Teknium
a55de5bcd0
feat(setup): auto-reconfigure on existing installs (#15879)
Bare `hermes setup` on a returning user now drops straight into the
full reconfigure wizard — every prompt shows the current value as its
default, press Enter to keep or type a new value to change it. The
returning-user menu is gone.

Behavior:
- First-time user: first-time wizard (unchanged)
- Returning user, bare command: full reconfigure wizard (new default)
- Returning user, `--quick`: only prompt for missing/unset items
- Returning user, one section: `hermes setup model|terminal|gateway|tools|agent`
- `--reconfigure`: preserved as backwards-compat alias (no-op since it's now default)

The section functions already used current values as prompt defaults —
this change just removes the extra click to get to them.

The 'Quick Setup - configure missing items only' menu option is now
exposed as the explicit `--quick` flag; it's the narrow case of
filling in missing config (e.g. after a partial OpenClaw migration or
when a required API key got cleared).

Inspired by Mercury Agent's `mercury doctor` UX.

Also removes:
- RETURNING_USER_MENU_SECTION_KEYS (orphaned constant)
- Two returning-user menu tests in test_setup_noninteractive.py
  (guarding behavior that no longer exists — covered by
  test_setup_reconfigure.py instead)
2026-04-25 22:02:02 -07:00
Teknium
731e1ef8cb feat(azure-foundry): auto-detect transport, models, context length
The azure-foundry wizard now probes the endpoint before asking the user
to pick anything by hand:

  1. URL path sniff — endpoints ending in /anthropic are Azure Foundry
     Claude routes and skip to anthropic_messages.
  2. GET <base>/models probe — if the endpoint returns an OpenAI-shaped
     model list, we switch to chat_completions and prefill the picker
     with the returned deployment/model IDs.
  3. Anthropic Messages probe — fallback for endpoints that don't expose
     /models but do speak the Anthropic Messages shape.
  4. Manual fallback — private endpoints / custom routes still work;
     the user picks API mode + types a deployment name.

Context length for the selected model is resolved through the existing
agent.model_metadata.get_model_context_length chain (models.dev,
provider metadata, hardcoded family fallbacks) and stored in
model.context_length when a non-default value is found.

Also refactors runtime_provider so Azure Foundry resolution is reused
between the explicit-credentials path and the default top-level path —
previously the /v1 strip for Anthropic-style Azure only ran when the
caller passed explicit_* args, which meant config-driven sessions
hit a double-/v1 URL.

New module hermes_cli/azure_detect.py with 19 unit tests covering:
- path sniff, model ID extraction, probe fallbacks
- HTTP error handling (URLError, HTTPError)
- context-length lookup passthrough
- DEFAULT_FALLBACK_CONTEXT rejection

New runtime tests cover:
- OpenAI-style Azure Foundry
- Anthropic-style Azure Foundry with /v1 stripping
- Missing base_url / API key raising AuthError

Rationale: Microsoft confirms there's no pure-API-key endpoint to list
Azure deployments (that requires ARM management auth).  The v1 Azure
OpenAI endpoint does expose /models with the resource's available
model catalog, which is good enough for picker prefill in the common
case.  Users on private/gated endpoints fall through to manual entry.
2026-04-25 18:48:43 -07:00
HangGlidersRule
d8e4c7214e fix: Azure Anthropic short-circuit in resolve_runtime_provider — bypass custom runtime when provider=anthropic + azure.com URL 2026-04-25 18:48:43 -07:00
HangGlidersRule
6ef3a47ce5 fix: use Azure API key directly for Azure endpoints, bypass OAuth token priority chain 2026-04-25 18:48:43 -07:00
TechPrototyper
3a7653dd1f feat: Add Azure Foundry provider with OpenAI/Anthropic API mode selection
Add support for Azure Foundry as a new inference provider. Azure Foundry
endpoints can use either OpenAI-style (/v1/chat/completions) or
Anthropic-style (/v1/messages) API formats.

Changes:
- Add azure-foundry to PROVIDER_REGISTRY (auth.py)
- Add azure-foundry overlay in HERMES_OVERLAYS (providers.py)
- Add empty model list for azure-foundry (models.py)
- Add _model_flow_azure_foundry() interactive setup (main.py)
- Add azure-foundry runtime resolution with api_mode support (runtime_provider.py)
- Add AZURE_FOUNDRY_API_KEY and AZURE_FOUNDRY_BASE_URL env vars (config.py)

Usage:
  hermes model -> More providers -> Azure Foundry

The setup wizard prompts for:
- Endpoint URL
- API format (OpenAI or Anthropic-style)
- API key
- Model name

Configuration is saved to config.yaml (model.provider, model.base_url,
model.api_mode, model.default) and ~/.hermes/.env (AZURE_FOUNDRY_API_KEY).
2026-04-25 18:48:43 -07:00
Teknium
125de02056
fix(context): honor custom_providers context_length on /model switch + bump probe tier to 256K (#15844)
Fixes #15779. Custom-provider per-model context_length (`custom_providers[].models.<id>.context_length`) is now honored across every resolution path, not just agent startup. Also adds 256K as the top probe tier and default fallback.

## What changed

New helper `hermes_cli.config.get_custom_provider_context_length()` — single source of truth for the per-model override lookup, with trailing-slash-insensitive base-url matching.

`agent.model_metadata.get_model_context_length()` gains an optional `custom_providers=` kwarg (step 0b — runs after explicit `config_context_length` but before every other probe).

Wired through five call sites that previously either duplicated the lookup or ignored it entirely:
- `run_agent.py` startup — refactored to use the new helper (dedups legacy inline loop, keeps invalid-value warning)
- `AIAgent.switch_model()` — re-reads custom_providers from live config on every /model switch
- `hermes_cli.model_switch.resolve_display_context_length()` — new `custom_providers=` kwarg
- `gateway/run.py` /model confirmation (picker callback + text path)
- `gateway/run.py` `_format_session_info` (/info)

## Context probe tiers

`CONTEXT_PROBE_TIERS = [256_000, 128_000, 64_000, 32_000, 16_000, 8_000]` — was `[128_000, ...]`. `DEFAULT_FALLBACK_CONTEXT` follows tier[0], so unknown models now default to 256K. The stale `128000` literal in the OpenRouter metadata-miss path is replaced with `DEFAULT_FALLBACK_CONTEXT` for consistency.

## Repro (from #15779)

```yaml
custom_providers:
  - name: my-custom-endpoint
    base_url: https://example.invalid/v1
    model: gpt-5.5
    models:
      gpt-5.5:
        context_length: 1050000
```

`/model gpt-5.5 --provider custom:my-custom-endpoint` → previously "Context: 128,000", now "Context: 1,050,000".

## Tests

- `tests/hermes_cli/test_custom_provider_context_length.py` — new file, 19 tests covering the helper, step-0b integration, and the 256K tier invariants
- `tests/hermes_cli/test_model_switch_context_display.py` — added regression tests for #15779 through the display resolver
- `tests/gateway/test_session_info.py` — updated default-fallback assertion (128K → 256K)
- `tests/agent/test_model_metadata.py` — updated tier assertions for the new top tier
2026-04-25 18:47:53 -07:00
Oluwadare Feranmi
dc5e02ea7f feat(cli): implement hermes update --check flag (fixes #10318) 2026-04-25 18:39:55 -07:00
Teknium
8bbeaea6c7 fix(config): broaden api-key ref lookup to templated base_url
The raw-template lookup added in PR #15817 went through
`get_compatible_custom_providers(read_raw_config())`, which calls
`_normalize_custom_provider_entry` → `urlparse(base_url)`. Any
entry whose `base_url` is itself an env-ref (`${NEURALWATT_API_BASE}`)
was dropped as 'not a valid URL', so `api_key_ref` stayed empty and the
resolved secret was still written to `model.api_key` — the exact case
the original Discord report described.

Replace the normalizer-gated lookup with a direct read of
`raw['custom_providers']` and `raw['providers']`, indexed by name
(case-insensitive, optionally qualified by model) so the loaded
(expanded) entry can be matched regardless of how `base_url` is
written.

Add an integration regression test driving the real
`select_provider_and_model` entry point with the Discord-reported
NeuralWatt config (`${VAR}` in both `base_url` and `api_key`).
This test fails on the PR-only fix and passes with the broadened
lookup.
2026-04-25 18:10:52 -07:00
helix4u
1fdc31b214 fix(config): preserve custom provider api key refs 2026-04-25 18:10:52 -07:00
kshitijk4poor
2c56dce0ed fix(model): preserve custom endpoint credentials and accept cloud models not in /v1/models
When switching models on a custom endpoint (ollama-launch):
- Same-provider switches no longer re-resolve credentials (fixes base_url
  being lost for 'custom' provider on subsequent switches)
- Named providers (ollama-launch) are resolved via user_providers so
  switch_model can find their base_url from config
- Models not in the /v1/models probe but present in the user's saved
  provider config are accepted with a warning instead of rejected
- CLI /model and TUI /model both pass user_providers/custom_providers
  to switch_model so the config model list is available for validation

Closes #15088
2026-04-25 18:03:47 -07:00
helix4u
b2d3308f98 fix(doctor): accept bare custom provider 2026-04-25 18:01:36 -07:00
Brooklyn Nicholson
c6fdf48b79 fix(tui): sync inference model after switches
- keep HERMES_INFERENCE_MODEL aligned with HERMES_MODEL after in-TUI model switches
- clarify static provider detection remapping docs
2026-04-25 14:17:57 -05:00
Brooklyn Nicholson
fdcbd2257b fix(tui): resolve startup model aliases statically
- expand short model aliases like sonnet/opus via static catalogs during startup runtime resolution
- keep startup alias resolution network-free and add regression tests in models and tui gateway suites
2026-04-25 14:13:02 -05:00
Brooklyn Nicholson
5e52011de3 fix(tui): bind provider as model alias 2026-04-25 13:58:59 -05:00
Brooklyn Nicholson
e48a497d16 fix(tui): share static model detection 2026-04-25 13:56:16 -05:00
Brooklyn Nicholson
e9c47c7042 fix(tui): honor launch model overrides 2026-04-25 13:21:59 -05:00
brooklyn!
ee0728c6c4
Merge pull request #15351 from helix4u/fix/tui-rebuild-missing-ink-bundle
fix(tui): rebuild when ink bundle is missing
2026-04-25 13:14:23 -05:00
Teknium
5006b2204b
fix(update): honor RestartSec when polling for gateway respawn (#15707)
The post-graceful-drain is-active poll used a fixed 10s timeout, but
systemd's hermes-gateway.service has RestartSec=30 — so systemd won't
respawn the unit for 30s after exit-75, and our poll gives up during
the cooldown. Result: every 'hermes update' printed

  ⚠ hermes-gateway drained but didn't relaunch — forcing restart

followed by a redundant 'systemctl restart' that kicked the newly-
respawning gateway again (and re-started WhatsApp / Discord a second
time in the process).

Fix: read RestartUSec from the unit via 'systemctl show' and set the
poll budget to max(10s, RestartSec + 10s slack). Units without
RestartSec set (or value=infinity) fall back to the original 10s.

Observed timeline from journalctl before fix:
  08:56:22.262  old PID exits 75
  08:56:32.707  systemd logs Stopped -> Started  (10.4s gap, > 10s budget)

After fix the poll covers 40s — comfortably inside RestartSec + slack.

Validation:
- RestartUSec parser tested against '30s', '100ms', '1min 30s',
  'infinity', '', 'garbage', '500us', '2min' — all correct.
- Against the live hermes-gateway.service: parses to 30.0s.
- tests/hermes_cli/test_update_gateway_restart.py: 41/41 pass.
2026-04-25 09:08:27 -07:00
Teknium
a9fa73a620
feat(oneshot): add --model / --provider / HERMES_INFERENCE_MODEL (#15704)
Makes hermes -z usable by sweeper without mutating user config.

- Top-level -m/--model and --provider flags that apply to -z/--oneshot
  (mirrors hermes chat's plumbing).
- HERMES_INFERENCE_MODEL env var as the parallel to HERMES_INFERENCE_PROVIDER
  for CI / scripted invocations.
- resolve_runtime_provider() gets the requested provider; when --model is
  given without --provider, detect_provider_for_model() auto-selects the
  provider that serves it (same semantic as /model in an interactive session).
- --provider without --model errors out with exit 2 — carrying a config
  model across to a different provider is usually wrong, and silently
  picking the provider's catalog default hides the mismatch.

Config defaults still used when both flags are omitted (existing behavior).

Validation (all live against OpenRouter):
  -z 'x' ....................... uses config default (opus-4.7)
  -z 'x' --model haiku-4.5 ..... haiku-4.5 via auto-detected openrouter
  -z 'x' --model ... --provider  pair as given
  HERMES_INFERENCE_MODEL=... -z  haiku-4.5 via env var
  -z 'x' --provider anthropic .. exits 2 with error to stderr
2026-04-25 08:55:36 -07:00
Teknium
7c8c031f60
feat: add hermes -z <prompt> one-shot mode (#15702)
* feat: add `hermes -z <prompt>` one-shot mode

Top-level flag that runs a single prompt and prints ONLY the final
response text to stdout. No banner, no spinner, no tool previews, no
session_id line — stdout is machine-readable, stderr is silent.

Tools, memory, rules, and AGENTS.md in the CWD are loaded as normal.
Approvals are auto-bypassed (sets HERMES_YOLO_MODE=1 for the call).
Bypasses cli.py entirely — goes straight to AIAgent.chat().

* feat(oneshot): handle interactive-callback gaps explicitly

Document (and where needed, patch) the interactive surfaces that have
no user to answer in oneshot mode:

  - clarify       — inject a callback that tells the agent to pick the
                    best default and continue (previously returned a
                    generic 'not available in this execution context'
                    error that wastes a tool call)
  - sudo password — terminal_tool already gates on HERMES_INTERACTIVE
                    (we don't set it); sudo fails gracefully
  - shell hooks   — HERMES_ACCEPT_HOOKS=1 auto-approves; also falls
                    back to deny on non-tty stdin
  - dangerous cmd — HERMES_YOLO_MODE=1 short-circuits before input()
  - secret capture— tool returns gracefully when no callback wired

Live-tested: agent asked clarify(['red','blue']) and got 'red' back,
replied with only 'red'.
2026-04-25 08:44:38 -07:00
Teknium
ea01bdcebe
refactor(memory): remove flush_memories entirely (#15696)
The AIAgent.flush_memories pre-compression save, the gateway
_flush_memories_for_session, and everything feeding them are
obsolete now that the background memory/skill review handles
persistent memory extraction.

Problems with flush_memories:

- Pre-dates the background review loop.  It was the only memory-save
  path when introduced; the background review now fires every 10 user
  turns on CLI and gateway alike, which is far more frequent than
  compression or session reset ever triggered flush.
- Blocking and synchronous.  Pre-compression flush ran on the live agent
  before compression, blocking the user-visible response.
- Cache-breaking.  Flush built a temporary conversation prefix
  (system prompt + memory-only tool list) that diverged from the live
  conversation's cached prefix, invalidating prompt caching.  The
  gateway variant spawned a fresh AIAgent with its own clean prompt
  for each finalized session — still cache-breaking, just in a
  different process.
- Redundant.  Background review runs in the live conversation's
  session context, gets the same content, writes to the same memory
  store, and doesn't break the cache.  Everything flush_memories
  claimed to preserve is already covered.

What this removes:

- AIAgent.flush_memories() method (~248 LOC in run_agent.py)
- Pre-compression flush call in _compress_context
- flush_memories call sites in cli.py (/new + exit)
- GatewayRunner._flush_memories_for_session + _async_flush_memories
  (and the 3 call sites: session expiry watcher, /new, /resume)
- 'flush_memories' entry from DEFAULT_CONFIG auxiliary tasks,
  hermes tools UI task list, auxiliary_client docstrings
- _memory_flush_min_turns config + init
- #15631's headroom-deduction math in
  _check_compression_model_feasibility (headroom was only needed
  because flush dragged the full main-agent system prompt along;
  the compression summariser sends a single user-role prompt so
  new_threshold = aux_context is safe again)
- The dedicated test files and assertions that exercised
  flush-specific paths

What this renames (with read-time backcompat on sessions.json):

- SessionEntry.memory_flushed -> SessionEntry.expiry_finalized.
  The session-expiry watcher still uses the flag to avoid re-running
  finalize/eviction on the same expired session; the new name
  reflects what it now actually gates.  from_dict() reads
  'expiry_finalized' first, falls back to the legacy 'memory_flushed'
  key so existing sessions.json files upgrade seamlessly.

Supersedes #15631 and #15638.

Tested: 383 targeted tests pass across run_agent/, agent/, cli/,
and gateway/ session-boundary suites.  No behavior regressions —
background memory review continues to handle persistent memory
extraction on both CLI and gateway.
2026-04-25 08:21:14 -07:00
Teknium
6e561ffa6d
fix(update): poll is-active instead of one-shot sleep(3) after gateway restart (#15639)
The auto-restart path in `hermes update` verifies systemd unit health with
`time.sleep(3)` + a single `systemctl is-active` call.  The unit's
Stopped -> Started transition after a graceful SIGUSR1 exit (or a hard
restart) is not always complete inside that 3s window, so the verify
races and reports 'drained but didn't relaunch' even though systemd is
about to bring the unit back up a fraction of a second later.  Users
then see a spurious warning, a redundant fallback `systemctl restart`
fires, and adapters (Discord, WhatsApp) get restarted twice.

Replace the three sleep+oneshot sites with a small `_wait_for_service_active()`
closure that polls `is-active` every 0.5s for up to 10s.  Behaviour
is unchanged when the unit is healthy or truly dead — only the race
window around a clean restart is now handled correctly.

Tests: tests/hermes_cli/test_update_gateway_restart.py (41/41).
2026-04-25 06:11:22 -07:00
Teknium
ac05daa189
fix(tools): dedupe bundled plugin toolsets with built-in entries (#15634)
`hermes tools` → "reconfigure existing" listed Spotify twice because
the Apr 24 refactor that moved Spotify into plugins/spotify/ (PR #15174)
left the entry in CONFIGURABLE_TOOLSETS. _get_effective_configurable_toolsets()
unconditionally appended get_plugin_toolsets() on top, so the same
'spotify' key showed up from both sources.

Dedupe by key — built-in CONFIGURABLE_TOOLSETS entry wins (it has the
nicer label and description). Also guards against future bundled plugins
that share a toolset key with a built-in.
2026-04-25 05:53:08 -07:00
Teknium
6ed37e0f42 feat(tools): make discord/discord_admin opt-in, Discord-only
Both discord (read/participate) and discord_admin (server admin) are now
configurable via `hermes tools` with default-OFF. Previously the core
discord tool (fetch_messages, search_members, create_thread) auto-loaded
on every Discord install with DISCORD_BOT_TOKEN set — 19 tools the user
never opted into.

Adds a platform-scoping mechanism (_TOOLSET_PLATFORM_RESTRICTIONS) so
the discord toolsets only show up in the Discord platform's checklist,
not on CLI/Telegram/Slack/etc. Applied at four gates:
  - _prompt_toolset_checklist: checklist filter
  - _get_platform_tools: resolution filter (both branches)
  - _save_platform_tools: save-time filter (covers 'Configure all
    platforms' and hand-edited config.yaml)
  - tools_disable_enable_command: rejects `hermes tools enable discord`
    on non-Discord platforms with a clear error

build_session_context_prompt now injects the Discord IDs block only
when both conditions hold: the discord/discord_admin toolset is
enabled AND DISCORD_BOT_TOKEN is set. Toolset alone isn't enough —
the tool's check_fn gates on the token at registry time, so opting
in without a token yields no tools and the IDs block would lie.
Otherwise keep the stale-API disclaimer.
2026-04-25 04:51:11 -07:00
alt-glitch
81987f0350 feat(discord): split discord_server into discord + discord_admin tools
Split the monolithic discord_server tool (14 actions) into two:

- discord: core actions (fetch_messages, search_members, create_thread)
  that are useful for the agent's normal operation. Auto-enabled on
  the discord platform via the pipeline fix.

- discord_admin: server management actions (list channels/roles, pins,
  role assignment) that require explicit opt-in via hermes tools.
  Added to CONFIGURABLE_TOOLSETS and _DEFAULT_OFF_TOOLSETS.
2026-04-25 04:50:14 -07:00
alt-glitch
9830905dab fix(tools): recover non-configurable toolsets from composite resolution
The reverse-mapping loop in _get_platform_tools only checked
CONFIGURABLE_TOOLSETS, silently dropping platform-specific toolsets
like discord and feishu_doc whose tools were in the composite but
had no configurable key. Add a second pass over TOOLSETS that picks
up unclaimed toolsets whose tools are present in the resolved
composite.
2026-04-25 04:50:14 -07:00
alt-glitch
9d7b64b5dd fix(tools): normalize numeric entries and clear stale no_mcp in _save_platform_tools
YAML parses bare numeric toolset names (e.g. 12306:) as int, causing
TypeError in sorted() since the read path normalizes to str but the
save path did not.

The no_mcp sentinel was preserved in existing entries even when the
user re-enabled MCP servers, causing MCP to stay silently disabled.
2026-04-25 04:49:02 -07:00
Teknium
023b1bff11
fix(delegate): resolve subagent approval prompts without deadlocking parent TUI (#15491)
Subagents run inside a ThreadPoolExecutor. The CLI's interactive approval
callback lives in tools/terminal_tool.py's threading.local(), which worker
threads do not inherit. When a subagent hits a dangerous-command guard,
prompt_dangerous_approval() falls back to input() from the worker thread,
deadlocking against the parent's prompt_toolkit TUI that owns stdin.

Fix: install a non-interactive callback into every subagent worker thread
via ThreadPoolExecutor(initializer=set_approval_callback, initargs=(cb,)).
The callback is config-gated by delegation.subagent_auto_approve:

  false (default) -> _subagent_auto_deny (safe; matches leaf tool blocklist)
  true            -> _subagent_auto_approve (opt-in YOLO for cron/batch)

Both emit a logger.warning audit line. Gateway sessions are unaffected
because they resolve approvals via tools/approval.py's per-session queue,
not through these TLS callbacks. Diagnosis credit: @MorAlekss (#14685).

- hermes_cli/config.py: DEFAULT_CONFIG.delegation.subagent_auto_approve: False
- cli-config.yaml.example: documented, commented (default)
- tools/delegate_tool.py: _subagent_auto_deny, _subagent_auto_approve,
  _get_subagent_approval_callback, wired into the child timeout executor
- tests/tools/test_delegate.py: 7 tests covering defaults, truthy coercion,
  and TLS scoping in the worker thread
2026-04-24 22:37:22 -07:00
Teknium
05d8f11085
fix(/model): show provider-enforced context length, not raw models.dev (#15438)
/model gpt-5.5 on openai-codex showed 'Context: 1,050,000 tokens' because
the display block used ModelInfo.context_window directly from models.dev.
Codex OAuth actually enforces 272K for the same slug, and the agent's
compressor already runs at 272K via get_model_context_length() — so the
banner + real context budget said 272K while /model lied with 1M.

Route the display context through a new resolve_display_context_length()
helper that always prefers agent.model_metadata.get_model_context_length
(which knows about Codex OAuth, Copilot, Nous caps) and only falls back
to models.dev when that returns nothing.

Fix applied to all 3 /model display sites:
  cli.py _handle_model_switch
  gateway/run.py picker on_model_selected callback
  gateway/run.py text-fallback confirmation

Reported by @emilstridell (Telegram, April 2026).
2026-04-24 17:21:38 -07:00
sprmn24
c599a41b84 fix(auth): preserve corrupt auth.json and warn instead of silently resetting
_load_auth_store() caught all parse/read exceptions and silently
returned an empty store, making corruption look like a logout with
no diagnostic information and no way to recover the original file.

Now copies the corrupt file to auth.json.corrupt before resetting,
and logs a warning with the exception and backup path.
2026-04-24 15:22:44 -07:00
sprmn24
7957da7a1d fix(web_server): hold _oauth_sessions_lock during PKCE session state writes
_submit_anthropic_pkce() retrieved sess under _oauth_sessions_lock but
wrote back to sess["status"] and sess["error_message"] outside the lock.
A concurrent session GC or cancel could race with these writes, producing
inconsistent session state.

Wrap all 4 sess write sites in _oauth_sessions_lock:
- network exception path (Token exchange failed)
- missing access_token path
- credential save failure path
- success path (approved)
2026-04-24 15:22:04 -07:00
Julia Bennet
1dcf79a864 feat: add slash command for busy input mode 2026-04-24 15:15:26 -07:00
Teknium
b7c1d77e55 fix(dashboard): remove unimplemented 'block' busy_input_mode option
The web UI schema advertised 'block' as a busy_input_mode choice, but
no implementation ever existed — the gateway and CLI both silently
collapsed 'block' (and anything other than 'queue') to 'interrupt'.
Users who picked 'block' in the dashboard got interrupts anyway.

Drop 'block' from the select options. The two supported modes are
'interrupt' (default) and 'queue'.
2026-04-24 15:01:38 -07:00
helix4u
0738b80833 fix(tui): rebuild when ink bundle is missing 2026-04-24 15:51:38 -06:00
Teknium
db9d6375fb
feat(models): add openai/gpt-5.5 and gpt-5.5-pro to OpenRouter + Nous Portal (#15343)
Replaces gpt-5.4 / gpt-5.4-pro entries in the OpenRouter fallback snapshot
and the Nous Portal curated list. Other aggregators (Vercel AI Gateway)
and provider-native lists are unchanged.
2026-04-24 14:31:47 -07:00
Austin Pickett
63975aa75b fix: mobile chat in new layout 2026-04-24 12:07:46 -04:00
emozilla
f49afd3122 feat(web): add /api/pty WebSocket bridge to embed TUI in dashboard
Exposes hermes --tui over a PTY-backed WebSocket so the dashboard can
embed the real TUI rather than reimplement its surface. The browser
attaches xterm.js to the socket; keystrokes flow in, PTY output bytes
flow out.

Architecture:

    browser <Terminal> (xterm.js)
           │  onData ───► ws.send(keystrokes)
           │  onResize ► ws.send('\x1b[RESIZE:cols;rows]')
           │  write   ◄── ws.onmessage (PTY bytes)
           ▼
    FastAPI /api/pty (token-gated, loopback-only)
           ▼
    PtyBridge (ptyprocess) ── spawns node ui-tui/dist/entry.js ──► tui_gateway + AIAgent

Components
----------

hermes_cli/pty_bridge.py
  Thin wrapper around ptyprocess.PtyProcess: byte-safe read/write on the
  master fd via os.read/os.write (not PtyProcessUnicode — ANSI is
  inherently byte-oriented and UTF-8 boundaries may land mid-read),
  non-blocking select-based reads, TIOCSWINSZ resize, idempotent
  SIGHUP→SIGTERM→SIGKILL teardown, platform guard (POSIX-only; Windows
  is WSL-supported only).

hermes_cli/web_server.py
  @app.websocket("/api/pty") endpoint gated by the existing
  _SESSION_TOKEN (via ?token= query param since browsers can't set
  Authorization on WS upgrades). Loopback-only enforcement. Reader task
  uses run_in_executor to pump PTY bytes without blocking the event
  loop. Writer loop intercepts a custom \x1b[RESIZE:cols;rows] escape
  before forwarding to the PTY. The endpoint resolves the TUI argv
  through a _resolve_chat_argv hook so tests can inject fake commands
  without building the real TUI.

Tests
-----

tests/hermes_cli/test_pty_bridge.py — 12 unit tests: spawn, stdout,
stdin round-trip, EOF, resize (via TIOCSWINSZ + tput readback), close
idempotency, cwd, env forwarding, unavailable-platform error.

tests/hermes_cli/test_web_server.py — TestPtyWebSocket adds 7 tests:
missing/bad token rejection (close code 4401), stdout streaming,
stdin round-trip, resize escape forwarding, unavailable-platform ANSI
error frame + 1011 close, resume parameter forwarding to argv.

96 tests pass under scripts/run_tests.sh.

(cherry picked from commit 29b337bca70fc9efb082a5a852ea2cd5381af1a9)

feat(web): add Chat tab with xterm.js terminal + Sessions resume button

(cherry picked from commit 3d21aee8 by emozilla, conflicts resolved
 against current main: BUILTIN_ROUTES table + plugin slot layout)

fix(tui): replace OSC 52 jargon in /copy confirmation

When the user ran /copy successfully, Ink confirmed with:

  sent OSC52 copy sequence (terminal support required)

That reads like a protocol spec to everyone who isn't a terminal
implementer. The caveat was a historical artifact — OSC 52 wasn't
universally supported when this message was written, so the TUI
honestly couldn't guarantee the copy had landed anywhere.

Today every modern terminal (including the dashboard's embedded
xterm.js) handles OSC 52 reliably. Say what the user actually wants
to know — that it copied, and how much — matching the message the
TUI already uses for selection copy:

  copied 1482 chars

(cherry picked from commit a0701b1d5a598dd1d3b94038a7bcbb2a3ab559fc)

docs: document the dashboard Chat tab

AGENTS.md — new subsection under TUI Architecture explaining that the
dashboard embeds the real hermes --tui rather than rewriting it,
with pointers to the pty_bridge + WebSocket endpoint and the rule
'never add a parallel chat surface in React.'

website/docs/user-guide/features/web-dashboard.md — user-facing Chat
section inside the existing Web Dashboard page, covering how it works
(WebSocket + PTY + xterm.js), the Sessions-page resume flow, and
prerequisites (Node.js, ptyprocess, POSIX kernel / WSL on Windows).

(cherry picked from commit 2c2e32cc4519973c77b63016316b065c0f656704)

feat(tui-gateway): transport-aware dispatch + WebSocket sidecar

Decouples the JSON-RPC dispatcher from its I/O sink so the same handler
surface can drive multiple transports concurrently. The PTY chat tab
already speaks to the TUI binary as bytes — this adds a structured
event channel alongside it for dashboard-side React widgets that need
typed events (tool.start/complete, model picker state, slash catalog)
that PTY can't surface.

- `tui_gateway/transport.py` — `Transport` protocol + `contextvars` binding
  + module-level `StdioTransport` fallback. The stdio stream resolves
  through a lambda so existing tests that monkey-patch `_real_stdout`
  keep passing without modification.
- `tui_gateway/ws.py` — WebSocket transport implementation; FastAPI
  endpoint mounting lives in hermes_cli/web_server.py.
- `tui_gateway/server.py`:
  - `write_json` routes via session transport (for async events) →
    contextvar transport (for in-request writes) → stdio fallback.
  - `dispatch(req, transport=None)` binds the transport for the request
    lifetime and propagates it to pool workers via `contextvars.copy_context`
    so async handlers don't lose their sink.
  - `_init_session` and the manual-session create path stash the
    request's transport so out-of-band events (subagent.complete, etc.)
    fan out to the right peer.

`tui_gateway.entry` (Ink's stdio handshake) is unchanged externally —
it falls through every precedence step into the stdio fallback, byte-
identical to the previous behaviour.

feat(web): ChatSidebar — JSON-RPC sidecar next to xterm.js terminal

Composes the two transports into a single Chat tab:

  ┌─────────────────────────────────────────┬──────────────┐
  │  xterm.js / PTY  (emozilla #13379)      │ ChatSidebar  │
  │  the literal hermes --tui process       │  /api/ws     │
  └─────────────────────────────────────────┴──────────────┘
        terminal bytes                          structured events

The terminal pane stays the canonical chat surface — full TUI fidelity,
slash commands, model picker, mouse, skin engine, wide chars all paint
inside the terminal. The sidebar opens a parallel JSON-RPC WebSocket
to the same gateway and renders metadata that PTY can't surface to
React chrome:

  • model + provider badge with connection state (click → switch)
  • running tool-call list (driven by tool.start / tool.progress /
    tool.complete events)
  • model picker dialog (gateway-driven, reuses ModelPickerDialog)

The sidecar is best-effort. If the WS can't connect (older gateway,
network hiccup, missing token) the terminal pane keeps working
unimpaired — sidebar just shows the connection-state badge in the
appropriate tone.

- `web/src/components/ChatSidebar.tsx` — new component (~270 lines).
  Owns its GatewayClient, drives the model picker through
  `slash.exec`, fans tool events into a capped tool list.
- `web/src/pages/ChatPage.tsx` — split layout: terminal pane
  (`flex-1`) + sidebar (`w-80`, `lg+` only).
- `hermes_cli/web_server.py` — mount `/api/ws` (token + loopback
  guards mirror /api/pty), delegate to `tui_gateway.ws.handle_ws`.

Co-authored-by: emozilla <emozilla@nousresearch.com>

refactor(web): /clean pass on ChatSidebar + ChatPage lint debt

- ChatSidebar: lift gw out of useRef into a useMemo derived from a
  reconnect counter. React 19's react-hooks/refs and react-hooks/
  set-state-in-effect rules both fire when you touch a ref during
  render or call setState from inside a useEffect body. The
  counter-derived gw is the canonical pattern for "external resource
  that needs to be replaceable on user action" — re-creating the
  client comes from bumping `version`, the effect just wires + tears
  down. Drops the imperative `gwRef.current = …` reassign in
  reconnect, drops the truthy ref guard in JSX. modelLabel +
  banner inlined as derived locals (one-off useMemo was overkill).
- ChatPage: lazy-init the banner state from the missing-token check
  so the effect body doesn't have to setState on first run. Drops
  the unused react-hooks/exhaustive-deps eslint-disable. Adds a
  scoped no-control-regex disable on the SGR mouse parser regex
  (the \\x1b is intentional for xterm escape sequences).

All my-touched files now lint clean. Remaining warnings on web/
belong to pre-existing files this PR doesn't touch.

Verified: vitest 249/249, ui-tui eslint clean, web tsc clean,
python imports clean.

chore: uptick

fix(web): drop ChatSidebar tool list — events can't cross PTY/WS boundary

The /api/pty endpoint spawns `hermes --tui` as a child process with its
own tui_gateway and _sessions dict; /api/ws runs handle_ws in-process in
the dashboard server with a separate _sessions dict. Tool events fire on
the child's gateway and never reach the WS sidecar, so the sidebar's
tool.start/progress/complete listeners always observed an empty list.

Drop the misleading list (and the now-orphaned ToolCall primitive),
keep model badge + connection state + model picker + error banner —
those work because they're sidecar-local concerns. Surfacing tool calls
in the sidebar requires cross-process forwarding (PTY child opens a
back-WS to the dashboard, gateway tees emits onto stdio + sidecar
transport) — proper feature for a follow-up.

feat(web): wire ChatSidebar tool list to PTY child via /api/pub broadcast

The dashboard's /api/pty spawns hermes --tui as a child process; tool
events fire in the python tui_gateway grandchild and never crossed the
process boundary into the in-process WS sidecar — so the sidebar tool
list was always empty.

Cross-process forwarding:

- tui_gateway: TeeTransport (transport.py) + WsPublisherTransport
  (event_publisher.py, sync websockets client). entry.py installs the
  tee on _stdio_transport when HERMES_TUI_SIDECAR_URL is set, mirroring
  every dispatcher emit to a back-WS without disturbing Ink's stdio
  handshake.

- hermes_cli/web_server.py: new /api/pub (publisher) + /api/events
  (subscriber) endpoints with a per-channel registry. /api/pty now
  accepts ?channel= and propagates the sidecar URL via env. start_server
  also stashes app.state.bound_port so the URL is constructable.

- web/src/pages/ChatPage.tsx: generates a channel UUID per mount,
  passes it to /api/pty and as a prop to ChatSidebar.

- web/src/components/ChatSidebar.tsx: opens /api/events?channel=, fans
  tool.start/progress/complete back into the ToolCall list. Restores
  the ToolCall primitive.

Tests: 4 new TestPtyWebSocket cases cover channel propagation,
broadcast fan-out, and missing-channel rejection (10 PTY tests pass,
120 web_server tests overall).

fix(web): address Copilot review on #14890

Five threads, all real:

- gatewayClient.ts: register `message`/`close` listeners BEFORE awaiting
  the open handshake.  Server emits `gateway.ready` immediately after
  accept, so a listener attached after the open promise could race past
  the initial skin payload and lose it.

- ChatSidebar.tsx: wire `error`/`close` on the /api/events subscriber
  WS into the existing error banner.  4401/4403 (auth/loopback reject)
  surface as a "reload the page" message; mid-stream drops surface as
  "events feed disconnected" with the existing reconnect button.  Clean
  unmount closes (1000/1001) stay silent.

- web-dashboard.md: install hint was `pip install hermes-agent[web]` but
  ptyprocess lives in the `pty` extra, not `web`.  Switch to
  `hermes-agent[web,pty]` in both prerequisite blocks.

- AGENTS.md: previous "never add a parallel React chat surface" guidance
  was overbroad and contradicted this PR's sidebar.  Tightened to forbid
  re-implementing the transcript/composer/PTY terminal while explicitly
  allowing structured supporting widgets (sidebar / model picker /
  inspectors), matching the actual architecture.

- web/package-lock.json: regenerated cleanly so the wterm sibling
  workspace paths (extraneous machine-local entries) stop polluting CI.

Tests: 249/249 vitest, 10/10 PTY/events, web tsc clean.

refactor(web): /clean pass on ChatSidebar events handler

Spotted in the round-2 review:

- Banner flashed on clean unmount: `ws.close()` from the effect cleanup
  fires `close` with code 1005, opened=true, neither 1000 nor 1001 —
  hit the "unexpected drop" branch.  Track `unmounting` in the effect
  scope and gate the banner through a `surface()` helper so cleanup
  closes stay silent.

- DRY the duplicated "events feed disconnected" string into a local
  const used by both the error and close handlers.

- Drop the `opened` flag (no longer needed once the unmount guard is
  the source of truth for "is this an expected close?").
2026-04-24 10:51:49 -04:00
Teknium
1840c6a57d
feat(spotify): wire setup wizard into 'hermes tools' + document cron usage (#15180)
A — 'hermes tools' activation now runs the full Spotify wizard.

Previously a user had to (1) toggle the Spotify toolset on in 'hermes
tools' AND (2) separately run 'hermes auth spotify' to actually use
it. The second step was a discovery gap — the docs mentioned it but
nothing in the TUI pointed users there.

Now toggling Spotify on calls login_spotify_command as a post_setup
hook. If the user has no client_id yet, the interactive wizard walks
them through Spotify app creation; if they do, it skips straight to
PKCE. Either way, one 'hermes tools' pass leaves Spotify toggled on
AND authenticated. SystemExit from the wizard (user abort) leaves the
toolset enabled and prints a 'run: hermes auth spotify' hint — it
does NOT fail the toolset toggle.

Dropped the TOOL_CATEGORIES env_vars list for Spotify. The wizard
handles HERMES_SPOTIFY_CLIENT_ID persistence itself, and asking users
to type env var names before the wizard fires was UX-backwards — the
point of the wizard is that they don't HAVE a client_id yet.

B — Docs page now covers cron + Spotify.

New 'Scheduling: Spotify + cron' section with two working examples
(morning playlist, wind-down pause) using the real 'hermes cron add'
CLI surface (verified via 'cron add --help'). Covers the active-device
gotcha, Premium gating, memory isolation, and links to the cron docs.

Also fixed a stale '9 Spotify tools' reference in the setup copy —
we consolidated to 7 tools in #15154.

Validation:
- scripts/run_tests.sh tests/hermes_cli/test_tools_config.py
    tests/hermes_cli/test_spotify_auth.py
    tests/tools/test_spotify_client.py
  → 54 passed
- website: node scripts/prebuild.mjs && npx docusaurus build
  → SUCCESS, no new warnings
2026-04-24 07:24:28 -07:00
5park1e
e1106772d9 fix: re-auth on stale OAuth token; read Claude Code credentials from macOS Keychain
Bug 3 — Stale OAuth token not detected in 'hermes model':
- _model_flow_anthropic used 'has_creds = bool(existing_key)' which treats
  any non-empty token (including expired OAuth tokens) as valid.
- Added existing_is_stale_oauth check: if the only credential is an OAuth
  token (sk-ant- prefix) with no valid cc_creds fallback, mark it stale
  and force the re-auth menu instead of silently accepting a broken token.

Bug 4 — macOS Keychain credentials never read:
- Claude Code >=2.1.114 migrated from ~/.claude/.credentials.json to the
  macOS Keychain under service 'Claude Code-credentials'.
- Added _read_claude_code_credentials_from_keychain() using the 'security'
  CLI tool; read_claude_code_credentials() now tries Keychain first then
  falls back to JSON file.
- Non-Darwin platforms return None from Keychain read immediately.

Tests:
- tests/agent/test_anthropic_keychain.py: 11 cases covering Darwin-only
  guard, security command failures, JSON parsing, fallback priority.
- tests/hermes_cli/test_anthropic_model_flow_stale_oauth.py: 8 cases
  covering stale OAuth detection, API key passthrough, cc_creds fallback.

Refs: #12905
2026-04-24 07:14:00 -07:00
Teknium
8d12fb1e6b
refactor(spotify): convert to built-in bundled plugin under plugins/spotify (#15174)
Moves the Spotify integration from tools/ into plugins/spotify/,
matching the existing pattern established by plugins/image_gen/ for
third-party service integrations.

Why:
- tools/ should be reserved for foundational capabilities (terminal,
  read_file, web_search, etc.). tools/providers/ was a one-off
  directory created solely for spotify_client.py.
- plugins/ is already the home for image_gen backends, memory
  providers, context engines, and standalone hook-based plugins.
  Spotify is a third-party service integration and belongs alongside
  those, not in tools/.
- Future service integrations (eventually: Deezer, Apple Music, etc.)
  now have a pattern to copy.

Changes:
- tools/spotify_tool.py → plugins/spotify/tools.py (handlers + schemas)
- tools/providers/spotify_client.py → plugins/spotify/client.py
- tools/providers/ removed (was only used for Spotify)
- New plugins/spotify/__init__.py with register(ctx) calling
  ctx.register_tool() × 7. The handler/check_fn wiring is unchanged.
- New plugins/spotify/plugin.yaml (kind: backend, bundled, auto-load).
- tests/tools/test_spotify_client.py: import paths updated.

tools_config fix — _DEFAULT_OFF_TOOLSETS now wins over plugin auto-enable:
- _get_platform_tools() previously auto-enabled unknown plugin
  toolsets for new platforms. That was fine for image_gen (which has
  no toolset of its own) but bad for Spotify, which explicitly
  requires opt-in (don't ship 7 tool schemas to users who don't use
  it). Added a check: if a plugin toolset is in _DEFAULT_OFF_TOOLSETS,
  it stays off until the user picks it in 'hermes tools'.

Pre-existing test bug fix:
- tests/hermes_cli/test_plugins.py::test_list_returns_sorted
  asserted names were sorted, but list_plugins() sorts by key
  (path-derived, e.g. image_gen/openai). With only image_gen plugins
  bundled, name and key order happened to agree. Adding plugins/spotify
  broke that coincidence (spotify sorts between openai-codex and xai
  by name but after xai by key). Updated test to assert key order,
  which is what the code actually documents.

Validation:
- scripts/run_tests.sh tests/hermes_cli/test_plugins.py \
    tests/hermes_cli/test_tools_config.py \
    tests/hermes_cli/test_spotify_auth.py \
    tests/tools/test_spotify_client.py \
    tests/tools/test_registry.py
  → 143 passed
- E2E plugin load: 'spotify' appears in loaded plugins, all 7 tools
  register into the spotify toolset, check_fn gating intact.
2026-04-24 07:06:11 -07:00
Teknium
e5d41f05d4
feat(spotify): consolidate tools (9→7), add spotify skill, surface in hermes setup (#15154)
Three quality improvements on top of #15121 / #15130 / #15135:

1. Tool consolidation (9 → 7)
   - spotify_saved_tracks + spotify_saved_albums → spotify_library with
     kind='tracks'|'albums'. Handler code was ~90 percent identical
     across the two old tools; the merge is a behavioral no-op.
   - spotify_activity dropped. Its 'now_playing' action was a duplicate
     of spotify_playback.get_currently_playing (both return identical
     204/empty payloads). Its 'recently_played' action moves onto
     spotify_playback as a new action — history belongs adjacent to
     live state.
   - Net: each API call ships 2 fewer tool schemas when the Spotify
     toolset is enabled, and the action surface is more discoverable
     (everything playback-related is on one tool).

2. Spotify skill (skills/media/spotify/SKILL.md)
   Teaches the agent canonical usage patterns so common requests don't
   balloon into 4+ tool calls:
   - 'play X' = one search, then play by URI (not search + scan +
     describe + play)
   - 'what's playing' = single get_currently_playing (no preflight
     get_state chain)
   - Don't retry on '403 Premium required' or '403 No active device' —
     both require user action
   - URI/URL/bare-ID format normalization
   - Full failure-mode reference for 204/401/403/429

3. Surfaced in 'hermes setup' tool status
   Adds 'Spotify (PKCE OAuth)' to the tool status list when
   auth.json has a Spotify access/refresh token. Matches the
   homeassistant pattern but reads from auth.json (OAuth-based) rather
   than env vars.

Docs updated to reflect the new 7-tool surface, and mention the
companion skill in the 'Using it' section.

Tests: 54 passing (client 22, auth 15, tools_config 35 — 18 = 54 after
renaming/replacing the spotify_activity tests with library +
recently_played coverage). Docusaurus build clean.
2026-04-24 06:14:51 -07:00
XieNBi
4a51ab61eb fix(cli): non-zero /model counts for native OpenAI and direct API rows 2026-04-24 05:48:15 -07:00
Brian D. Evans
7f26cea390 fix(models): strip models/ prefix in Gemini validator (#12532)
Salvage of the Gemini-specific piece from PR #12585 by @briandevans.
Gemini's OpenAI-compat /v1beta/openai/models endpoint returns IDs prefixed
with 'models/' (native Gemini-API convention), so set-membership against
curated bare IDs drops every model. Strip the prefix before comparison.

The Anthropic static-catalog piece of #12585 was subsumed by #12618's
_fetch_anthropic_models() branch landing earlier in the same salvage PR.
Full branch cherry-pick was skipped because it also carried unrelated
catalog-version regressions.
2026-04-24 05:48:15 -07:00
H-Ali13381
2303dd8686 fix(models): use Anthropic-native headers for model validation
The generic /v1/models probe in validate_requested_model() sent a plain
'Authorization: Bearer <key>' header, which works for OpenAI-compatible
endpoints but results in a 401 Unauthorized from Anthropic's API.
Anthropic requires x-api-key + anthropic-version headers (or Bearer for
OAuth tokens from Claude Code).

Add a provider-specific branch for normalized == 'anthropic' that calls
the existing _fetch_anthropic_models() helper, which already handles
both regular API keys and Claude Code OAuth tokens correctly.  This
mirrors the pattern already used for openai-codex, copilot, and bedrock.

The branch also includes:
- fuzzy auto-correct (cutoff 0.9) for near-exact model ID typos
- fuzzy suggestions (cutoff 0.5) when the model is not listed
- graceful fall-through when the token cannot be resolved or the
  network is unreachable (accepts with a warning rather than hard-fail)
- a note that newer/preview/snapshot model IDs can be gate-listed
  and may still work even if not returned by /v1/models

Fixes Anthropic provider users seeing 'service unreachable' errors
when running /model <claude-model> because every probe 401'd.
2026-04-24 05:48:15 -07:00