Commit Graph

584 Commits

Author SHA1 Message Date
Teknium
c1ef9b2250
fix(cli): ensure on_session_end hook fires on interrupted exits (#4159)
- Add SIGTERM/SIGHUP signal handlers for graceful shutdown
- Add BrokenPipeError to exit exception handling (SSH disconnects)
- Fire on_session_end plugin hook in finally block, guarded by
  _agent_running to avoid double-firing on normal exits (the hook
  already fires per-turn from run_conversation)

Co-authored-by: kelsia14 <kelsia14@users.noreply.github.com>
2026-03-30 20:37:17 -07:00
Teknium
13f3e67165
ux: show 'Initializing agent...' on first message (#4086)
Display a brief status message before the heavy agent initialization
(OpenAI client setup, tool loading, memory init, etc.) so users
aren't staring at a blank screen for several seconds.

Only prints when self.agent is None (first use or after model switch).

Closes #4060

Co-authored-by: SHL0MS <SHL0MS@users.noreply.github.com>
2026-03-30 17:05:40 -07:00
Teknium
f93637b3a1
feat: add /profile slash command to show active profile (#4027)
Adds /profile to COMMAND_REGISTRY (Info category) with handlers in
both CLI and gateway. Shows the active profile name and home directory.

Works on all platforms — CLI, Telegram, Discord, Slack, etc.
Detects profile by checking if HERMES_HOME is under ~/.hermes/profiles/.
Shows 'default' when running without a profile.
2026-03-30 13:20:06 -07:00
Teknium
0976bf6cd0
feat: add /yolo slash command to toggle dangerous command approvals (#3990)
Adds a /yolo command that toggles HERMES_YOLO_MODE at runtime, skipping
all dangerous command approval prompts for the current session. Works in
both CLI and gateway (Telegram, Discord, etc.).

- /yolo -> ON: all commands auto-approved, no confirmation prompts
- /yolo -> OFF: normal approval flow restored

The --yolo CLI flag already existed for launch-time opt-in. This adds
the ability to toggle mid-session without restarting.

Session-scoped — resets when the process ends. Uses the existing
HERMES_YOLO_MODE env var that check_all_command_guards() already
respects.
2026-03-30 11:17:09 -07:00
Teknium
0e592aa5b4
fix(cli): remove input() from /tools disable that freezes the terminal (#3918)
input() hangs inside prompt_toolkit's TUI event loop — this is a known
pitfall (AGENTS.md). The /tools disable and /tools enable commands used
input() for a Y/N confirmation prompt, causing the terminal to freeze
with no way to type a response.

Fix: remove the confirmation prompt. The user typing '/tools disable web'
is implicit consent. The change is applied directly with a status message.
2026-03-30 02:53:21 -07:00
Wing Lian
efae525dc5
feat(plugins): add inject_message interface for remote message injection (#3778) 2026-03-30 02:48:06 -07:00
kshitij
c288bbfb57
fix(cli): prevent status bar wrapping into duplicate rows (#3883)
- measure status bar display width using prompt_toolkit cell widths
- trim rendered status text when fragments would overflow
- add a final single-fragment fallback to prevent wrapping
- update width assertions to validate display cells instead of len()
2026-03-29 23:59:07 -07:00
Teknium
86ac23c8da
fix(auth): stop silently falling back to OpenRouter when no provider is configured (#3862)
Previously, when no API keys or provider credentials were found, Hermes
silently defaulted to OpenRouter + Claude Opus. This caused confusion
when users configured local servers (LM Studio, Ollama, etc.) with a
typo or unrecognized provider name — the system would silently route to
OpenRouter instead of telling them something was wrong.

Changes:
- resolve_provider() now raises AuthError when no credentials are found
  instead of returning 'openrouter' as a silent fallback
- Added local server aliases: lmstudio, ollama, vllm, llamacpp → custom
- Removed hardcoded 'anthropic/claude-opus-4.6' fallback from gateway
  and cron scheduler (they read from config.yaml instead)
- Updated cli-config.yaml.example with complete provider documentation
  including all supported providers, aliases, and local server setup
2026-03-29 21:06:35 -07:00
Teknium
e314833c9d
feat(display): configurable tool preview length -- show full paths by default (#3841)
Tool call previews (paths, commands, queries) were hardcoded to truncate
at 35-40 chars across CLI spinners, completion lines, and gateway progress
messages. Users could not see full file paths in tool output.

New config option: display.tool_preview_length (default 0 = no limit).
Set a positive number to truncate at that length.

Changes:
- display.py: module-level _tool_preview_max_len with getter/setter;
  build_tool_preview() and get_cute_tool_message() _trunc/_path respect it
- cli.py: reads config at startup, spinner widget respects config
- gateway/run.py: reads config per-message, progress callback respects config
- run_agent.py: removed redundant 30-char quiet-mode spinner truncation
- config.py: added display.tool_preview_length to DEFAULT_CONFIG

Reported by kriskaminski
2026-03-29 18:02:42 -07:00
Teknium
252fbea005
feat(providers): add ordered fallback provider chain (salvage #1761) (#3813)
Extends the single fallback_model mechanism into an ordered chain.
When the primary model fails, Hermes tries each fallback provider in
sequence until one succeeds or the chain is exhausted.

Config format (new):
  fallback_providers:
    - provider: openrouter
      model: anthropic/claude-sonnet-4
    - provider: openai
      model: gpt-4o

Legacy single-dict fallback_model format still works unchanged.

Key fix vs original PR: the call sites in the retry loop now use
_fallback_index < len(_fallback_chain) instead of the old one-shot
_fallback_activated guard, so the chain actually advances through
all configured providers.

Changes:
- run_agent.py: _fallback_chain list + _fallback_index replaces
  one-shot _fallback_model; _try_activate_fallback() advances
  through chain; failed provider resolution skips to next entry;
  call sites updated to allow chain advancement
- cli.py: reads fallback_providers with legacy fallback_model compat
- gateway/run.py: same
- hermes_cli/config.py: fallback_providers: [] in DEFAULT_CONFIG
- tests: 12 new chain tests + 6 existing test fixtures updated

Co-authored-by: uzaylisak <uzaylisak@users.noreply.github.com>
2026-03-29 16:04:53 -07:00
Teknium
0fd3b59ba1
feat(cli): add Ctrl+Z process suspend support (#3802)
Adds a Ctrl+Z key binding to suspend the hermes CLI to background
using standard Unix job control. Uses prompt_toolkit's run_in_terminal()
to properly save/restore terminal state, then sends SIGTSTP to the
process group. Prints a branded message with resume instructions.
Shows a not-supported notice on Windows.

Co-authored-by: CharlieKerfoot <CharlieKerfoot@users.noreply.github.com>
2026-03-29 15:47:55 -07:00
Teknium
f6db1b27ba
feat: add profiles — run multiple isolated Hermes instances (#3681)
Each profile is a fully independent HERMES_HOME with its own config,
API keys, memory, sessions, skills, gateway, cron, and state.db.

Core module: hermes_cli/profiles.py (~900 lines)
  - Profile CRUD: create, delete, list, show, rename
  - Three clone levels: blank, --clone (config), --clone-all (everything)
  - Export/import: tar.gz archive for backup and migration
  - Wrapper alias scripts (~/.local/bin/<name>)
  - Collision detection for alias names
  - Sticky default via ~/.hermes/active_profile
  - Skill seeding via subprocess (handles module-level caching)
  - Auto-stop gateway on delete with disable-before-stop for services
  - Tab completion generation for bash and zsh

CLI integration (hermes_cli/main.py):
  - _apply_profile_override(): pre-import -p/--profile flag + sticky default
  - Full 'hermes profile' subcommand: list, use, create, delete, show,
    alias, rename, export, import
  - 'hermes completion bash/zsh' command
  - Multi-profile skill sync in hermes update

Display (cli.py, banner.py, gateway/run.py):
  - CLI prompt: 'coder ❯' when using a non-default profile
  - Banner shows profile name
  - Gateway startup log includes profile name

Gateway safety:
  - Token locks: Discord, Slack, WhatsApp, Signal (extends Telegram pattern)
  - Port conflict detection: API server, webhook adapter

Diagnostics (hermes_cli/doctor.py):
  - Profile health section: lists profiles, checks config, .env, aliases
  - Orphan alias detection: warns when wrapper points to deleted profile

Tests (tests/hermes_cli/test_profiles.py):
  - 71 automated tests covering: validation, CRUD, clone levels, rename,
    export/import, active profile, isolation, alias collision, completion
  - Full suite: 6760 passed, 0 new failures

Documentation:
  - website/docs/user-guide/profiles.md: full user guide (12 sections)
  - website/docs/reference/profile-commands.md: command reference (12 commands)
  - website/docs/reference/faq.md: 6 profile FAQ entries
  - website/sidebars.ts: navigation updated
2026-03-29 10:41:20 -07:00
Teknium
9f01244137
fix: replace user-facing hardcoded ~/.hermes paths with display_hermes_home()
Prep for profiles: user-facing messages now use display_hermes_home() so
diagnostic output shows the correct path for each profile.

New helper: display_hermes_home() in hermes_constants.py
12 files swept, ~30 user-facing string replacements.
Includes dynamic TTS schema description.
2026-03-28 23:47:21 -07:00
Teknium
0bd7e95dfc
fix(honcho): allow self-hosted local instances without API key (#3644)
Self-hosted Honcho on localhost doesn't require authentication, but
both the activation gates and the SDK client required an API key.

Combined fix from three contributor PRs:
- Relax all 8 activation gates to accept (api_key OR base_url) as
  valid credentials (#3482 by @cameronbergh)
- Use 'local' placeholder for the SDK client when base_url points to
  localhost/127.0.0.1/::1 (#3570 by @ygd58)

Files changed: run_agent.py (2 gates), cli.py (1 gate),
gateway/run.py (1 gate), honcho_integration/cli.py (2 gates),
hermes_cli/doctor.py (2 gates), honcho_integration/client.py (SDK).

Co-authored-by: cameronbergh <cameronbergh@users.noreply.github.com>
Co-authored-by: ygd58 <ygd58@users.noreply.github.com>
Co-authored-by: devorun <devorun@users.noreply.github.com>
2026-03-28 17:49:56 -07:00
Teknium
bea49e02a3
fix: route /bg spinner through TUI widget to prevent status bar collision (#3643)
Background agent's KawaiiSpinner wrote \r-based animation and stop()
messages through StdoutProxy, colliding with prompt_toolkit's status bar.

Two fixes:
- display.py: use isinstance(out, StdoutProxy) instead of fragile
  hasattr+name check for detecting prompt_toolkit's stdout wrapper
- cli.py: silence bg agent's raw spinner (_print_fn=no-op) and route
  thinking updates through the TUI widget only when no foreground
  agent is active; clear spinner text in finally block with same guard

Closes #2718

Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>
2026-03-28 17:29:37 -07:00
Teknium
857a5d7b47
fix: sanitize surrogate characters from clipboard paste to prevent UnicodeEncodeError (#3624)
Pasting text from rich-text editors (Google Docs, Word, etc.) can inject
lone surrogate characters (U+D800..U+DFFF) that are invalid UTF-8.
The OpenAI SDK serializes messages with ensure_ascii=False, then encodes
to UTF-8 for the HTTP body — surrogates crash this with:
  UnicodeEncodeError: 'utf-8' codec can't encode character '\udce2'

Three-layer fix:
1. Primary: sanitize user_message at the top of run_conversation()
2. CLI: sanitize in chat() before appending to conversation_history
3. Safety net: catch UnicodeEncodeError in the API error handler,
   sanitize the entire messages list in-place, and retry once.
   Also exclude UnicodeEncodeError from is_local_validation_error
   so it doesn't get classified as non-retryable.

Includes 14 new tests covering the sanitization helpers and the
integration with run_conversation().
2026-03-28 16:53:14 -07:00
Teknium
b029742092
fix(cli): strengthen paste collapse fallback for terminals without bracketed paste (#3625)
The _on_text_changed fallback only detected pastes when all characters
arrived in a single event (chars_added > 1).  Some terminals (notably
VSCode integrated terminal in certain configs) may deliver paste data
differently, causing the fallback to miss.

Add a second heuristic: if the newline count jumps by 4+ in a single
text-change event, treat it as a paste.  Alt+Enter only adds 1 newline
per event, so this never false-positives on manual multi-line input.

Also fixes: the fallback path was missing _paste_just_collapsed flag
set before replacing buffer text, which could cause a re-trigger loop.
2026-03-28 15:40:49 -07:00
Teknium
e4480ff426
fix(config): accept 'model' key as alias for 'default' in model config (#3603)
Users intuitively write model: { model: my-model } instead of
model: { default: my-model } and it silently falls back to the
hardcoded default. Now both spellings work across all three config
consumers: runtime_provider, CLI, and gateway.

Co-authored-by: ygd58 <ygd58@users.noreply.github.com>
2026-03-28 14:55:27 -07:00
Teknium
9a364f2805
fix: cap percentage displays at 100% in stats, gateway, and memory tool (#3599)
Salvage of PR #3533 (binhnt92). Follow-up to #3480 — applies min(100, ...) to 5 remaining unclamped percentage display sites in context_compressor, cli /stats, gateway /stats, and memory tool. Defensive clamps now that the root cause (estimation heuristic) was already removed in #3480.

Co-Authored-By: binhnt92 <binhnt92@users.noreply.github.com>
2026-03-28 14:55:18 -07:00
Teknium
1b2d4f21f3
feat(cli): show resume-by-title command in exit summary (#3607)
When exiting a session that has a title (auto-generated or manual),
the exit summary now also shows:
  hermes -c "Session Title"
alongside the existing hermes --resume <id> command.

Also adds the title to the session info block.
2026-03-28 14:54:53 -07:00
Teknium
be39292633
fix(cli): guard .strip() against None values from YAML config (#3552)
dict.get(key, default) only returns default when key is ABSENT.
When YAML has 'key:' with no value, it parses as None — .get()
returns None, then .strip() crashes with AttributeError.

Use (x or '') pattern to handle both missing and null cases.


Salvaged from PR #3217.

Co-authored-by: erosika <erosika@users.noreply.github.com>
2026-03-28 11:39:01 -07:00
Teknium
e4e04c2005
fix: make tirith block verdicts approvable instead of hard-blocking (#3428)
Previously, tirith exit code 1 (block) immediately rejected the command
with no approval prompt — users saw 'BLOCKED: Command blocked by
security scan' and the agent moved on.  This prevented gateway/CLI users
from approving pipe-to-shell installs like 'curl ... | sh' even when
they understood the risk.

Changes:
- Tirith 'block' and 'warn' now both go through the approval flow.
  Users see the full tirith findings (severity, title, description,
  safer alternatives) and can choose to approve or deny.
- New _format_tirith_description() builds rich descriptions from tirith
  findings JSON so the approval prompt is informative.
- CLI startup now warns when tirith is enabled but not available, so
  users know command scanning is degraded to pattern matching only.

The default approval choice is still deny, so the security posture is
unchanged for unattended/timeout scenarios.

Reported via Discord by pistrie — 'curl -fsSL https://mandex.dev/install.sh | sh'
was hard-blocked with no way to approve.
2026-03-27 13:22:01 -07:00
Teknium
8ecd7aed2c
fix: prevent reasoning box from rendering 3x during tool-calling loops (#3405)
Two independent bugs caused the reasoning box to appear three times when
the model produced reasoning + tool_calls:

Bug A: _build_assistant_message() re-fired reasoning_callback with the full
reasoning text even when streaming had already displayed it. The original
guard only checked structured reasoning_content deltas, but reasoning also
arrives via content tag extraction (<REASONING_SCRATCHPAD>/<think> tags
in delta.content), which went through _fire_stream_delta not
_fire_reasoning_delta. Fix: skip the callback entirely when streaming is
active — both paths display reasoning during the stream. Any reasoning not
shown during streaming is caught by the CLI post-response fallback.

Bug B: The post-response reasoning display checked _reasoning_stream_started,
but that flag was reset by _reset_stream_state() during intermediate turn
boundaries (when stream_delta_callback(None) fires between tool calls).
Introduced _reasoning_shown_this_turn flag that persists across the tool
loop and is only reset at the start of each user turn.

Live-tested in PTY: reasoning now shows exactly once per API call, no
duplicates across tool-calling loops.
2026-03-27 09:57:50 -07:00
Teknium
e0dbbdb2c9
fix: eliminate 'Event loop is closed' / 'Press ENTER to continue' during idle (#3398)
The OpenAI SDK's AsyncHttpxClientWrapper.__del__ schedules aclose() via
asyncio.get_running_loop().create_task().  When an AsyncOpenAI client is
garbage-collected while prompt_toolkit's event loop is running (the common
CLI idle state), the aclose() task runs on prompt_toolkit's loop but the
underlying TCP transport is bound to a different (dead) worker loop.
The transport's self._loop.call_soon() then raises RuntimeError('Event
loop is closed'), which prompt_toolkit surfaces as the disruptive
'Unhandled exception in event loop ... Press ENTER to continue...' error.

Three-layer fix:

1. neuter_async_httpx_del(): Monkey-patches __del__ to a no-op at CLI
   startup before any AsyncOpenAI clients are created.  Safe because
   cached clients are explicitly cleaned via _force_close_async_httpx,
   and uncached clients' TCP connections are cleaned by the OS on exit.

2. Custom asyncio exception handler: Installed on prompt_toolkit's event
   loop to silently suppress 'Event loop is closed' RuntimeError.
   Defense-in-depth for SDK upgrades that might change the class name.

3. cleanup_stale_async_clients(): Called after each agent turn (when the
   agent thread joins) to proactively evict cache entries whose event
   loop is closed, preventing stale clients from accumulating.
2026-03-27 09:45:25 -07:00
Teknium
1519c4d477
fix(session): add /resume CLI handler, session log truncation guard, reopen_session API (#3315)
Three improvements salvaged from PR #3225 by Mibayy:

1. Add /resume slash command handler in CLI process_command(). The
   command was registered in the commands registry but had no handler,
   so typing /resume produced 'Unknown command'. The handler resolves
   by title or session ID, ends the current session cleanly, loads
   conversation history from SQLite, re-opens the target session, and
   syncs the AIAgent instance. Follows the same pattern as new_session().

2. Add truncation guard in _save_session_log(). When resuming a session
   whose messages weren't fully written to SQLite, the agent starts with
   partial history and the first save would overwrite the full JSON log
   on disk. The guard reads the existing file and skips the write if it
   already has more messages than the current batch.

3. Add reopen_session() method to SessionDB. Proper API for clearing
   ended_at/end_reason instead of reaching into _conn directly.

Note: Bug 1 from the original PR (INSERT OR IGNORE + _session_db = None)
is already fixed on main — skipped as redundant.

Closes #3123.
2026-03-26 19:04:28 -07:00
Teknium
3c57eaf744
fix: YAML boolean handling for tool_progress config (#3300)
YAML 1.1 parses bare `off` as boolean False, which is falsy in
Python's `or` chain and silently falls through to the 'all' default.
Users setting `display.tool_progress: off` in config.yaml saw no
effect — tool progress stayed on.

Normalise False → 'off' before the or chain in both affected paths:
- gateway/run.py _run_agent() tool progress reader
- cli.py HermesCLI.__init__() tool_progress_mode

Reported by @gibbsoft in #2859. Closes #2859.
2026-03-26 17:58:50 -07:00
Teknium
2d232c9991
feat(cli): configurable busy input mode + fix /queue always working (#3298)
Two changes:

1. Fix /queue command: remove the _agent_running guard that rejected
   /queue after the agent finished. The prompt was deferred in
   _pending_input until the agent completed, then the handler checked
   _agent_running (now False) and rejected it. /queue now always queues
   regardless of timing.

2. Add display.busy_input_mode config (CLI-only):
   - 'interrupt' (default): Enter while busy interrupts the current run
     (preserves existing behavior)
   - 'queue': Enter while busy queues the message for the next turn,
     with a 'Queued for the next turn: ...' confirmation
   Ctrl+C always interrupts regardless of this setting.

Salvaged from PR #3037 by StefanoChiodino. Key differences:
- Default is 'interrupt' (preserves existing behavior) not 'queue'
- No config version bump (unnecessary for new key in existing section)
- Simpler normalization (no alias map)
- /queue fix is simpler: just remove the guard instead of intercepting
  commands during busy state
2026-03-26 17:58:40 -07:00
Teknium
716e616d28
fix(tui): status bar duplicates and degrades during long sessions (#3291)
shutil.get_terminal_size() can return stale/fallback values on SSH that
differ from prompt_toolkit's actual terminal width. Fragments built for
the wrong width overflow and wrap onto a second line (wrap_lines=True
default), appearing as progressively degrading duplicates.

- Read width from get_app().output.get_size().columns when inside a
  prompt_toolkit TUI, falling back to shutil outside TUI context
- Add wrap_lines=False on the status bar Window as belt-and-suspenders
  guard against any future width mismatch

Closes #3130

Co-authored-by: Mibayy <Mibayy@users.noreply.github.com>
2026-03-26 17:33:11 -07:00
Teknium
db241ae6ce
feat(sessions): add --source flag for third-party session isolation (#3255)
When third-party tools (Paperclip orchestrator, etc.) spawn hermes chat
as a subprocess, their sessions pollute user session history and search.

- hermes chat --source <tag> (also HERMES_SESSION_SOURCE env var)
- exclude_sources parameter on list_sessions_rich() and search_messages()
- Sessions with source=tool hidden from sessions list/browse/search
- Third-party adapters pass --source tool to isolate agent sessions

Cherry-picked from PR #3208 by HenkDz.

Co-authored-by: Henkey <noonou7@gmail.com>
2026-03-26 14:35:31 -07:00
Teknium
41ee207a5e
fix: catch KeyboardInterrupt in exit cleanup handlers (#3257)
except Exception does not catch KeyboardInterrupt (inherits from
BaseException). A second Ctrl+C during exit cleanup aborts pending
writes — Honcho observations dropped, SQLite sessions left unclosed,
cron job sessions never marked ended.

Changed to except (Exception, KeyboardInterrupt) at all five sites:
- cli.py: honcho.shutdown() and end_session() in finally exit block
- run_agent.py: _flush_honcho_on_exit atexit handler
- cron/scheduler.py: end_session() and close() in job finally block

Tests exercise the actual production code paths and confirm
KeyboardInterrupt propagates without the fix.

Co-authored-by: dieutx <dangtc94@gmail.com>
2026-03-26 14:34:31 -07:00
Teknium
62f8aa9b03
fix: MCP toolset resolution for runtime and config (#3252)
Gateway sessions had their own inline toolset resolution that only read
platform_toolsets from config, which never includes MCP server names.
MCP tools were discovered and registered but invisible to the model.

- Replace duplicated gateway toolset resolution in _run_agent() and
  _run_background_task() with calls to the shared _get_platform_tools()
- Extend _get_platform_tools() to include globally enabled MCP servers
  at runtime (include_default_mcp_servers=True), while config-editing
  flows use include_default_mcp_servers=False to avoid persisting
  implicit MCP defaults into platform_toolsets
- Add homeassistant to PLATFORMS dict (was missing, caused KeyError)
- Fix CLI entry point to use _get_platform_tools() as well, so MCP
  tools are visible in CLI mode too
- Remove redundant platform_key reassignment in _run_background_task

Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>
2026-03-26 13:39:41 -07:00
Teknium
cbf195e806
chore: fix 154 f-strings, simplify getattr/URL patterns, remove dead code (#3119)
Three categories of cleanup, all zero-behavioral-change:

1. F-strings without placeholders (154 fixes across 29 files)
   - Converted f'...' to '...' where no {expression} was present
   - Heaviest files: run_agent.py (24), cli.py (20), honcho_integration/cli.py (34)

2. Simplify defensive patterns in run_agent.py
   - Added explicit self._is_anthropic_oauth = False in __init__ (before
     the api_mode branch that conditionally sets it)
   - Replaced 7x getattr(self, '_is_anthropic_oauth', False) with direct
     self._is_anthropic_oauth (attribute always initialized now)
   - Added _is_openrouter_url() and _is_anthropic_url() helper methods
   - Replaced 3 inline 'openrouter' in self._base_url_lower checks

3. Remove dead code in small files
   - hermes_cli/claw.py: removed unused 'total' computation
   - tools/fuzzy_match.py: removed unused strip_indent() function and
     pattern_stripped variable

Full test suite: 6184 passed, 0 failures
E2E PTY: banner clean, tool calls work, zero garbled ANSI
2026-03-25 19:47:58 -07:00
Teknium
08d3be0412
fix: graceful return on max retries instead of crashing thread
run_conversation raised the raw exception after exhausting retries,
which crashed the background thread in cli.py (unhandled exception
in Thread). Now returns a proper error result dict with failed=True
and persists the session, matching the pattern used by other error
paths (invalid responses, empty content, etc.).

Also wraps cli.py's run_agent thread function in try/except as a
safety net against any future unhandled exceptions from
run_conversation.

Made-with: Cursor
2026-03-25 19:00:39 -07:00
Teknium
f46542b6c6
fix(cli): read root-level provider and base_url from config.yaml into model config (#3112)
When users write root-level provider and base_url in config.yaml
(instead of nesting under model:), these keys were never merged into
defaults['model']. The CLI reads them from CLI_CONFIG['model']['provider']
so root-level keys were silently ignored, causing fallback to OpenRouter.

Merge root-level provider and base_url into defaults['model'] after
handling the model key, so custom/local provider configs work regardless
of nesting.

Cherry-picked from PR #2283 by ygd58. Fixes #2281.
2026-03-25 18:38:32 -07:00
Teknium
9783c9d5c1
refactor: remove /model slash command from CLI and gateway (#3080)
The /model command is removed from both the interactive CLI and
messenger gateway (Telegram/Discord/Slack/WhatsApp). Users can
still change models via 'hermes model' CLI subcommand or by
editing config.yaml directly.

Removed:
- CommandDef entry from COMMAND_REGISTRY
- CLI process_command() handler and model autocomplete logic
- Gateway _handle_model_command() and dispatch
- SlashCommandCompleter model_completer_provider parameter
- Two-stage Tab completion and ghost text for /model
- All /model-specific tests

Unaffected:
- /provider command (read-only, shows current model + providers)
- ACP adapter _cmd_model (separate system for VS Code/Zed/JetBrains)
- model_switch.py module (used by ACP)
- 'hermes model' CLI subcommand

Author: Teknium
2026-03-25 17:03:05 -07:00
Teknium
9d1e13019e
fix(cli): prevent TypeError on startup when base_url is None (#3068)
Description
This PR fixes the startup crash introduced in v0.4.0 where `self.base_url` being `None` throws a `TypeError`.

Root Cause:
At `cli.py:1108`, a membership check (`"openrouter.ai" in self.base_url`) is performed. If a user's config doesn't explicitly set a `base_url` (meaning it's `None`), Python raises a `TypeError: argument of type 'NoneType' is not iterable`, causing the entire CLI to crash on boot.

Fix:
Added a simple truthiness guard (`if self.base_url and ...`) to ensure the membership check only occurs if `base_url` is a valid string.

Closes #2842

Co-authored-by: devorun <130918800+devorun@users.noreply.github.com>
2026-03-25 16:21:00 -07:00
Teknium
841401f588
feat(cli): preserve user input on multiline paste (#3065)
When pasting 5+ lines, the CLI previously replaced the entire input
buffer with a file reference placeholder. If the user had already typed
a question, it was lost.

Fix: move paste collapsing into handle_paste (BracketedPaste handler)
so only the pasted content is saved to file. The placeholder is inserted
at the cursor position, preserving existing buffer text.

Also fixes:
- Multi-ref expansion on submit (re.sub instead of re.match) so
  multiple paste blocks and surrounding text are all preserved
- Double-collapse prevention via _paste_just_collapsed flag
- Consistent Unicode arrow character across all paste paths

Salvaged from PR #2607 by crazywriter1 (option B: core fix only,
without keybinding overrides for solid-object navigation/deletion).
2026-03-25 16:00:36 -07:00
Teknium
77bcaba2d7
refactor: consolidate get_hermes_home() and parse_reasoning_effort() (#3062)
Centralizes two widely-duplicated patterns into hermes_constants.py:

1. get_hermes_home() — Path resolution for ~/.hermes (HERMES_HOME env var)
   - Was copy-pasted inline across 30+ files as:
     Path(os.getenv("HERMES_HOME", Path.home() / ".hermes"))
   - Now defined once in hermes_constants.py (zero-dependency module)
   - hermes_cli/config.py re-exports it for backward compatibility
   - Removed local wrapper functions in honcho_integration/client.py,
     tools/website_policy.py, tools/tirith_security.py, hermes_cli/uninstall.py

2. parse_reasoning_effort() — Reasoning effort string validation
   - Was copy-pasted in cli.py, gateway/run.py, cron/scheduler.py
   - Same validation logic: check against (xhigh, high, medium, low, minimal, none)
   - Now defined once in hermes_constants.py, called from all 3 locations
   - Warning log for unknown values kept at call sites (context-specific)

31 files changed, net +31 lines (125 insertions, 94 deletions)
Full test suite: 6179 passed, 0 failed
2026-03-25 15:54:28 -07:00
Teknium
8bb1d15da4
chore: remove ~100 unused imports across 55 files (#3016)
Automated cleanup via pyflakes + autoflake with manual review.

Changes:
- Removed unused stdlib imports (os, sys, json, pathlib.Path, etc.)
- Removed unused typing imports (List, Dict, Any, Optional, Tuple, Set, etc.)
- Removed unused internal imports (hermes_cli.auth, hermes_cli.config, etc.)
- Fixed cli.py: removed 8 shadowed banner imports (imported from hermes_cli.banner
  then immediately redefined locally — only build_welcome_banner is actually used)
- Added noqa comments to imports that appear unused but serve a purpose:
  - Re-exports (gateway/session.py SessionResetPolicy, tools/terminal_tool.py
    is_interrupted/_interrupt_event)
  - SDK presence checks in try/except (daytona, fal_client, discord)
  - Test mock targets (auxiliary_client.py Path, mcp_config.py get_hermes_home)

Zero behavioral changes. Full test suite passes (6162/6162, 2 pre-existing
streaming test failures unrelated to this change).
2026-03-25 15:02:03 -07:00
Teknium
861624d4e9
fix(cli): refresh TUI before background task output to prevent status bar overlap (#3048)
When a background task (/bg command) prints its output while the main agent
is processing with the thinking spinner visible, the status bar could render
on the same row as the spinner, causing visual overlap.

This fix adds an explicit app.invalidate() call with a brief pause before
printing background task output, ensuring the TUI layout is in a consistent
state before the output is written.

Changes:
- Add TUI refresh before success output in _handle_background_command
- Add TUI refresh before error output in the exception handler
- Add tests for the refresh behavior

Closes #2718

Co-authored-by: Bartok9 <bartokmagic@proton.me>
2026-03-25 15:00:33 -07:00
Teknium
e4033b2baf
fix(cli): catch KeyboardInterrupt during flush_memories on exit (#3025)
KeyboardInterrupt inherits from BaseException, not Exception, so the
except Exception: clauses wrapping flush_memories() on exit paths
silently skipped the flush when the user pressed Ctrl+C. This could
lose conversation memory.

Change both call sites to except (Exception, KeyboardInterrupt): so
the memory flush is attempted even during interrupt.

Salvaged from PR #2855 by RufusLin (dropped unrelated bundled changes).
2026-03-25 12:47:51 -07:00
Teknium
8f6ef042c1
fix(cli): buffer reasoning preview chunks and fix duplicate display (#3013)
Three improvements to reasoning/thinking display in the CLI:

1. Buffer tiny reasoning chunks: providers like DeepSeek stream reasoning
   one word at a time, producing a separate [thinking] line per token.
   Add a buffer that coalesces chunks and flushes at natural boundaries
   (newlines, sentence endings, terminal width).

2. Fix duplicate reasoning display: centralize callback selection into
   _current_reasoning_callback() — one place instead of 4 scattered
   inline ternaries. Prevents both the streaming box AND the preview
   callback from firing simultaneously.

3. Fix post-response reasoning box guard: change the check from
   'not self._stream_started' to 'not self._reasoning_stream_started'
   so the final reasoning box is only suppressed when reasoning was
   actually streamed live, not when any text was streamed.

Cherry-picked from PR #2781 by juanfradb.
2026-03-25 12:16:39 -07:00
Teknium
52c5e491f5
fix(session): surface silent SessionDB failures that cause session data loss (#2999)
* fix(session): surface silent SessionDB failures that cause session data loss

SessionDB initialization and operation failures were logged at debug level
or silently swallowed, causing sessions to never be indexed in the FTS5
database. This made session_search unable to find affected conversations.

In practice, ~48% of sessions can be lost without any visible indication.
The JSON session files are still written (separate code path), but the
SQLite/FTS5 index gets nothing — making session_search return empty results
for affected sessions.

Changes:
- cli.py: Log warnings (not debug) when SessionDB init fails at both
  __init__ and _start_session entry points
- run_agent.py: Log warnings on create_session, append_message, and
  compression split failures
- run_agent.py: Set _session_db = None after create_session failure to
  fail fast instead of silently dropping every message for the session

Root cause: When gateway restarts or DB lock contention occurs during
SessionDB() init, the exception is caught and swallowed. The agent
continues running normally — JSON session logs are written to disk —
but no messages reach the FTS5 index.

* fix: use module logger instead of root logging for SessionDB warnings

Follow-up to cherry-picked PR #2939 — the original used logging.warning()
(root logger) instead of logger.warning() (module logger) in the 5 new
warning calls. Module logger preserves the logger hierarchy and shows the
correct module name in log output.

---------

Co-authored-by: LucidPaths <lc77@outlook.de>
2026-03-25 11:10:19 -07:00
Teknium
fd292e676b
fix: skip KawaiiSpinner when TUI handles tool progress (#2973)
* docs: unify hooks documentation — add plugin hooks to hooks page, add session:end event

The hooks page only documented gateway event hooks (HOOK.yaml system).
The plugins page listed plugin hooks (pre_tool_call, etc.) that weren't
referenced from the hooks page, which was confusing.

Changes:
- hooks.md: Add overview table showing both hook systems
- hooks.md: Add Plugin Hooks section with available hooks, callback
  signatures, and example
- hooks.md: Add missing session:end gateway event (emitted but undocumented)
- hooks.md: Mark pre_llm_call, post_llm_call, on_session_start,
  on_session_end as planned (defined in VALID_HOOKS but not yet invoked)
- hooks.md: Update info box to cross-reference plugin hooks
- hooks.md: Fix heading hierarchy (gateway content as subsections)
- plugins.md: Add cross-reference to hooks page for full details
- plugins.md: Mark planned hooks as (planned)

* feat(session_search): add recent sessions mode when query is omitted

When session_search is called without a query (or with an empty query),
it now returns metadata for the most recent sessions instead of erroring.
This lets the agent quickly see what was worked on recently without
needing specific keywords.

Returns for each session: session_id, title, source, started_at,
last_active, message_count, preview (first user message).
Zero LLM cost — pure DB query. Current session lineage and child
delegation sessions are excluded.

The agent can then keyword-search specific sessions if it needs
deeper context from any of them.

* docs: clarify two-mode behavior in session_search schema description

* fix(compression): restore sane defaults and cap summary at 12K tokens

- threshold: 0.80 → 0.50 (compress at 50%, not 80%)
- target_ratio: 0.40 → 0.20, now relative to threshold not total context
  (20% of 50% = 10% of context as tail budget)
- summary ceiling: 32K → 12K (Gemini can't output more than ~12K)
- Updated DEFAULT_CONFIG, config display, example config, and tests

* fix: browser_vision ignores auxiliary.vision.timeout config (#2901)

* docs: unify hooks documentation — add plugin hooks to hooks page, add session:end event

The hooks page only documented gateway event hooks (HOOK.yaml system).
The plugins page listed plugin hooks (pre_tool_call, etc.) that weren't
referenced from the hooks page, which was confusing.

Changes:
- hooks.md: Add overview table showing both hook systems
- hooks.md: Add Plugin Hooks section with available hooks, callback
  signatures, and example
- hooks.md: Add missing session:end gateway event (emitted but undocumented)
- hooks.md: Mark pre_llm_call, post_llm_call, on_session_start,
  on_session_end as planned (defined in VALID_HOOKS but not yet invoked)
- hooks.md: Update info box to cross-reference plugin hooks
- hooks.md: Fix heading hierarchy (gateway content as subsections)
- plugins.md: Add cross-reference to hooks page for full details
- plugins.md: Mark planned hooks as (planned)

* fix: browser_vision ignores auxiliary.vision.timeout config

browser_vision called call_llm() without passing a timeout parameter,
so it always used the 30-second default in auxiliary_client.py. This
made vision analysis with local models (llama.cpp, ollama) impossible
since they typically need more than 30s for screenshot analysis.

Now browser_vision reads auxiliary.vision.timeout from config.yaml
(same config key that vision_analyze already uses) and passes it
through to call_llm().

Also bumped the default vision timeout from 30s to 120s in both
browser_vision and vision_analyze — 30s is too aggressive for local
models and the previous default silently failed for anyone running
vision locally.

Fixes user report from GamerGB1988.

* fix(skills): agent-created skills were incorrectly treated as untrusted community content

_resolve_trust_level() didn't handle 'agent-created' source, so it
fell through to 'community' trust level. Community policy blocks on
any caution or dangerous findings, which meant common patterns like
curl with env vars, systemctl, crontab, cloudflared references etc.
would block skill creation/patching.

The agent-created policy row already existed in INSTALL_POLICY with
permissive settings (allow caution, ask on dangerous) but was never
reached. Now it is.

Fixes reports of skill_manage being blocked by security scanner.

* fix(cli): enhance real-time reasoning output by forcing flush of long partial lines

Updated the reasoning output mechanism to emit complete lines and force-flush long partial lines, ensuring reasoning is visible in real-time even without newlines. This improves user experience during reasoning sessions.

* fix: skip KawaiiSpinner when TUI handles tool progress

In the interactive CLI, the agent runs with quiet_mode=True and
tool_progress_callback set. The quiet_mode condition triggered
KawaiiSpinner for every tool call, but the TUI was already handling
progress display via the spinner widget.

The KawaiiSpinner writes carriage-return animation through StdoutProxy,
triggering run_in_terminal() erase/redraw cycles on every flush. These
redundant cycles cause the status bar to ghost into terminal scrollback.

The thinking spinner already had this guard (checks thinking_callback).
This extends the same pattern to the three tool spinner creation sites:
concurrent tools, delegate_task, and single tool execution.
2026-03-25 08:33:44 -07:00
Teknium
ad1bf16f28
chore: remove all remaining mini-swe-agent references
Complete cleanup after dropping the mini-swe-agent submodule (PR #2804):

- Remove MSWEA_SILENT_STARTUP and MSWEA_GLOBAL_CONFIG_DIR env var
  settings from cli.py, run_agent.py, hermes_cli/main.py, doctor.py
- Remove mini-swe-agent health check from hermes doctor
- Remove 'minisweagent' from logger suppression lists
- Remove litellm/typer/platformdirs from requirements.txt
- Remove mini-swe-agent install steps from install.ps1 (Windows)
- Remove mini-swe-agent install steps from website docs
- Update all stale comments/docstrings referencing mini-swe-agent
  in terminal_tool.py, tools/__init__.py, code_execution_tool.py,
  environments/README.md, environments/agent_loop.py
- Remove mini_swe_runner from pyproject.toml py-modules
  (still exists as standalone script for RL training use)
- Shrink test_minisweagent_path.py to empty stub

The orphaned mini-swe-agent/ directory on disk needs manual removal:
  rm -rf mini-swe-agent/
2026-03-24 08:19:23 -07:00
Teknium
02b38b93cb
refactor: remove mini-swe-agent dependency — inline Docker/Modal backends (#2804)
Drop the mini-swe-agent git submodule. All terminal backends now use
hermes-agent's own environment implementations directly.

Docker backend:
- Inline the `docker run -d` container startup (was 15 lines in
  minisweagent's DockerEnvironment). Our wrapper already handled
  execute(), cleanup(), security hardening, volumes, and resource limits.

Modal backend:
- Import swe-rex's ModalDeployment directly instead of going through
  minisweagent's 90-line passthrough wrapper.
- Bake the _AsyncWorker pattern (from environments/patches.py) directly
  into ModalEnvironment for Atropos compatibility without monkey-patching.

Cleanup:
- Remove minisweagent_path.py (submodule path resolution helper)
- Remove submodule init/install from install.sh and setup-hermes.sh
- Remove mini-swe-agent from .gitmodules
- environments/patches.py is now a no-op (kept for backward compat)
- terminal_tool.py no longer does sys.path hacking for minisweagent
- mini_swe_runner.py guards imports (optional, for RL training only)
- Update all affected tests to mock the new direct subprocess calls
- Update README.md, CONTRIBUTING.md

No functionality change — all Docker, Modal, local, SSH, Singularity,
and Daytona backends behave identically. 6093 tests pass.
2026-03-24 07:30:25 -07:00
Teknium
2e524272b1
refactor(model): extract shared switch_model() from CLI and gateway handlers
Phase 4 of the /model command overhaul.

Both the CLI (cli.py) and gateway (gateway/run.py) /model handlers
had ~50 lines of duplicated core logic: parsing, provider detection,
credential resolution, and model validation. This extracts that
pipeline into hermes_cli/model_switch.py.

New module exports:
- ModelSwitchResult: dataclass with all fields both handlers need
- CustomAutoResult: dataclass for bare '/model custom' results
- switch_model(): core pipeline — parse → detect → resolve → validate
- switch_to_custom_provider(): resolve endpoint + auto-detect model

The shared functions are pure (no I/O side effects). Each caller
handles its own platform-specific concerns:
- CLI: sets self.model/provider/etc, calls save_config_value(), prints
- Gateway: writes config.yaml directly, sets env vars, returns markdown

Net result: -244 lines from handlers, +234 lines in shared module.
The handlers are now ~80 lines each (down from ~150+) and can't drift
apart on core logic.
2026-03-24 07:08:07 -07:00
Teknium
b641ee88f4
feat(model): /model command overhaul — Phases 2, 3, 5
* feat(model): persist base_url on /model switch, auto-detect for bare /model custom

Phase 2+3 of the /model command overhaul:

Phase 2 — Persist base_url on model switch:
- CLI: save model.base_url when switching to a non-OpenRouter endpoint;
  clear it when switching away from custom to prevent stale URLs
  leaking into the new provider's resolution
- Gateway: same logic using direct YAML write

Phase 3 — Better feedback and edge cases:
- Bare '/model custom' now auto-detects the model from the endpoint
  using _auto_detect_local_model() and saves all three config values
  (model, provider, base_url) atomically
- Shows endpoint URL in success messages when switching to/from
  custom providers (both CLI and gateway)
- Clear error messages when no custom endpoint is configured
- Updated test assertions for the additional save_config_value call

Fixes #2562 (Phase 2+3)

* feat(model): support custom:name:model triple syntax for named custom providers

Phase 5 of the /model command overhaul.

Extends parse_model_input() to handle the triple syntax:
  /model custom:local-server:qwen → provider='custom:local-server', model='qwen'
  /model custom:my-model          → provider='custom', model='my-model' (unchanged)

The 'custom:local-server' provider string is already supported by
_get_named_custom_provider() in runtime_provider.py, which matches
it against the custom_providers list in config.yaml. This just wires
the parsing so users can do it from the /model slash command.

Added 4 tests covering single, triple, whitespace, and empty model cases.
2026-03-24 06:58:04 -07:00
Teknium
4313b8aff6
fix(cli): ensure single closure of streaming boxes during tool generation
- Updated `_on_tool_gen_start` method in `HermesCLI` to close open streaming boxes exactly once, preventing potential multiple closures.
- Added a check for `_stream_box_opened` to manage the state of the streaming box more effectively, enhancing user experience during large payload streaming.
2026-03-24 06:33:21 -07:00
Teknium
87e2626cf6
feat(cli, agent): add tool generation callback for streaming updates
- Introduced `_on_tool_gen_start` in `HermesCLI` to indicate when tool-call arguments are being generated, enhancing user feedback during streaming.
- Updated `AIAgent` to support a new `tool_gen_callback`, notifying the display layer when tool generation starts, allowing for better user experience during large payloads.
- Ensured that the callback is triggered appropriately during streaming events to prevent user interface freezing.
2026-03-23 23:10:58 -07:00
Teknium
4ff73fb32c
feat(config): support ${ENV_VAR} substitution in config.yaml (#2684)
* feat(config): support ${ENV_VAR} substitution in config.yaml

* fix: extend env var expansion to CLI and gateway config loaders

The original PR (#2680) only wired _expand_env_vars into load_config(),
which is used by 'hermes tools' and 'hermes setup'. The two primary
config paths were missed:

- load_cli_config() in cli.py (interactive CLI)
- Module-level _cfg in gateway/run.py (gateway — bridges api_keys to env vars)

Also:
- Remove redundant 'import re' (already imported at module level)
- Add missing blank lines between top-level functions (PEP 8)
- Add tests for load_cli_config() expansion

---------

Co-authored-by: teyrebaz33 <hakanerten02@hotmail.com>
2026-03-23 16:02:06 -07:00
Teknium
f60ebc7bf2
fix: move activated skills line below welcome text
Previously 'Activated skills: xxx' was printed above the banner in
show_banner(). Now it prints directly after the 'Welcome to Hermes
Agent!' line in run(), which is a more natural placement.
2026-03-23 06:20:19 -07:00
Teknium
1b5fb36c9d
fix(cli): allow custom/local endpoints without API key
Local LLM servers (llama.cpp, ollama, vLLM, etc.) typically don't
require authentication. When a custom base_url is configured but no
API key is found, use a placeholder instead of failing with
'Provider resolver returned an empty API key.'

The OpenAI SDK accepts any string as api_key, and local servers
simply ignore the Authorization header.

Fixes issue reported by @ThatWolfieGuy — llama.cpp stopped working
after updating because the new runtime provider resolver enforces
non-empty API keys even for keyless local endpoints.
2026-03-22 16:08:21 -07:00
Teknium
1f21ef7488
fix(cli): prevent 'Press ENTER to continue...' on exit
When AsyncOpenAI clients are garbage-collected after the event loop
closes, their AsyncHttpxClientWrapper.__del__ tries to schedule
aclose() on the dead loop, causing RuntimeError: Event loop is closed.
prompt_toolkit catches this as an unhandled exception and shows
'Press ENTER to continue...' which blocks CLI exit.

Fix: Add shutdown_cached_clients() to auxiliary_client.py that marks
all cached async clients' underlying httpx transport as CLOSED before
GC runs. This prevents __del__ from attempting the aclose() call.

- _force_close_async_httpx(): sets httpx AsyncClient._state to CLOSED
- shutdown_cached_clients(): iterates _client_cache, closes sync clients
  normally and marks async clients as closed
- Also fix stale client eviction in _get_cached_client to mark evicted
  async clients as closed (was just del-ing them, triggering __del__)
- Call shutdown_cached_clients() from _run_cleanup() in cli.py
2026-03-22 15:31:54 -07:00
Teknium
bfe4baa6ed
chore: remove unused imports, dead code, and stale comments
Mechanical cleanup — no behavior changes.

Unused imports removed:
- model_tools.py: import os
- run_agent.py: OPENROUTER_MODELS_URL, get_model_context_length
- cli.py: Table, VERSION, RELEASE_DATE, resolve_toolset, get_skill_commands
- terminal_tool.py: signal, uuid, tempfile, set_interrupt_event,
  DANGEROUS_PATTERNS, _load_permanent_allowlist, _detect_dangerous_command

Dead code removed:
- toolsets.py: print_toolset_tree() (zero callers)
- browser_tool.py: _get_session_name() (never called)

Stale comments removed:
- toolsets.py: duplicated/garbled comment line
- web_tools.py: 3 aspirational TODO comments from early development
2026-03-22 08:33:34 -07:00
Mibayy
0698ddb496
fix(compression): remove hardcoded gemini-3-flash-preview as default summary model
Closes #2453

The DEFAULT_CONFIG was hardcoding google/gemini-3-flash-preview as the
summary_model for context compression. This caused unexpected OpenRouter
charges for users who configured a different provider/model, because the
compression task would silently fall back to gemini via OpenRouter even
when the user's main model was on a different provider.

Fix: change summary_model default to empty string. When empty,
call_llm() resolves the model through the standard auto-detection chain
(auxiliary.compression config -> env vars -> main provider), which
correctly uses the user's configured provider and model.

Users who want a dedicated cheap model for compression can still
explicitly set compression.summary_model in their config.yaml.
2026-03-22 04:36:36 -07:00
Teknium
f69c47d9ae
fix: /stop command crash + UnboundLocalError in streaming media delivery
Two fixes:

1. CLI /stop command crashed with 'cannot import name get_registry' —
   the code imported a non-existent function. Fixed to use the actual
   process_registry singleton and list_sessions() method.
   (Reported in #2458 by haiyuzhong1980)

2. Streaming media delivery used undefined 'adapter' variable —
   our PR #2382 called _deliver_media_from_response(adapter=adapter)
   but 'adapter' wasn't guaranteed to be defined in that scope.
   Fixed to resolve via self.adapters.get(source.platform).
   (Reported in #2424 by 42-evey)
2026-03-22 04:35:27 -07:00
Teknium
8cb7864110
fix: resolve garbled ANSI escape codes in status printouts (#2262) (#2448)
Two related root causes for the '?[33mTool progress: NEW?[0m' garbling
reported on kitty, alacritty, ghostty and gnome-console:

1. /verbose label printing used self.console.print() with Rich markup
   ([yellow]...[/]).  self.console is a plain Rich Console() whose output
   goes directly to sys.stdout, which patch_stdout's StdoutProxy
   intercepts and mangles raw ANSI sequences.

2. Context pressure status lines (e.g. 'approaching compaction') from
   AIAgent._safe_print() had the same problem -- _safe_print() was a
   @staticmethod that always called builtin print(), bypassing the
   prompt_toolkit renderer entirely.

Fix:
- Convert AIAgent._safe_print() from @staticmethod to an instance method
  that delegates to self._print_fn (defaults to builtin print, preserving
  all non-CLI behaviour).
- After the CLI creates its AIAgent instance, wire self.agent._print_fn to
  the existing _cprint() helper which routes through
  prompt_toolkit.print_formatted_text(ANSI(text)).
- Rewrite the /verbose feedback labels to use hermes_cli.colors.Colors
  ANSI constants in f-strings and emit them via _cprint() directly,
  removing the Rich-markup-inside-patch_stdout anti-pattern.

Fixes #2262

Co-authored-by: Animesh Mishra <animesh.m.7523@gmail.com>
2026-03-22 04:07:06 -07:00
Teknium
2a5f86ed6d
Merge pull request #2343 from NousResearch/hermes/hermes-31d7db3b
feat: @ context references + Honcho config fixes
2026-03-21 16:10:19 -07:00
Teknium
8da410ed95
feat(plugins): add slash command registration for plugins (#2359)
Plugins can now register slash commands via ctx.register_command()
in their register() function. Commands automatically appear in:
- /help and COMMANDS_BY_CATEGORY (under 'Plugins' category)
- Tab autocomplete in CLI
- Telegram bot menu
- Slack subcommand mapping
- Gateway dispatch

Handler signature: handler(args: str) -> str | None
Async handlers are supported in gateway context.

Changes:
- commands.py: add register_plugin_command() and rebuild_lookups()
- plugins.py: add register_command() to PluginContext, track in
  PluginManager._plugin_commands and LoadedPlugin.commands_registered
- cli.py: dispatch plugin commands in process_command()
- gateway/run.py: dispatch plugin commands before skill commands
- tests: 5 new tests for registration, help, tracking, handler, gateway
- docs: update plugins feature page and build guide
2026-03-21 16:00:30 -07:00
Teknium
da44c196b6
feat: @ context references — inline file, folder, diff, git, and URL injection
Add @file:path, @folder:dir, @diff, @staged, @git:N, and @url:
references that expand inline before the message reaches the LLM.
Supports line ranges (@file:main.py:10-50), token budget enforcement
(soft warn at 25%, hard block at 50%), and path sandboxing for gateway.

Core module from PR #2090 by @kshitijk4poor. CLI and gateway wiring
rewritten against current main. Fixed asyncio.run() crash when called
from inside a running event loop (gateway).

Closes #682.
2026-03-21 15:57:13 -07:00
christopher-kapic
97108db038
fix(cli): pass conversation_history in quiet mode with --resume
hermes chat -q 'msg' --resume SESSION_ID loaded the session history
but never passed it to run_conversation(), so the model responded
without prior context. The interactive mode already does this correctly.

Based on work by christopher-kapic in PR #2081. Fixes #2106.
2026-03-21 12:51:34 -07:00
Teknium
29520df44f
revert: remove Shift+Enter keybindings that crash prompt_toolkit
Reverts the s-enter and Kitty CSI keybindings from PR #2345/#2346.
The s-enter key notation causes 'Invalid key: s-enter' crash on
some prompt_toolkit versions, breaking hermes startup entirely.
2026-03-21 10:41:07 -07:00
Teknium
42cef9c282
fix: resolve merge conflict markers in cli.py breaking hermes startup
PR #2346 was merged with unresolved git conflict markers (<<<<<<,
=======, >>>>>>>) in cli.py at line 6047, causing SyntaxError on
startup. Resolved by keeping both the Shift+Enter keybindings and
the tab handler.
2026-03-21 10:34:21 -07:00
ygd58
356122e990
fix(cli): handle Kitty keyboard protocol Shift+Enter for Ghostty/WezTerm
Kitty-protocol terminals (Ghostty, WezTerm) encode Shift+Enter as
CSI 13;2u instead of plain Enter. Without this binding, raw escape
characters appear in the input buffer. Adds s-enter and the Kitty
escape sequence as newline-insert bindings.

Based on work by ygd58 in PR #1798. Fixes #1795.
Registry.py apostrophe sanitization change excluded (unrelated scope).
2026-03-21 10:03:55 -07:00
Teknium
c4e787d47b
feat: enable streaming by default in CLI
Streaming provides a better UX — tokens appear as they arrive instead
of waiting for the full response. show_reasoning remains false so
thinking blocks are not streamed to the user.
2026-03-21 09:49:47 -07:00
Teknium
d70e07fc45
refactor(cli): add protected TUI extension hooks for wrapper CLIs
Based on PR #1749 by @erosika (reimplemented on current main).

Extracts three protected methods from run() so wrapper CLIs can extend
the TUI without overriding the entire method:

- _get_extra_tui_widgets(): inject widgets between spacer and status bar
- _register_extra_tui_keybindings(kb, input_area): add keybindings
- _build_tui_layout_children(**widgets): full control over ordering

Default implementations reproduce existing layout exactly. The inline
HSplit in run() now delegates to _build_tui_layout_children().

5 tests covering defaults, widget insertion position, and keybinding
registration.
2026-03-21 09:42:07 -07:00
Teknium
f4a74d3ac7
fix(honcho): hide session banner when not explicitly configured
Add explicitly_configured field to HonchoClientConfig — set when the
config has a hosts.hermes block or explicit enabled flag, vs auto-enabled
from a stray HONCHO_API_KEY env var.  Banner only shows when this is true.

Based on #1960 by @erosika, reimplemented without duplicating config parsing.
2026-03-21 08:33:44 -07:00
Teknium
fd1d6c03cb
fix(cli): correct truncated AUXILIARY_WEB_EXTRACT_API_KEY env var name
Cherry-picked from PR #2295 by @dlkakbs.

The web_extract auxiliary client api_key env var was literally stored as
'AUXILI..._KEY' (dots in the source) instead of the full name. Users
configuring an auxiliary web_extract model with an API key would have
auth failures because the key was written to a non-existent var.
2026-03-21 07:09:28 -07:00
Teknium
eb537b5db4
fix(cli): prevent multiple reasoning boxes from rendering
Added a check to suppress further reasoning rendering once the response box is open, preventing potential overlap of reasoning boxes during late thinking blocks. This enhances the user experience by maintaining a clean output in the CLI.
2026-03-21 06:28:47 -07:00
Teknium
3585019831
feat(cli): enhance user input display with consistent formatting
- Added a user bar separator for improved visual clarity when displaying pasted text and user input in the HermesCLI.
- Ensured consistent formatting for both multi-line and single-line user inputs, enhancing the overall user experience in the command-line interface.

These changes contribute to a more organized and visually appealing output during interactions.
2026-03-20 23:36:49 -07:00
Test
d0ac8d9fc7 chore: remove dead top-level toolsets config key
The top-level 'toolsets' key in config.yaml was never read at runtime.
Tool selection uses platform_toolsets (per-platform) or the --toolsets
CLI flag. The key existed in load_cli_config() defaults and the example
config as 'toolsets: [all]', misleading users into thinking it
controlled tool availability.

- Remove from load_cli_config() hardcoded defaults
- Remove from hermes config show output
- Replace in cli-config.yaml.example with deprecation note pointing
  to platform_toolsets and hermes tools
2026-03-20 22:27:13 -07:00
Test
f7e2ed20fa feat(cli): implement true-color ANSI support for response text
- Added support for true-color ANSI escape codes in the HermesCLI to enhance the visual appearance of streamed content.
- Introduced a fallback mechanism for text color in case of errors while retrieving the color from the active skin.
- Updated the output formatting to include the new text color in both line emissions and buffer flushing.

These changes improve the user experience by ensuring consistent and visually appealing text output in the command-line interface.
2026-03-20 21:02:36 -07:00
Teknium
2416b2b7af
refactor(cli, banner): update gold ANSI color to true-color format (#2246)
- Changed the ANSI escape code for gold color in cli.py and banner.py to use true-color format (#FFD700) for better visual consistency.
- Enhanced the _on_tool_progress method in HermesCLI to update the TUI spinner with tool execution status, improving user feedback during operations.

These changes improve the visual representation and user experience in the command-line interface.

Co-authored-by: Test <test@test.com>
2026-03-20 18:17:38 -07:00
Teknium
8e884fb3f1
Merge pull request #2215 from NousResearch/hermes/hermes-31d7db3b
fix: infer provider from base URL for models.dev context length lookup
2026-03-20 12:52:07 -07:00
Test
76bc27199f fix(cli, agent): improve streaming handling and state management
- Updated _stream_delta method in HermesCLI to handle None values, flushing the stream and resetting state for clean tool execution.
- Enhanced quiet mode handling in AIAgent to ensure proper display closure before tool execution, preventing display issues with intermediate streamed content.

These changes improve the robustness of the streaming functionality and ensure a smoother user experience during tool interactions.
2026-03-20 10:02:42 -07:00
Teknium
66a1942524
feat: add /queue command to queue prompts without interrupting (#2191)
Adds /queue <prompt> (alias /q) that queues a message for the next
turn while the agent is busy, without interrupting the current run.

- CLI: /queue <prompt> puts it in _pending_input for the next turn
- Gateway: /queue <prompt> creates a pending MessageEvent on the
  adapter, picked up after the current agent run finishes
- Enter still interrupts as usual (no behavior change)
- /queue with no prompt shows usage
- /queue when agent is idle tells user to just type normally

Co-authored-by: Test <test@test.com>
2026-03-20 09:44:27 -07:00
Test
1055d4356a fix: skip model auto-detection for custom/local providers
When the user is on a custom provider (provider=custom, localhost, or
127.0.0.1 endpoint), /model <name> no longer tries to auto-detect a
provider switch. The model name changes on the current endpoint as-is.

To switch away from a custom endpoint, users must use explicit
provider:model syntax (e.g. /model openai-codex:gpt-5.2-codex).
A helpful tip is printed when changing models on a custom endpoint.

This prevents the confusing case where someone on LM Studio types
/model gpt-5.2-codex, the auto-detection tries to switch providers,
fails or partially succeeds, and requests still go to the old endpoint.

Also fixes the missing prompt_toolkit.auto_suggest mock stub in
test_cli_init.py (same issue already fixed in test_cli_new_session.py).
2026-03-20 04:35:17 -07:00
Teknium
b19f5133c3
Merge pull request #2118 from NousResearch/hermes/hermes-e83093f0
feat: show reasoning/thinking blocks when show_reasoning is enabled
2026-03-20 04:35:12 -07:00
Test
b1832faaae feat: show reasoning/thinking blocks when show_reasoning is enabled
- Add <thinking> tag to streaming filter's tag list
- When show_reasoning is on, route XML reasoning content to the
  reasoning display box instead of silently discarding it
- Expand _strip_think_blocks to handle all tag variants:
  <think>, <thinking>, <THINKING>, <reasoning>, <REASONING_SCRATCHPAD>
2026-03-19 19:44:31 -07:00
InB4DevOps
fe331ed9bd fix: Reset token counters on new session for accurate usage display (#2099) 2026-03-20 01:21:25 +01:00
Teknium
4c0c7f4c6e
fix: /model command — bare provider names, custom endpoint display
Two issues with /model preventing proper provider switching:

1. Bare provider names not detected: typing '/model nous' treated 'nous'
   as a model name instead of triggering a provider switch. Fixed by adding
   step 0 in detect_provider_for_model() that checks if the input matches
   a known provider name/alias (excluding 'custom'/'openrouter' which need
   explicit model names) and returns that provider's default model.

2. Custom endpoint details hidden: /model (no args) showed '[custom]' with
   just a usage hint but no endpoint URL or model name. Now displays the
   configured base_url for custom providers in both CLI and gateway.

Note: config base_url and OPENAI_BASE_URL are intentionally NOT cleared on
provider switch — dedicated provider paths (nous, anthropic, codex) have
their own credential resolution that ignores these, and clearing them would
destroy the user's custom endpoint config, preventing switching back.

Co-authored-by: Test <test@test.com>
2026-03-19 12:06:48 -07:00
StefanIsMe
04b6ecadc4 feat(cli): Tab now accepts auto-suggestions (ghost text)
Previously, Tab only handled dropdown completions. Users seeing gray
ghost text from history-based suggestions had no way to accept them
with Tab - they had to use Right arrow or Ctrl+E.

Now Tab follows priority:
1. Completion menu open → accept selected completion
2. Ghost text suggestion available → accept auto-suggestion
3. Otherwise → start completion menu

This matches user intuition that Tab should 'complete what I see.'
2026-03-19 10:40:37 -07:00
cmcleay
bb59057d5d fix: normalize live Chrome CDP endpoints for browser tools 2026-03-19 10:17:03 -07:00
Teknium
d76fa7fc37
fix: detect context length for custom model endpoints via fuzzy matching + config override (#2051)
* fix: detect context length for custom model endpoints via fuzzy matching + config override

Custom model endpoints (non-OpenRouter, non-known-provider) were silently
falling back to 2M tokens when the model name didn't exactly match what the
endpoint's /v1/models reported. This happened because:

1. Endpoint metadata lookup used exact match only — model name mismatches
   (e.g. 'qwen3.5:9b' vs 'Qwen3.5-9B-Q4_K_M.gguf') caused a miss
2. Single-model servers (common for local inference) required exact name
   match even though only one model was loaded
3. No user escape hatch to manually set context length

Changes:
- Add fuzzy matching for endpoint model metadata: single-model servers
  use the only available model regardless of name; multi-model servers
  try substring matching in both directions
- Add model.context_length config override (highest priority) so users
  can explicitly set their model's context length in config.yaml
- Log an informative message when falling back to 2M probe, telling
  users about the config override option
- Thread config_context_length through ContextCompressor and AIAgent init

Tests: 6 new tests covering fuzzy match, single-model fallback, config
override (including zero/None edge cases).

* fix: auto-detect local model name and context length for local servers

Cherry-picked from PR #2043 by sudoingX.

- Auto-detect model name from local server's /v1/models when only one
  model is loaded (no manual model name config needed)
- Add n_ctx_train and n_ctx to context length detection keys for llama.cpp
- Query llama.cpp /props endpoint for actual allocated context (not just
  training context from GGUF metadata)
- Strip .gguf suffix from display in banner and status bar
- _auto_detect_local_model() in runtime_provider.py for CLI init

Co-authored-by: sudo <sudoingx@users.noreply.github.com>

* fix: revert accidental summary_target_tokens change + add docs for context_length config

- Revert summary_target_tokens from 2500 back to 500 (accidental change
  during patching)
- Add 'Context Length Detection' section to Custom & Self-Hosted docs
  explaining model.context_length config override

---------

Co-authored-by: Test <test@test.com>
Co-authored-by: sudo <sudoingx@users.noreply.github.com>
2026-03-19 06:01:16 -07:00
Test
e7844e9c8d Merge origin/main, resolve conflicts (self._base_url_lower) 2026-03-18 04:09:00 -07:00
Test
c1750bb32d feat(cli): add /statusbar command to toggle context bar
Adds /statusbar (alias /sb) to show/hide the bottom status bar that
displays model name, context usage, and session duration.

Uses ConditionalContainer so the bar takes zero space when hidden
rather than leaving a blank line.
2026-03-18 03:49:49 -07:00
Test
8422196e89 Merge PR #1879: feat: integrate GitHub Copilot providers 2026-03-18 03:18:33 -07:00
Teknium
24ac577046
fix: respect model.default from config.yaml for openai-codex provider (#1896)
When config.yaml had a non-default model (e.g. gpt-5.3-codex) and the
provider was openai-codex, _normalize_model_for_provider() would replace
it with the latest available codex model because _model_is_default only
checked the CLI argument, not the config value.

Now _model_is_default is False when config.yaml has a model that differs
from the global fallback (anthropic/claude-opus-4.6), so the user's
explicit config choice is preserved.

Fixes #1887

Co-authored-by: Test <test@test.com>
2026-03-18 02:50:31 -07:00
Test
a8132d1252 fix: respect model.default from config.yaml for openai-codex provider
When config.yaml had a non-default model (e.g. gpt-5.3-codex) and the
provider was openai-codex, _normalize_model_for_provider() would replace
it with the latest available codex model because _model_is_default only
checked the CLI argument, not the config value.

Now _model_is_default is False when config.yaml has a model that differs
from the global fallback (anthropic/claude-opus-4.6), so the user's
explicit config choice is preserved.

Fixes #1887
2026-03-18 02:24:41 -07:00
max
0c392e7a87 feat: integrate GitHub Copilot providers across Hermes
Add first-class GitHub Copilot and Copilot ACP provider support across
model selection, runtime provider resolution, CLI sessions, delegated
subagents, cron jobs, and the Telegram gateway.

This also normalizes Copilot model catalogs and API modes, introduces a
Copilot ACP OpenAI-compatible shim, and fixes service-mode auth by
resolving Homebrew-installed gh binaries under launchd.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-03-17 23:40:22 -07:00
Test
a71e3f4d98 fix: add /browser to COMMAND_REGISTRY so it shows in help and autocomplete
The /browser command handler existed in cli.py but was never added to
COMMAND_REGISTRY after the centralized command registry refactor. This
meant:
- /browser didn't appear in /help
- No tab-completion or subcommand suggestions
- Dispatch used _base_word fallback instead of canonical resolution

Added CommandDef with connect/disconnect/status subcommands and
switched dispatch to use canonical instead of _base_word.
2026-03-17 13:29:36 -07:00
Teknium
7ac9088d5c
fix: Telegram streaming — config bridge, not-modified, flood control (#1782)
* fix: NameError in OpenCode provider setup (prompt_text -> prompt)

The OpenCode Zen and OpenCode Go setup sections used prompt_text()
which is undefined. All other providers correctly use the local
prompt() function defined in setup.py. Fixes crash during
'hermes setup' when selecting either OpenCode provider.

* fix: Telegram streaming — config bridge, not-modified, flood control

Three fixes for gateway streaming:

1. Bridge streaming config from config.yaml into gateway runtime.
   load_gateway_config() now reads the 'streaming' key from config.yaml
   (same pattern as session_reset, stt, etc.), matching the docs.
   Previously only gateway.json was read.

2. Handle 'Message is not modified' in Telegram edit_message().
   This Telegram API error fires when editing with identical content —
   a no-op, not a real failure. Previously it returned success=False
   which made the stream consumer disable streaming entirely.

3. Handle RetryAfter / flood control in Telegram edit_message().
   Fast providers can hit Telegram rate limits during streaming.
   Now waits the requested retry_after duration and retries once,
   instead of treating it as a fatal edit failure.

Also fixed double-edit on stream finish: the consumer now tracks
last-sent text and skips redundant edits, preventing the not-modified
error at the source.

* refactor: make config.yaml the primary gateway config source

Eliminates the per-key bridge pattern in load_gateway_config().
Previously gateway.json was the primary source and each config.yaml
key needed an individual bridge — easy to forget (streaming was
missing, causing garl4546's bug).

Now config.yaml is read first and its keys are mapped directly into
the GatewayConfig.from_dict() schema. gateway.json is kept as a
legacy fallback layer (loaded first, then overwritten by config.yaml
keys). If gateway.json exists, a log message suggests migrating.

Also:
- Removed dead save_gateway_config() (never called anywhere)
- Updated CLI help text and send_message error to reference
  config.yaml instead of gateway.json

---------

Co-authored-by: Test <test@test.com>
2026-03-17 10:51:54 -07:00
teknium1
c881209b92 Revert "feat(cli): skin-aware light/dark theme mode with terminal auto-detection"
This reverts commit a1c81360a5.
2026-03-17 10:04:53 -07:00
Teknium
d1d17f4f0a
feat(compression): add summary_base_url + move compression config to YAML-only
- Add summary_base_url config option to compression block for custom
  OpenAI-compatible endpoints (e.g. zai, DeepSeek, Ollama)
- Remove compression env var bridges from cli.py and gateway/run.py
  (CONTEXT_COMPRESSION_* env vars no longer set from config)
- Switch run_agent.py to read compression config directly from
  config.yaml instead of env vars
- Fix backwards-compat block in _resolve_task_provider_model to also
  fire when auxiliary.compression.provider is 'auto' (DEFAULT_CONFIG
  sets this, which was silently preventing the compression section's
  summary_* keys from being read)
- Add test for summary_base_url config-to-client flow
- Update docs to show compression as config.yaml-only

Closes #1591
Based on PR #1702 by @uzaylisak
2026-03-17 04:46:15 -07:00
teknium1
0897e4350e merge: resolve conflicts with origin/main 2026-03-17 04:30:37 -07:00
teknium1
e5fc916814 feat: auto-generate session titles after first exchange
After the first user→assistant exchange, Hermes now generates a short
descriptive session title via the auxiliary LLM (compression task config).
Title generation runs in a background thread so it never delays the
user-facing response.

Key behaviors:
- Fires only on the first 1-2 exchanges (checks user message count)
- Skips if a title already exists (user-set titles are never overwritten)
- Uses call_llm with compression task config (cheapest/fastest model)
- Truncates long messages to keep the title generation request small
- Cleans up LLM output: strips quotes, 'Title:' prefixes, enforces 80 char max
- Works in both CLI and gateway (Telegram/Discord/etc.)

Also updates /title (no args) to show the session ID alongside the title
in both CLI and gateway.

Implements #1426
2026-03-17 04:14:40 -07:00
Teknium
d417ba2a48
feat: add route-aware pricing estimates (#1695)
Salvaged from PR #1563 by @kshitijk4poor. Cherry-picked with authorship preserved.

- Route-aware pricing architecture replacing static MODEL_PRICING + heuristics
- Canonical usage normalization (Anthropic/OpenAI/Codex API shapes)
- Cache-aware billing (separate cache_read/cache_write rates)
- Cost status tracking (estimated/included/unknown/actual)
- OpenRouter live pricing via models API
- Schema migration v4→v5 with billing metadata columns
- Removed speculative forward-looking entries
- Removed cost display from CLI status bar
- Threaded OpenRouter metadata pre-warm

Co-authored-by: kshitij <82637225+kshitijk4poor@users.noreply.github.com>
2026-03-17 03:44:44 -07:00
Teknium
1d5a39e002
fix: thread safety for concurrent subagent delegation (#1672)
* fix: thread safety for concurrent subagent delegation

Four thread-safety fixes that prevent crashes and data races when
running multiple subagents concurrently via delegate_task:

1. Remove redirect_stdout/stderr from delegate_tool — mutating global
   sys.stdout races with the spinner thread when multiple children start
   concurrently, causing segfaults. Children already run with
   quiet_mode=True so the redirect was redundant.

2. Split _run_single_child into _build_child_agent (main thread) +
   _run_single_child (worker thread). AIAgent construction creates
   httpx/SSL clients which are not thread-safe to initialize
   concurrently.

3. Add threading.Lock to SessionDB — subagents share the parent's
   SessionDB and call create_session/append_message from worker threads
   with no synchronization.

4. Add _active_children_lock to AIAgent — interrupt() iterates
   _active_children while worker threads append/remove children.

5. Add _client_cache_lock to auxiliary_client — multiple subagent
   threads may resolve clients concurrently via call_llm().

Based on PR #1471 by peteromallet.

* feat: Honcho base_url override via config.yaml + quick command alias type

Two features salvaged from PR #1576:

1. Honcho base_url override: allows pointing Hermes at a remote
   self-hosted Honcho deployment via config.yaml:

     honcho:
       base_url: "http://192.168.x.x:8000"

   When set, this overrides the Honcho SDK's environment mapping
   (production/local), enabling LAN/VPN Honcho deployments without
   requiring the server to live on localhost. Uses config.yaml instead
   of env var (HONCHO_URL) per project convention.

2. Quick command alias type: adds a new 'alias' quick command type
   that rewrites to another slash command before normal dispatch:

     quick_commands:
       sc:
         type: alias
         target: /context

   Supports both CLI and gateway. Arguments are forwarded to the
   target command.

Based on PR #1576 by redhelix.

---------

Co-authored-by: peteromallet <peteromallet@users.noreply.github.com>
Co-authored-by: redhelix <redhelix@users.noreply.github.com>
2026-03-17 02:53:33 -07:00
teknium1
a1c81360a5 feat(cli): skin-aware light/dark theme mode with terminal auto-detection
Add display.theme_mode setting (auto/light/dark) that makes the CLI
readable on light terminal backgrounds.

- Auto-detect terminal background via COLORFGBG, OSC 11, and macOS
  appearance (fallback chain in hermes_cli/colors.py)
- Add colors_light overrides to all 7 built-in skins with dark/readable
  colors for light backgrounds
- SkinConfig.get_color() now returns light overrides when theme is light
- get_prompt_toolkit_style_overrides() uses light bg colors for
  completion menus in light mode
- init_skin_from_config() reads display.theme_mode from config
- 7 new tests covering theme mode resolution, detection fallbacks,
  and light-mode skin overrides

Salvaged from PR #1187 by @peteromallet. Core design preserved;
adapted to current main (kept all existing helpers, tool_emojis,
convenience functions that were added after the PR branched).

Co-authored-by: Peter O'Mallet <peteromallet@users.noreply.github.com>
2026-03-17 02:51:40 -07:00
Teknium
556e0f4b43 fix(docker): add explicit env allowlist for container credentials (#1436)
Docker terminal sessions are secret-dark by default. This adds
terminal.docker_forward_env as an explicit allowlist for env vars
that may be forwarded into Docker containers.

Values resolve from the current shell first, then fall back to
~/.hermes/.env. Only variables the user explicitly lists are
forwarded — nothing is auto-exposed.

Cherry-picked from PR #1449 by @teknium1, conflict-resolved onto
current main.

Fixes #1436
Supersedes #1439
2026-03-17 02:34:35 -07:00
Teknium
8992babaa3
fix(cli): flush stdout during agent loop to prevent macOS display freeze (#1624)
* fix: prevent infinite 400 failure loop on context overflow (#1630)

When a gateway session exceeds the model's context window, Anthropic may
return a generic 400 invalid_request_error with just 'Error' as the
message.  This bypassed the phrase-based context-length detection,
causing the agent to treat it as a non-retryable client error.  Worse,
the failed user message was still persisted to the transcript, making
the session even larger on each attempt — creating an infinite loop.

Three-layer fix:

1. run_agent.py — Fallback heuristic: when a 400 error has a very short
   generic message AND the session is large (>40% of context or >80
   messages), treat it as a probable context overflow and trigger
   compression instead of aborting.

2. run_agent.py + gateway/run.py — Don't persist failed messages:
   when the agent returns failed=True before generating any response,
   skip writing the user's message to the transcript/DB. This prevents
   the session from growing on each failure.

3. gateway/run.py — Smarter error messages: detect context-overflow
   failures and suggest /compact or /reset specifically, instead of a
   generic 'try again' that will fail identically.

* fix(skills): detect prompt injection patterns and block cache file reads

Adds two security layers to prevent prompt injection via skills hub
cache files (#1558):

1. read_file: blocks direct reads of ~/.hermes/skills/.hub/ directory
   (index-cache, catalog files). The 3.5MB clawhub_catalog_v1.json
   was the original injection vector — untrusted skill descriptions
   in the catalog contained adversarial text that the model executed.

2. skill_view: warns when skills are loaded from outside the trusted
   ~/.hermes/skills/ directory, and detects common injection patterns
   in skill content ("ignore previous instructions", "<system>", etc.).

Cherry-picked from PR #1562 by ygd58.

* fix(tools): chunk long messages in send_message_tool before dispatch (#1552)

Long messages sent via send_message tool or cron delivery silently
failed when exceeding platform limits. Gateway adapters handle this
via truncate_message(), but the standalone senders in send_message_tool
bypassed that entirely.

- Apply truncate_message() chunking in _send_to_platform() before
  dispatching to individual platform senders
- Remove naive message[i:i+2000] character split in _send_discord()
  in favor of centralized smart splitting
- Attach media files to last chunk only for Telegram
- Add regression tests for chunking and media placement

Cherry-picked from PR #1557 by llbn.

* fix(approval): show full command in dangerous command approval (#1553)

Previously the command was truncated to 80 chars in CLI (with a
[v]iew full option), 500 chars in Discord embeds, and missing entirely
in Telegram/Slack approval messages. Now the full command is always
displayed everywhere:

- CLI: removed 80-char truncation and [v]iew full menu option
- Gateway (TG/Slack): approval_required message includes full command
  in a code block
- Discord: embed shows full command up to 4096-char limit
- Windows: skip SIGALRM-based test timeout (Unix-only)
- Updated tests: replaced view-flow tests with direct approval tests

Cherry-picked from PR #1566 by crazywriter1.

* fix(cli): flush stdout during agent loop to prevent macOS display freeze (#1624)

The interrupt polling loop in chat() waited on the queue without
invalidating the prompt_toolkit renderer. On macOS, the StdoutProxy
buffer only flushed on input events, causing the CLI to appear frozen
during tool execution until the user typed a key.

Fix: call _invalidate() on each queue timeout (every ~100ms, throttled
to 150ms) to force the renderer to flush buffered agent output.

---------

Co-authored-by: buray <ygd58@users.noreply.github.com>
Co-authored-by: lbn <llbn@users.noreply.github.com>
Co-authored-by: crazywriter1 <53251494+crazywriter1@users.noreply.github.com>
2026-03-17 02:09:26 -07:00
Teknium
49043b7b7d
feat: add /tools disable/enable/list slash commands with session reset (#1652)
Add in-session tool management via /tools disable/enable/list, plus
hermes tools list/disable/enable CLI subcommands. Supports both
built-in toolsets (web, memory) and MCP tools (github:create_issue).

To preserve prompt caching, /tools disable/enable in a chat session
saves the change to config and resets the session cleanly — the user
is asked to confirm before the reset happens.

Also improves prefix matching: /qui now dispatches to /quit instead
of showing ambiguous when longer skill commands like /quint-pipeline
are installed.

Based on PR #1520 by @YanSte.

Co-authored-by: Yannick Stephan <YanSte@users.noreply.github.com>
2026-03-17 02:05:26 -07:00
Teknium
3744118311
feat(cli): two-stage /model autocomplete with ghost text suggestions (#1641)
* feat(cli): two-stage /model autocomplete with ghost text suggestions

- SlashCommandCompleter: Tab-complete providers first (anthropic:, openrouter:, etc.)
  then models within the selected provider
- SlashCommandAutoSuggest: inline ghost text for slash commands, subcommands,
  and /model provider:model two-stage suggestions
- Custom Tab key binding: accepts provider completion and immediately
  re-triggers completions to show that provider's models
- COMMANDS_BY_CATEGORY: structured format with explicit subcommands for
  tab completion and ghost text (prompt, reasoning, voice, skills, cron, browser)
- SUBCOMMANDS dict auto-extracted from command definitions
- Model/provider info cached 60s for responsive completions

* fix: repair test regression and restore gold color from PR #1622

- Fix test_unknown_command_still_shows_error: patch _cprint instead of
  console.print to match the _cprint switch in process_command()
- Restore gold color on 'Type /help' hint using _DIM + _GOLD constants
  instead of bare \033[2m (was losing the #B8860B gold)
- Use _GOLD constant for ambiguous command message for consistency
- Add clarifying comment on SUBCOMMANDS regex fallback

---------

Co-authored-by: Lars van der Zande <lmvanderzande@gmail.com>
2026-03-17 01:47:32 -07:00
Teknium
46176c8029
refactor: centralize slash command registry (#1603)
* refactor: centralize slash command registry

Replace 7+ scattered command definition sites with a single
CommandDef registry in hermes_cli/commands.py. All downstream
consumers now derive from this registry:

- CLI process_command() resolves aliases via resolve_command()
- Gateway _known_commands uses GATEWAY_KNOWN_COMMANDS frozenset
- Gateway help text generated by gateway_help_lines()
- Telegram BotCommands generated by telegram_bot_commands()
- Slack subcommand map generated by slack_subcommand_map()

Adding a command or alias is now a one-line change to
COMMAND_REGISTRY instead of touching 6+ files.

Bugfixes included:
- Telegram now registers /rollback, /background (were missing)
- Slack now has /voice, /update, /reload-mcp (were missing)
- Gateway duplicate 'reasoning' dispatch (dead code) removed
- Gateway help text can no longer drift from CLI help

Backwards-compatible: COMMANDS and COMMANDS_BY_CATEGORY dicts are
rebuilt from the registry, so existing imports work unchanged.

* docs: update developer docs for centralized command registry

Update AGENTS.md with full 'Slash Command Registry' and 'Adding a
Slash Command' sections covering CommandDef fields, registry helpers,
and the one-line alias workflow.

Also update:
- CONTRIBUTING.md: commands.py description
- website/docs/reference/slash-commands.md: reference central registry
- docs/plans/centralize-command-registry.md: mark COMPLETED
- plans/checkpoint-rollback.md: reference new pattern
- hermes-agent-dev skill: architecture table

* chore: remove stale plan docs
2026-03-16 23:21:03 -07:00
Teknium
6794e79bb4
feat: add /bg as alias for /background slash command (#1590)
* feat: add optional smart model routing

Add a conservative cheap-vs-strong routing option that can send very short/simple turns to a cheaper model across providers while keeping the primary model for complex work. Wire it through CLI, gateway, and cron, and document the config.yaml workflow.

* fix(gateway): remove recursive ExecStop from systemd units, extend TimeoutStopSec to 60s

* fix(gateway): avoid recursive ExecStop in user systemd unit

* fix: extend ExecStop removal and TimeoutStopSec=60 to system unit

The cherry-picked PR #1448 fix only covered the user systemd unit.
The system unit had the same TimeoutStopSec=15 and could benefit
from the same 60s timeout for clean shutdown. Also adds a regression
test for the system unit.

---------

Co-authored-by: Ninja <ninja@local>

* feat(skills): add blender-mcp optional skill for 3D modeling

Control a running Blender instance from Hermes via socket connection
to the blender-mcp addon (port 9876). Supports creating 3D objects,
materials, animations, and running arbitrary bpy code.

Placed in optional-skills/ since it requires Blender 4.3+ desktop
with a third-party addon manually started each session.

* feat(acp): support slash commands in ACP adapter (#1532)

Adds /help, /model, /tools, /context, /reset, /compact, /version
to the ACP adapter (VS Code, Zed, JetBrains). Commands are handled
directly in the server without instantiating the TUI — each command
queries agent/session state and returns plain text.

Unrecognized /commands fall through to the LLM as normal messages.

/model uses detect_provider_for_model() for auto-detection when
switching models, matching the CLI and gateway behavior.

Fixes #1402

* fix(logging): improve error logging in session search tool (#1533)

* fix(gateway): restart on retryable startup failures (#1517)

* feat(email): add skip_attachments option via config.yaml

* feat(email): add skip_attachments option via config.yaml

Adds a config.yaml-driven option to skip email attachments in the
gateway email adapter. Useful for malware protection and bandwidth
savings.

Configure in config.yaml:
  platforms:
    email:
      skip_attachments: true

Based on PR #1521 by @an420eth, changed from env var to config.yaml
(via PlatformConfig.extra) to match the project's config-first pattern.

* docs: document skip_attachments option for email adapter

* fix(telegram): retry on transient TLS failures during connect and send

Add exponential-backoff retry (3 attempts) around initialize() to
handle transient TLS resets during gateway startup. Also catches
TimedOut and OSError in addition to NetworkError.

Add exponential-backoff retry (3 attempts) around send_message() for
NetworkError during message delivery, wrapping the existing Markdown
fallback logic.

Both imports are guarded with try/except ImportError for test
environments where telegram is mocked.

Based on PR #1527 by cmd8. Closes #1526.

* feat: permissive block_anchor thresholds and unicode normalization (#1539)

Salvaged from PR #1528 by an420eth. Closes #517.

Improves _strategy_block_anchor in fuzzy_match.py:
- Add unicode normalization (smart quotes, em/en-dashes, ellipsis,
  non-breaking spaces → ASCII) so LLM-produced unicode artifacts
  don't break anchor line matching
- Lower thresholds: 0.10 for unique matches (was 0.70), 0.30 for
  multiple candidates — if first/last lines match exactly, the
  block is almost certainly correct
- Use original (non-normalized) content for offset calculation to
  preserve correct character positions

Tested: 3 new scenarios fixed (em-dash anchors, non-breaking space
anchors, very-low-similarity unique matches), zero regressions on
all 9 existing fuzzy match tests.

Co-authored-by: an420eth <an420eth@users.noreply.github.com>

* feat(cli): add file path autocomplete in the input prompt (#1545)

When typing a path-like token (./  ../  ~/  /  or containing /),
the CLI now shows filesystem completions in the dropdown menu.
Directories show a trailing slash and 'dir' label; files show
their size. Completions are case-insensitive and capped at 30
entries.

Triggered by tokens like:
  edit ./src/ma     → shows ./src/main.py, ./src/manifest.json, ...
  check ~/doc       → shows ~/docs/, ~/documents/, ...
  read /etc/hos     → shows /etc/hosts, /etc/hostname, ...
  open tools/reg    → shows tools/registry.py

Slash command autocomplete (/help, /model, etc.) is unaffected —
it still triggers when the input starts with /.

Inspired by OpenCode PR #145 (file path completion menu).

Implementation:
- hermes_cli/commands.py: _extract_path_word() detects path-like
  tokens, _path_completions() yields filesystem Completions with
  size labels, get_completions() routes to paths vs slash commands
- tests/hermes_cli/test_path_completion.py: 26 tests covering
  path extraction, prefix filtering, directory markers, home
  expansion, case-insensitivity, integration with slash commands

* feat(privacy): redact PII from LLM context when privacy.redact_pii is enabled

Add privacy.redact_pii config option (boolean, default false). When
enabled, the gateway redacts personally identifiable information from
the system prompt before sending it to the LLM provider:

- Phone numbers (user IDs on WhatsApp/Signal) → hashed to user_<sha256>
- User IDs → hashed to user_<sha256>
- Chat IDs → numeric portion hashed, platform prefix preserved
- Home channel IDs → hashed
- Names/usernames → NOT affected (user-chosen, publicly visible)

Hashes are deterministic (same user → same hash) so the model can
still distinguish users in group chats. Routing and delivery use
the original values internally — redaction only affects LLM context.

Inspired by OpenClaw PR #47959.

* fix(privacy): skip PII redaction on Discord/Slack (mentions need real IDs)

Discord uses <@user_id> for mentions and Slack uses <@U12345> — the LLM
needs the real ID to tag users. Redaction now only applies to WhatsApp,
Signal, and Telegram where IDs are pure routing metadata.

Add 4 platform-specific tests covering Discord, WhatsApp, Signal, Slack.

* feat: smart approvals + /stop command (inspired by OpenAI Codex)

* feat: smart approvals — LLM-based risk assessment for dangerous commands

Adds a 'smart' approval mode that uses the auxiliary LLM to assess
whether a flagged command is genuinely dangerous or a false positive,
auto-approving low-risk commands without prompting the user.

Inspired by OpenAI Codex's Smart Approvals guardian subagent
(openai/codex#13860).

Config (config.yaml):
  approvals:
    mode: manual   # manual (default), smart, off

Modes:
- manual — current behavior, always prompt the user
- smart  — aux LLM evaluates risk: APPROVE (auto-allow), DENY (block),
           or ESCALATE (fall through to manual prompt)
- off    — skip all approval prompts (equivalent to --yolo)

When smart mode auto-approves, the pattern gets session-level approval
so subsequent uses of the same pattern don't trigger another LLM call.
When it denies, the command is blocked without user prompt. When
uncertain, it escalates to the normal manual approval flow.

The LLM prompt is carefully scoped: it sees only the command text and
the flagged reason, assesses actual risk vs false positive, and returns
a single-word verdict.

* feat: make smart approval model configurable via config.yaml

Adds auxiliary.approval section to config.yaml with the same
provider/model/base_url/api_key pattern as other aux tasks (vision,
web_extract, compression, etc.).

Config:
  auxiliary:
    approval:
      provider: auto
      model: ''        # fast/cheap model recommended
      base_url: ''
      api_key: ''

Bridged to env vars in both CLI and gateway paths so the aux client
picks them up automatically.

* feat: add /stop command to kill all background processes

Adds a /stop slash command that kills all running background processes
at once. Currently users have to process(list) then process(kill) for
each one individually.

Inspired by OpenAI Codex's separation of interrupt (Ctrl+C stops current
turn) from /stop (cleans up background processes). See openai/codex#14602.

Ctrl+C continues to only interrupt the active agent turn — background
dev servers, watchers, etc. are preserved. /stop is the explicit way
to clean them all up.

* feat: first-class plugin architecture + hide status bar cost by default (#1544)

The persistent status bar now shows context %, token counts, and
duration but NOT $ cost by default. Cost display is opt-in via:

  display:
    show_cost: true

in config.yaml, or: hermes config set display.show_cost true

The /usage command still shows full cost breakdown since the user
explicitly asked for it — this only affects the always-visible bar.

Status bar without cost:
  ⚕ claude-sonnet-4 │ 12K/200K │ 6% │ 15m

Status bar with show_cost: true:
  ⚕ claude-sonnet-4 │ 12K/200K │ 6% │ $0.06 │ 15m

* feat: improve memory prioritization + aggressive skill updates (inspired by OpenAI Codex)

* feat: improve memory prioritization — user preferences over procedural knowledge

Inspired by OpenAI Codex's memory prompt improvements (openai/codex#14493)
which focus memory writes on user preferences and recurring patterns
rather than procedural task details.

Key insight: 'Optimize for reducing future user steering — the most
valuable memory prevents the user from having to repeat themselves.'

Changes:
- MEMORY_GUIDANCE (prompt_builder.py): added prioritization hierarchy
  and the core principle about reducing user steering
- MEMORY_SCHEMA (memory_tool.py): reordered WHEN TO SAVE list to put
  corrections first, added explicit PRIORITY guidance
- Memory nudge (run_agent.py): now asks specifically about preferences,
  corrections, and workflow patterns instead of generic 'anything'
- Memory flush (run_agent.py): now instructs to prioritize user
  preferences and corrections over task-specific details

* feat: more aggressive skill creation and update prompting

Press harder on skill updates — the agent should proactively patch
skills when it encounters issues during use, not wait to be asked.

Changes:
- SKILLS_GUIDANCE: 'consider saving' → 'save'; added explicit instruction
  to patch skills immediately when found outdated/wrong
- Skills header: added instruction to update loaded skills before finishing
  if they had missing steps or wrong commands
- Skill nudge: more assertive ('save the approach' not 'consider saving'),
  now also prompts for updating existing skills used in the task
- Skill nudge interval: lowered default from 15 to 10 iterations
- skill_manage schema: added 'patch it immediately' to update triggers

* feat: first-class plugin architecture (#1555)

Plugin system for extending Hermes with custom tools, hooks, and
integrations — no source code changes required.

Core system (hermes_cli/plugins.py):
  - Plugin discovery from ~/.hermes/plugins/, .hermes/plugins/, and
    pip entry_points (hermes_agent.plugins group)
  - PluginContext with register_tool() and register_hook()
  - 6 lifecycle hooks: pre/post tool_call, pre/post llm_call,
    on_session_start/end
  - Namespace package handling for relative imports in plugins
  - Graceful error isolation — broken plugins never crash the agent

Integration (model_tools.py):
  - Plugin discovery runs after built-in + MCP tools
  - Plugin tools bypass toolset filter via get_plugin_tool_names()
  - Pre/post tool call hooks fire in handle_function_call()

CLI:
  - /plugins command shows loaded plugins, tool counts, status
  - Added to COMMANDS dict for autocomplete

Docs:
  - Getting started guide (build-a-hermes-plugin.md) — full tutorial
    building a calculator plugin step by step
  - Reference page (features/plugins.md) — quick overview + tables
  - Covers: file structure, schemas, handlers, hooks, data files,
    bundled skills, env var gating, pip distribution, common mistakes

Tests: 16 tests covering discovery, loading, hooks, tool visibility.

* feat: add /bg as alias for /background slash command

Adds /bg alias across CLI, gateway, and Slack platform adapter.
Updates help text, autocomplete, known_commands set, and dispatch
logic. Includes tests for the new alias.

* docs: add plan for centralized slash command registry

Scopes a refactor to replace 7+ scattered command definition sites
with a single CommandDef registry in hermes_cli/commands.py. Includes
derived helper functions for gateway help text, Telegram BotCommands,
Slack subcommand maps, and alias resolution.

Documents current drift (Telegram missing /rollback + /background,
Slack missing /voice + /update, gateway dead code) that the refactor
fixes for free.

---------

Co-authored-by: Ninja <ninja@local>
Co-authored-by: alireza78a <alireza78a@users.noreply.github.com>
Co-authored-by: Oktay Aydin <113846926+aydnOktay@users.noreply.github.com>
Co-authored-by: JP Lew <polydegen@protonmail.com>
Co-authored-by: an420eth <an420eth@users.noreply.github.com>
2026-03-16 17:27:02 -07:00
Teknium
181077b785
fix: hide Honcho session line on CLI load when no API key configured (#1582)
HonchoClientConfig.from_env() set enabled=True unconditionally,
even when HONCHO_API_KEY was not set. When ~/.honcho/config.json
didn't exist, from_global_config() fell back to from_env() and
returned enabled=True with a null api_key, causing the Honcho
session indicator to display on every CLI launch.

Fix: from_env() now sets enabled=bool(api_key), matching the
auto-enable logic already used in from_global_config().
Also added api_key guard to the CLI display as defense-in-depth.
2026-03-16 17:22:52 -07:00
teknium1
f4d61c168b merge: resolve conflicts with main (show_cost, turn routing, docker docs) 2026-03-16 14:22:38 -07:00
Teknium
5e5c92663d
fix: hermes update causes dual gateways on macOS (launchd) (#1567)
* feat: add optional smart model routing

Add a conservative cheap-vs-strong routing option that can send very short/simple turns to a cheaper model across providers while keeping the primary model for complex work. Wire it through CLI, gateway, and cron, and document the config.yaml workflow.

* fix(gateway): remove recursive ExecStop from systemd units, extend TimeoutStopSec to 60s

* fix(gateway): avoid recursive ExecStop in user systemd unit

* fix: extend ExecStop removal and TimeoutStopSec=60 to system unit

The cherry-picked PR #1448 fix only covered the user systemd unit.
The system unit had the same TimeoutStopSec=15 and could benefit
from the same 60s timeout for clean shutdown. Also adds a regression
test for the system unit.

---------

Co-authored-by: Ninja <ninja@local>

* feat(skills): add blender-mcp optional skill for 3D modeling

Control a running Blender instance from Hermes via socket connection
to the blender-mcp addon (port 9876). Supports creating 3D objects,
materials, animations, and running arbitrary bpy code.

Placed in optional-skills/ since it requires Blender 4.3+ desktop
with a third-party addon manually started each session.

* feat(acp): support slash commands in ACP adapter (#1532)

Adds /help, /model, /tools, /context, /reset, /compact, /version
to the ACP adapter (VS Code, Zed, JetBrains). Commands are handled
directly in the server without instantiating the TUI — each command
queries agent/session state and returns plain text.

Unrecognized /commands fall through to the LLM as normal messages.

/model uses detect_provider_for_model() for auto-detection when
switching models, matching the CLI and gateway behavior.

Fixes #1402

* fix(logging): improve error logging in session search tool (#1533)

* fix(gateway): restart on retryable startup failures (#1517)

* feat(email): add skip_attachments option via config.yaml

* feat(email): add skip_attachments option via config.yaml

Adds a config.yaml-driven option to skip email attachments in the
gateway email adapter. Useful for malware protection and bandwidth
savings.

Configure in config.yaml:
  platforms:
    email:
      skip_attachments: true

Based on PR #1521 by @an420eth, changed from env var to config.yaml
(via PlatformConfig.extra) to match the project's config-first pattern.

* docs: document skip_attachments option for email adapter

* fix(telegram): retry on transient TLS failures during connect and send

Add exponential-backoff retry (3 attempts) around initialize() to
handle transient TLS resets during gateway startup. Also catches
TimedOut and OSError in addition to NetworkError.

Add exponential-backoff retry (3 attempts) around send_message() for
NetworkError during message delivery, wrapping the existing Markdown
fallback logic.

Both imports are guarded with try/except ImportError for test
environments where telegram is mocked.

Based on PR #1527 by cmd8. Closes #1526.

* feat: permissive block_anchor thresholds and unicode normalization (#1539)

Salvaged from PR #1528 by an420eth. Closes #517.

Improves _strategy_block_anchor in fuzzy_match.py:
- Add unicode normalization (smart quotes, em/en-dashes, ellipsis,
  non-breaking spaces → ASCII) so LLM-produced unicode artifacts
  don't break anchor line matching
- Lower thresholds: 0.10 for unique matches (was 0.70), 0.30 for
  multiple candidates — if first/last lines match exactly, the
  block is almost certainly correct
- Use original (non-normalized) content for offset calculation to
  preserve correct character positions

Tested: 3 new scenarios fixed (em-dash anchors, non-breaking space
anchors, very-low-similarity unique matches), zero regressions on
all 9 existing fuzzy match tests.

Co-authored-by: an420eth <an420eth@users.noreply.github.com>

* feat(cli): add file path autocomplete in the input prompt (#1545)

When typing a path-like token (./  ../  ~/  /  or containing /),
the CLI now shows filesystem completions in the dropdown menu.
Directories show a trailing slash and 'dir' label; files show
their size. Completions are case-insensitive and capped at 30
entries.

Triggered by tokens like:
  edit ./src/ma     → shows ./src/main.py, ./src/manifest.json, ...
  check ~/doc       → shows ~/docs/, ~/documents/, ...
  read /etc/hos     → shows /etc/hosts, /etc/hostname, ...
  open tools/reg    → shows tools/registry.py

Slash command autocomplete (/help, /model, etc.) is unaffected —
it still triggers when the input starts with /.

Inspired by OpenCode PR #145 (file path completion menu).

Implementation:
- hermes_cli/commands.py: _extract_path_word() detects path-like
  tokens, _path_completions() yields filesystem Completions with
  size labels, get_completions() routes to paths vs slash commands
- tests/hermes_cli/test_path_completion.py: 26 tests covering
  path extraction, prefix filtering, directory markers, home
  expansion, case-insensitivity, integration with slash commands

* feat(privacy): redact PII from LLM context when privacy.redact_pii is enabled

Add privacy.redact_pii config option (boolean, default false). When
enabled, the gateway redacts personally identifiable information from
the system prompt before sending it to the LLM provider:

- Phone numbers (user IDs on WhatsApp/Signal) → hashed to user_<sha256>
- User IDs → hashed to user_<sha256>
- Chat IDs → numeric portion hashed, platform prefix preserved
- Home channel IDs → hashed
- Names/usernames → NOT affected (user-chosen, publicly visible)

Hashes are deterministic (same user → same hash) so the model can
still distinguish users in group chats. Routing and delivery use
the original values internally — redaction only affects LLM context.

Inspired by OpenClaw PR #47959.

* fix(privacy): skip PII redaction on Discord/Slack (mentions need real IDs)

Discord uses <@user_id> for mentions and Slack uses <@U12345> — the LLM
needs the real ID to tag users. Redaction now only applies to WhatsApp,
Signal, and Telegram where IDs are pure routing metadata.

Add 4 platform-specific tests covering Discord, WhatsApp, Signal, Slack.

* feat: smart approvals + /stop command (inspired by OpenAI Codex)

* feat: smart approvals — LLM-based risk assessment for dangerous commands

Adds a 'smart' approval mode that uses the auxiliary LLM to assess
whether a flagged command is genuinely dangerous or a false positive,
auto-approving low-risk commands without prompting the user.

Inspired by OpenAI Codex's Smart Approvals guardian subagent
(openai/codex#13860).

Config (config.yaml):
  approvals:
    mode: manual   # manual (default), smart, off

Modes:
- manual — current behavior, always prompt the user
- smart  — aux LLM evaluates risk: APPROVE (auto-allow), DENY (block),
           or ESCALATE (fall through to manual prompt)
- off    — skip all approval prompts (equivalent to --yolo)

When smart mode auto-approves, the pattern gets session-level approval
so subsequent uses of the same pattern don't trigger another LLM call.
When it denies, the command is blocked without user prompt. When
uncertain, it escalates to the normal manual approval flow.

The LLM prompt is carefully scoped: it sees only the command text and
the flagged reason, assesses actual risk vs false positive, and returns
a single-word verdict.

* feat: make smart approval model configurable via config.yaml

Adds auxiliary.approval section to config.yaml with the same
provider/model/base_url/api_key pattern as other aux tasks (vision,
web_extract, compression, etc.).

Config:
  auxiliary:
    approval:
      provider: auto
      model: ''        # fast/cheap model recommended
      base_url: ''
      api_key: ''

Bridged to env vars in both CLI and gateway paths so the aux client
picks them up automatically.

* feat: add /stop command to kill all background processes

Adds a /stop slash command that kills all running background processes
at once. Currently users have to process(list) then process(kill) for
each one individually.

Inspired by OpenAI Codex's separation of interrupt (Ctrl+C stops current
turn) from /stop (cleans up background processes). See openai/codex#14602.

Ctrl+C continues to only interrupt the active agent turn — background
dev servers, watchers, etc. are preserved. /stop is the explicit way
to clean them all up.

* feat: first-class plugin architecture + hide status bar cost by default (#1544)

The persistent status bar now shows context %, token counts, and
duration but NOT $ cost by default. Cost display is opt-in via:

  display:
    show_cost: true

in config.yaml, or: hermes config set display.show_cost true

The /usage command still shows full cost breakdown since the user
explicitly asked for it — this only affects the always-visible bar.

Status bar without cost:
  ⚕ claude-sonnet-4 │ 12K/200K │ 6% │ 15m

Status bar with show_cost: true:
  ⚕ claude-sonnet-4 │ 12K/200K │ 6% │ $0.06 │ 15m

* feat: improve memory prioritization + aggressive skill updates (inspired by OpenAI Codex)

* feat: improve memory prioritization — user preferences over procedural knowledge

Inspired by OpenAI Codex's memory prompt improvements (openai/codex#14493)
which focus memory writes on user preferences and recurring patterns
rather than procedural task details.

Key insight: 'Optimize for reducing future user steering — the most
valuable memory prevents the user from having to repeat themselves.'

Changes:
- MEMORY_GUIDANCE (prompt_builder.py): added prioritization hierarchy
  and the core principle about reducing user steering
- MEMORY_SCHEMA (memory_tool.py): reordered WHEN TO SAVE list to put
  corrections first, added explicit PRIORITY guidance
- Memory nudge (run_agent.py): now asks specifically about preferences,
  corrections, and workflow patterns instead of generic 'anything'
- Memory flush (run_agent.py): now instructs to prioritize user
  preferences and corrections over task-specific details

* feat: more aggressive skill creation and update prompting

Press harder on skill updates — the agent should proactively patch
skills when it encounters issues during use, not wait to be asked.

Changes:
- SKILLS_GUIDANCE: 'consider saving' → 'save'; added explicit instruction
  to patch skills immediately when found outdated/wrong
- Skills header: added instruction to update loaded skills before finishing
  if they had missing steps or wrong commands
- Skill nudge: more assertive ('save the approach' not 'consider saving'),
  now also prompts for updating existing skills used in the task
- Skill nudge interval: lowered default from 15 to 10 iterations
- skill_manage schema: added 'patch it immediately' to update triggers

* feat: first-class plugin architecture (#1555)

Plugin system for extending Hermes with custom tools, hooks, and
integrations — no source code changes required.

Core system (hermes_cli/plugins.py):
  - Plugin discovery from ~/.hermes/plugins/, .hermes/plugins/, and
    pip entry_points (hermes_agent.plugins group)
  - PluginContext with register_tool() and register_hook()
  - 6 lifecycle hooks: pre/post tool_call, pre/post llm_call,
    on_session_start/end
  - Namespace package handling for relative imports in plugins
  - Graceful error isolation — broken plugins never crash the agent

Integration (model_tools.py):
  - Plugin discovery runs after built-in + MCP tools
  - Plugin tools bypass toolset filter via get_plugin_tool_names()
  - Pre/post tool call hooks fire in handle_function_call()

CLI:
  - /plugins command shows loaded plugins, tool counts, status
  - Added to COMMANDS dict for autocomplete

Docs:
  - Getting started guide (build-a-hermes-plugin.md) — full tutorial
    building a calculator plugin step by step
  - Reference page (features/plugins.md) — quick overview + tables
  - Covers: file structure, schemas, handlers, hooks, data files,
    bundled skills, env var gating, pip distribution, common mistakes

Tests: 16 tests covering discovery, loading, hooks, tool visibility.

* fix: hermes update causes dual gateways on macOS (launchd)

Three bugs worked together to create the dual-gateway problem:

1. cmd_update only checked systemd for gateway restart, completely
   ignoring launchd on macOS. After killing the PID it would print
   'Restart it with: hermes gateway run' even when launchd was about
   to auto-respawn the process.

2. launchd's KeepAlive.SuccessfulExit=false respawns the gateway
   after SIGTERM (non-zero exit), so the user's manual restart
   created a second instance.

3. The launchd plist lacked --replace (systemd had it), so the
   respawned gateway didn't kill stale instances on startup.

Fixes:
- Add --replace to launchd ProgramArguments (matches systemd)
- Add launchd detection to cmd_update's auto-restart logic
- Print 'auto-restart via launchd' instead of manual restart hint

* fix: add launchd plist auto-refresh + explicit restart in cmd_update

Two integration issues with the initial fix:

1. Existing macOS users with old plist (no --replace) would never
   get the fix until manual uninstall/reinstall. Added
   refresh_launchd_plist_if_needed() — mirrors the existing
   refresh_systemd_unit_if_needed(). Called from launchd_start(),
   launchd_restart(), and cmd_update.

2. cmd_update relied on KeepAlive respawn after SIGTERM rather than
   explicit launchctl stop/start. This caused races: launchd would
   respawn the old process before the PID file was cleaned up.
   Now does explicit stop+start (matching how systemd gets an
   explicit systemctl restart), with plist refresh first so the
   new --replace flag is picked up.

---------

Co-authored-by: Ninja <ninja@local>
Co-authored-by: alireza78a <alireza78a@users.noreply.github.com>
Co-authored-by: Oktay Aydin <113846926+aydnOktay@users.noreply.github.com>
Co-authored-by: JP Lew <polydegen@protonmail.com>
Co-authored-by: an420eth <an420eth@users.noreply.github.com>
2026-03-16 12:36:29 -07:00
teknium1
942950f5b9 feat(cli): live reasoning token streaming — dim box above response
When both display.streaming and display.show_reasoning are enabled,
reasoning tokens stream in real-time into a dim bordered box. When
content tokens start arriving, the reasoning box closes and the
response box opens — smooth visual transition.

- _stream_reasoning_delta(): line-buffered rendering in dim text
- _close_reasoning_box(): flush + close, called on first content token
- Reasoning callback routes to streaming version when both flags set
- Skips static post-response reasoning display when streamed live
- State reset per turn via _reset_stream_state()

Works with reasoning_content deltas (OpenRouter reasoning mode) and
thinking_delta events (Anthropic extended thinking).
2026-03-16 10:29:55 -07:00
teknium1
d3687d3e81 docs: document planned live reasoning token display as future enhancement
The streaming infrastructure already fires reasoning deltas via
_fire_reasoning_delta() during streaming. The remaining work is the
CLI display layer: a dim reasoning box that opens on first reasoning
token, streams live, then transitions to the response box.

Reference: PR #1214 (raulvidis) for gateway reasoning visibility.
2026-03-16 10:22:44 -07:00
teknium1
23b9d88a76 docs: add streaming config to cli-config.yaml.example and defaults
Documents the new streaming options in the example config:
- display.streaming for CLI (under display section)
- streaming.enabled + transport/interval/threshold/cursor for gateway
- Added streaming: false to load_cli_config() defaults dict
2026-03-16 07:53:08 -07:00
teknium1
c0b88018eb feat: ship streaming disabled by default — opt-in via config
Streaming is now off by default for both CLI and gateway. Users opt in:

CLI (config.yaml):
  display:
    streaming: true

Gateway (config.yaml):
  streaming:
    enabled: true

This lets early adopters test streaming while existing users see zero
change. Once we have enough field validation, we flip the default to
true in a subsequent release.
2026-03-16 07:44:42 -07:00
teknium1
fc4080c58a fix(cli): add <THINKING> to streaming tag suppression list
Anthropic native models emit <THINKING> tags in text content (separate
from the SDK's thinking_delta events). Without suppression, these tags
leak into the streamed CLI output. Found during live provider testing.
2026-03-16 07:34:29 -07:00
teknium1
c2769dffe0 merge: resolve conflicts with main (plugins + stop commands) 2026-03-16 07:32:00 -07:00
teknium1
71e35311f5 fix(browser): model waits for user instruction after /browser connect
Updated the injected context message to tell the model to await the
user's instruction before operating the browser. Typical flow is:
user opens Chrome → logs into sites → /browser connect → tells the
agent what to do.
2026-03-16 07:20:43 -07:00
Teknium
97990e7ad5
feat: first-class plugin architecture (#1555)
Plugin system for extending Hermes with custom tools, hooks, and
integrations — no source code changes required.

Core system (hermes_cli/plugins.py):
  - Plugin discovery from ~/.hermes/plugins/, .hermes/plugins/, and
    pip entry_points (hermes_agent.plugins group)
  - PluginContext with register_tool() and register_hook()
  - 6 lifecycle hooks: pre/post tool_call, pre/post llm_call,
    on_session_start/end
  - Namespace package handling for relative imports in plugins
  - Graceful error isolation — broken plugins never crash the agent

Integration (model_tools.py):
  - Plugin discovery runs after built-in + MCP tools
  - Plugin tools bypass toolset filter via get_plugin_tool_names()
  - Pre/post tool call hooks fire in handle_function_call()

CLI:
  - /plugins command shows loaded plugins, tool counts, status
  - Added to COMMANDS dict for autocomplete

Docs:
  - Getting started guide (build-a-hermes-plugin.md) — full tutorial
    building a calculator plugin step by step
  - Reference page (features/plugins.md) — quick overview + tables
  - Covers: file structure, schemas, handlers, hooks, data files,
    bundled skills, env var gating, pip distribution, common mistakes

Tests: 16 tests covering discovery, loading, hooks, tool visibility.
2026-03-16 07:17:36 -07:00
teknium1
73f39a7761 feat(browser): auto-launch Chrome when /browser connect finds no debugger
When /browser connect detects that port 9222 isn't open, it now:
1. Finds Chrome/Chromium/Brave/Edge on the system (macOS app bundles
   or Linux PATH lookup)
2. Launches it with --remote-debugging-port=9222 (detached)
3. Waits up to 5 seconds for the port to come up
4. Falls back to manual instructions if auto-launch fails

This means GUI-only users can just type /browser connect without
needing to know about terminal flags or Chrome launch commands.
2026-03-16 07:05:48 -07:00
Teknium
447594be28
feat: first-class plugin architecture + hide status bar cost by default (#1544)
The persistent status bar now shows context %, token counts, and
duration but NOT $ cost by default. Cost display is opt-in via:

  display:
    show_cost: true

in config.yaml, or: hermes config set display.show_cost true

The /usage command still shows full cost breakdown since the user
explicitly asked for it — this only affects the always-visible bar.

Status bar without cost:
  ⚕ claude-sonnet-4 │ 12K/200K │ 6% │ 15m

Status bar with show_cost: true:
  ⚕ claude-sonnet-4 │ 12K/200K │ 6% │ $0.06 │ 15m
2026-03-16 06:43:57 -07:00
teknium1
9d1483c7e6 feat(browser): /browser connect — attach browser tools to live Chrome via CDP
Add /browser slash command for connecting browser tools to the user's
live Chrome instance via Chrome DevTools Protocol:

  /browser connect       — connect to Chrome on localhost:9222
  /browser connect ws://host:port  — custom CDP endpoint
  /browser disconnect    — revert to default (headless/Browserbase)
  /browser status        — show current browser mode + connectivity

When connected:
- All browser tools (navigate, snapshot, click, etc.) control the
  user's real Chrome — logged-in sessions, cookies, open tabs
- Platform-specific Chrome launch instructions are shown
- Port connectivity is tested immediately
- A context message is injected so the model knows it's controlling
  a live browser and should be mindful of user's open tabs

Implementation:
- BROWSER_CDP_URL env var drives the backend selection in browser_tool.py
- New _create_cdp_session() creates sessions using the CDP override
- _get_cdp_override() checked before local/Browserbase selection
- Existing agent-browser --cdp flag handles the actual CDP connection

Inspired by OpenClaw's browser profile system.
2026-03-16 06:38:20 -07:00
teknium1
8e07f9ca56 fix: audit fixes — 5 bugs found and resolved
Thorough code review found 5 issues across run_agent.py, cli.py, and gateway/:

1. CRITICAL — Gateway stream consumer task never started: stream_consumer_holder
   was checked BEFORE run_sync populated it. Fixed with async polling pattern
   (same as track_agent).

2. MEDIUM-HIGH — Streaming fallback after partial delivery caused double-response:
   if streaming failed after some tokens were delivered, the fallback would
   re-deliver the full response. Now tracks deltas_were_sent and only falls
   back when no tokens reached consumers yet.

3. MEDIUM — Codex mode lost on_first_delta spinner callback: _run_codex_stream
   now accepts on_first_delta parameter, fires it on first text delta. Passed
   through from _interruptible_streaming_api_call via _codex_on_first_delta
   instance attribute.

4. MEDIUM — CLI close-tag after-text bypassed tag filtering: text after a
   reasoning close tag was sent directly to _emit_stream_text, skipping
   open-tag detection. Now routes through _stream_delta for full filtering.

5. LOW — Removed 140 lines of dead code: old _streaming_api_call method
   (superseded by _interruptible_streaming_api_call). Updated 13 tests in
   test_run_agent.py and test_openai_client_lifecycle.py to use the new
   method name and signature.

4573 tests passing.
2026-03-16 06:35:46 -07:00
Teknium
57be18c026
feat: smart approvals + /stop command (inspired by OpenAI Codex)
* feat: smart approvals — LLM-based risk assessment for dangerous commands

Adds a 'smart' approval mode that uses the auxiliary LLM to assess
whether a flagged command is genuinely dangerous or a false positive,
auto-approving low-risk commands without prompting the user.

Inspired by OpenAI Codex's Smart Approvals guardian subagent
(openai/codex#13860).

Config (config.yaml):
  approvals:
    mode: manual   # manual (default), smart, off

Modes:
- manual — current behavior, always prompt the user
- smart  — aux LLM evaluates risk: APPROVE (auto-allow), DENY (block),
           or ESCALATE (fall through to manual prompt)
- off    — skip all approval prompts (equivalent to --yolo)

When smart mode auto-approves, the pattern gets session-level approval
so subsequent uses of the same pattern don't trigger another LLM call.
When it denies, the command is blocked without user prompt. When
uncertain, it escalates to the normal manual approval flow.

The LLM prompt is carefully scoped: it sees only the command text and
the flagged reason, assesses actual risk vs false positive, and returns
a single-word verdict.

* feat: make smart approval model configurable via config.yaml

Adds auxiliary.approval section to config.yaml with the same
provider/model/base_url/api_key pattern as other aux tasks (vision,
web_extract, compression, etc.).

Config:
  auxiliary:
    approval:
      provider: auto
      model: ''        # fast/cheap model recommended
      base_url: ''
      api_key: ''

Bridged to env vars in both CLI and gateway paths so the aux client
picks them up automatically.

* feat: add /stop command to kill all background processes

Adds a /stop slash command that kills all running background processes
at once. Currently users have to process(list) then process(kill) for
each one individually.

Inspired by OpenAI Codex's separation of interrupt (Ctrl+C stops current
turn) from /stop (cleans up background processes). See openai/codex#14602.

Ctrl+C continues to only interrupt the active agent turn — background
dev servers, watchers, etc. are preserved. /stop is the explicit way
to clean them all up.
2026-03-16 06:20:11 -07:00
teknium1
ac739e485f fix(cli): reasoning tag suppression during streaming + fix fallback detection
Fixes two issues found during live testing:

1. Reasoning tag suppression: close tags like </REASONING_SCRATCHPAD>
   that arrive split across stream tokens (e.g. '</REASONING_SCRATCH' +
   'PAD>\n\nHello') were being lost because the buffer was discarded.
   Fix: keep a sliding window of the tail (max close tag length) so
   partial tags survive across tokens.

2. Streaming fallback detection was too broad — 'stream' matched any
   error containing that word (including 'stream_options' rejections).
   Narrowed to specific phrases: 'streaming is not', 'streaming not
   support', 'does not support stream', 'not available'.

Verified with real API calls: streaming works end-to-end with
reasoning block suppression, response box framing, and proper
fallback to Rich Panel when streaming isn't active.
2026-03-16 05:28:10 -07:00
teknium1
780ddd102b fix(docker): gate cwd workspace mount behind config
Keep Docker sandboxes isolated by default. Add an explicit terminal.docker_mount_cwd_to_workspace opt-in, thread it through terminal/file environment creation, and document the security tradeoff and config.yaml workflow clearly.
2026-03-16 05:20:56 -07:00
teknium1
d23e9a9bed feat(cli): streaming token display — line-buffered rendering with response box framing
Stage 2 of streaming support. CLI now streams tokens in real-time:

- _stream_delta(): line-buffered rendering via _cprint (prompt_toolkit safe)
- _flush_stream(): emits remaining buffer and closes response box
- Response box opens on first token, closes on flush
- Skip Rich Panel when streaming already displayed content
- Reset streaming state before each agent turn
- Compatible with existing TTS streaming (both can fire simultaneously)
- Uses skin engine for response label branding

Credit: OutThisLife (#798 CLI streaming concept).
2026-03-16 05:10:15 -07:00
Teknium
9e845a6e53
feat: major /rollback improvements — enabled by default, diff preview, file-level restore, conversation undo, terminal checkpoints
Checkpoint & rollback upgrades:

1. Enabled by default — checkpoints are now on for all new sessions.
   Zero cost when no file-mutating tools fire. Disable with
   checkpoints.enabled: false in config.yaml.

2. Diff preview — /rollback diff <N> shows a git diff between the
   checkpoint and current working tree before committing to a restore.

3. File-level restore — /rollback <N> <file> restores a single file
   from a checkpoint instead of the entire directory.

4. Conversation undo on rollback — when restoring files, the last
   chat turn is automatically undone so the agent's context matches
   the restored filesystem state.

5. Terminal command checkpoints — destructive terminal commands (rm,
   mv, sed -i, truncate, git reset/clean, output redirects) now
   trigger automatic checkpoints before execution. Previously only
   write_file and patch were covered.

6. Change summary in listing — /rollback now shows file count and
   +insertions/-deletions for each checkpoint.

7. Fixed dead code — removed duplicate _run_git call in
   list_checkpoints with nonsensical --all if False condition.

8. Updated help text — /rollback with no args now shows available
   subcommands (diff, file-level restore).
2026-03-16 04:43:37 -07:00
Teknium
00a0c56598
feat: add persistent CLI status bar and usage details (#1522)
Salvaged from PR #1104 by kshitijk4poor. Closes #683.

Adds a persistent status bar to the CLI showing model name, context
window usage with visual bar, estimated cost, and session duration.
Responsive layout degrades gracefully for narrow terminals.

Changes:
- agent/usage_pricing.py: shared pricing table, cost estimation with
  Decimal arithmetic, duration/token formatting helpers
- agent/insights.py: refactored to reuse usage_pricing (eliminates
  duplicate pricing table and formatting logic)
- cli.py: status bar with FormattedTextControl fragments, color-coded
  context thresholds (green/yellow/orange/red), enhanced /usage with
  cost breakdown, 1Hz idle refresh for status bar updates
- tests/test_cli_status_bar.py: status bar snapshot, width collapsing,
  usage report with/without pricing, zero-priced model handling
- tests/test_insights.py: verify zero-priced providers show as unknown

Salvage fixes:
- Resolved conflict with voice status bar (both coexist in layout)
- Import _format_context_length from hermes_cli.banner (moved since PR)

Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>
2026-03-16 04:42:48 -07:00
Teknium
c1da1fdcd5
feat: auto-detect provider when switching models via /model (#1506)
When typing /model deepseek-chat while on a different provider, the
model name now auto-resolves to the correct provider instead of
silently staying on the wrong one and causing API errors.

Detection priority:
1. Direct provider with credentials (e.g. DEEPSEEK_API_KEY set)
2. OpenRouter catalog match with proper slug remapping
3. Direct provider without creds (clear error beats silent failure)

Also adds DeepSeek as a first-class API-key provider — just set
DEEPSEEK_API_KEY and /model deepseek-chat routes directly.

Bare model names get remapped to proper OpenRouter slugs:
  /model gpt-5.4 → openai/gpt-5.4
  /model claude-opus-4.6 → anthropic/claude-opus-4.6

Salvages the concept from PR #1177 by @virtaava with credential
awareness and OpenRouter slug mapping added.

Co-authored-by: virtaava <virtaava@users.noreply.github.com>
2026-03-16 04:34:45 -07:00
teknium1
33ebedc76d feat: enable persistent shell by default for SSH, add config option
SSH persistent shell now defaults to true — non-local backends benefit
most from state persistence across execute() calls. Local backend
remains opt-in via TERMINAL_LOCAL_PERSISTENT env var.

New config.yaml option: terminal.persistent_shell (default: true)
Controls the default for non-local backends. Users can disable with:
  hermes config set terminal.persistent_shell false

Precedence: per-backend env var > TERMINAL_PERSISTENT_SHELL > default.

Wired through cli.py, gateway/run.py, and hermes_cli/config.py so the
config.yaml value reaches terminal_tool via env var bridge.
2026-03-15 20:17:13 -07:00
Teknium
103f7b1ebc
fix: verbose mode shows full untruncated output
* fix(cli): silence tirith prefetch install warnings at startup

* fix: verbose mode now shows full untruncated tool args, results, content, and think blocks

When tool progress is set to 'verbose' (via /verbose or config), the display
was still truncating tool arguments to 100 chars, tool results to 100-200 chars,
assistant content to 100 chars, and think blocks to 5 lines. This defeated the
purpose of verbose mode.

Changes:
- Tool args: show full JSON args (not truncated to log_prefix_chars)
- Tool results: show full result content in both display and debug logs
- Assistant content: show full content during tool-call loops
- Think blocks: show full reasoning text (not truncated to 5 lines/100 chars)
- Auto-enable reasoning display when verbose mode is active
- Fix initial agent creation to respect verbose config (was always quiet_mode=True)
- Updated verbose label to mention think blocks
2026-03-15 20:03:37 -07:00
Teknium
5e92a4ce5a
fix: auto-reload MCP tools when mcp_servers config changes without restart (#1474)
Fixes #1036

After adding an MCP server to config.yaml, users had to restart Hermes
before the new tools became visible — even though /reload-mcp existed.

Add _check_config_mcp_changes() called from process_loop every 5s:
- stat() config.yaml for mtime changes (fast path, no YAML parse)
- On mtime change, parse and compare mcp_servers section
- If mcp_servers changed, auto-trigger _reload_mcp() and notify user
- Skip check while agent is running to avoid interrupting tool calls
- Throttled to CONFIG_WATCH_INTERVAL=5s to avoid busy-polling

/reload-mcp still works for manual force-reload.

Tests: 6 new tests in TestMCPConfigWatch, all passed

Co-authored-by: teyrebaz33 <hakanerten02@hotmail.com>
2026-03-15 19:03:34 -07:00
Teknium
471c663fdf
fix(cli): silence tirith prefetch install warnings at startup (#1452) 2026-03-15 18:07:03 -07:00
teknium1
f24c00a5bf fix(config): reload .env over stale shell overrides
Hermes startup entrypoints now load ~/.hermes/.env and project fallback env files with user config taking precedence over stale shell-exported values. This makes model/provider/base URL changes in .env actually take effect after restarting Hermes. Adds a shared env loader plus regression coverage, and reproduces the original bug case where OPENAI_BASE_URL and HERMES_INFERENCE_PROVIDER remained stuck on old shell values before import.
2026-03-15 06:46:28 -07:00
Teknium
d36b3d498d
Merge pull request #1388 from NousResearch/hermes/hermes-0fadff1b
fix: harden .worktreeinclude path containment
2026-03-14 21:53:28 -07:00
teknium1
f4c012873c fix: harden salvaged worktree include checks
Use Path.relative_to-based containment checks for the salvaged .worktreeinclude guard, remove the replayed test logic from the cherry-picked PR, and add real integration regressions for file, directory, and symlink escapes.
2026-03-14 21:51:27 -07:00
Sebastion
12bc86d9c9 fix: prevent path traversal in .worktreeinclude file processing
Resolve .worktreeinclude entries and validate that both the source path
stays within the repository root and the destination path stays within
the worktree directory before copying files or creating symlinks.

A malicious .worktreeinclude in a cloned repository could previously
reference paths like "../../etc/passwd" to copy or symlink arbitrary
files from outside the repo into the worktree.

CWE-22: Improper Limitation of a Pathname to a Restricted Directory
2026-03-14 21:48:19 -07:00
Nyk
b89177668e fix(cli): non-blocking startup update check and banner deduplication
- Add background thread mechanism (prefetch_update_check/get_update_result)
  so git fetch runs in parallel with skill sync and agent init
- Fix repo path fallback in check_for_updates() for dev installs
- Remove duplicate build_welcome_banner (~180 lines) and
  _format_context_length from cli.py — the banner.py version is
  now the single source of truth
- Port skin banner_hero/banner_logo support and terminal width check
  from cli.py's version into banner.py
- Add update status output to hermes version command
- Add unit tests for update check, prefetch, and version string
2026-03-14 21:45:50 -07:00
Teknium
b14a07315b
fix: save /plan output in workspace (#1381) 2026-03-14 21:28:51 -07:00
Teknium
ff3473a37c
feat: add /plan command (#1372)
* feat: add /plan command

* refactor: back /plan with bundled skill

* docs: document /plan skill
2026-03-14 21:18:17 -07:00
teknium1
9f6bccd76a feat: add direct endpoint overrides for auxiliary and delegation
Add base_url/api_key overrides for auxiliary tasks and delegation so users can
route those flows straight to a custom OpenAI-compatible endpoint without
having to rely on provider=main or named custom providers.

Also clear gateway session env vars in test isolation so the full suite stays
deterministic when run from a messaging-backed agent session.
2026-03-14 21:11:37 -07:00
Teknium
f8a3e37f54
Merge pull request #1343 from NousResearch/hermes/hermes-5d160594
feat: compress cron management into one tool
2026-03-14 19:34:20 -07:00
teknium1
3229e434b8 Merge origin/main into hermes/hermes-5d160594 2026-03-14 19:34:05 -07:00
Teknium
24f61d006a
feat: preload CLI skills on launch (#1359)
* feat: preload CLI skills on launch

* test: cover continue with worktree and skills flags

* feat: show activated skills before CLI banner
2026-03-14 19:33:59 -07:00
teknium1
c3ea620796 feat: add multi-skill cron editing and docs 2026-03-14 19:18:10 -07:00
teknium1
df5c61b37c feat: compress cron management into one tool 2026-03-14 12:21:50 -07:00
teknium1
eb8226daab fix(cli): repair dangerous command approval UI
Move the dangerous-command header onto its own line inside the approval box
so the panel border no longer cuts through it, and restore the long-command
expand path in the active prompt_toolkit approval callback. The CLI already
had a merged 'view full command' feature in fallback/gateway paths, but the
live TUI callback was still using an older choice set and never exposed it.
Add regression tests for long-command view state, in-place expansion, and
panel rendering.
2026-03-14 11:57:44 -07:00
teyrebaz33
fbdce27b9a fix: address prefix matching recursion and skill command coverage
Per teknium1 review on PR #968:

1. Guard against infinite recursion: if expanded name equals the typed
   token (already exact), fall through to Unknown command instead of
   redispatching the same string forever.

2. Include skill slash commands in prefix resolution so execution-time
   matching agrees with tab-completion (set(COMMANDS) | set(_skill_commands)).

3. Add missing test cases:
   - unambiguous prefix with extra args does not recurse
   - exact command with args does not loop
   - skill command prefix matches correctly
   - exact builtin takes priority over skill prefix ambiguity

8 tests passing.
2026-03-14 10:33:58 -07:00
teyrebaz33
a50550fdb4 fix: add prefix matching to slash command dispatcher
Slash commands previously required exact full names. Typing /con
returned 'Unknown command' even though /config was the only match.

Add unambiguous prefix matching in process_command():
- Unique prefix (e.g. /con -> /config): dispatch immediately
- Ambiguous prefix (e.g. /re -> /reset, /retry, /reasoning...):
  show 'Did you mean' suggestions
- No match: existing 'Unknown command' error

Prefix matching uses the COMMANDS dict from hermes_cli/commands.py
(same source as SlashCommandCompleter) so it stays in sync with
any new commands added there.

Closes #928
2026-03-14 10:33:58 -07:00
teknium1
9633ddd8d8 fix: initialize CLI voice state for single-query mode
- initialize voice and interrupt runtime state in HermesCLI.__init__
- prevent chat -q from crashing before run() has executed
- add regression coverage for single-query state initialization
2026-03-14 06:31:32 -07:00
teknium1
7b10881b9e fix: persist clean voice transcripts and /voice off state
- keep CLI voice prefixes API-local while storing the original user text
- persist explicit gateway off state and restore adapter auto-TTS suppression on restart
- add regression coverage for both behaviors
2026-03-14 06:14:22 -07:00
0xbyt4
eb34c0b09a fix: voice pipeline hardening — 7 bug fixes with tests
1. Anthropic + ElevenLabs TTS silence: forward full response to TTS
   callback for non-streaming providers (choices first, then native
   content blocks fallback).

2. Subprocess timeout kill: play_audio_file now kills the process on
   TimeoutExpired instead of leaving zombie processes.

3. Discord disconnect cleanup: leave all voice channels before closing
   the client to prevent leaked state.

4. Audio stream leak: close InputStream if stream.start() fails.

5. Race condition: read/write _on_silence_stop under lock in audio
   callback thread.

6. _vprint force=True: show API error, retry, and truncation messages
   even during streaming TTS.

7. _refresh_level lock: read _voice_recording under _voice_lock.
2026-03-14 14:27:21 +03:00
0xbyt4
cc0a453476 fix: address PR review round 5 — streaming guard, VC auth, history prefix, auto-TTS control
1. Gate _streaming_api_call to chat_completions mode only — Anthropic and
   Codex fall back to _interruptible_api_call. Preserve Anthropic base_url
   across all client rebuild paths (interrupt, fallback, 401 refresh).

2. Discord VC synthetic events now use chat_type="channel" instead of
   defaulting to "dm" — prevents session bleed into DM context.
   Authorization runs before echoing transcript. Sanitize @everyone/@here
   in voice transcripts.

3. CLI voice prefix ("[Voice input...]") is now API-call-local only —
   stripped from returned history so it never persists to session DB or
   resumed sessions.

4. /voice off now disables base adapter auto-TTS via _auto_tts_disabled_chats
   set — voice input no longer triggers TTS when voice mode is off.
2026-03-14 14:27:21 +03:00
0xbyt4
69cb373864 fix: update /voice status to show correct STT provider
Voice status was hardcoded to check API keys only. Now uses the actual
provider resolution (local/groq/openai) so it correctly shows
"local faster-whisper" when installed instead of "Groq" or "MISSING".
2026-03-14 14:27:21 +03:00
0xbyt4
0ff1b4ade2 fix: harden web gateway security and fix error swallowing
- Use hmac.compare_digest for timing-safe token comparison (3 endpoints)
- Default bind to 127.0.0.1 instead of 0.0.0.0
- Sanitize upload filenames with Path.name to prevent path traversal
- Add DOMPurify to sanitize marked.parse() output against XSS
- Replace add_static with authenticated media handler
- Hide token in group chats for /remote-control command
- Use ctypes.util.find_library for Opus instead of hardcoded paths
- Add force=True to 5 interrupt _vprint calls for visibility
- Log Opus decode errors and voice restart failures instead of swallowing
2026-03-14 14:27:21 +03:00
0xbyt4
0a8985acf9 fix: add missing load_config import in _show_voice_status 2026-03-14 14:27:21 +03:00
0xbyt4
39a77431e2 fix: use shutdown() instead of cancel() on CLI exit to release persistent audio stream 2026-03-14 14:27:20 +03:00
0xbyt4
eb79dda04b fix: persistent audio stream and silence detection improvements
- Keep InputStream alive across recordings to avoid CoreAudio hang on
  repeated open/close cycles on macOS.  New _ensure_stream() creates the
  stream once; start()/stop()/cancel() only toggle frame collection.
- Add _close_stream_with_timeout() with daemon thread to prevent
  stream.stop()/close() from blocking indefinitely.
- Add generation counter to detect stale stream-open completions after
  cancel or restart.
- Run recorder.cancel() in background thread from Ctrl+C handler to
  keep the event loop responsive.
- Add shutdown() method called on /voice off to release audio resources.
- Fix silence timer reset during active speech: use dip tolerance for
  _resume_start tracker so natural speech pauses (< 0.3s) don't prevent
  the silence timer from being reset.
- Update tests to match persistent stream behavior.
2026-03-14 14:27:20 +03:00
0xbyt4
c3dc4448bf fix: disable STT retries and stop continuous mode after 3 silent cycles
- Set max_retries=0 on the STT OpenAI client. The SDK default (2) honors
  Groq's retry-after header (often 53s), blocking the thread for up to
  ~106s on rate limits. Voice STT should fail fast, not retry silently.
- Stop continuous recording mode after 3 consecutive no-speech cycles to
  prevent infinite restart loops when nobody is talking.
2026-03-14 14:27:20 +03:00
0xbyt4
0a89933f9b fix: add STT timeout, move finally restart to thread, guard exit on recording
- Set OpenAI client timeout=30s in transcribe_audio() — default 600s
  blocks _voice_processing for 10 min if Groq/OpenAI stalls
- Move _voice_start_recording in _voice_stop_and_transcribe finally
  block to a daemon thread (same pattern as Ctrl+B handler and
  process_loop)
- Add _should_exit guard at top of _voice_start_recording so all 4
  call sites respect shutdown without individual checks
2026-03-14 14:27:20 +03:00
0xbyt4
9d58cafec9 fix: move process_loop voice restart to daemon thread, use _cprint consistently
- process_loop's continuous mode restart called _voice_start_recording()
  directly, blocking the loop if play_beep/sd.wait hangs — queued user
  input would stall silently. Dispatch to daemon thread like Ctrl+B handler.
- Replace print() with _cprint() in _handle_voice_command for consistency
  with the rest of the voice mode code.
2026-03-14 14:27:20 +03:00
0xbyt4
d0e3b39e69 fix: prevent Ctrl+B key handler from blocking prompt_toolkit event loop
The handle_voice_record key binding runs in prompt_toolkit's event-loop
thread. When silence auto-stopped recording, _voice_recording was False
but recorder.stop() still held AudioRecorder._lock. A concurrent Ctrl+B
press entered the START path and blocked on that lock, freezing all
keyboard input.

Three changes:
- Set _voice_processing atomically with _voice_recording=False in
  _voice_stop_and_transcribe to close the race window
- Add _voice_processing guard in the START path to prevent starting
  while stop/transcribe is still running
- Dispatch _voice_start_recording to a daemon thread so play_beep
  (sd.wait) and AudioRecorder.start (lock acquire) never block the
  event loop
2026-03-14 14:27:20 +03:00
0xbyt4
ddfd6e0c59 fix: resolve 6 voice mode bugs found during audit
- edge_tts NameError: _generate_edge_tts now calls _import_edge_tts()
  instead of referencing bare module name (tts_tool.py)
- TTS thread leak: chat() finally block sends sentinel to text_queue,
  sets stop_event, and joins tts_thread on exception paths (cli.py)
- output_stream leak: moved close() into finally block so audio device
  is released even on exception (tts_tool.py)
- Ctrl+C continuous mode: cancel handler now resets _voice_continuous
  to prevent auto-restart after user cancels recording (cli.py)
- _disable_voice_mode: now calls stop_playback() and sets
  _voice_tts_done so TTS stops when voice mode is turned off (cli.py)
- _show_voice_status: reads record key from config instead of
  hardcoding Ctrl+B (cli.py)
2026-03-14 14:27:20 +03:00
0xbyt4
a78249230c fix: address voice mode PR review (streaming TTS, prompt cache, _vprint)
Bug A: Replace stale _HAS_ELEVENLABS/_HAS_AUDIO boolean imports with
lazy import function calls (_import_elevenlabs, _import_sounddevice).
The old constants no longer exist in tts_tool -- the try/except
silently swallowed the ImportError, leaving streaming TTS dead.

Bug B: Use user message prefix instead of modifying system prompt for
voice mode instruction. Changing ephemeral_system_prompt mid-session
invalidates the prompt cache. Now the concise-response hint is
prepended to the user_message passed to run_conversation while
conversation_history keeps the original text.

Minor: Add force parameter to _vprint so critical error messages
(max retries, non-retryable errors, API failures) are always shown
even during streaming TTS playback.

Tests: 15 new tests in test_voice_cli_integration.py covering all
three fixes -- lazy import activation, message prefix behavior,
history cleanliness, system prompt stability, and AST verification
that all critical _vprint calls use force=True.
2026-03-14 14:27:20 +03:00
0xbyt4
fc893f98f4 fix: wrap sd.InputStream in try-except and fix config key name
- AudioRecorder.start() now catches InputStream errors gracefully
  with a clear error message about microphone availability
- Fix config key mismatch: cli.py was reading "push_to_talk_key"
  but config.py defines "record_key" -- now consistent
- Add format conversion from config format ("ctrl+b") to
  prompt_toolkit format ("c-b")
2026-03-14 14:27:20 +03:00
0xbyt4
a8838a7ae5 fix: replace all hardcoded Ctrl+R references with Ctrl+B 2026-03-14 14:27:20 +03:00
0xbyt4
b859dfab16 fix: address voice mode review feedback
1. Fully lazy imports: sounddevice, numpy, elevenlabs, edge_tts, and
   openai are never imported at module level. Each is imported only when
   the feature is explicitly activated, preventing crashes in headless
   environments (SSH, Docker, WSL, no PortAudio).

2. No core agent loop changes: streaming TTS path extracted from
   _interruptible_api_call() into separate _streaming_api_call() method.
   The original method is restored to its upstream form.

3. Configurable key binding: push-to-talk key changed from Ctrl+R
   (conflicts with readline reverse-search) to Ctrl+B by default.
   Configurable via voice.push_to_talk_key in config.yaml.

4. Environment detection: new detect_audio_environment() function checks
   for SSH, Docker, WSL, and missing audio devices before enabling voice
   mode. Auto-disables with clear warnings in incompatible environments.

5. Graceful degradation: every audio touchpoint (sd.play, sd.InputStream,
   sd.OutputStream) wrapped in try/except with ImportError/OSError
   handling. Failures produce warnings, not crashes.
2026-03-14 14:27:20 +03:00
0xbyt4
46db7aeffd fix: streaming tool call parsing, error handling, and fake HA state mutation
- Fix Gemini streaming tool call merge bug: multiple tool calls with same
  index but different IDs are now parsed as separate calls instead of
  concatenating names (e.g. ha_call_serviceha_call_service)
- Handle partial results in voice mode: show error and stop continuous
  mode when agent returns partial/failed results with empty response
- Fix error display during streaming TTS: error messages are shown in
  full response box even when streaming box was already opened
- Add duplicate sentence filter in TTS: skip near-duplicate sentences
  from LLM repetition
- Fix fake HA server state mutation: turn_on/turn_off/set_temperature
  correctly update entity states; temperature sensor simulates change
  when thermostat is adjusted
2026-03-14 14:27:20 +03:00
0xbyt4
404123aea7 feat: add persistent voice mode status bar below input area
Shows voice state (recording, transcribing, TTS/continuous toggles)
as a persistent toolbar using prompt_toolkit ConditionalContainer.
2026-03-14 14:27:20 +03:00
0xbyt4
b00c5949fc fix: suppress verbose logs during streaming TTS, improve hallucination filter, stop continuous mode on errors
- Add _vprint() helper to suppress log output when stream_callback is active
- Expand Whisper hallucination filter with multi-language phrases and regex pattern for repetitive text
- Stop continuous voice mode when agent returns a failed result (e.g. 429 rate limit)
2026-03-14 14:26:55 +03:00
0xbyt4
3a1b35ed92 fix: voice mode race conditions, temp file leak, think tag parsing
- Atomic check-and-set for _voice_recording flag with _voice_lock
- Guard _voice_stop_and_transcribe against concurrent invocation
- Remove premature flag clearing from Ctrl+R handler
- Clean up temp WAV files in finally block (_play_via_tempfile)
- Use buffer-level regex for <think> block filtering (handles chunked tags)
- Prevent /voice on prompt accumulation on repeated calls
- Include Groq in STT key error message
2026-03-14 14:26:55 +03:00
0xbyt4
7d4b4e95f1 feat: sync text display with TTS audio playback
Move screen output from stream_callback to display_callback called by
TTS consumer thread. Text now appears sentence-by-sentence in sync with
audio instead of streaming ahead at LLM speed. Removes quiet_mode hack.
2026-03-14 14:26:55 +03:00
0xbyt4
179d9e1a22 feat: add streaming sentence-by-sentence TTS via ElevenLabs
Stream audio to speaker as the agent generates tokens instead of
waiting for the full response. First sentence plays within ~1-2s
of agent starting to respond.

- run_agent: add stream_callback to run_conversation/chat, streaming
  path in _interruptible_api_call accumulates chunks into mock
  ChatCompletion while forwarding content deltas to callback
- tts_tool: add stream_tts_to_speaker() with sentence buffering,
  think block filtering, markdown stripping, ElevenLabs pcm_24000
  streaming to sounddevice OutputStream
- cli: wire up streaming TTS pipeline in chat(), detect elevenlabs
  provider + sounddevice availability, skip batch TTS when streaming
  is active, signal stop on interrupt

Falls back to batch TTS for Edge/OpenAI providers or when
elevenlabs/sounddevice are not available. Zero impact on non-voice
mode (callback defaults to None).
2026-03-14 14:26:30 +03:00
0xbyt4
d7425343ee fix: fix voice recording stuck in continuous mode
- Track submitted state locally instead of using racy qsize() check
- Allow Ctrl+R to stop recording even while agent is running
- Add double-start guard to prevent concurrent recording attempts
2026-03-14 14:26:30 +03:00
0xbyt4
dad865e920 fix: fix silence detection bugs and add Phase 4 voice mode features
Fix 3 critical bugs in silence detection:
- Micro-pause tolerance now tracks dip duration (not time since speech start)
- Peak RMS check in stop() prevents discarding recordings with real speech
- Reduced min_speech_duration from 0.5s to 0.3s for reliable speech confirmation

Phase 4 features: configurable silence params, visual audio level indicator,
voice system prompt, tool call audio cues, TTS interrupt, continuous mode
auto-restart, interruptable playback via Popen tracking.
2026-03-14 14:26:30 +03:00
0xbyt4
bfd9c97705 feat: add Phase 4 low-latency features for voice mode
- Audio cues: beep on record start (880Hz), double beep on stop (660Hz)
- Silence detection: auto-stop recording after 3s of silence (RMS-based)
- Continuous mode: auto-restart recording after agent responds
  - Ctrl+R starts continuous mode, Ctrl+R during recording exits it
  - Waits for TTS to finish before restarting to avoid recording speaker
- Tests: 7 new tests for beep generation and silence detection
2026-03-14 14:25:28 +03:00
0xbyt4
c23928d089 fix: improve voice mode robustness and add integration tests
- Show TTS errors to user instead of silently logging
- Improve markdown stripping: code blocks, URLs, links, horizontal rules
- Fix stripping order: process markdown links before removing URLs
- Add threading.Lock for voice state variables (cross-thread safety)
- Add 14 CLI integration tests (markdown stripping, command parsing, thread safety)
- Total: 47 voice-related tests
2026-03-14 14:25:28 +03:00
0xbyt4
ea5b89825a fix: voice mode TTS playback and keybinding issues
- Change record key from c-@ to c-r (Ctrl+R) for macOS compatibility
- Add missing tempfile and time imports that caused silent TTS crash
- Use MP3 output for CLI TTS playback (afplay doesn't handle OGG well)
- Strip markdown formatting from text before sending to TTS
- Remove duplicate transcript echo in voice pipeline
2026-03-14 14:25:28 +03:00
0xbyt4
ec32e9a540 feat: add Groq STT support and fix voice mode keybinding
- Add multi-provider STT support (OpenAI > Groq fallback) in transcription_tools
- Auto-correct model selection when provider doesn't support the configured model
- Change voice record key from Ctrl+Space to Ctrl+R (macOS compatibility)
- Fix duplicate transcript echo in voice pipeline
- Add GROQ_API_KEY to .env.example
2026-03-14 14:25:28 +03:00
0xbyt4
1a6fbef8a9 feat: add voice mode with push-to-talk and TTS output for CLI
Implements Issue #314 Phase 2 & 3:
- /voice command to toggle voice mode (on/off/tts/status)
- Ctrl+Space push-to-talk recording via sounddevice
- Whisper STT transcription via existing transcription_tools
- Optional TTS response playback via existing tts_tool
- Visual indicators in prompt (recording/transcribing/voice)
- 21 unit tests, all mocked (no real mic/API)
- Optional deps: sounddevice, numpy (pip install hermes-agent[voice])
2026-03-14 14:25:28 +03:00
Wayne
41f22de20f fix(cli): make TUI prompt and accent output skin-aware
Salvaged from PR #932 by Wayne onto current main.

Apply skin-aware prompt symbols and live prompt_toolkit color refresh,
replace lingering hardcoded accent output with active-skin colors, keep
ANSI-safe response rendering, preserve secret-capture and approval-prompt
state handling, and add integration coverage for prompt state and style
refresh behavior.
2026-03-14 03:12:52 -07:00
Teknium
1869e88169
Merge pull request #1256 from NousResearch/hermes/hermes-720acdad
feat(security): add tirith pre-exec command scanning
2026-03-14 00:24:56 -07:00
sheeki003
375ce8a881 feat(security): add tirith pre-exec command scanning
Integrate tirith as a pre-execution security scanner that detects
homograph URLs, pipe-to-interpreter patterns, terminal injection,
zero-width Unicode, and environment variable manipulation — threats
the existing 50-pattern dangerous command detector doesn't cover.

Architecture: gather-then-decide — both tirith and the dangerous
command detector run before any approval prompt, preventing gateway
force=True replay from bypassing one check when only the other was
shown to the user.

New files:
- tools/tirith_security.py: subprocess wrapper with auto-installer,
  mandatory cosign provenance verification, non-blocking background
  download, disk-persistent failure markers with retryable-cause
  tracking (cosign_missing auto-clears when cosign appears on PATH)
- tests/tools/test_tirith_security.py: 62 tests covering exit code
  mapping, fail_open, cosign verification, background install,
  HERMES_HOME isolation, and failure recovery
- tests/tools/test_command_guards.py: 21 integration tests for the
  combined guard orchestration

Modified files:
- tools/approval.py: add check_all_command_guards() orchestrator,
  add allow_permanent parameter to prompt_dangerous_approval()
- tools/terminal_tool.py: replace _check_dangerous_command with
  consolidated check_all_command_guards
- cli.py: update _approval_callback for allow_permanent kwarg,
  call ensure_installed() at startup
- gateway/run.py: iterate pattern_keys list on replay approval,
  call ensure_installed() at startup
- hermes_cli/config.py: add security config defaults, split
  commented sections for independent fallback
- cli-config.yaml.example: document tirith security config
2026-03-14 00:11:27 -07:00
Teknium
29176f302e
fix: sanitize chat payloads and provider precedence (#1253)
fix: sanitize chat payloads and provider precedence
2026-03-14 00:09:14 -07:00
teknium1
163fa4a9d1 refactor(cli): implement approval locking mechanism to serialize concurrent requests
- Introduced _approval_lock to ensure that approval prompts are handled sequentially, preventing state clobbering from parallel delegation subtasks.
- Updated approval_callback and HermesCLI methods to utilize the lock for managing approval state and deadlines.
- Added tests for the config bridging logic to ensure correct environment variable mapping from config.yaml.
2026-03-13 23:59:18 -07:00
Adavya Sharma
358dab52ce fix: sanitize chat payloads and provider precedence 2026-03-13 23:59:12 -07:00
teknium1
253d54a9e1 fix(cli): make /new, /reset, and /clear start real fresh sessions
Create a new session DB row when starting fresh from the CLI, reset the
agent DB flush cursor and todo state, and update session timing/session ID
bookkeeping so follow-up logging stays correct.

Also update slash-command descriptions and add regression tests for /new,
/reset, and /clear.

Supersedes PR #899.
Closes #641.
2026-03-13 21:53:54 -07:00
teknium1
206e56cc5e fix: finish HERMES_HOME path cleanup
- route CLI interrupt debug logging through HERMES_HOME
- update the remaining channel_directory test to patch HERMES_HOME
  instead of Path.home()
2026-03-13 21:35:07 -07:00
0xIbra
437ec17125 fix(cli): respect HERMES_HOME in all remaining hardcoded ~/.hermes paths
Several files resolved paths via Path.home() / ".hermes" or
os.path.expanduser("~/.hermes/..."), bypassing the HERMES_HOME
environment variable. This broke isolation when running multiple
Hermes instances with distinct HERMES_HOME directories.

Replace all hardcoded paths with calls to get_hermes_home() from
hermes_cli.config, consistent with the rest of the codebase.

Files fixed:
- tools/process_registry.py (processes.json)
- gateway/pairing.py (pairing/)
- gateway/sticker_cache.py (sticker_cache.json)
- gateway/channel_directory.py (channel_directory.json, sessions.json)
- gateway/config.py (gateway.json, config.yaml, sessions_dir)
- gateway/mirror.py (sessions/)
- gateway/hooks.py (hooks/)
- gateway/platforms/base.py (image_cache/, audio_cache/, document_cache/)
- gateway/platforms/whatsapp.py (whatsapp/session)
- gateway/delivery.py (cron/output)
- agent/auxiliary_client.py (auth.json)
- agent/prompt_builder.py (SOUL.md)
- cli.py (config.yaml, images/, pastes/, history)
- run_agent.py (logs/)
- tools/environments/base.py (sandboxes/)
- tools/environments/modal.py (modal_snapshots.json)
- tools/environments/singularity.py (singularity_snapshots.json)
- tools/tts_tool.py (audio_cache)
- hermes_cli/status.py (cron/jobs.json, sessions.json)
- hermes_cli/gateway.py (logs/, whatsapp session)
- hermes_cli/main.py (whatsapp/session)

Tests updated to use HERMES_HOME env var instead of patching Path.home().

Closes #892

(cherry picked from commit 78ac1bba43b8b74a934c6172f2c29bb4d03164b9)
2026-03-13 21:32:53 -07:00
kshitijk4poor
ccfbf42844 feat: secure skill env setup on load (core #688)
When a skill declares required_environment_variables in its YAML
frontmatter, missing env vars trigger a secure TUI prompt (identical
to the sudo password widget) when the skill is loaded. Secrets flow
directly to ~/.hermes/.env, never entering LLM context.

Key changes:
- New required_environment_variables frontmatter field for skills
- Secure TUI widget (masked input, 120s timeout)
- Gateway safety: messaging platforms show local setup guidance
- Legacy prerequisites.env_vars normalized into new format
- Remote backend handling: conservative setup_needed=True
- Env var name validation, file permissions hardened to 0o600
- Redact patterns extended for secret-related JSON fields
- 12 existing skills updated with prerequisites declarations
- ~48 new tests covering skip, timeout, gateway, remote backends
- Dynamic panel widget sizing (fixes hardcoded width from original PR)

Cherry-picked from PR #723 by kshitijk4poor, rebased onto current main
with conflict resolution.

Fixes #688

Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>
2026-03-13 03:14:04 -07:00
teknium1
f562d97f13 Enhance CLI output formatting with RichText support
- Updated command output handling to use RichText for ANSI formatting.
- Improved response display in chat console with RichText integration.
- Ensured fallback for empty command outputs with a clear message.
2026-03-13 02:05:30 -07:00
Teknium
475dd58a8e
Merge PR #736: feat(honcho): async writes, memory modes, session title integration, setup CLI
Authored by erosika. Builds on #38 and #243.

Adds async write support, configurable memory modes, context prefetch pipeline,
4 new Honcho tools (honcho_context, honcho_profile, honcho_search, honcho_conclude),
full 'hermes honcho' CLI, session strategies, AI peer identity, recallMode A/B,
gateway lifecycle management, and comprehensive docs.

Cherry-picks fixes from PRs #831/#832 (adavyas).

Co-authored-by: erosika <erosika@users.noreply.github.com>
Co-authored-by: adavyas <adavyas@users.noreply.github.com>
2026-03-12 19:05:11 -07:00
Teknium
1bb8ed4495
chore: lower default compression threshold from 85% to 50% (#1096)
* fix: ClawHub skill install — use /download ZIP endpoint

The ClawHub API v1 version endpoint only returns file metadata
(path, size, sha256, contentType) without inline content or download
URLs. Our code was looking for inline content in the metadata, which
never existed, causing all ClawHub installs to fail with:
'no inline/raw file content was available'

Fix: Use the /api/v1/download endpoint (same as the official clawhub
CLI) to download skills as ZIP bundles and extract files in-memory.

Changes:
- Add _download_zip() method that downloads and extracts ZIP bundles
- Retry on 429 rate limiting with Retry-After header support
- Path sanitization and binary file filtering for security
- Keep _extract_files() as a fallback for inline/raw content
- Also fix nested file lookup (version_data.version.files)

* chore: lower default compression threshold from 85% to 50%

Triggers context compression earlier — at 50% of the model's context
window instead of 85%. Updated in all four places where the default
is defined: context_compressor.py, cli.py, run_agent.py, config.py,
and gateway/run.py.
2026-03-12 15:51:50 -07:00
Erosika
fefc709b2c merge: resolve conflict with main in subagent interrupt test 2026-03-12 16:28:57 -04:00
Erosika
cd6e5e44e4 feat(honcho): show clickable session line on CLI startup
Display a one-line Honcho session indicator with an OSC 8 terminal
hyperlink after the banner. Also shown when /title remaps the session.
2026-03-12 12:30:42 -04:00
Teknium
e004c094ea
fix: use session_key instead of chat_id for adapter interrupt lookups
* fix: use session_key instead of chat_id for adapter interrupt lookups

monitor_for_interrupt() in _run_agent was using source.chat_id to query
the adapter's has_pending_interrupt() and get_pending_message() methods.
But the adapter stores interrupt events under build_session_key(source),
which produces a different string (e.g. 'agent:main:telegram:dm' vs '123456').

This key mismatch meant the interrupt was never detected through the
adapter path, which is the only active interrupt path for all adapter-based
platforms (Telegram, Discord, Slack, etc.). The gateway-level interrupt
path (in dispatch_message) is unreachable because the adapter intercepts
the 2nd message in handle_message() before it reaches dispatch_message().

Result: sending a new message while subagents were running had no effect —
the interrupt was silently lost.

Fix: replace all source.chat_id references in the interrupt-related code
within _run_agent() with the session_key parameter, which matches the
adapter's storage keys.

Also adds regression tests verifying session_key vs chat_id consistency.

* debug: add file-based logging to CLI interrupt path

Temporary instrumentation to diagnose why message-based interrupts
don't seem to work during subagent execution. Logs to
~/.hermes/interrupt_debug.log (immune to redirect_stdout).

Two log points:
1. When Enter handler puts message into _interrupt_queue
2. When chat() reads it and calls agent.interrupt()

This will reveal whether the message reaches the queue and
whether the interrupt is actually fired.
2026-03-12 08:35:45 -07:00
Teknium
2a62514d17
feat: add 'View full command' option to dangerous command approval (#887)
When a dangerous command is detected and the user is prompted for
approval, long commands are truncated (80 chars in fallback, 70 chars
in the TUI). Users had no way to see the full command before deciding.

This adds a 'View full command' option across all approval interfaces:

- CLI fallback (tools/approval.py): [v]iew option in the prompt menu.
  Shows the full command and re-prompts for approval decision.
- CLI TUI (cli.py): 'Show full command' choice in the arrow-key
  selection panel. Expands the command display in-place and removes
  the view option after use.
- CLI callbacks (callbacks.py): 'view' choice added to the list when
  the command exceeds 70 characters.
- Gateway (gateway/run.py): 'full', 'show', 'view' responses reveal
  the complete command while keeping the approval pending.

Includes 7 new tests covering view-then-approve, view-then-deny,
short command fallthrough, and double-view behavior.

Closes community feedback about the 80-char cap on dangerous commands.
2026-03-12 06:27:21 -07:00
dmahan93
c7fc39bde0 feat: include session ID in system prompt via --pass-session-id flag
Adds --pass-session-id CLI flag. When set, the agent's system prompt
includes the session ID:

  Conversation started: Sunday, March 08, 2026 06:32 PM
  Session ID: 20260308_183200_abc123

Usage:
  hermes --pass-session-id
  hermes chat --pass-session-id

Implementation threads the flag as a proper parameter through the full
chain (main.py → cli.py → run_agent.py) rather than using an env var,
avoiding collisions in multi-agent/multitenant setups.

Based on PR #726 by dmahan93, reworked to use instance parameter
instead of HERMES_PASS_SESSION_ID environment variable.

Co-authored-by: dmahan93 <dmahan93@users.noreply.github.com>
2026-03-12 05:51:31 -07:00
teknium1
7febdf7208 fix: custom endpoint model validation + better /model error messages
- Custom endpoints can serve any model, so skip validation for
  provider='custom' in validate_requested_model(). Previously it
  would reject any model name since there's no static catalog or
  live API to check against.
- Show clear setup instructions when switching to custom endpoint
  without OPENAI_BASE_URL/OPENAI_API_KEY configured.
- Added curated model lists for Nous Portal and OpenAI Codex to
  _PROVIDER_MODELS so /model shows their available models.
2026-03-11 23:29:26 -07:00
teknium1
ec2c6dff70 feat: unified /model and /provider into single view
Both /model and /provider now show the same unified display:

  Current: anthropic/claude-opus-4.6 via OpenRouter

  Authenticated providers & models:
    [openrouter] ← active
      anthropic/claude-opus-4.6 ← current
      anthropic/claude-sonnet-4.5
      ...
    [nous]
      claude-opus-4-6
      gemini-3-flash
      ...
    [openai-codex]
      gpt-5.2-codex
      gpt-5.1-codex-mini
      ...

  Not configured: Z.AI / GLM, Kimi / Moonshot, ...

  Switch model:    /model <model-name>
  Switch provider: /model <provider>:<model-name>
  Example: /model nous:claude-opus-4-6

Users can see all authenticated providers and their models at a glance,
making it easy to switch mid-conversation.

Also added curated model lists for Nous Portal and OpenAI Codex to
hermes_cli/models.py.
2026-03-11 23:06:06 -07:00
teknium1
9302690e1b refactor: remove LLM_MODEL env var dependency — config.yaml is sole source of truth
Model selection now comes exclusively from config.yaml (set via
'hermes model' or 'hermes setup'). The LLM_MODEL env var is no longer
read or written anywhere in production code.

Why: env vars are per-process/per-user and would conflict in
multi-agent or multi-tenant setups. Config.yaml is file-based and
can be scoped per-user or eventually per-session.

Changes:
- cli.py: Read model from CLI_CONFIG only, not LLM_MODEL/OPENAI_MODEL
- hermes_cli/auth.py: _save_model_choice() no longer writes LLM_MODEL
  to .env
- hermes_cli/setup.py: Remove 12 save_env_value('LLM_MODEL', ...)
  calls from all provider setup flows
- gateway/run.py: Remove LLM_MODEL fallback (HERMES_MODEL still works
  for gateway process runtime)
- cron/scheduler.py: Same
- agent/auxiliary_client.py: Remove LLM_MODEL from custom endpoint
  model detection
2026-03-11 22:04:42 -07:00
Erosika
a0b0dbe6b2 Merge remote-tracking branch 'origin/main' into feat/honcho-async-memory
Made-with: Cursor

# Conflicts:
#	cli.py
#	tests/test_run_agent.py
2026-03-11 12:22:56 -04:00
Teknium
b16d7f2da6
Merge pull request #921 from NousResearch/hermes/hermes-ece5a45c
feat(cli): add /reasoning command for effort level and display toggle
2026-03-11 06:30:20 -07:00
teknium1
9423fda5cb feat: configurable subagent provider:model with full credential resolution
Adds delegation.model and delegation.provider config fields so subagents
can run on a completely different provider:model pair than the parent agent.

When delegation.provider is set, the system resolves the full credential
bundle (base_url, api_key, api_mode) via resolve_runtime_provider() —
the same path used by CLI/gateway startup. This means all configured
providers work out of the box: openrouter, nous, zai, kimi-coding,
minimax, minimax-cn.

Key design decisions:
- Provider resolution uses hermes_cli.runtime_provider (single source of
  truth for credential resolution across CLI, gateway, cron, and now
  delegation)
- When only delegation.model is set (no provider), the model name changes
  but parent credentials are inherited (for switching models within the
  same provider like OpenRouter)
- When delegation.provider is set, full credentials are resolved
  independently — enabling cross-provider delegation (e.g. parent on
  Nous Portal, subagents on OpenRouter)
- Clear error messages if provider resolution fails (missing API key,
  unknown provider name)
- _load_config() now falls back to hermes_cli.config.load_config() for
  gateway/cron contexts where CLI_CONFIG is unavailable

Based on PR #791 by 0xbyt4 (closes #609), reworked to use proper
provider credential resolution instead of passing provider as metadata.

Co-authored-by: 0xbyt4 <0xbyt4@users.noreply.github.com>
2026-03-11 06:12:21 -07:00
teknium1
4d873f77c1 feat(cli): add /reasoning command for effort level and display toggle
Combined implementation of reasoning management:
- /reasoning              Show current effort level and display state
- /reasoning <level>      Set reasoning effort (none, low, medium, high, xhigh)
- /reasoning show|on      Show model thinking/reasoning in output
- /reasoning hide|off     Hide model thinking/reasoning from output

Effort level changes persist to config and force agent re-init.
Display toggle updates the agent callback dynamically without re-init.

When display is enabled:
- Intermediate reasoning shown as dim [thinking] lines during tool loops
- Final reasoning shown in a bordered box above the response
- Long reasoning collapsed (5 lines intermediate, 10 lines final)

Also adds:
- reasoning_callback parameter to AIAgent
- last_reasoning in run_conversation result dict
- show_reasoning config option (display section, default: false)
- Display section in /config output
- 34 tests covering both features

Combines functionality from PR #789 and PR #790.

Co-authored-by: Aum Desai <Aum08Desai@users.noreply.github.com>
Co-authored-by: 0xbyt4 <35742124+0xbyt4@users.noreply.github.com>
2026-03-11 06:02:18 -07:00
teknium1
925f378baa Merge PR #773: feat(cli,gateway): add /personality none and custom personality support
Authored by teyrebaz33. Closes #643.

- /personality none/default/neutral clears system prompt overlay
- Dict format personalities with description, tone, style fields
- Works in both CLI and gateway
- 18 tests
2026-03-11 02:54:27 -07:00
teknium1
6e303def12 Merge PR #757: security: enforce 0600/0700 file permissions on sensitive files
Enforces owner-only permissions on files containing secrets:
- config.yaml, .env → 0600
- ~/.hermes/, cron dirs → 0700
- cron jobs.json, output files → 0600

Windows-safe (all chmod calls wrapped in try/except).
Inspired by openclaw v2026.3.7.
2026-03-11 02:48:56 -07:00
teknium1
bd2606a576 fix: initialize self.config in HermesCLI to fix AttributeError on slash commands
HermesCLI.__init__ never assigned self.config, causing an
AttributeError ('HermesCLI' object has no attribute 'config')
whenever an unrecognized slash command fell through to the
quick_commands check on line 2838. This affected skill slash
commands like /x-thread-creation since the quick_commands lookup
runs before the skill command check.

Set self.config = CLI_CONFIG in __init__ to match the pattern used
by the gateway (run.py:199).
2026-03-11 02:33:31 -07:00
teknium1
f5324f9aa5 fix: initialize self.config in HermesCLI to fix AttributeError on slash commands
HermesCLI.__init__ never assigned self.config, causing an
AttributeError ('HermesCLI object has no attribute config')
whenever an unrecognized slash command fell through to the
quick_commands check (line 2832). This broke skill slash commands
like /x-thread-creation since the quick_commands lookup runs
before the skill command check.

Set self.config = CLI_CONFIG in __init__, matching the pattern
used by the gateway (run.py:199).
2026-03-11 02:33:25 -07:00
teknium1
3be6e8a5f2 Merge PR #746: feat(cli,gateway): add user-defined quick commands that bypass agent loop
Authored by teyrebaz33. Adds config-driven quick commands that execute
shell commands without invoking the LLM — zero token usage, works from
Telegram/Discord/Slack/etc. Closes #744.
2026-03-11 00:24:34 -07:00
Bartok Moltbot
8eb9eed074 feat(ux): improve /help formatting with command categories (#640)
- Organize COMMANDS into COMMANDS_BY_CATEGORY dict
- Group commands: Session, Configuration, Tools & Skills, Info, Exit
- Add visual category headers with spacing
- Maintain backwards compat via flat COMMANDS dict
- Better visual hierarchy and scannability

Before:
  /help           - Show this help message
  /tools          - List available tools
  ... (dense list)

After:
  ── Session ──
    /new           Start a new conversation
    /reset         Reset conversation only
    ...

  ── Configuration ──
    /config        Show current configuration
    ...

Closes #640
2026-03-10 23:45:36 -07:00
teknium1
2d80ef7872 fix: _init_agent returns bool, not agent — fix quiet mode crash 2026-03-10 20:49:03 -07:00
teknium1
23270d41b9 feat: add --quiet/-Q flag for programmatic single-query mode
Adds -Q/--quiet to `hermes chat` for use by external orchestrators
(Paperclip, scripts, CI). When combined with -q, suppresses:
- Banner and ASCII art
- Spinner animations
- Tool preview lines (┊ prefix)

Only outputs:
- The agent's final response text
- A parseable 'session_id: <id>' line for session resumption

Usage: hermes chat -q 'Do something' -Q
Used by: Paperclip adapter (@nousresearch/paperclip-adapter-hermes)
2026-03-10 20:45:28 -07:00
vilkasdev
d502952bac fix(cli): add loading indicators for slow slash commands
Shows an immediate status message and braille spinner for slow slash
commands (/skills search|browse|inspect|install, /reload-mcp). Makes
input read-only while the command runs so the CLI doesn't appear frozen.

Cherry-picked from PR #714 by vilkasdev, rebased onto current main
with conflict resolution and bug fix (get_hint_text duplicate return).

Fixes #636

Co-authored-by: vilkasdev <vilkasdev@users.noreply.github.com>
2026-03-10 17:31:00 -07:00
teknium1
ad7a16dca6 fix: remove left/right borders from response box for easier copy-paste
Use rich_box.HORIZONTALS instead of the default ROUNDED box style
for the agent response panel. This keeps the top/bottom horizontal
rules (with title) but removes the vertical │ borders on left and
right, making it much easier to copy-paste response text from the
terminal.
2026-03-10 15:59:08 -07:00
Erosika
74c214e957 feat(honcho): async memory integration with prefetch pipeline and recallMode
Adds full Honcho memory integration to Hermes:

- Session manager with async background writes, memory modes (honcho/hybrid/local),
  and dialectic prefetch for first-turn context warming
- Agent integration: prefetch pipeline, tool surface gated by recallMode,
  system prompt context injection, SIGTERM/SIGINT flush handlers
- CLI commands: setup, status, mode, tokens, peer, identity, migrate
- recallMode setting (auto | context | tools) for A/B testing retrieval strategies
- Session strategies: per-session, per-repo (git tree root), per-directory, global
- Polymorphic memoryMode config: string shorthand or per-peer object overrides
- 97 tests covering async writes, client config, session resolution, and memory modes
2026-03-10 16:21:07 -04:00
teknium1
8eefbef91c fix: replace ANSI response box with Rich Panel + reduce widget flashing
Major UX improvements:

1. Response box now uses a Rich Panel rendered through ChatConsole
   instead of hand-rolled ANSI box-drawing borders. Rich Panels
   adapt to terminal width at render time, wrap content inside
   the borders properly, and use skin colors natively.

2. ChatConsole now reads terminal width at render time via
   shutil.get_terminal_size() instead of defaulting to 80 cols.
   All Rich output adapts to the current terminal size.

3. User-input separator reduced to fixed 40-char width so it
   never wraps regardless of terminal resize.

4. Approval and clarify countdown repaints throttled to every 5s
   (was 1s), dramatically reducing flicker in Kitty/ghostty.
   Selection changes still trigger instant repaints via key bindings.

5. Sudo widget now uses dynamic _panel_box_width() instead of
   hardcoded border strings.

Tests: 2860 passed.
2026-03-10 07:04:02 -07:00
teknium1
e8b19b5826 fix: cap user-input separator at 120 cols (matches response box) 2026-03-10 06:47:26 -07:00
teknium1
9ea2209a43 fix: reduce approval/clarify widget flashing + dynamic border widths
Three UI improvements:

1. Throttle countdown repaints to every 5s (was 1s) for approval
   and clarify widgets. The frequent invalidation caused visible
   blinking in Kitty, ghostty, and some other terminals. Selection
   changes (↑/↓) still trigger instant repaints via key bindings.

2. Make echo Link2them00n. | sudo -S -p '' widget use dynamic _panel_box_width() instead of
   hardcoded border strings — adapts to terminal width on resize.

3. Cap response box borders at 120 columns so they don't wrap
   when switching from fullscreen to a narrower window.

Tests: 2857 passed.
2026-03-10 06:44:13 -07:00
stablegenius49
4bd579f915 fix: normalize max turns config path 2026-03-10 06:05:02 -07:00
teknium1
695c017411 Merge PR #603: fix: return deny on approval callback timeout instead of None
Authored by 0xbyt4.

_approval_callback() had no return statement after the timeout break,
causing it to return None instead of 'deny'. Callers in approval.py
expect one of 'once', 'session', 'always', or 'deny'. This matches
the existing timeout behavior in approval.py:209.
2026-03-10 04:15:31 -07:00
teknium1
4945240fc3 feat: add poseidon/sisyphus/charizard skins + banner logo support
Adds 3 new built-in skins (poseidon, sisyphus, charizard) with full
customization — colors, spinner faces/verbs/wings, branding text, and
custom ASCII art banner logos. Total: 7 built-in skins.

Also adds banner_logo and banner_hero fields to SkinConfig, allowing
any skin to replace the HERMES-AGENT ASCII art logo and the caduceus
hero art with custom artwork. The CLI now renders the skin's logo when
available, falling back to the default Hermes logo.

Skins with custom logos: ares, poseidon, sisyphus, charizard
Skins using default logo: default, mono, slate
2026-03-10 02:11:50 -07:00
teknium1
f6bc620d39 fix: apply skin colors to local build_welcome_banner in cli.py
cli.py had a local copy of build_welcome_banner() that shadowed the
imported one from banner.py. This local copy had all colors hardcoded,
so /skin changes had no visible effect on the banner.

Now the local copy resolves skin colors at render time using
get_active_skin(), matching the banner.py behavior. All hardcoded
#FFD700/#CD7F32/#FFBF00/#B8860B/#FFF8DC/#8B8682 values in the local
function are replaced with skin-aware lookups.
2026-03-10 00:58:42 -07:00
teknium1
de6750ed23 feat: add data-driven skin/theme engine for CLI customization
Adds a skin system that lets users customize the CLI's visual appearance
through data files (YAML) rather than code changes. Skins define: color
palette, spinner faces/verbs/wings, branding text, and tool output prefix.

New files:
- hermes_cli/skin_engine.py — SkinConfig dataclass, built-in skins
  (default, ares, mono, slate), YAML loader for user skins from
  ~/.hermes/skins/, skin management API
- tests/hermes_cli/test_skin_engine.py — 26 tests covering config,
  built-in skins, user YAML skins, display integration

Modified files:
- agent/display.py — skin-aware spinner wings, faces, verbs, tool prefix
- hermes_cli/banner.py — skin-aware banner colors (title, border, accent,
  dim, text, session) via _skin_color()/_skin_branding() helpers
- cli.py — /skin command handler, skin init from config, skin-aware
  response box label and welcome message
- hermes_cli/config.py — add display.skin default
- hermes_cli/commands.py — add /skin to slash commands

Built-in skins:
- default: classic Hermes gold/kawaii
- ares: crimson/bronze war-god theme (from community PRs #579/#725)
- mono: clean grayscale
- slate: cool blue developer theme

User skins: drop a YAML file in ~/.hermes/skins/ with name, colors,
spinner, branding, and tool_prefix fields. Missing values inherit from
the default skin.
2026-03-10 00:37:28 -07:00
teknium1
8b9de366f2 Merge PR #570: feat: OpenClaw migration skill + CLI panel width improvements
Authored by unmodeled-tyler. Adds openclaw-migration skill to optional-skills/
with migration script, SKILL.md, and 7 tests. Also improves clarify/approval
panel rendering with dynamic width calculation.
2026-03-10 00:06:40 -07:00
teknium1
ee4008431a fix: stop terminal border flashing with steady cursor and TUI spinner widget
Cherry-picked and improved from PR #470 (fixes #464).

Problem: On Ubuntu 24.04 with ghostty + tmux, the prompt input box
border lines flash due to cursor blink and raw spinner terminal writes
conflicting with prompt_toolkit's rendering.

Changes:
- cli.py: Add CursorShape.BLOCK to Application() to disable cursor blink
- cli.py: Add thinking_callback + spinner_widget in TUI layout so
  thinking status displays as a proper prompt_toolkit widget instead of
  raw terminal writes that conflict with the TUI renderer
- run_agent.py: Add thinking_callback parameter to AIAgent; when set,
  uses the callback instead of KawaiiSpinner for thinking display

What was NOT changed (preserving existing behavior):
- agent/display.py: Untouched. KawaiiSpinner _write() stdout capture,
  _animate() logic, and 0.12s frame interval all preserved. This
  protects subagent stdout redirection and keeps smooth animations
  for non-CLI contexts (gateway, batch runner).
- Original emoji spinner types (brain/sparkle/pulse/moon/star) preserved
  for all non-CLI contexts.

Fixes from original PR #470:
- CursorShape.STEADY_BLOCK -> CursorShape.BLOCK (STEADY_BLOCK doesn't
  exist in prompt_toolkit 3.0.52)
- Removed duplicate self._spinner_text = '' line
- Removed redundant nested if-checks

Tested: 2706 tests pass, interactive CLI verified via tmux.
2026-03-09 23:26:43 -07:00
teknium1
fa2e72ae9c docs: document docker_volumes config for shared host directories
The Docker backend already supports user-configured volume mounts via
docker_volumes, but it was undocumented — missing from DEFAULT_CONFIG,
cli.py defaults, and configuration docs.

Changes:
- hermes_cli/config.py: Add docker_volumes to DEFAULT_CONFIG with
  inline documentation and examples
- cli.py: Add docker_volumes to load_cli_config defaults
- configuration.md: Full Docker Volume Mounts section with YAML
  examples, use cases (providing files, receiving outputs, shared
  workspaces), and env var alternative
2026-03-09 15:29:34 -07:00
teyrebaz33
c3cf88b202 feat(cli,gateway): add /personality none and custom personality support
Closes #643

Changes:
- /personality none|default|neutral — clears system prompt overlay
- Custom personalities in config.yaml support dict format with:
  name, description, system_prompt, tone, style directives
- Backwards compatible — existing string format still works
- CLI + gateway both updated
- 18 tests covering none/default/neutral, dict format, string format,
  list display, save to config
2026-03-09 17:31:54 +03:00
teknium1
c754135965 fix: banner wraps in narrow terminals (Kitty, small windows)
The full HERMES-AGENT ASCII logo needs ~95 columns, and the
side-by-side caduceus + tools panel needs ~80. In narrow terminals
(Kitty default, resized windows) everything wraps into visual garbage.

Fixes:
- show_banner() auto-detects terminal width and falls back to compact
  banner when < 80 columns
- build_welcome_banner() skips the ASCII logo when < 95 columns
- Compact banner now dynamically sized via _build_compact_banner()
  instead of a hardcoded 64-char box that also wrapped in narrow terms
- Same width checks applied to /clear command's banner refresh

The up/down arrow key issue in Kitty terminal for multiline input is
a known Kitty keyboard protocol (CSI u) vs prompt_toolkit compatibility
gap — arrow keys work correctly in standard terminals and tmux. Users
can work around it by running in tmux or setting TERM=xterm-256color.
2026-03-09 05:57:36 -07:00
teknium1
0ce190be0d security: enforce 0600/0700 file permissions on sensitive files (inspired by openclaw)
Enforce owner-only permissions on files and directories that contain
secrets or sensitive data:

- cron/jobs.py: jobs.json (0600), cron dirs (0700), job output files (0600)
- hermes_cli/config.py: config.yaml (0600), .env (0600), ~/.hermes/* dirs (0700)
- cli.py: config.yaml via save_config_value (0600)

All chmod calls use try/except for Windows compatibility.

Includes _secure_file() and _secure_dir() helpers with graceful fallback.
8 new tests verify permissions on all file types.

Inspired by openclaw v2026.3.7 file permission enforcement.
2026-03-09 02:19:32 -07:00
teknium1
57b48a81ca feat: add config toggle to disable secret redaction
New config option:

  security:
    redact_secrets: false  # default: true

When set to false, API keys, tokens, and passwords are shown in
full in read_file, search_files, and terminal output. Useful for
debugging auth issues where you need to verify the actual key value.

Bridged to both CLI and gateway via HERMES_REDACT_SECRETS env var.
The check is in redact_sensitive_text() itself, so all call sites
(terminal, file tools, log formatter) respect it.
2026-03-09 01:04:33 -07:00
teyrebaz33
1404f846a7 feat(cli,gateway): add user-defined quick commands that bypass agent loop
Implements config-driven quick commands for both CLI and gateway that
execute locally without invoking the LLM.

Config example (~/.hermes/config.yaml):
  quick_commands:
    limits:
      type: exec
      command: /home/user/.local/bin/hermes-limits
    dn:
      type: exec
      command: echo daily-note

Changes:
- hermes_cli/config.py: add quick_commands: {} default
- cli.py: check quick_commands before skill commands in process_command()
- gateway/run.py: check quick_commands before skill commands in _handle_message()
- tests/test_quick_commands.py: 11 tests covering exec, timeout, unsupported type, missing command, priority over skills

Closes #744
2026-03-09 07:38:06 +03:00
teknium1
3ffaac00dd feat: bell_on_complete — terminal bell when agent finishes
Adds a simple config option to play the terminal bell (\a) when the
agent finishes a response. Useful for long-running tasks — switch to
another window and your terminal will ding when done.

Works over SSH since the bell character propagates through the
connection. Most terminal emulators can be configured to flash the
taskbar, play a sound, or show a visual indicator on bell.

Config (default: off):
  display:
    bell_on_complete: true

Closes #318
2026-03-08 21:30:48 -07:00
Teknium
816a3ef6f1
Merge pull request #745 from NousResearch/hermes/hermes-f8d56335
feat: browser console tool, annotated screenshots, auto-recording, and dogfood QA skill
2026-03-08 21:29:52 -07:00
teknium1
a8bf414f4a feat: browser console/errors tool, annotated screenshots, auto-recording, and dogfood QA skill
New browser capabilities and a built-in skill for agent-driven web QA.

## New tool: browser_console

Returns console messages (log/warn/error/info) AND uncaught JavaScript
exceptions in a single call. Uses agent-browser's 'console' and 'errors'
commands through the existing session plumbing. Supports --clear to reset
buffers. Verified working in both local and Browserbase cloud modes.

## Enhanced tool: browser_vision(annotate=True)

New boolean parameter on browser_vision. When true, agent-browser overlays
numbered [N] labels on interactive elements — each [N] maps to ref @eN.
Annotation data (element name, role, bounding box) returned alongside the
vision analysis. Useful for QA reports and spatial reasoning.

## Config: browser.record_sessions

Auto-record browser sessions as WebM video files when enabled:
- Starts recording on first browser_navigate
- Stops and saves on browser_close
- Saves to ~/.hermes/browser_recordings/
- Works in both local and cloud modes (verified)
- Disabled by default

## Built-in skill: dogfood

Systematic exploratory QA testing for web applications. Teaches the agent
a 5-phase workflow:
1. Plan — accept URL, create output dirs, set scope
2. Explore — systematic crawl with annotated screenshots
3. Collect Evidence — screenshots, console errors, JS exceptions
4. Categorize — severity (Critical/High/Medium/Low) and category
   (Functional/Visual/Accessibility/Console/UX/Content)
5. Report — structured markdown with per-issue evidence

Includes:
- skills/dogfood/SKILL.md — full workflow instructions
- skills/dogfood/references/issue-taxonomy.md — severity/category defs
- skills/dogfood/templates/dogfood-report-template.md — report template

## Tests

21 new tests covering:
- browser_console message/error parsing, clear flag, empty/failed states
- browser_console schema registration
- browser_vision annotate schema and flag passing
- record_sessions config defaults and recording lifecycle
- Dogfood skill file existence and content validation

Addresses #315.
2026-03-08 21:28:12 -07:00
teknium1
161436cfdd feat: simple fallback model for provider resilience
When the primary model/provider fails after retries (rate limit, overload,
auth errors, connection failures), Hermes automatically switches to a
configured fallback model for the remainder of the session.

Config (in ~/.hermes/config.yaml):

  fallback_model:
    provider: openrouter
    model: anthropic/claude-sonnet-4

Supports all major providers: OpenRouter, OpenAI, Nous, DeepSeek, Together,
Groq, Fireworks, Mistral, Gemini — plus custom endpoints via base_url and
api_key_env overrides.

Design principles:
- Dead simple: one fallback model, not a chain
- One-shot: switches once, doesn't ping-pong back
- Zero new dependencies: uses existing OpenAI client
- Minimal code: ~100 lines in run_agent.py, ~5 lines in cli.py/gateway
- Three trigger points: max retries exhausted, non-retryable client errors,
  and invalid response exhaustion

Does NOT trigger on context overflow or payload-too-large errors (those
are handled by the existing compression system).

Addresses #737.

25 new tests, 2492 total passing.
2026-03-08 20:22:33 -07:00
Teknium
ebe60646db
Merge pull request #735 from NousResearch/hermes/hermes-f8d56335
fix: allow non-codex-suffixed models (e.g. gpt-5.4) with OpenAI Codex provider
2026-03-08 18:30:27 -07:00
teknium1
f996d7950b fix: trust user-selected models with OpenAI Codex provider
The Codex model normalization was rejecting any model without 'codex'
in its name, forcing a fallback to gpt-5.3-codex. This blocked models
like gpt-5.4 that the Codex API actually supports.

The fix simplifies _normalize_model_for_provider() to two operations:
1. Strip provider prefixes (API needs bare slugs)
2. Replace the *untouched default* model with a Codex-compatible one

If the user explicitly chose a model — any model — we trust them and
let the API be the judge. No allowlists, no slug checks.

Also removes the 'codex not in slug' filter from _read_cache_models()
so the local cache preserves all API-available models.

Inspired by OpenClaw's approach which explicitly lists non-codex models
(gpt-5.4, gpt-5.2) as valid Codex models.
2026-03-08 18:29:09 -07:00
teknium1
d9f373654b feat: enhance auxiliary model configuration and environment variable handling
- Added support for auxiliary model overrides in the configuration, allowing users to specify providers and models for vision and web extraction tasks.
- Updated the CLI configuration example to include new auxiliary model settings.
- Enhanced the environment variable mapping in the CLI to accommodate auxiliary model configurations.
- Improved the resolution logic for auxiliary clients to support task-specific provider overrides.
- Updated relevant documentation and comments for clarity on the new features and their usage.
2026-03-08 18:06:47 -07:00
teknium1
3aded1d4e5 feat: display previous messages when resuming a session in CLI
When resuming a session via --continue or --resume, show a compact recap
of the previous conversation inside a Rich panel before the input prompt.
This gives users immediate visual context about what was discussed.

Changes:
- Add _preload_resumed_session() to load session history early (in run(),
  before banner) so _init_agent() doesn't need a separate DB round-trip
- Add _display_resumed_history() that renders a formatted recap panel:
  * User messages shown with gold bullet (truncated at 300 chars)
  * Assistant responses shown with green diamond (truncated at 200 chars / 3 lines)
  * Tool calls collapsed to count + tool names
  * System messages and tool results hidden
  * <REASONING_SCRATCHPAD> blocks stripped from display
  * Pure-reasoning messages (no visible output) skipped entirely
  * Capped at last 10 exchanges with 'N earlier messages' indicator
  * Dim/muted styling distinguishes recap from active conversation
- Add display.resume_display config option: 'full' (default) or 'minimal'
- Store resume_display as instance variable (like compact) for testability
- 27 new tests covering all display scenarios, config, and edge cases

Closes #719
2026-03-08 17:45:45 -07:00
teknium1
1f1caa836a fix: error out when hermes -w is used outside a git repo
Previously, --worktree printed a yellow warning and continued without
isolation, silently defeating the purpose of the flag. Now it prints
a clear error message and exits immediately.
2026-03-08 17:22:24 -07:00
teknium1
95b1130485 fix: normalize incompatible models when provider resolves to Codex
When _ensure_runtime_credentials() resolves the provider to openai-codex,
check if the active model is Codex-compatible.  If not (e.g. the default
anthropic/claude-opus-4.6), swap it for the best available Codex model.
Also strips provider prefixes the Codex API rejects (openai/gpt-5.3-codex
→ gpt-5.3-codex).

Adds _model_is_default flag so warnings are only shown when the user
explicitly chose an incompatible model (not when it's the config default).

Fixes #651.

Co-inspired-by: stablegenius49 (PR #661)
Co-inspired-by: teyrebaz33 (PR #696)
2026-03-08 16:48:56 -07:00
teknium1
34b4fe495e fix: add title validation — sanitize, length limit, control char stripping
- Add SessionDB.sanitize_title() static method:
  - Strips ASCII control chars (null, bell, ESC, etc.) except whitespace
  - Strips problematic Unicode controls (zero-width, RTL override, BOM)
  - Collapses whitespace runs, strips edges
  - Normalizes empty/whitespace-only to None
  - Enforces 100 char max length (raises ValueError)
- set_session_title() now calls sanitize_title() internally,
  so all call sites (CLI, gateway, auto-lineage) are protected
- CLI /title handler sanitizes early to show correct feedback
- Gateway /title handler sanitizes early to show correct feedback
- 24 new tests: sanitize_title (17 cases covering control chars,
  zero-width, RTL, BOM, emoji, CJK, length, integration),
  gateway validation (too long, control chars, only-control-chars)
2026-03-08 15:54:51 -07:00
teknium1
60b6abefd9 feat: session naming with unique titles, auto-lineage, rich listing, resume by name
- Schema v4: unique title index, migration from v2/v3
- set/get/resolve session titles with uniqueness enforcement
- Auto-lineage: context compression auto-numbers titles (Task -> Task #2 -> Task #3)
- resolve_session_by_title: auto-latest finds most recent continuation
- list_sessions_rich: preview (first 60 chars) + last_active timestamp
- CLI: -c accepts optional name arg (hermes -c 'my project')
- CLI: /title command with deferred mode (set before session exists)
- CLI: sessions list shows Title, Preview, Last Active, ID
- 27 new tests (1844 total passing)
2026-03-08 15:20:29 -07:00
teknium1
cf810c2950 fix: pre-process CLI clipboard images through vision tool instead of raw embedding
Images pasted in the CLI were embedded as raw base64 image_url content
parts in the conversation history, which only works with vision-capable
models. If the main model (e.g. Nous API) doesn't support vision, this
breaks the request and poisons all subsequent messages.

Now the CLI uses the same approach as the messaging gateway: images are
pre-processed through the auxiliary vision model (Gemini Flash via
OpenRouter or Nous Portal) and converted to text descriptions. The
local file path is included so the agent can re-examine via
vision_analyze if needed. Works with any model.

Fixes #638.
2026-03-08 06:22:00 -07:00
teknium1
a23bcb81ce fix: improve /model user feedback + update docs
User messaging improvements:
- Rejection: '(>_<) Error: not a valid model' instead of '(^_^) Warning: Error:'
- Rejection: shows 'Model unchanged' + tip about /model and /provider
- Session-only: explains 'this session only' with reason and 'will revert on restart'
- Saved: clear '(saved to config)' confirmation

Docs updated:
- cli-commands.md, cli.md, messaging/index.md: /model now shows
  provider:model syntax, /provider command added to tables

Test fixes: deduplicated test names, assertions match new messages.
2026-03-08 06:13:12 -07:00
teknium1
666f2dd486 feat: /provider command + fix gateway bugs + harden parse_model_input
/provider command (CLI + gateway):
  Shows all providers with auth status (✓/✗), aliases, and active marker.
  Users can now discover what provider names work with provider:model syntax.

Gateway bugs fixed:
  - Config was saved even when validation.persist=False (told user 'session
    only' but actually persisted the unvalidated model)
  - HERMES_INFERENCE_PROVIDER env var not set on provider switch, causing
    the switch to be silently overridden if that env var was already set

parse_model_input hardened:
  - Colon only treated as provider delimiter if left side is a recognized
    provider name or alias. 'anthropic/claude-3.5-sonnet:beta' now passes
    through as a model name instead of trying provider='anthropic/claude-3.5-sonnet'.
  - HTTP URLs, random colons no longer misinterpreted.

56 tests passing across model validation, CLI commands, and integration.
2026-03-08 06:09:36 -07:00
teknium1
34792dd907 fix: resolve 'auto' provider properly via credential detection
'auto' doesn't always mean openrouter — it could be nous, zai,
kimi-coding, etc. depending on configured credentials. Reverted the
hardcoded mapping and now both CLI and gateway call
resolve_provider() to detect the actual active provider when 'auto'
is set. Falls back to openrouter only if resolution fails.
2026-03-08 05:58:45 -07:00
teknium1
132e5ec179 fix: resolve 'auto' provider in /model display + update gateway handler
- normalize_provider('auto') now returns 'openrouter' (the default)
  so /model shows the curated model list instead of nothing
- CLI /model display uses normalize_provider before looking up labels
- Gateway /model handler now uses the same validation logic as CLI:
  live API probe, provider:model syntax, curated model list display
2026-03-08 05:54:52 -07:00
teknium1
66d3e6a0c2 feat: provider switching via /model + enhanced model display
Add provider:model syntax to /model command for runtime provider switching:
  /model zai:glm-5           → switch to Z.AI provider with glm-5
  /model nous:hermes-3       → switch to Nous Portal with hermes-3
  /model openrouter:anthropic/claude-sonnet-4.5  → explicit OpenRouter

When switching providers, credentials are resolved via resolve_runtime_provider
and validated before committing. Both model and provider are saved to config.
Provider aliases work (glm: → zai, kimi: → kimi-coding, etc.).

Enhanced /model (no args) display now shows:
  - Current model and provider
  - Curated model list for the current provider with ← marker
  - Usage examples including provider:model syntax

39 tests covering parse_model_input, curated_models_for_provider,
provider switching (success + credential failure), and display output.
2026-03-08 05:45:59 -07:00
teknium1
245d174359 feat: validate /model against live API instead of hardcoded lists
Replace the static catalog-based model validation with a live API probe.
The /model command now hits the provider's /models endpoint to check if
the requested model actually exists:

- Model found in API → accepted + saved to config
- Model NOT found in API → rejected with 'Error: not a valid model'
  and fuzzy-match suggestions from the live model list
- API unreachable → graceful fallback to hardcoded catalog (session-only
  for unrecognized models)
- Format errors (empty, spaces, missing '/') still caught instantly
  without a network call

The API probe takes ~0.2s for OpenRouter (346 models) and works with any
OpenAI-compatible endpoint (Ollama, vLLM, custom, etc.).

32 tests covering all paths: format checks, API found, API not found,
API unreachable fallback, CLI integration.
2026-03-08 05:22:20 -07:00