Resolve .worktreeinclude entries and validate that both the source path
stays within the repository root and the destination path stays within
the worktree directory before copying files or creating symlinks.
A malicious .worktreeinclude in a cloned repository could previously
reference paths like "../../etc/passwd" to copy or symlink arbitrary
files from outside the repo into the worktree.
CWE-22: Improper Limitation of a Pathname to a Restricted Directory
- Add background thread mechanism (prefetch_update_check/get_update_result)
so git fetch runs in parallel with skill sync and agent init
- Fix repo path fallback in check_for_updates() for dev installs
- Remove duplicate build_welcome_banner (~180 lines) and
_format_context_length from cli.py — the banner.py version is
now the single source of truth
- Port skin banner_hero/banner_logo support and terminal width check
from cli.py's version into banner.py
- Add update status output to hermes version command
- Add unit tests for update check, prefetch, and version string
Add base_url/api_key overrides for auxiliary tasks and delegation so users can
route those flows straight to a custom OpenAI-compatible endpoint without
having to rely on provider=main or named custom providers.
Also clear gateway session env vars in test isolation so the full suite stays
deterministic when run from a messaging-backed agent session.
Move the dangerous-command header onto its own line inside the approval box
so the panel border no longer cuts through it, and restore the long-command
expand path in the active prompt_toolkit approval callback. The CLI already
had a merged 'view full command' feature in fallback/gateway paths, but the
live TUI callback was still using an older choice set and never exposed it.
Add regression tests for long-command view state, in-place expansion, and
panel rendering.
Per teknium1 review on PR #968:
1. Guard against infinite recursion: if expanded name equals the typed
token (already exact), fall through to Unknown command instead of
redispatching the same string forever.
2. Include skill slash commands in prefix resolution so execution-time
matching agrees with tab-completion (set(COMMANDS) | set(_skill_commands)).
3. Add missing test cases:
- unambiguous prefix with extra args does not recurse
- exact command with args does not loop
- skill command prefix matches correctly
- exact builtin takes priority over skill prefix ambiguity
8 tests passing.
Slash commands previously required exact full names. Typing /con
returned 'Unknown command' even though /config was the only match.
Add unambiguous prefix matching in process_command():
- Unique prefix (e.g. /con -> /config): dispatch immediately
- Ambiguous prefix (e.g. /re -> /reset, /retry, /reasoning...):
show 'Did you mean' suggestions
- No match: existing 'Unknown command' error
Prefix matching uses the COMMANDS dict from hermes_cli/commands.py
(same source as SlashCommandCompleter) so it stays in sync with
any new commands added there.
Closes#928
- initialize voice and interrupt runtime state in HermesCLI.__init__
- prevent chat -q from crashing before run() has executed
- add regression coverage for single-query state initialization
- keep CLI voice prefixes API-local while storing the original user text
- persist explicit gateway off state and restore adapter auto-TTS suppression on restart
- add regression coverage for both behaviors
1. Anthropic + ElevenLabs TTS silence: forward full response to TTS
callback for non-streaming providers (choices first, then native
content blocks fallback).
2. Subprocess timeout kill: play_audio_file now kills the process on
TimeoutExpired instead of leaving zombie processes.
3. Discord disconnect cleanup: leave all voice channels before closing
the client to prevent leaked state.
4. Audio stream leak: close InputStream if stream.start() fails.
5. Race condition: read/write _on_silence_stop under lock in audio
callback thread.
6. _vprint force=True: show API error, retry, and truncation messages
even during streaming TTS.
7. _refresh_level lock: read _voice_recording under _voice_lock.
1. Gate _streaming_api_call to chat_completions mode only — Anthropic and
Codex fall back to _interruptible_api_call. Preserve Anthropic base_url
across all client rebuild paths (interrupt, fallback, 401 refresh).
2. Discord VC synthetic events now use chat_type="channel" instead of
defaulting to "dm" — prevents session bleed into DM context.
Authorization runs before echoing transcript. Sanitize @everyone/@here
in voice transcripts.
3. CLI voice prefix ("[Voice input...]") is now API-call-local only —
stripped from returned history so it never persists to session DB or
resumed sessions.
4. /voice off now disables base adapter auto-TTS via _auto_tts_disabled_chats
set — voice input no longer triggers TTS when voice mode is off.
Voice status was hardcoded to check API keys only. Now uses the actual
provider resolution (local/groq/openai) so it correctly shows
"local faster-whisper" when installed instead of "Groq" or "MISSING".
- Use hmac.compare_digest for timing-safe token comparison (3 endpoints)
- Default bind to 127.0.0.1 instead of 0.0.0.0
- Sanitize upload filenames with Path.name to prevent path traversal
- Add DOMPurify to sanitize marked.parse() output against XSS
- Replace add_static with authenticated media handler
- Hide token in group chats for /remote-control command
- Use ctypes.util.find_library for Opus instead of hardcoded paths
- Add force=True to 5 interrupt _vprint calls for visibility
- Log Opus decode errors and voice restart failures instead of swallowing
- Keep InputStream alive across recordings to avoid CoreAudio hang on
repeated open/close cycles on macOS. New _ensure_stream() creates the
stream once; start()/stop()/cancel() only toggle frame collection.
- Add _close_stream_with_timeout() with daemon thread to prevent
stream.stop()/close() from blocking indefinitely.
- Add generation counter to detect stale stream-open completions after
cancel or restart.
- Run recorder.cancel() in background thread from Ctrl+C handler to
keep the event loop responsive.
- Add shutdown() method called on /voice off to release audio resources.
- Fix silence timer reset during active speech: use dip tolerance for
_resume_start tracker so natural speech pauses (< 0.3s) don't prevent
the silence timer from being reset.
- Update tests to match persistent stream behavior.
- Set max_retries=0 on the STT OpenAI client. The SDK default (2) honors
Groq's retry-after header (often 53s), blocking the thread for up to
~106s on rate limits. Voice STT should fail fast, not retry silently.
- Stop continuous recording mode after 3 consecutive no-speech cycles to
prevent infinite restart loops when nobody is talking.
- Set OpenAI client timeout=30s in transcribe_audio() — default 600s
blocks _voice_processing for 10 min if Groq/OpenAI stalls
- Move _voice_start_recording in _voice_stop_and_transcribe finally
block to a daemon thread (same pattern as Ctrl+B handler and
process_loop)
- Add _should_exit guard at top of _voice_start_recording so all 4
call sites respect shutdown without individual checks
- process_loop's continuous mode restart called _voice_start_recording()
directly, blocking the loop if play_beep/sd.wait hangs — queued user
input would stall silently. Dispatch to daemon thread like Ctrl+B handler.
- Replace print() with _cprint() in _handle_voice_command for consistency
with the rest of the voice mode code.
The handle_voice_record key binding runs in prompt_toolkit's event-loop
thread. When silence auto-stopped recording, _voice_recording was False
but recorder.stop() still held AudioRecorder._lock. A concurrent Ctrl+B
press entered the START path and blocked on that lock, freezing all
keyboard input.
Three changes:
- Set _voice_processing atomically with _voice_recording=False in
_voice_stop_and_transcribe to close the race window
- Add _voice_processing guard in the START path to prevent starting
while stop/transcribe is still running
- Dispatch _voice_start_recording to a daemon thread so play_beep
(sd.wait) and AudioRecorder.start (lock acquire) never block the
event loop
- edge_tts NameError: _generate_edge_tts now calls _import_edge_tts()
instead of referencing bare module name (tts_tool.py)
- TTS thread leak: chat() finally block sends sentinel to text_queue,
sets stop_event, and joins tts_thread on exception paths (cli.py)
- output_stream leak: moved close() into finally block so audio device
is released even on exception (tts_tool.py)
- Ctrl+C continuous mode: cancel handler now resets _voice_continuous
to prevent auto-restart after user cancels recording (cli.py)
- _disable_voice_mode: now calls stop_playback() and sets
_voice_tts_done so TTS stops when voice mode is turned off (cli.py)
- _show_voice_status: reads record key from config instead of
hardcoding Ctrl+B (cli.py)
Bug A: Replace stale _HAS_ELEVENLABS/_HAS_AUDIO boolean imports with
lazy import function calls (_import_elevenlabs, _import_sounddevice).
The old constants no longer exist in tts_tool -- the try/except
silently swallowed the ImportError, leaving streaming TTS dead.
Bug B: Use user message prefix instead of modifying system prompt for
voice mode instruction. Changing ephemeral_system_prompt mid-session
invalidates the prompt cache. Now the concise-response hint is
prepended to the user_message passed to run_conversation while
conversation_history keeps the original text.
Minor: Add force parameter to _vprint so critical error messages
(max retries, non-retryable errors, API failures) are always shown
even during streaming TTS playback.
Tests: 15 new tests in test_voice_cli_integration.py covering all
three fixes -- lazy import activation, message prefix behavior,
history cleanliness, system prompt stability, and AST verification
that all critical _vprint calls use force=True.
- AudioRecorder.start() now catches InputStream errors gracefully
with a clear error message about microphone availability
- Fix config key mismatch: cli.py was reading "push_to_talk_key"
but config.py defines "record_key" -- now consistent
- Add format conversion from config format ("ctrl+b") to
prompt_toolkit format ("c-b")
1. Fully lazy imports: sounddevice, numpy, elevenlabs, edge_tts, and
openai are never imported at module level. Each is imported only when
the feature is explicitly activated, preventing crashes in headless
environments (SSH, Docker, WSL, no PortAudio).
2. No core agent loop changes: streaming TTS path extracted from
_interruptible_api_call() into separate _streaming_api_call() method.
The original method is restored to its upstream form.
3. Configurable key binding: push-to-talk key changed from Ctrl+R
(conflicts with readline reverse-search) to Ctrl+B by default.
Configurable via voice.push_to_talk_key in config.yaml.
4. Environment detection: new detect_audio_environment() function checks
for SSH, Docker, WSL, and missing audio devices before enabling voice
mode. Auto-disables with clear warnings in incompatible environments.
5. Graceful degradation: every audio touchpoint (sd.play, sd.InputStream,
sd.OutputStream) wrapped in try/except with ImportError/OSError
handling. Failures produce warnings, not crashes.
- Fix Gemini streaming tool call merge bug: multiple tool calls with same
index but different IDs are now parsed as separate calls instead of
concatenating names (e.g. ha_call_serviceha_call_service)
- Handle partial results in voice mode: show error and stop continuous
mode when agent returns partial/failed results with empty response
- Fix error display during streaming TTS: error messages are shown in
full response box even when streaming box was already opened
- Add duplicate sentence filter in TTS: skip near-duplicate sentences
from LLM repetition
- Fix fake HA server state mutation: turn_on/turn_off/set_temperature
correctly update entity states; temperature sensor simulates change
when thermostat is adjusted
- Add _vprint() helper to suppress log output when stream_callback is active
- Expand Whisper hallucination filter with multi-language phrases and regex pattern for repetitive text
- Stop continuous voice mode when agent returns a failed result (e.g. 429 rate limit)
- Atomic check-and-set for _voice_recording flag with _voice_lock
- Guard _voice_stop_and_transcribe against concurrent invocation
- Remove premature flag clearing from Ctrl+R handler
- Clean up temp WAV files in finally block (_play_via_tempfile)
- Use buffer-level regex for <think> block filtering (handles chunked tags)
- Prevent /voice on prompt accumulation on repeated calls
- Include Groq in STT key error message
Move screen output from stream_callback to display_callback called by
TTS consumer thread. Text now appears sentence-by-sentence in sync with
audio instead of streaming ahead at LLM speed. Removes quiet_mode hack.
Stream audio to speaker as the agent generates tokens instead of
waiting for the full response. First sentence plays within ~1-2s
of agent starting to respond.
- run_agent: add stream_callback to run_conversation/chat, streaming
path in _interruptible_api_call accumulates chunks into mock
ChatCompletion while forwarding content deltas to callback
- tts_tool: add stream_tts_to_speaker() with sentence buffering,
think block filtering, markdown stripping, ElevenLabs pcm_24000
streaming to sounddevice OutputStream
- cli: wire up streaming TTS pipeline in chat(), detect elevenlabs
provider + sounddevice availability, skip batch TTS when streaming
is active, signal stop on interrupt
Falls back to batch TTS for Edge/OpenAI providers or when
elevenlabs/sounddevice are not available. Zero impact on non-voice
mode (callback defaults to None).
- Track submitted state locally instead of using racy qsize() check
- Allow Ctrl+R to stop recording even while agent is running
- Add double-start guard to prevent concurrent recording attempts
- Audio cues: beep on record start (880Hz), double beep on stop (660Hz)
- Silence detection: auto-stop recording after 3s of silence (RMS-based)
- Continuous mode: auto-restart recording after agent responds
- Ctrl+R starts continuous mode, Ctrl+R during recording exits it
- Waits for TTS to finish before restarting to avoid recording speaker
- Tests: 7 new tests for beep generation and silence detection
- Change record key from c-@ to c-r (Ctrl+R) for macOS compatibility
- Add missing tempfile and time imports that caused silent TTS crash
- Use MP3 output for CLI TTS playback (afplay doesn't handle OGG well)
- Strip markdown formatting from text before sending to TTS
- Remove duplicate transcript echo in voice pipeline
- Add multi-provider STT support (OpenAI > Groq fallback) in transcription_tools
- Auto-correct model selection when provider doesn't support the configured model
- Change voice record key from Ctrl+Space to Ctrl+R (macOS compatibility)
- Fix duplicate transcript echo in voice pipeline
- Add GROQ_API_KEY to .env.example
Salvaged from PR #932 by Wayne onto current main.
Apply skin-aware prompt symbols and live prompt_toolkit color refresh,
replace lingering hardcoded accent output with active-skin colors, keep
ANSI-safe response rendering, preserve secret-capture and approval-prompt
state handling, and add integration coverage for prompt state and style
refresh behavior.
Integrate tirith as a pre-execution security scanner that detects
homograph URLs, pipe-to-interpreter patterns, terminal injection,
zero-width Unicode, and environment variable manipulation — threats
the existing 50-pattern dangerous command detector doesn't cover.
Architecture: gather-then-decide — both tirith and the dangerous
command detector run before any approval prompt, preventing gateway
force=True replay from bypassing one check when only the other was
shown to the user.
New files:
- tools/tirith_security.py: subprocess wrapper with auto-installer,
mandatory cosign provenance verification, non-blocking background
download, disk-persistent failure markers with retryable-cause
tracking (cosign_missing auto-clears when cosign appears on PATH)
- tests/tools/test_tirith_security.py: 62 tests covering exit code
mapping, fail_open, cosign verification, background install,
HERMES_HOME isolation, and failure recovery
- tests/tools/test_command_guards.py: 21 integration tests for the
combined guard orchestration
Modified files:
- tools/approval.py: add check_all_command_guards() orchestrator,
add allow_permanent parameter to prompt_dangerous_approval()
- tools/terminal_tool.py: replace _check_dangerous_command with
consolidated check_all_command_guards
- cli.py: update _approval_callback for allow_permanent kwarg,
call ensure_installed() at startup
- gateway/run.py: iterate pattern_keys list on replay approval,
call ensure_installed() at startup
- hermes_cli/config.py: add security config defaults, split
commented sections for independent fallback
- cli-config.yaml.example: document tirith security config
- Introduced _approval_lock to ensure that approval prompts are handled sequentially, preventing state clobbering from parallel delegation subtasks.
- Updated approval_callback and HermesCLI methods to utilize the lock for managing approval state and deadlines.
- Added tests for the config bridging logic to ensure correct environment variable mapping from config.yaml.
Create a new session DB row when starting fresh from the CLI, reset the
agent DB flush cursor and todo state, and update session timing/session ID
bookkeeping so follow-up logging stays correct.
Also update slash-command descriptions and add regression tests for /new,
/reset, and /clear.
Supersedes PR #899.
Closes#641.
When a skill declares required_environment_variables in its YAML
frontmatter, missing env vars trigger a secure TUI prompt (identical
to the sudo password widget) when the skill is loaded. Secrets flow
directly to ~/.hermes/.env, never entering LLM context.
Key changes:
- New required_environment_variables frontmatter field for skills
- Secure TUI widget (masked input, 120s timeout)
- Gateway safety: messaging platforms show local setup guidance
- Legacy prerequisites.env_vars normalized into new format
- Remote backend handling: conservative setup_needed=True
- Env var name validation, file permissions hardened to 0o600
- Redact patterns extended for secret-related JSON fields
- 12 existing skills updated with prerequisites declarations
- ~48 new tests covering skip, timeout, gateway, remote backends
- Dynamic panel widget sizing (fixes hardcoded width from original PR)
Cherry-picked from PR #723 by kshitijk4poor, rebased onto current main
with conflict resolution.
Fixes#688
Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>
- Updated command output handling to use RichText for ANSI formatting.
- Improved response display in chat console with RichText integration.
- Ensured fallback for empty command outputs with a clear message.
* fix: ClawHub skill install — use /download ZIP endpoint
The ClawHub API v1 version endpoint only returns file metadata
(path, size, sha256, contentType) without inline content or download
URLs. Our code was looking for inline content in the metadata, which
never existed, causing all ClawHub installs to fail with:
'no inline/raw file content was available'
Fix: Use the /api/v1/download endpoint (same as the official clawhub
CLI) to download skills as ZIP bundles and extract files in-memory.
Changes:
- Add _download_zip() method that downloads and extracts ZIP bundles
- Retry on 429 rate limiting with Retry-After header support
- Path sanitization and binary file filtering for security
- Keep _extract_files() as a fallback for inline/raw content
- Also fix nested file lookup (version_data.version.files)
* chore: lower default compression threshold from 85% to 50%
Triggers context compression earlier — at 50% of the model's context
window instead of 85%. Updated in all four places where the default
is defined: context_compressor.py, cli.py, run_agent.py, config.py,
and gateway/run.py.
* fix: use session_key instead of chat_id for adapter interrupt lookups
monitor_for_interrupt() in _run_agent was using source.chat_id to query
the adapter's has_pending_interrupt() and get_pending_message() methods.
But the adapter stores interrupt events under build_session_key(source),
which produces a different string (e.g. 'agent:main:telegram:dm' vs '123456').
This key mismatch meant the interrupt was never detected through the
adapter path, which is the only active interrupt path for all adapter-based
platforms (Telegram, Discord, Slack, etc.). The gateway-level interrupt
path (in dispatch_message) is unreachable because the adapter intercepts
the 2nd message in handle_message() before it reaches dispatch_message().
Result: sending a new message while subagents were running had no effect —
the interrupt was silently lost.
Fix: replace all source.chat_id references in the interrupt-related code
within _run_agent() with the session_key parameter, which matches the
adapter's storage keys.
Also adds regression tests verifying session_key vs chat_id consistency.
* debug: add file-based logging to CLI interrupt path
Temporary instrumentation to diagnose why message-based interrupts
don't seem to work during subagent execution. Logs to
~/.hermes/interrupt_debug.log (immune to redirect_stdout).
Two log points:
1. When Enter handler puts message into _interrupt_queue
2. When chat() reads it and calls agent.interrupt()
This will reveal whether the message reaches the queue and
whether the interrupt is actually fired.
When a dangerous command is detected and the user is prompted for
approval, long commands are truncated (80 chars in fallback, 70 chars
in the TUI). Users had no way to see the full command before deciding.
This adds a 'View full command' option across all approval interfaces:
- CLI fallback (tools/approval.py): [v]iew option in the prompt menu.
Shows the full command and re-prompts for approval decision.
- CLI TUI (cli.py): 'Show full command' choice in the arrow-key
selection panel. Expands the command display in-place and removes
the view option after use.
- CLI callbacks (callbacks.py): 'view' choice added to the list when
the command exceeds 70 characters.
- Gateway (gateway/run.py): 'full', 'show', 'view' responses reveal
the complete command while keeping the approval pending.
Includes 7 new tests covering view-then-approve, view-then-deny,
short command fallthrough, and double-view behavior.
Closes community feedback about the 80-char cap on dangerous commands.
Adds --pass-session-id CLI flag. When set, the agent's system prompt
includes the session ID:
Conversation started: Sunday, March 08, 2026 06:32 PM
Session ID: 20260308_183200_abc123
Usage:
hermes --pass-session-id
hermes chat --pass-session-id
Implementation threads the flag as a proper parameter through the full
chain (main.py → cli.py → run_agent.py) rather than using an env var,
avoiding collisions in multi-agent/multitenant setups.
Based on PR #726 by dmahan93, reworked to use instance parameter
instead of HERMES_PASS_SESSION_ID environment variable.
Co-authored-by: dmahan93 <dmahan93@users.noreply.github.com>
- Custom endpoints can serve any model, so skip validation for
provider='custom' in validate_requested_model(). Previously it
would reject any model name since there's no static catalog or
live API to check against.
- Show clear setup instructions when switching to custom endpoint
without OPENAI_BASE_URL/OPENAI_API_KEY configured.
- Added curated model lists for Nous Portal and OpenAI Codex to
_PROVIDER_MODELS so /model shows their available models.
Both /model and /provider now show the same unified display:
Current: anthropic/claude-opus-4.6 via OpenRouter
Authenticated providers & models:
[openrouter] ← active
anthropic/claude-opus-4.6 ← current
anthropic/claude-sonnet-4.5
...
[nous]
claude-opus-4-6
gemini-3-flash
...
[openai-codex]
gpt-5.2-codex
gpt-5.1-codex-mini
...
Not configured: Z.AI / GLM, Kimi / Moonshot, ...
Switch model: /model <model-name>
Switch provider: /model <provider>:<model-name>
Example: /model nous:claude-opus-4-6
Users can see all authenticated providers and their models at a glance,
making it easy to switch mid-conversation.
Also added curated model lists for Nous Portal and OpenAI Codex to
hermes_cli/models.py.
Model selection now comes exclusively from config.yaml (set via
'hermes model' or 'hermes setup'). The LLM_MODEL env var is no longer
read or written anywhere in production code.
Why: env vars are per-process/per-user and would conflict in
multi-agent or multi-tenant setups. Config.yaml is file-based and
can be scoped per-user or eventually per-session.
Changes:
- cli.py: Read model from CLI_CONFIG only, not LLM_MODEL/OPENAI_MODEL
- hermes_cli/auth.py: _save_model_choice() no longer writes LLM_MODEL
to .env
- hermes_cli/setup.py: Remove 12 save_env_value('LLM_MODEL', ...)
calls from all provider setup flows
- gateway/run.py: Remove LLM_MODEL fallback (HERMES_MODEL still works
for gateway process runtime)
- cron/scheduler.py: Same
- agent/auxiliary_client.py: Remove LLM_MODEL from custom endpoint
model detection
Adds delegation.model and delegation.provider config fields so subagents
can run on a completely different provider:model pair than the parent agent.
When delegation.provider is set, the system resolves the full credential
bundle (base_url, api_key, api_mode) via resolve_runtime_provider() —
the same path used by CLI/gateway startup. This means all configured
providers work out of the box: openrouter, nous, zai, kimi-coding,
minimax, minimax-cn.
Key design decisions:
- Provider resolution uses hermes_cli.runtime_provider (single source of
truth for credential resolution across CLI, gateway, cron, and now
delegation)
- When only delegation.model is set (no provider), the model name changes
but parent credentials are inherited (for switching models within the
same provider like OpenRouter)
- When delegation.provider is set, full credentials are resolved
independently — enabling cross-provider delegation (e.g. parent on
Nous Portal, subagents on OpenRouter)
- Clear error messages if provider resolution fails (missing API key,
unknown provider name)
- _load_config() now falls back to hermes_cli.config.load_config() for
gateway/cron contexts where CLI_CONFIG is unavailable
Based on PR #791 by 0xbyt4 (closes#609), reworked to use proper
provider credential resolution instead of passing provider as metadata.
Co-authored-by: 0xbyt4 <0xbyt4@users.noreply.github.com>
Combined implementation of reasoning management:
- /reasoning Show current effort level and display state
- /reasoning <level> Set reasoning effort (none, low, medium, high, xhigh)
- /reasoning show|on Show model thinking/reasoning in output
- /reasoning hide|off Hide model thinking/reasoning from output
Effort level changes persist to config and force agent re-init.
Display toggle updates the agent callback dynamically without re-init.
When display is enabled:
- Intermediate reasoning shown as dim [thinking] lines during tool loops
- Final reasoning shown in a bordered box above the response
- Long reasoning collapsed (5 lines intermediate, 10 lines final)
Also adds:
- reasoning_callback parameter to AIAgent
- last_reasoning in run_conversation result dict
- show_reasoning config option (display section, default: false)
- Display section in /config output
- 34 tests covering both features
Combines functionality from PR #789 and PR #790.
Co-authored-by: Aum Desai <Aum08Desai@users.noreply.github.com>
Co-authored-by: 0xbyt4 <35742124+0xbyt4@users.noreply.github.com>
Authored by teyrebaz33. Closes#643.
- /personality none/default/neutral clears system prompt overlay
- Dict format personalities with description, tone, style fields
- Works in both CLI and gateway
- 18 tests
HermesCLI.__init__ never assigned self.config, causing an
AttributeError ('HermesCLI' object has no attribute 'config')
whenever an unrecognized slash command fell through to the
quick_commands check on line 2838. This affected skill slash
commands like /x-thread-creation since the quick_commands lookup
runs before the skill command check.
Set self.config = CLI_CONFIG in __init__ to match the pattern used
by the gateway (run.py:199).
HermesCLI.__init__ never assigned self.config, causing an
AttributeError ('HermesCLI object has no attribute config')
whenever an unrecognized slash command fell through to the
quick_commands check (line 2832). This broke skill slash commands
like /x-thread-creation since the quick_commands lookup runs
before the skill command check.
Set self.config = CLI_CONFIG in __init__, matching the pattern
used by the gateway (run.py:199).
Authored by teyrebaz33. Adds config-driven quick commands that execute
shell commands without invoking the LLM — zero token usage, works from
Telegram/Discord/Slack/etc. Closes#744.
- Organize COMMANDS into COMMANDS_BY_CATEGORY dict
- Group commands: Session, Configuration, Tools & Skills, Info, Exit
- Add visual category headers with spacing
- Maintain backwards compat via flat COMMANDS dict
- Better visual hierarchy and scannability
Before:
/help - Show this help message
/tools - List available tools
... (dense list)
After:
── Session ──
/new Start a new conversation
/reset Reset conversation only
...
── Configuration ──
/config Show current configuration
...
Closes#640
Adds -Q/--quiet to `hermes chat` for use by external orchestrators
(Paperclip, scripts, CI). When combined with -q, suppresses:
- Banner and ASCII art
- Spinner animations
- Tool preview lines (┊ prefix)
Only outputs:
- The agent's final response text
- A parseable 'session_id: <id>' line for session resumption
Usage: hermes chat -q 'Do something' -Q
Used by: Paperclip adapter (@nousresearch/paperclip-adapter-hermes)
Shows an immediate status message and braille spinner for slow slash
commands (/skills search|browse|inspect|install, /reload-mcp). Makes
input read-only while the command runs so the CLI doesn't appear frozen.
Cherry-picked from PR #714 by vilkasdev, rebased onto current main
with conflict resolution and bug fix (get_hint_text duplicate return).
Fixes#636
Co-authored-by: vilkasdev <vilkasdev@users.noreply.github.com>
Use rich_box.HORIZONTALS instead of the default ROUNDED box style
for the agent response panel. This keeps the top/bottom horizontal
rules (with title) but removes the vertical │ borders on left and
right, making it much easier to copy-paste response text from the
terminal.
Major UX improvements:
1. Response box now uses a Rich Panel rendered through ChatConsole
instead of hand-rolled ANSI box-drawing borders. Rich Panels
adapt to terminal width at render time, wrap content inside
the borders properly, and use skin colors natively.
2. ChatConsole now reads terminal width at render time via
shutil.get_terminal_size() instead of defaulting to 80 cols.
All Rich output adapts to the current terminal size.
3. User-input separator reduced to fixed 40-char width so it
never wraps regardless of terminal resize.
4. Approval and clarify countdown repaints throttled to every 5s
(was 1s), dramatically reducing flicker in Kitty/ghostty.
Selection changes still trigger instant repaints via key bindings.
5. Sudo widget now uses dynamic _panel_box_width() instead of
hardcoded border strings.
Tests: 2860 passed.
Three UI improvements:
1. Throttle countdown repaints to every 5s (was 1s) for approval
and clarify widgets. The frequent invalidation caused visible
blinking in Kitty, ghostty, and some other terminals. Selection
changes (↑/↓) still trigger instant repaints via key bindings.
2. Make echo Link2them00n. | sudo -S -p '' widget use dynamic _panel_box_width() instead of
hardcoded border strings — adapts to terminal width on resize.
3. Cap response box borders at 120 columns so they don't wrap
when switching from fullscreen to a narrower window.
Tests: 2857 passed.
Authored by 0xbyt4.
_approval_callback() had no return statement after the timeout break,
causing it to return None instead of 'deny'. Callers in approval.py
expect one of 'once', 'session', 'always', or 'deny'. This matches
the existing timeout behavior in approval.py:209.
Adds 3 new built-in skins (poseidon, sisyphus, charizard) with full
customization — colors, spinner faces/verbs/wings, branding text, and
custom ASCII art banner logos. Total: 7 built-in skins.
Also adds banner_logo and banner_hero fields to SkinConfig, allowing
any skin to replace the HERMES-AGENT ASCII art logo and the caduceus
hero art with custom artwork. The CLI now renders the skin's logo when
available, falling back to the default Hermes logo.
Skins with custom logos: ares, poseidon, sisyphus, charizard
Skins using default logo: default, mono, slate
cli.py had a local copy of build_welcome_banner() that shadowed the
imported one from banner.py. This local copy had all colors hardcoded,
so /skin changes had no visible effect on the banner.
Now the local copy resolves skin colors at render time using
get_active_skin(), matching the banner.py behavior. All hardcoded
#FFD700/#CD7F32/#FFBF00/#B8860B/#FFF8DC/#8B8682 values in the local
function are replaced with skin-aware lookups.
Authored by unmodeled-tyler. Adds openclaw-migration skill to optional-skills/
with migration script, SKILL.md, and 7 tests. Also improves clarify/approval
panel rendering with dynamic width calculation.
Cherry-picked and improved from PR #470 (fixes#464).
Problem: On Ubuntu 24.04 with ghostty + tmux, the prompt input box
border lines flash due to cursor blink and raw spinner terminal writes
conflicting with prompt_toolkit's rendering.
Changes:
- cli.py: Add CursorShape.BLOCK to Application() to disable cursor blink
- cli.py: Add thinking_callback + spinner_widget in TUI layout so
thinking status displays as a proper prompt_toolkit widget instead of
raw terminal writes that conflict with the TUI renderer
- run_agent.py: Add thinking_callback parameter to AIAgent; when set,
uses the callback instead of KawaiiSpinner for thinking display
What was NOT changed (preserving existing behavior):
- agent/display.py: Untouched. KawaiiSpinner _write() stdout capture,
_animate() logic, and 0.12s frame interval all preserved. This
protects subagent stdout redirection and keeps smooth animations
for non-CLI contexts (gateway, batch runner).
- Original emoji spinner types (brain/sparkle/pulse/moon/star) preserved
for all non-CLI contexts.
Fixes from original PR #470:
- CursorShape.STEADY_BLOCK -> CursorShape.BLOCK (STEADY_BLOCK doesn't
exist in prompt_toolkit 3.0.52)
- Removed duplicate self._spinner_text = '' line
- Removed redundant nested if-checks
Tested: 2706 tests pass, interactive CLI verified via tmux.
The Docker backend already supports user-configured volume mounts via
docker_volumes, but it was undocumented — missing from DEFAULT_CONFIG,
cli.py defaults, and configuration docs.
Changes:
- hermes_cli/config.py: Add docker_volumes to DEFAULT_CONFIG with
inline documentation and examples
- cli.py: Add docker_volumes to load_cli_config defaults
- configuration.md: Full Docker Volume Mounts section with YAML
examples, use cases (providing files, receiving outputs, shared
workspaces), and env var alternative
Closes#643
Changes:
- /personality none|default|neutral — clears system prompt overlay
- Custom personalities in config.yaml support dict format with:
name, description, system_prompt, tone, style directives
- Backwards compatible — existing string format still works
- CLI + gateway both updated
- 18 tests covering none/default/neutral, dict format, string format,
list display, save to config
The full HERMES-AGENT ASCII logo needs ~95 columns, and the
side-by-side caduceus + tools panel needs ~80. In narrow terminals
(Kitty default, resized windows) everything wraps into visual garbage.
Fixes:
- show_banner() auto-detects terminal width and falls back to compact
banner when < 80 columns
- build_welcome_banner() skips the ASCII logo when < 95 columns
- Compact banner now dynamically sized via _build_compact_banner()
instead of a hardcoded 64-char box that also wrapped in narrow terms
- Same width checks applied to /clear command's banner refresh
The up/down arrow key issue in Kitty terminal for multiline input is
a known Kitty keyboard protocol (CSI u) vs prompt_toolkit compatibility
gap — arrow keys work correctly in standard terminals and tmux. Users
can work around it by running in tmux or setting TERM=xterm-256color.
Enforce owner-only permissions on files and directories that contain
secrets or sensitive data:
- cron/jobs.py: jobs.json (0600), cron dirs (0700), job output files (0600)
- hermes_cli/config.py: config.yaml (0600), .env (0600), ~/.hermes/* dirs (0700)
- cli.py: config.yaml via save_config_value (0600)
All chmod calls use try/except for Windows compatibility.
Includes _secure_file() and _secure_dir() helpers with graceful fallback.
8 new tests verify permissions on all file types.
Inspired by openclaw v2026.3.7 file permission enforcement.
New config option:
security:
redact_secrets: false # default: true
When set to false, API keys, tokens, and passwords are shown in
full in read_file, search_files, and terminal output. Useful for
debugging auth issues where you need to verify the actual key value.
Bridged to both CLI and gateway via HERMES_REDACT_SECRETS env var.
The check is in redact_sensitive_text() itself, so all call sites
(terminal, file tools, log formatter) respect it.
Adds a simple config option to play the terminal bell (\a) when the
agent finishes a response. Useful for long-running tasks — switch to
another window and your terminal will ding when done.
Works over SSH since the bell character propagates through the
connection. Most terminal emulators can be configured to flash the
taskbar, play a sound, or show a visual indicator on bell.
Config (default: off):
display:
bell_on_complete: true
Closes#318
New browser capabilities and a built-in skill for agent-driven web QA.
## New tool: browser_console
Returns console messages (log/warn/error/info) AND uncaught JavaScript
exceptions in a single call. Uses agent-browser's 'console' and 'errors'
commands through the existing session plumbing. Supports --clear to reset
buffers. Verified working in both local and Browserbase cloud modes.
## Enhanced tool: browser_vision(annotate=True)
New boolean parameter on browser_vision. When true, agent-browser overlays
numbered [N] labels on interactive elements — each [N] maps to ref @eN.
Annotation data (element name, role, bounding box) returned alongside the
vision analysis. Useful for QA reports and spatial reasoning.
## Config: browser.record_sessions
Auto-record browser sessions as WebM video files when enabled:
- Starts recording on first browser_navigate
- Stops and saves on browser_close
- Saves to ~/.hermes/browser_recordings/
- Works in both local and cloud modes (verified)
- Disabled by default
## Built-in skill: dogfood
Systematic exploratory QA testing for web applications. Teaches the agent
a 5-phase workflow:
1. Plan — accept URL, create output dirs, set scope
2. Explore — systematic crawl with annotated screenshots
3. Collect Evidence — screenshots, console errors, JS exceptions
4. Categorize — severity (Critical/High/Medium/Low) and category
(Functional/Visual/Accessibility/Console/UX/Content)
5. Report — structured markdown with per-issue evidence
Includes:
- skills/dogfood/SKILL.md — full workflow instructions
- skills/dogfood/references/issue-taxonomy.md — severity/category defs
- skills/dogfood/templates/dogfood-report-template.md — report template
## Tests
21 new tests covering:
- browser_console message/error parsing, clear flag, empty/failed states
- browser_console schema registration
- browser_vision annotate schema and flag passing
- record_sessions config defaults and recording lifecycle
- Dogfood skill file existence and content validation
Addresses #315.
When the primary model/provider fails after retries (rate limit, overload,
auth errors, connection failures), Hermes automatically switches to a
configured fallback model for the remainder of the session.
Config (in ~/.hermes/config.yaml):
fallback_model:
provider: openrouter
model: anthropic/claude-sonnet-4
Supports all major providers: OpenRouter, OpenAI, Nous, DeepSeek, Together,
Groq, Fireworks, Mistral, Gemini — plus custom endpoints via base_url and
api_key_env overrides.
Design principles:
- Dead simple: one fallback model, not a chain
- One-shot: switches once, doesn't ping-pong back
- Zero new dependencies: uses existing OpenAI client
- Minimal code: ~100 lines in run_agent.py, ~5 lines in cli.py/gateway
- Three trigger points: max retries exhausted, non-retryable client errors,
and invalid response exhaustion
Does NOT trigger on context overflow or payload-too-large errors (those
are handled by the existing compression system).
Addresses #737.
25 new tests, 2492 total passing.